Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 2 Aug 2017 09:42:36 -0700
From: Thomas Garnier <>
To: "H. Peter Anvin" <>
Cc: Brian Gerst <>, Herbert Xu <>, 
	"David S . Miller" <>, Thomas Gleixner <>, Ingo Molnar <>, 
	Peter Zijlstra <>, Josh Poimboeuf <>, 
	Arnd Bergmann <>, Matthias Kaehlcke <>, 
	Boris Ostrovsky <>, Juergen Gross <>, 
	Paolo Bonzini <>, Radim Krčmář <>, 
	Joerg Roedel <>, Andy Lutomirski <>, Borislav Petkov <>, 
	"Kirill A . Shutemov" <>, Borislav Petkov <>, 
	Christian Borntraeger <>, "Rafael J . Wysocki" <>, 
	Len Brown <>, Pavel Machek <>, Tejun Heo <>, 
	Christoph Lameter <>, Kees Cook <>, 
	Paul Gortmaker <>, Chris Metcalf <>, 
	"Paul E . McKenney" <>, Andrew Morton <>, 
	Christopher Li <>, Dou Liyang <>, 
	Masahiro Yamada <>, Daniel Borkmann <>, 
	Markus Trippelsdorf <>, Peter Foley <>, 
	Steven Rostedt <>, Tim Chen <>, 
	Ard Biesheuvel <>, Catalin Marinas <>, 
	Matthew Wilcox <>, Michal Hocko <>, Rob Landley <>, 
	Jiri Kosina <>, "H . J . Lu" <>, Paul Bolle <>, 
	Baoquan He <>, Daniel Micay <>, 
	"the arch/x86 maintainers" <>, 
	"" <>, 
	Linux Kernel Mailing List <>,, 
	kvm list <>, linux-pm <>, 
	linux-arch <>,, 
	Kernel Hardening <>
Subject: Re: [RFC 16/22] x86/percpu: Adapt percpu for PIE support

On Thu, Jul 20, 2017 at 7:26 AM, Thomas Garnier <> wrote:
> On Wed, Jul 19, 2017 at 4:33 PM, H. Peter Anvin <> wrote:
>> On 07/19/17 11:26, Thomas Garnier wrote:
>>> On Tue, Jul 18, 2017 at 8:08 PM, Brian Gerst <> wrote:
>>>> On Tue, Jul 18, 2017 at 6:33 PM, Thomas Garnier <> wrote:
>>>>> Perpcu uses a clever design where the .percu ELF section has a virtual
>>>>> address of zero and the relocation code avoid relocating specific
>>>>> symbols. It makes the code simple and easily adaptable with or without
>>>>> SMP support.
>>>>> This design is incompatible with PIE because generated code always try to
>>>>> access the zero virtual address relative to the default mapping address.
>>>>> It becomes impossible when KASLR is configured to go below -2G. This
>>>>> patch solves this problem by removing the zero mapping and adapting the GS
>>>>> base to be relative to the expected address. These changes are done only
>>>>> when PIE is enabled. The original implementation is kept as-is
>>>>> by default.
>>>> The reason the per-cpu section is zero-based on x86-64 is to
>>>> workaround GCC hardcoding the stack protector canary at %gs:40.  So
>>>> this patch is incompatible with CONFIG_STACK_PROTECTOR.
>>> Ok, that make sense. I don't want this feature to not work with
>>> CONFIG_CC_STACKPROTECTOR*. One way to fix that would be adding a GDT
>>> entry for gs so gs:40 points to the correct memory address and
>>> gs:[rip+XX] works correctly through the MSR.
>> What are you talking about?  A GDT entry and the MSR do the same thing,
>> except that a GDT entry is limited to an offset of 0-0xffffffff (which
>> doesn't work for us, obviously.)
> A GDT entry would allow gs:0x40 to be valid while all gs:[rip+XX]
> addresses uses the MSR.
> I didn't tested it but that was used on the RFG mitigation [1]. The fs
> segment register was used for both thread storage and shadow stack.
> [1]

Small update on that.

I noticed that not only we have the problem of gs:0x40 not being
accessible. The compiler will default to the fs register if
mcmodel=kernel is not set.

On the next patch set, I am going to add support for
-mstack-protector-guard=global so a global variable can be used
instead of the segment register. Similar approach than ARM/ARM64.

Following this patch, I will work with gcc and llvm to add
-mstack-protector-reg=<segment register> support similar to PowerPC.
This way we can have gs used even without mcmodel=kernel. Once that's
an option, I can setup the GDT as described in the previous email
(similar to RFG).

Let me know what you think about this approach.

>>> Given the separate
>>> discussion on mcmodel, I am going first to check if we can move from
>>> PIE to PIC with a mcmodel=small or medium that would remove the percpu
>>> change requirement. I tried before without success but I understand
>>> better percpu and other components so maybe I can make it work.
>>>> This is silly.  The right thing is for PIE is to be explicitly absolute,
>>>> without (%rip).  The use of (%rip) memory references for percpu is just
>>>> an optimization.
>>> I agree that it is odd but that's how the compiler generates code. I
>>> will re-explore PIC options with mcmodel=small or medium, as mentioned
>>> on other threads.
>> Why should the way compiler generates code affect the way we do things
>> in assembly?
>> That being said, the compiler now has support for generating this kind
>> of code explicitly via the __seg_gs pointer modifier.  That should let
>> us drop the __percpu_prefix and just use variables directly.  I suspect
>> we want to declare percpu variables as "volatile __seg_gs" to account
>> for the possibility of CPU switches.
>> Older compilers won't be able to work with this, of course, but I think
>> that it is acceptable for those older compilers to not be able to
>> support PIE.
>>         -hpa
> --
> Thomas


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.