Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Sep 2016 19:42:19 -0400
From: Rich Felker <>
To: "LeMay, Michael" <>
Cc: "" <>
Subject: Re: [RFC] Support for segmentation-hardened SafeStack

On Thu, Sep 22, 2016 at 11:00:45PM +0000, LeMay, Michael wrote:
> Hi,
> I submitted several patches to LLVM and Clang to harden SafeStack
> using segmentation on x86-32 [1]. See [2] for general background on
> SafeStack. On Linux, I have been testing my compiler changes with a
> modified version of musl. I currently plan to submit my musl patches
> if and when the prerequisite LLVM and Clang patches are accepted.
> One of my LLVM patches depends on the details of my musl patches,
> which is the main reason that I am sending this RFC now.

My understanding is that this is a different, incompatible ABI for
i386, i.e. code that uses safestack is not calling-compatible with
code that doesn't, and vice versa. Is that true? This is probably the
most significant determining factor in how we treat it.

> Specifically, assumes that the
> unsafe stack pointer is stored at offset 0x24 in the musl thread
> control block. This would be between the pid and tsd_used variables
> that are currently defined. I also propose storing the base address
> of the unsafe stack at offset 0x28, but the compiler would not
> depend on that.

Almost none of the existing fields are public; I think the only
exception is the stack-protector canary. IMO you should prefer
avoiding per-libc offset variation over preserving existing offsets.

> Here is an overview of some other changes that I plan to propose
> with my musl patches:
> The segmentation-hardened SafeStack support would be enabled with a
> new configuration option, "--enable-safe-stack".
> When this is enabled, many libraries routines require that both a
> safe stack and an unsafe stack be available. I modified _start_c in
> crt1.c to temporarily setup a small, pre-allocated unsafe stack for

I'd have to see exactly what you mean, but my leaning is that crt1 is
not a good place for anything new. For dynamic-linked programs,
crt1-to-__libc_start_main is the main permanent ABI boundary and not
something you want to have complexity that could need changing.

> the early initialization routines to use. I also made similar
> changes in dlstart.c. A larger unsafe stack is allocated and setup
> later from either __libc_start_main or __dls3, depending on whether
> static or dynamic linking is used. I split __dls3 so that it only
> performs minimal initialization before allocating the larger unsafe
> stack and then performing the rest of its work in a new __dls4
> function.
> After the larger unsafe stack is allocated, I invoke the modify_ldt
> syscall to insert a segment descriptor with a limit that is below
> the beginning of the safe stacks. I load that segment descriptor
> into the DS and ES segment registers to block memory accesses to DS
> and ES from accessing the safe stacks. One purpose of my LLVM and
> Clang patches is to insert the necessary segment override prefixes
> to direct accesses to the appropriate segments.

The content on these stacks is purely return addresses, spills, and
other stuff that's only accessible to compiler-generated code, not
data whose addresses can be taken, right?

> Many instructions expect that argc, argv, the environment, and auxv
> are accessible in the DS and ES segments. These are stored on the
> initial stack, which is above the limit of the restricted DS and ES
> segments. I annotated auxv with an attribute to cause the compiler
> to emit SS segment-override prefixes when accessing auxv. I copied
> the other data to the heap, which is accessible in DS and ES.

Invasive arch-specific changes to unrelated code are highly frowned
upon in musl. I think to be acceptable upstream at all the auxv would
also have to be relocated by startup code to an address where it's

> I modified the pthread routines to allocate and deallocate
> additional stacks as needed in the appropriate memory ranges. The
> safe stacks are allocated at high addresses so that they are above
> the limit of the modified DS and ES segments. The unsafe stack for
> each new thread is allocated below its TLS region and thread control
> block, which is where the stack is currently located by default.

This likely should be hidden inside __clone rather than in
non-arch-specific sources.

> The Linux vDSO code may be incompatible with programs that enable
> segmentation-hardened SafeStack. For example, it may allocate data
> on the safe stack and then attempt to access it in DS or ES, which
> would result in an exception due to the segment limit violation. My
> patches prevent the vDSO from being invoked when
> segmentation-hardened SafeStack is enabled.

That sounds reasonable but rather unfortunate.

> Finally, the i386 __clone implementation is written in assembly
> language, so the compiler is unable to automatically add a stack
> segment override prefix to an instruction in that routine that
> accesses a safe stack. I added that prefix manually in the source
> code.
> Comments appreciated.

Hope the above are helpful.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.