Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 25 Mar 2014 03:11:24 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: Transition path for removing lazy init of thread pointer

On Tue, Mar 25, 2014 at 06:35:02AM +0000, Laurent Bercot wrote:
> On 25/03/2014 01:55, Rich Felker wrote:
> >The mandatory syscall is set_thread_area or equivalent, e.g.
> >arch_prctl on x86_64. It's there because most archs need a syscall to
> >set the thread pointer used for accessing TLS. Even in single-threaded
> >programs, there are reasons one may want to have it.
> >
> >The big reason is that, on most archs, stack protector's canary value
> >is stored at a fixed offset from the thread pointer rather than in a
> >global, so stack protector can't work without the thread pointer being
> >initialized. Up to now we've tried to detect whether stack protector
> >is used based on symbol references to __stack_chk_fail, but this check
> >gives a false negative (and thus crashing programs) if gcc optimizes
> >out the check to __stack_chk_fail but not the load of the canary, e.g.
> >in the program: int main() { exit(0); }
> 
>  That's a good reason indeed.
>  I take it you're still hell-bent against compile-time options ? Because

In general, no. I'll probably eventually accept compile-time options
for things like iconv charset selection.

But gratuitous ones, yes. Especially if supporting the compile-time
option significantly complicates the code and forces us to have
multiple #ifdef/#else cases, ala uClibc. Making thread-pointer
optional would, at least in the long term, be one of those, since it
either precludes all optimization and simplification that assumes the
thread pointer is available, or forces us to have multiple versions of
the same code for with/without it.

> a musl compile-time option "I don't want this musl to support stack
> protector, yes I know it will crash programs compiled with it, but I'm
> a big boy and know what I'm doing" would be great for OCD people like
> me who like their strace clean. :)

Yeah, this is really just a case of appealing to OCD, so thanks for
acknowledging that. :-)

I think we could still consider making the second syscall
(set_tid_address) get optimized out in static binaries that don't need
it, but it's enough of a complexity burden that I'd like to see what
others have to say about it, and at least wait to see how hard it
would be, once other cleanups related to this change are made.

> >The other main reason is that lazy initialization is a lot more
> >expensive at runtime.
> 
>  That's not a good reason for single-threaded programs.

Well there are a lot of mostly-useless micro-optimizations you could
do that, theoretically, improve single-threaded programs. Like
accessing errno directly. The problem is that these preclude doing
major systemic simplifications that have much greater debloating
effects (even on single-threaded programs!) unless we make a whole
separate single-thread-only libc.

For example, __stdio_read and __stdio_write just got simpler because
they no longer have to special-case the threaded/non-threaded cases to
avoid gratuitous thread-pointer loads and possible crashes.

And pthread_setcancelstate, which is used in various functions which
need to avoid triggering cancellation, is now simpler since it knows
the absence/presence of a thread pointer will be constant (before, it
had to be able to get/set state before the thread pointer was loaded
for consistency in case it's loaded later).

Right now that's about it for code that gets linked in NON-threaded
programs, but there will probably be more that gets simplified later,
and a lot more if you count code for programs using threads.

> >So despite always initializing the thread pointer kinda looking like
> >"bloat" from a minimal-program standpoint, it's really a major step
> >forward in debloating and simplifying lots of code.
> 
>  I totally understand and approve for multi-threaded programs and
> programs using stack protection. I just wish there were a special
> optimization for "int main() { return 0; }".

Yes, I miss the extreme-minimal strace too, but it's still pretty damn
minimal and not going to get any bigger anytime soon. What I don't
miss is the messy undocumented logic for lazy initialization.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.