musl - Thread pointer changes

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140610072835.GA8466@brightrain.aerifal.cx>
Date: Tue, 10 Jun 2014 03:28:35 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Thread pointer changes

One of the items on the roadmap for this release cycle is:

"Optimizations based on thread-pointer being always-available, and
fallbacks for old kernels without thread pointer setting."

I've worked out (and have POC code) how to use the modify_ldt syscall
to setup a valid thread pointer on i386 for any kernel back to 1.0.

For ARM, I think we should revisit the thread-pointer/atomic inlining
work that was done as a sloppy workaround for kernels without the
kuser_helper page. If the set_thread_area syscall fails (due to an old
kernel that doesn't support it), we can setup a function pointer for
the __aeabi_read_tp function that only supports a single main thread
and returns its thread pointer. Likewise at this stage we could detect
the presence or absence of the kuser_helper page and substitute our
own fallbacks (using the instructions directly) if needed. One thing
that should be checked though is whether there are any kernel versions
which support EABI syscalls but not the thread-pointer setup syscall.
If not, there's really no use in having a fallback for that. These
slides look like they might shed some light on the history:
http://wookware.org/talks/armeabidebconf.pdf

For MIPS I see no way to support kernels that don't provide
set_thread_area.

For microblaze, powerpc, and sh, it's impossible not to support the
thread pointer; it's simply loaded in a register via a userspace
instruction.

For x86_64, I'm not sure if pre-2.6 kernels were ever relevant (and I
mean real pre-2.6, not RHEL-style 2.4 that's actually patched up to be
equivalent to 2.6). But there's no way to emulate the thread pointer.
Even if modify_ldt works (and I'm not sure if it does), selectors
setup by it can only address 32 bits.

As of 1.1.0, musl nominally _requires_ the thread pointer. However
there's still a flag indicating whether it's setup (important for code
that runs prior to it getting setup, especially accesses to errno)
which a few functions check before accessing it. If we're going to
eventually support building libc with stack protector (which would
have been nice for the recent dns parser issue!) it's important that
the thread pointer always be setup anyway,

Right now, we have two flags in the libc structure,
libc.has_thread_pointer and libc.can_do_threads. Both of these are set
in __init_tp() based on the result of __set_thread_area; if it fails,
they're left at 0 (false) and if it succeeds they're set to 1 (true).
The idea here was that we could eventually change __set_thread_area to
have a more fine-grained error reporting to let the caller know that
it succeeded in setting up a thread pointer for the main thread, but
not in a way that will work for running multiple threads. I'm not sure
whether I'll keep this; it actually might make more sense for the
arch-specific __set_thread_area to simply store its own flag, which
the arch-specific __clone could then inspect and force itself to fail
if CLONE_TLS is specified but can't be honored.


Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.