Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Wed, 17 Oct 2018 23:54:18 +0200
From: Florian Weimer <fw@...eb.enyo.de>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: Possible design for global thread list

* Rich Felker:

> On Sun, Oct 14, 2018 at 07:32:54PM +0200, Florian Weimer wrote:
>> * Rich Felker:
>> 
>> 
>> > Of course, this futex wake is already used for pthread_join, which
>> > would need another mechanism. This is solved simply: pthread_exit can
>> > FUTEX_REQUEUE a waiting joiner to the thread-list lock. pthread_join
>> > then has to wait on (but need not acquire) the thread-list lock after
>> > waiting on the thread's own exit futex in order to ensure the exit has
>> > actually finished. This is potentially subject to long waits if the
>> > lock is under contention (lots of threads exiting or being created)
>> > and retaken before pthread_join gets to run, but the probability of
>> > collision can be made negligible (only possible under extremely rapid
>> > tid reuse) by using the tid of the exiting thread as the wait value.
>> > Alternatively, the tid of the joiner could be used, making collisions
>> > impossible, but setting up to do this is more complex.
>> 
>> I'm not sure if this is compatible with existing software which
>> rapidly joins and creates many threads in succession because it looks
>> to me that the pthread_join operation can return before the kernel
>> resources are freed.  As a result, applications will get impossible
>> EAGAIN failures, even though the application never exceeds the thread
>> limit.
>> 
>> Depending on kernel version and cgroups configuration, this race can
>> even be observed with the more usual join sequence because the kernel
>> signals thread exit too early to user space.
>
> I think you must be confused about something, because either way, what
> pthread_join is waiting for is the same kind of futex wake by the
> kernel, and happens at the same point during kernel task exit. The
> only difference is what address it's at.

Right, I was confused.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.