Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181014234035.GJ5150@brightrain.aerifal.cx>
Date: Sun, 14 Oct 2018 19:40:35 -0400
From: Rich Felker <dalias@...c.org>
To: Florian Weimer <fw@...eb.enyo.de>
Cc: musl@...ts.openwall.com
Subject: Re: Possible design for global thread list

On Sun, Oct 14, 2018 at 07:32:54PM +0200, Florian Weimer wrote:
> * Rich Felker:
> 
> 
> > Of course, this futex wake is already used for pthread_join, which
> > would need another mechanism. This is solved simply: pthread_exit can
> > FUTEX_REQUEUE a waiting joiner to the thread-list lock. pthread_join
> > then has to wait on (but need not acquire) the thread-list lock after
> > waiting on the thread's own exit futex in order to ensure the exit has
> > actually finished. This is potentially subject to long waits if the
> > lock is under contention (lots of threads exiting or being created)
> > and retaken before pthread_join gets to run, but the probability of
> > collision can be made negligible (only possible under extremely rapid
> > tid reuse) by using the tid of the exiting thread as the wait value.
> > Alternatively, the tid of the joiner could be used, making collisions
> > impossible, but setting up to do this is more complex.
> 
> I'm not sure if this is compatible with existing software which
> rapidly joins and creates many threads in succession because it looks
> to me that the pthread_join operation can return before the kernel
> resources are freed.  As a result, applications will get impossible
> EAGAIN failures, even though the application never exceeds the thread
> limit.
> 
> Depending on kernel version and cgroups configuration, this race can
> even be observed with the more usual join sequence because the kernel
> signals thread exit too early to user space.

I think you must be confused about something, because either way, what
pthread_join is waiting for is the same kind of futex wake by the
kernel, and happens at the same point during kernel task exit. The
only difference is what address it's at.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.