Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 14 Oct 2018 19:32:54 +0200
From: Florian Weimer <fw@...eb.enyo.de>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: Possible design for global thread list

* Rich Felker:


> Of course, this futex wake is already used for pthread_join, which
> would need another mechanism. This is solved simply: pthread_exit can
> FUTEX_REQUEUE a waiting joiner to the thread-list lock. pthread_join
> then has to wait on (but need not acquire) the thread-list lock after
> waiting on the thread's own exit futex in order to ensure the exit has
> actually finished. This is potentially subject to long waits if the
> lock is under contention (lots of threads exiting or being created)
> and retaken before pthread_join gets to run, but the probability of
> collision can be made negligible (only possible under extremely rapid
> tid reuse) by using the tid of the exiting thread as the wait value.
> Alternatively, the tid of the joiner could be used, making collisions
> impossible, but setting up to do this is more complex.

I'm not sure if this is compatible with existing software which
rapidly joins and creates many threads in succession because it looks
to me that the pthread_join operation can return before the kernel
resources are freed.  As a result, applications will get impossible
EAGAIN failures, even though the application never exceeds the thread
limit.

Depending on kernel version and cgroups configuration, this race can
even be observed with the more usual join sequence because the kernel
signals thread exit too early to user space.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.