Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 10 Feb 2019 15:15:55 +0300
From: Alexey Izbyshev <izbyshev@...ras.ru>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: __synccall: deadlock and reliance on racy /proc/self/task

On 2019-02-10 04:20, Rich Felker wrote:
> On Sun, Feb 10, 2019 at 02:16:23AM +0100, Szabolcs Nagy wrote:
>> * Rich Felker <dalias@...c.org> [2019-02-09 19:52:50 -0500]:
>> > Maybe it's salvagable though. Since __block_new_threads is true, in
>> > order for this to happen, tid J must have been between the
>> > __block_new_threads check in pthread_create and the clone syscall at
>> > the time __synccall started. The number of threads in such a state
>> > seems to be bounded by some small constant (like 2) times
>> > libc.threads_minus_1+1, computed at any point after
>> > __block_new_threads is set to true, so sufficiently heavy presignaling
>> > (heavier than we have now) might suffice to guarantee that all are
>> > captured.
>> 
>> heavier presignaling may catch more threads, but we don't
>> know how long should we wait until all signal handlers are
>> invoked (to ensure that all tasks are enqueued on the call
>> serializer chain before we start walking that list)
> 
> That's why reading /proc/self/task is still necessary. However, it
> seems useful to be able to prove you've queued enough signals that at
> least as many threads as could possibly exist are already in a state
> where they cannot return from a syscall with signals unblocked without
> entering the signal handler. In that case you would know there's no
> more racing going on to create new threads, so reading /proc/self/task
> is purely to get the list of threads you're waiting to enqueue
> themselves on the chain, not to find new threads you need to signal.

Similar to Szabolcs, I fail to see how heavier presignaling would help. 
Even if we're sure that we'll *eventually* catch all threads (including 
their future children) that were between __block_new_threads check in 
pthread_create and the clone syscall at the time we set 
__block_new_threads to 1, we still have no means to know whether we 
reached a stable state. In other words, we don't know when we should 
stop spinning in /proc/self/task loop because we may miss threads that 
are currently being created.

Also, note that __pthread_exit() blocks all signals and decrements 
libc.threads_minus_1 before exiting, so an arbitrary number of threads 
may be exiting while we're in /proc/self/task loop, and we know that 
concurrently exiting threads are related to misses.

Alexey

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.