Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 19 Sep 2022 11:29:37 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Illegal killlock skipping when transitioning to
 single-threaded state

On Wed, Sep 07, 2022 at 03:46:53AM +0300, Alexey Izbyshev wrote:
> Hi,
> 
> While reading pthread_exit() implementation I noticed that it can
> set "libc.need_locks" to -1 while still holding the killlock of the
> exiting thread:
> 
>     if (!--libc.threads_minus_1) libc.need_locks = -1;
> 
> If the remaining thread attempts to acquire the same killlock
> concurrently (which is valid for joinable threads), it works fine
> because LOCK() resets "libc.need_locks" only after a_cas():
> 
>     int need_locks = libc.need_locks;
>     if (!need_locks) return;
>     /* fast path: INT_MIN for the lock, +1 for the congestion */
>     int current = a_cas(l, 0, INT_MIN + 1);
>     if (need_locks < 0) libc.need_locks = 0;
>     if (!current) return;
> 
> However, because "libc.need_locks" is reset when using LOCK() for
> any other lock too, the following could happen (E is the exiting
> thread, R is the remaining thread):
> 
> E: libc.need_locks = -1
> R: LOCK(unrelated_lock)
> R:   libc.need_locks = 0
> R: UNLOCK(unrelated_lock)
> R: LOCK(E->killlock) // skips locking, now both R and E think they
> are holding the lock
> R: UNLOCK(E->killlock)
> E: UNLOCK(E->killlock)
> 
> The lack of mutual exclusion means that the tid reuse problem that
> killlock is supposed to prevent might occur.
> 
> Moreover, when UNLOCK(E->killlock) is called concurrently by R and
> E, a_fetch_add() could be done twice if timing is such that both
> threads notice that "l[0] < 0":
> 
>    /* Check l[0] to see if we are multi-threaded. */
>    if (l[0] < 0) {
>        if (a_fetch_add(l, -(INT_MIN + 1)) != (INT_MIN + 1)) {
>            __wake(l, 1, 1);
>        }
>    }
> 
> In that case E->killlock will be assigned to INT_MAX (i.e. "unlocked
> with INT_MAX waiters"). This is a bad state because the next LOCK()
> will wrap it to "unlocked" state instead of locking. Also, if more
> than two threads attempt to use it, a deadlock will occur if two
> supposedly-owners execute a_fetch_add() concurrently in UNLOCK()
> after a third thread registered itself as a waiter because the lock
> will wrap to INT_MIN.
> 
> Reordering the "libc.need_locks = -1" assignment and
> UNLOCK(E->killlock) and providing a store barrier between them
> should fix the issue.

I think this all sounds correct. I'm not sure what you mean by a store
barrier between them, since all lock and unlock operations are already
full barriers.

I'm still a little bit concerned though that there's no real model for
synchronization of the access to modify need_locks though.
Conceptually, the store of -1 from the exiting thread and the store of
0 in the surviving thread are not ordered with respect to each other
except by an unsynchronized causality relationship.

I suspect what "should be" happening is that, if we observe
need_locks==-1 (as a relaxed atomic load in our memory model), we take
the thread list lock (as the lock controlling write access to these
values) which ensures that the exiting thread has exited, then confirm
that we are the only thread left (are there async-signal ways without
UB that another thread could be created in the mean time?) before
setting it to 0. But this is a rather heavy operation. If there's an
assurance that no other threads can be created in the middle of LOCK
(which I think there should be), we could just use __tl_sync(), which
is much lighter, to conclude that we've synchronized with being the
only remaining thread and that unsynchronized access is okay.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.