Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Wed, 07 Sep 2022 03:46:53 +0300
From: Alexey Izbyshev <izbyshev@...ras.ru>
To: musl@...ts.openwall.com
Subject: Illegal killlock skipping when transitioning to single-threaded state

Hi,

While reading pthread_exit() implementation I noticed that it can set 
"libc.need_locks" to -1 while still holding the killlock of the exiting 
thread:

     if (!--libc.threads_minus_1) libc.need_locks = -1;

If the remaining thread attempts to acquire the same killlock 
concurrently (which is valid for joinable threads), it works fine 
because LOCK() resets "libc.need_locks" only after a_cas():

     int need_locks = libc.need_locks;
     if (!need_locks) return;
     /* fast path: INT_MIN for the lock, +1 for the congestion */
     int current = a_cas(l, 0, INT_MIN + 1);
     if (need_locks < 0) libc.need_locks = 0;
     if (!current) return;

However, because "libc.need_locks" is reset when using LOCK() for any 
other lock too, the following could happen (E is the exiting thread, R 
is the remaining thread):

E: libc.need_locks = -1
R: LOCK(unrelated_lock)
R:   libc.need_locks = 0
R: UNLOCK(unrelated_lock)
R: LOCK(E->killlock) // skips locking, now both R and E think they are 
holding the lock
R: UNLOCK(E->killlock)
E: UNLOCK(E->killlock)

The lack of mutual exclusion means that the tid reuse problem that 
killlock is supposed to prevent might occur.

Moreover, when UNLOCK(E->killlock) is called concurrently by R and E, 
a_fetch_add() could be done twice if timing is such that both threads 
notice that "l[0] < 0":

    /* Check l[0] to see if we are multi-threaded. */
    if (l[0] < 0) {
        if (a_fetch_add(l, -(INT_MIN + 1)) != (INT_MIN + 1)) {
            __wake(l, 1, 1);
        }
    }

In that case E->killlock will be assigned to INT_MAX (i.e. "unlocked 
with INT_MAX waiters"). This is a bad state because the next LOCK() will 
wrap it to "unlocked" state instead of locking. Also, if more than two 
threads attempt to use it, a deadlock will occur if two 
supposedly-owners execute a_fetch_add() concurrently in UNLOCK() after a 
third thread registered itself as a waiter because the lock will wrap to 
INT_MIN.

Reordering the "libc.need_locks = -1" assignment and UNLOCK(E->killlock) 
and providing a store barrier between them should fix the issue.

Thanks,
Alexey

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.