Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 05 Oct 2022 15:10:09 +0300
From: Alexey Izbyshev <izbyshev@...ras.ru>
To: musl@...ts.openwall.com
Subject: Re: Illegal killlock skipping when transitioning to
 single-threaded state

On 2022-10-05 04:00, Rich Felker wrote:
> On Wed, Sep 07, 2022 at 03:46:53AM +0300, Alexey Izbyshev wrote:
>> Reordering the "libc.need_locks = -1" assignment and
>> UNLOCK(E->killlock) and providing a store barrier between them
>> should fix the issue.
> 
> Back to this, because it's immediately actionable without resolving
> the aarch64 atomics issue:
> 
> Do you have something in mind for how this reordering is done, since
> there are other intervening steps that are potentially ordered with
> respect to either or both? I don't think there is actually any
> ordering constraint at all on the unlocking of killlock (with the
> accompanying assignment self->tid=0 kept with it) except that it be
> past the point where we are committed to the thread terminating
> without executing any more application code. So my leaning would be to
> move this block from the end of pthread_exit up to right after the
> point-of-no-return comment.
> 
This was my conclusion as well back when I looked at it before sending 
the report.

I was initially concerned about whether reordering with 
a_store(&self->detach_state, DT_EXITED) could cause an unwanted 
observable change (pthread_tryjoin_np() returning EBUSY after a pthread 
function acting on tid like pthread_getschedparam() returns ESRCH), but 
no, pthread_tryjoin_np() will block/trap if the thread is not 
DT_JOINABLE.

> Unfortunately while reading this I found another bug, this time a lock
> order one. __dl_thread_cleanup() takes a lock while the thread list
> lock is already held, but fork takes these in the opposite order. I
> think the lock here could be dropped and replaced with an atomic-cas
> list head, but that's rather messy and I'm open to other ideas.
> 
I'm not sure why using a lock-free list is messy, it seems like a 
perfect fit here to me.

However, doesn't __dl_vseterr() use the libc-internal allocator after  
34952fe5de44a833370cbe87b63fb8eec61466d7? If so, the problem that 
freebuf_queue was originally solving doesn't exist anymore. We still 
can't call the allocator after __tl_lock(), but maybe this whole free 
deferral approach can be reconsidered?

Alexey

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.