musl - Re: Illegal killlock skipping when transitioning to single-threaded state

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221005140303.GS29905@brightrain.aerifal.cx>
Date: Wed, 5 Oct 2022 10:03:03 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Illegal killlock skipping when transitioning to
 single-threaded state

On Wed, Oct 05, 2022 at 03:10:09PM +0300, Alexey Izbyshev wrote:
> On 2022-10-05 04:00, Rich Felker wrote:
> >On Wed, Sep 07, 2022 at 03:46:53AM +0300, Alexey Izbyshev wrote:
> >>Reordering the "libc.need_locks = -1" assignment and
> >>UNLOCK(E->killlock) and providing a store barrier between them
> >>should fix the issue.
> >
> >Back to this, because it's immediately actionable without resolving
> >the aarch64 atomics issue:
> >
> >Do you have something in mind for how this reordering is done, since
> >there are other intervening steps that are potentially ordered with
> >respect to either or both? I don't think there is actually any
> >ordering constraint at all on the unlocking of killlock (with the
> >accompanying assignment self->tid=0 kept with it) except that it be
> >past the point where we are committed to the thread terminating
> >without executing any more application code. So my leaning would be to
> >move this block from the end of pthread_exit up to right after the
> >point-of-no-return comment.
> >
> This was my conclusion as well back when I looked at it before
> sending the report.
> 
> I was initially concerned about whether reordering with
> a_store(&self->detach_state, DT_EXITED) could cause an unwanted
> observable change (pthread_tryjoin_np() returning EBUSY after a
> pthread function acting on tid like pthread_getschedparam() returns
> ESRCH), but no, pthread_tryjoin_np() will block/trap if the thread
> is not DT_JOINABLE.
> 
> >Unfortunately while reading this I found another bug, this time a lock
> >order one. __dl_thread_cleanup() takes a lock while the thread list
> >lock is already held, but fork takes these in the opposite order. I
> >think the lock here could be dropped and replaced with an atomic-cas
> >list head, but that's rather messy and I'm open to other ideas.
> >
> I'm not sure why using a lock-free list is messy, it seems like a
> perfect fit here to me.

Just in general I've tried to reduce the direct use of atomics and use
high-level primitives, because (as this thread is evidence of) I find
the reasoning about direct use of atomics and their correctness to be
difficult and inaccessible to a lot of people who would otherwise be
successful readers of the code. But you're right that it's a "good
match" for the problem at hand.

> However, doesn't __dl_vseterr() use the libc-internal allocator
> after  34952fe5de44a833370cbe87b63fb8eec61466d7? If so, the problem
> that freebuf_queue was originally solving doesn't exist anymore. We
> still can't call the allocator after __tl_lock(), but maybe this
> whole free deferral approach can be reconsidered?

I almost made that change when the MT-fork changes were done, but
didn't because it was wrong. I'm not sure if I documented this
anywhere (it might be in mail threads related to that or IRC) but it
was probably because it would need to take malloc locks with the
thread list lock held, which isn't allowed.

It would be nice if we could get rid of the deferred freeing here, but
I don't see a good way. The reason we can't free the buffer until
after the thread list lock is taken is that it's only freeable if this
isn't the last exiting thread. If it is the last exiting thread, the
buffer contents still need to be present for the atexit handlers to
see. And whether this is the last exiting thread is only
stable/determinate as long as the thread list lock is held.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.