Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 6 Oct 2022 15:50:54 -0400
From: Rich Felker <>
Subject: Re: Re: MT fork and key_lock in pthread_key_create.c

On Thu, Oct 06, 2022 at 03:20:42PM -0400, Rich Felker wrote:
> On Thu, Oct 06, 2022 at 10:02:11AM +0300, Alexey Izbyshev wrote:
> > On 2022-10-06 09:37, Alexey Izbyshev wrote:
> > >Hi,
> > >
> > >I noticed that fork() doesn't take key_lock that is used to protect
> > >the global table of thread-specific keys. I couldn't find mentions of
> > >this lock in the MT fork discussion in the mailing list archive. Was
> > >this lock overlooked?
> > >
> > >Also, I looked at how __aio_atfork() handles a similar case with
> > >maplock, and it seems wrong. It takes the read lock and then simply
> > >unlocks it both in the parent and in the child. But if there were
> > >other holders of the read lock at the time of fork(), the lock won't
> > >end up in the unlocked state in the child. It should probably be
> > >completely nulled-out in the child instead.
> > >
> > Looking at aio further, I don't understand how it's supposed to work
> > with MT fork at all. __aio_atfork() is called in _Fork() when the
> > allocator locks are already held. Meanwhile another thread could be
> > stuck in __aio_get_queue() holding maplock in exclusive mode while
> > trying to allocate, resulting in deadlock.
> Indeed, this is messy and I don't think it makes sense to be doing
> this at all. The child is just going to throw away the state so the
> parent shouldn't need to synchronize at all, but if we walk the
> multi-level map[] table in the child after async fork, it's possible
> that the contents seen are inconsistent, even that the pointers are
> only half-written or something.
> I see a few possible solutions:
> 1. Just set map = 0 in the child and leak the memory. This is not
>    going to matter unless you're doing multiple generations of fork
>    with aio anyway.
> 2. The same, but be a little bit smarter. pthread_rwlock_tryrdlock in
>    the child, and if it succeeds, we know the map is consistent so we
>    can just zero it out the same as now. Still "leaks" but only on
>    contention to expand the map.
> 3. Getting a little smarter still: move the __aio_atfork for the
>    parent side from _Fork to fork, outside of the critical section
>    where malloc lock is held. Then proceed as in (2). Now, the
>    tryrdlock is guaranteed to succeed in the child. Leak is only
>    possible when _Fork is used (in which case the child context is an
>    async signal one, and thus calling any aio_* that would allocate
>    map[] again is UB -- note that in this case, the only reason we
>    have to do anything at all in the child is to prevent close from
>    interacting with aio).
> After writing them out, 3 seems like the right choice.

Proposed patch attached.

View attachment "aio_atfork.diff" of type "text/plain" (2551 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.