Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 11 Feb 2023 14:49:51 -0500
From: Rich Felker <dalias@...c.org>
To: Alexey Izbyshev <izbyshev@...ras.ru>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH] mq_notify: fix close/recv race on failure path

On Sat, Feb 11, 2023 at 10:28:20PM +0300, Alexey Izbyshev wrote:
> On 2023-02-11 21:35, Rich Felker wrote:
> >On Sat, Feb 11, 2023 at 09:08:53PM +0300, Alexey Izbyshev wrote:
> >>On 2023-02-11 20:59, Rich Felker wrote:
> >>>On Sat, Feb 11, 2023 at 08:50:15PM +0300, Alexey Izbyshev wrote:
> >>>>On 2023-02-11 20:13, Markus Wichmann wrote:
> >>>>>On Sat, Feb 11, 2023 at 10:06:03AM -0500, Rich Felker wrote:
> >>>>>>--- a/src/thread/pthread_detach.c
> >>>>>>+++ b/src/thread/pthread_detach.c
> >>>>>>@@ -5,8 +5,12 @@ static int __pthread_detach(pthread_t t)
> >>>>>> {
> >>>>>> 	/* If the cas fails, detach state is either already-detached
> >>>>>> 	 * or exiting/exited, and pthread_join will trap or cleanup. */
> >>>>>>-	if (a_cas(&t->detach_state, DT_JOINABLE, DT_DETACHED) !=
> >>>>>>DT_JOINABLE)
> >>>>>>+	if (a_cas(&t->detach_state, DT_JOINABLE, DT_DETACHED) !=
> >>>>>>DT_JOINABLE) {
> >>>>>>+		int cs;
> >>>>>>+		__pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &cs);
> >>>>>> 		return __pthread_join(t, 0);
> >>>>>                ^^^^^^ I think you forgot to rework this.
> >>>>>>+		__pthread_setcancelstate(cs, 0);
> >>>>>>+	}
> >>>>>> 	return 0;
> >>>>>> }
> >>>>>>
> >>>>>
> >>>>>I see no other obvious missteps, though.
> >>>>>
> >>>>Same here, apart from this and misspelled "pthred_detach" in the
> >>>>commit message, the patches look good to me.
> >>>>
> >>>>Regarding the POSIX requirement to run sigev_notify_function in the
> >>>>context of a detached thread, while it's possible to observe the
> >>>>wrong detachstate for a short while via pthread_getattr_np after
> >>>>these patches, I'm not sure there is a standard way to do that. Even
> >>>>if it exists, this minor issue may be not worth caring about.
> >>>
> >>>Would this just be if the notification callback executes before
> >>>mq_notify returns in the parent?
> >>
> >>Yes, it seems so.
> >>
> >>>I suppose we could have the newly
> >>>created thread do the work of making the syscall, handling the error
> >>>case, detaching itself on success and and reporting back to the
> >>>mq_notify function whether it succeeded or failed via the
> >>>semaphore/args structure. Thoughts on that?
> >>>
> >>Could we just move pthread_detach call to the worker thread to the
> >>point after pthread_cleanup_pop?
> >
> >I thought that sounded dubious, in that it might lead to an attempt to
> >join a detached thread, but maybe it's safe to assume recv will never
> >return if the mq_notify syscall failed...?
> >
> Actually, because app signals are not blocked when the worker thread
> is created, recv can indeed return early with EINTR. But this looks
> like just a bug.

Yes. While it's not a conformance bug to run with signals unblocked
("The signal mask of this thread is implementation-defined.") it's a
functional bug to ever introduce threads that don't block all
application signals, since these interfere with sigwait & other
application control of where signals are delivered. This is an
oversight. I'll make it mask all signals.

> Otherwise, mq_notify already assumes that recv can't return before
> SYS_mq_notify (if it did, the syscall would try to register a closed
> fd). I haven't tried to prove it (e.g. maybe recv may need to
> allocate something before blocking and hence can fail with ENOMEM?),
> but if it's true, I don't see how a failed SYS_mq_notify could cause
> recv to return, so joining a detached thread should be impossible if
> we make pthread_detach follow recv.

I'm thinking for now maybe we should just drop the joining on error,
and leave it starting out detached. While recv should not fail, it's
obviously possible to make it fail in a seccomp sandbox, and you don't
want that to turn into UB inside the implementation. If it does fail,
the thread should still exit, but we have no way to synchronize with
the mq_notify parent to decide whether it's being joined or not in
this case without extra sync machinery...

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.