Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 6 Dec 2018 11:23:36 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: sem_wait and EINTR

On Thu, Dec 06, 2018 at 04:57:56PM +0100, Markus Wichmann wrote:
> On Wed, Dec 05, 2018 at 10:17:56PM -0500, Rich Felker wrote:
> > I'd like it if we could avoid the pre-linux-2.6.22 bug of spurious
> > EINTR from SYS_futex, but I don't see any way to do so except possibly
> > wrapping all signal handlers and implementing restart-vs-EINTR
> > ourselves. So if we need to change this, it might just be a case where
> > we say "well, sorry, your kernel is broken" if someone is using a
> > broken kernel.
> > 
> > Thoughts?
> > 
> > Rich
> 
> I really don't know what you are getting at, here. In the hypothetical
> case you detected an EINTR return without a signal having been handled,
> you could just retry the syscall. The problem is getting that
> information in the first place.

See the commit c0ed5a201b2bdb6d1896064bec0020c9973db0a1 which
introduced the EINTR suppression, deliberately:

    per POSIX, the EINTR condition is an optional error for these
    functions, not a mandatory one. since old kernels (pre-2.6.22) failed
    to honor SA_RESTART for the futex syscall, it's dangerous to trust
    EINTR from the kernel. thankfully POSIX offers an easy way out.

(Ignore the apparently wrong claim about POSIX.)

The concern is that perfectly correct programs can use sem_wait
without a retry loop if they do not install interrupting signal
handlers (and most programs refrain from doing that, because it's
awful). However, if run on an old kernel (<2.6.22), these correct
programs would wrongly make forward progress without finishing the
sem_wait.

One ugly hack that might be worth doing is simply tracking whether any
signal handler has been installed without SA_RESTART, and keeping the
retry-on-EINTR logic if not. Retrying under such conditions could not
break conformance and would preserve safety on old kernels for
programs which don't use interrupting signals at all. It would not
preserve the safety of *all possible* programs on such kernels, since
a program could install interrupting signal handlers but leave the
corresponding signals blocked in all threads that use sem_wait, but I
suspect that's a much less likely scenario.

> Practically, I see a lot of work for little gain. Wrapping all signal
> handlers means we need to save up to _NSIG function pointers. Access to
> those doesn't need serialization any more than sigaction() does. Though,
> what does it mean if someone changes the signal handler while we are in
> the wrapper?

This is not an actual proposal at this time (although the need has
been considered for other reasons at various times, which is why I'm
familiar with the concept). It was just a statement that I don't think
the problem can be worked around without such an extreme measure.

> Speaking of calls that shouldn't fail but do: Is futex_wake() affected
> by the same bug?

It shouldn't be because it shouldn't enter any interruptible sleep.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.