Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 8 Dec 2018 21:51:40 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: sem_wait and EINTR

On Thu, Dec 06, 2018 at 06:33:37PM +0100, Markus Wichmann wrote:
> Hi all,
> 
> is the attached patch acceptable? A word about the bitfields: I
> generally dislike them for most things, but I didn't want to destroy the
> alignment struct __libc had going on, and these other flags really only
> are 0 or 1.
> 
> Patch is untested for want of an old kernel.
> 
> Ciao,
> Markus

> >From 816abf54fbbb02923331b69b62333b8a0edb4181 Mon Sep 17 00:00:00 2001
> From: Markus Wichmann <nullplan@....net>
> Date: Thu, 6 Dec 2018 18:30:26 +0100
> Subject: [PATCH 3/3] Add workaround for old linux bug.
> 
> ---
>  src/internal/libc.h        | 4 +---
>  src/signal/sigaction.c     | 2 ++
>  src/thread/sem_timedwait.c | 5 ++++-
>  3 files changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/src/internal/libc.h b/src/internal/libc.h
> index ac97dc7e..2e6737d9 100644
> --- a/src/internal/libc.h
> +++ b/src/internal/libc.h
> @@ -18,9 +18,7 @@ struct tls_module {
>  };
>  
>  struct __libc {
> -	int can_do_threads;
> -	int threaded;
> -	int secure;
> +	unsigned can_do_threads:1, threaded:1, secure:1, handling_sigs:1;
>  	volatile int threads_minus_1;
>  	size_t *auxv;
>  	struct tls_module *tls_head;
> diff --git a/src/signal/sigaction.c b/src/signal/sigaction.c
> index af47195e..5f02b99b 100644
> --- a/src/signal/sigaction.c
> +++ b/src/signal/sigaction.c
> @@ -43,6 +43,8 @@ int __libc_sigaction(int sig, const struct sigaction *restrict sa, struct sigact
>  					SIGPT_SET, 0, _NSIG/8);
>  				unmask_done = 1;
>  			}
> +			if (!(sa->sa_flags & SA_RESTART))
> +				libc.handling_sigs = 1;
>  		}
>  		/* Changing the disposition of SIGABRT to anything but
>  		 * SIG_DFL requires a lock, so that it cannot be changed
> diff --git a/src/thread/sem_timedwait.c b/src/thread/sem_timedwait.c
> index 8132eb1b..e76ae9de 100644
> --- a/src/thread/sem_timedwait.c
> +++ b/src/thread/sem_timedwait.c
> @@ -22,7 +22,10 @@ int sem_timedwait(sem_t *restrict sem, const struct timespec *restrict at)
>  		pthread_cleanup_push(cleanup, (void *)(sem->__val+1));
>  		r = __timedwait_cp(sem->__val, -1, CLOCK_REALTIME, at, sem->__val[2]);
>  		pthread_cleanup_pop(1);
> -		if (r && r != EINTR) {
> +                /* Linux pre-2.6.22 bug: Sometimes SYS_futex returns with EINTR when it should not.
> +                 * Workaround: Retry on EINTR unless someone installed handlers before.
> +                 */
> +		if (r && (r != EINTR || libc.handling_sigs)) {
>  			errno = r;
>  			return -1;
>  		}
> -- 
> 2.19.1
> 

Conceptually the logic looks correct, but I'm trying to phase out the
__libc structure, and more importantly, introducing a bitfield here is
not safe unless access to all bitfield members is controlled by a
common mutex or other synchronization mechanism. There is a data race
if any access to the other fields is accessed when sigaction is
callable from another thread, and the impact of a data race on any of
these is critical.

Unfortunately, even without the bitfield, there is a data race between
access to the new flag from sem_timedwait and from sigaction. In
theory, this potentially results in a race window where sem_wait
retries on EINTR because it doesn't yet see the updated handling_sigs.

In reality, I don't think that happens -- at least not on kernels
without the bug -- because the store to handling_sigs is sequenced
before sigaction, which is sequenced before the EINTR-generating
signal delivery, which is sequenced before the load in sig_timedwait,
with memory barriers (necessarily from an implementation standpoint)
imposed by whatever kernel mechanism delivers the signal. However
there is still a race between concurrent/unsequenced sigaction calls.

The "right" fix would be to use an AS-safe lock to protect the
handling_sigs object, or make it atomic (a_store on the sigaction
side, a_barrier before load on the sem side). Alternatively we could
assume the above reasoning about sequencing of sigaction before EINTR
and just use a lock on the sigaction side, but the atomic is probably
simplest and "safer".

A couple other small nits: comments especially should not exceed 80
columns (preferably <75 or so) assuming tab width 8; code should avoid
it too but I see it slightly does in a few places there. Spaces should
not be used for indention. And the commit message should reflect the
change made; it's not adding a workaround, but reducing the scope of a
previous workaround (which should be cited by commit id) to fix a
conformance bug. The name "handling_sigs" is also a bit misleading;
what it's indicating is that interrupting signal handlers are or have
been present.

I'm happy to fix up these issues and commit if there aren't mistakes
in the above comments that still need to be addressed.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.