|   | 
| 
 | 
Message-ID: <1407566854.4988.231.camel@eris.loria.fr>
Date: Sat, 09 Aug 2014 08:47:34 +0200
From: Jens Gustedt <jens.gustedt@...ia.fr>
To: musl@...ts.openwall.com
Subject: Re: Explaining cond var destroy [Re: C threads, v3.0]
Hello,
Am Freitag, den 08.08.2014, 16:48 -0400 schrieb Rich Felker:
> On Fri, Aug 08, 2014 at 03:14:06PM -0400, Rich Felker wrote:
> > I think I may have a solution you'll like:
> > 
> > We can perform the release of the lock via a compare-and-swap rather
> > than a simple swap. In this way, we can know before releasing the lock
> > whether it's going to require a wake or not:
> > 
> > - If waiters was zero and the cas from owned/uncontended to zero
> >   succeeds, no futex wake operation is needed.
> > 
> > - If waiters was nonzero, or if the cas fails (thereby instead
> >   requiring a cas from owned/contended to zero), we can do the
> >   following:
> > 
> > Don't use a userspace CAS to release; this would allow the lock to be
> > acquired by another thread, released, destroyed, and freed before the
> > futex wake is performed. Instead, use FUTEX_WAKE_OP to atomically
> > perform the atomic assignment and futex wake.
> 
> FUTEX_WAKE_OP is highly under-documented, and i'm worried it might be
> unsupported on some archs (since the atomics for it have to be
> implemented on a per-arch basis in the kernel) but of course we can
> just fallback on archs where it's not supported yet.
> 
> Anyway, the behavior seems to be:
> 
> - Futex acquisition for uaddr1 and uaddr2 both happen prior to the
>   atomic operation, and this hold locks that seem to prevent new
>   waiters on the futex(es). This should preclude any risk of waking a
>   new waiter that arrives after the atomic operation, as desired.
> 
> - Both uaddr1 and uaddr2 are hashed, with no check for equality. This
>   is a fairly costly wasteful operation, but could be fixed on the
>   kernel side. At present I suspect they don't care because
>   FUTEX_WAKE_OP is considered unnecessary, but if I raise it on the
>   glibc bug tracker thread for issue 13690 as a solution to the
>   problem, I think there would be a lot more interest in optimizing
>   this kernel path.
> 
> - After the atomic operation is performed, a wake is always performed
>   on uaddr1 (based on the previous acquisition); this fact is omitted
>   from all the documentation, but it's obviously intentional since
>   otherwise the uaddr1 argument would not be used for anything but
>   wasting time. The wake on uaddr2 is conditional on a comparison.
> 
> - No allocation is required anywhere in the operation, so we don't
>   have to worry about lost actions on OOM. For plain FUTEX_WAKE this
>   would not have been an issue (if acquirin the futex required memory,
>   then failure for FUTEX_WAKE to acquire it would mean there was no
>   FUTEX_WAIT taking place anyway), but for FUTEX_WAKE_OP, failure
>   would omit the atomic operation, which must take place even if there
>   are no current FUTEX_WAIT waiters (e.g. if the FUTEX_WAIT was
>   interrupted by a signal handler).
> 
> Based on the above, I think it's safe to move forward with using
> FUTEX_WAKE_OP. It seems optimal to me to use uaddr1==uaddr2 and a
> comparison that always yields false, so that the wake only goes to
> uaddr1. This will allow the kernel to optimize out double-hashing in
> the future by checking for uaddr1==uaddr2, and already optimizes out
> the double-iteration of the hash bucket for waking purposes.
> 
> Any further thoughts on the matter? I think we should finish the
> private futex support task before starting on this, so that we don't
> do new work that's going to conflict with a pending patch.
This looks promissing, but I yet don't know enough about these less
common futex operations to comment more on it.
Generally I think that the control structures should be as tight as
possible, give provable properties in the mathematical sense. The
interaction between user- and kernelland should be minimal, and we
shouldn't provoque reactions of the kernel that concern threads (or
even process) that are not really targetted. 
Jens
PS: I will be a bit less available in the next days.
-- 
:: INRIA Nancy Grand Est ::: AlGorille ::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536   ::
:: :::::::::::::::::::::: gsm France : +33 651400183   ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::
Download attachment "signature.asc" of type "application/pgp-signature" (199 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.