musl - Re: Explaining cond var destroy [Re: C threads, v3.0]

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 8 Aug 2014 12:53:21 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Explaining cond var destroy [Re: C threads, v3.0]

On Fri, Aug 08, 2014 at 11:20:27AM +0200, Jens Gustedt wrote:
> Hello,
> 
> Am Donnerstag, den 07.08.2014, 13:25 -0400 schrieb Rich Felker:
> > Most futex operations (including wait) access an object for read. Some
> > rarely-used ones write to an object too. Wake does neither. Wake is
> > purely an operation on the address/futex-waitqueue-object. This
> > property is not just a side effect but absolutely necessary in order
> > for futex to be useful, since you _always_ perform the futex wake
> > after releasing (and thereby potentially allowing destruction) of the
> > object.
> 
> I agree with almost all what you are saying, in particular the
> facts, ;) and your statement is a good summary of what I learned with
> our discussion.
> 
> I don't agree with your appreciation of these facts, though.
> 
> In particular the "(and thereby potentially allowing destruction)"
> bothers me a lot, and I don't think that *doing* this with control
> structures is desirable.

Do you have a reason it's undesirable? The only reason I've seen so
far is that it causes spurious wakes and makes the return values of
futex calls unusable. I agree it would be nice if the return values
were usable, but the benefit to having them seems to be limited to
making small optimizations in the implementation of synchronization
primitives. On the other hand, the cost to make such optimizations
(added synchronization to preclude sending a wake to an invalid
address) seems to be much higher, and essentially infinite (i.e.
fundamentally possible without moving the whole lock to kernelspace
and incurring a syscall for each operation) for process-shared
objects. So even if it is desirable, buying it seems like a HUGE net
loss.

> I think it is possible to have control structures that guarantee that
> no wake (or any other futex operation) is done on an address that
> isn't a valid address of a life object.

So far the way you've proposed to do this is using dynamic allocation.
This incurs additional synchronization cost (in malloc) for
creation/destruction, incurs a complexity cost (in no longer being
fail-safe) for both the implementation and the application, and it's
not even possible for process-shared objects because there is
fundamentally no way to allocate shared memory which is accessible
only by the exact set of processes which have access to the original
object. Any attempt to do so would either have false negatives (some
processes that nominally have access to the object would fail to
operate on it) or false positives (malicious processes would
potentially have access to operate on the object).

> I also think if only done for
> non-shared structures (such as for C threads) that such a design can
> be relatively simple, and with much less implicit state than what the
> current musl implementation does, in particular for pthread_cond_t.

Making mutexes heavy to make cond vars lighter seems like a net loss.
Cond vars are supposed to be a heavy primitive (typical case is
waiting) and mutexes a light one (typical case is immediately
acquiring the lock).

I would still like to explore ways to improve cond vars without
resorting to allocation. One potential idea (but I don't see a way to
make it work with cond vars yet) is to try something like the barriers
implementation, where the actual synchronization object lies on the
stack of the first waiter. Obviously this is only for
non-process-shared objects (barriers do something else more costly in
the process-shared case). The need for both signal and broadcast
semantics complicates it somewhat though; with such a design it might
be necessary for broadcast to simply keep signaling until the original
list is empty rather than using a single broadcast wake.

> Such an implementation should by design avoid late wakes from the
> kernel for addresses that are recycled for new control objects. (It
> can't inhibit them since other userspace tools might still have the
> late wake problem.)
> 
> I am currently debugging such a C thread implementation, and I'll come
> back to you once I have something to show.

I'm happy to see what you come up with, but I doubt it's right for
musl.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.