musl - Re: Resuming work on new semaphore

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150424150341.GP17573@brightrain.aerifal.cx>
Date: Fri, 24 Apr 2015 11:03:41 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Resuming work on new semaphore

On Fri, Apr 24, 2015 at 01:23:27PM +0300, Alexander Monakov wrote:
> On Thu, 23 Apr 2015, Rich Felker wrote:
> > Perhaps this can be patched up by saturating sem_getvalue's result? In
> > the case where the overflow happens it's transient, right? I think
> > that means discounting the overflow would be valid. But I'll need to
> > think about it more...
> 
> Hm, can't agree here.  This whole line of discussion stems from attempt to
> align timedwait/trywait/getvalue behavior in light of dead waiters, which are
> indistinguishable from preempted waiters.

I don't think dead waiters are a solvable problem with this design,
but they're a minor problem until you hit overflow.

> If "it's transient" claim can be
> made, it also can be used as a reason not to modify getvalue to look at val[1].

No, because you can interrupt a waiter with a signal handler and the
"transient" state becomes something you can synchronize with and
observe and thus no longer transient. That was the motivation for
needing to count the pending wakes.

> > With that said, my inclination right now is that we should hold off on
> > trying to commit the new semaphore for this release cycle. I've been
> > aiming for this month and just about everything else is in order for
> > release, but the semaphore rabbit-hole keeps going deeper and I think
> > we need to work through this properly. I hope that's not too much of a
> > disappointment.
> 
> Ack; thankfully I don't feel disappointment in this case, this discussion has
> been quite entertaining.  When I proposed my modification I felt it was very
> intuitive.  What I did not grasp back then is that the definition of a waiter
> is not solid.
> 
> How do you interpret the following?
> 
> 1. Semaphore initialized to 0. There's only one thread.
> 2. alarm(1)
> 3. sem_wait()
> .... (in SIGALRM handler)
>     4. sem_post()
>     5. sem_getvalue()
> 
> May getvalue be 0 here?  At step 4, can the thread possibly "be a waiter"
> on the semaphore?

Here steps 4 and 5 are UB (calling AS-unsafe functions from AS
context). But you can achieve the same with another thread observing
entry to the signal handler in a valid way (e.g. via posting of a
second sem from the signal handler).

With that problem solved, I think it's valid at this point to observe
a value of 0 or 1. But if 0 is observed, sem_trywait would have to
fail, and sem_wait or sem_timedwait could return only in the case of
an error. This is why returning 0 does not seem to be practical -- I
don't know a way to let the existing suspended waiter take the wake
without allowing new waiters to steal it (and thus expose
inconsistency).

Rich

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.