Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 24 Apr 2015 11:03:41 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Resuming work on new semaphore

On Fri, Apr 24, 2015 at 01:23:27PM +0300, Alexander Monakov wrote:
> On Thu, 23 Apr 2015, Rich Felker wrote:
> > Perhaps this can be patched up by saturating sem_getvalue's result? In
> > the case where the overflow happens it's transient, right? I think
> > that means discounting the overflow would be valid. But I'll need to
> > think about it more...
> 
> Hm, can't agree here.  This whole line of discussion stems from attempt to
> align timedwait/trywait/getvalue behavior in light of dead waiters, which are
> indistinguishable from preempted waiters.

I don't think dead waiters are a solvable problem with this design,
but they're a minor problem until you hit overflow.

> If "it's transient" claim can be
> made, it also can be used as a reason not to modify getvalue to look at val[1].

No, because you can interrupt a waiter with a signal handler and the
"transient" state becomes something you can synchronize with and
observe and thus no longer transient. That was the motivation for
needing to count the pending wakes.

> > With that said, my inclination right now is that we should hold off on
> > trying to commit the new semaphore for this release cycle. I've been
> > aiming for this month and just about everything else is in order for
> > release, but the semaphore rabbit-hole keeps going deeper and I think
> > we need to work through this properly. I hope that's not too much of a
> > disappointment.
> 
> Ack; thankfully I don't feel disappointment in this case, this discussion has
> been quite entertaining.  When I proposed my modification I felt it was very
> intuitive.  What I did not grasp back then is that the definition of a waiter
> is not solid.
> 
> How do you interpret the following?
> 
> 1. Semaphore initialized to 0. There's only one thread.
> 2. alarm(1)
> 3. sem_wait()
> .... (in SIGALRM handler)
>     4. sem_post()
>     5. sem_getvalue()
> 
> May getvalue be 0 here?  At step 4, can the thread possibly "be a waiter"
> on the semaphore?

Here steps 4 and 5 are UB (calling AS-unsafe functions from AS
context). But you can achieve the same with another thread observing
entry to the signal handler in a valid way (e.g. via posting of a
second sem from the signal handler).

With that problem solved, I think it's valid at this point to observe
a value of 0 or 1. But if 0 is observed, sem_trywait would have to
fail, and sem_wait or sem_timedwait could return only in the case of
an error. This is why returning 0 does not seem to be practical -- I
don't know a way to let the existing suspended waiter take the wake
without allowing new waiters to steal it (and thus expose
inconsistency).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.