Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 13 Aug 2014 00:11:09 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: bug in pthread_cond_broadcast

On Tue, Aug 12, 2014 at 07:33:10PM -0400, Rich Felker wrote:
> One potential solution I have in mind is to get rid of this complex
> waiter accounting by:
> 
> 1. Having pthread_cond_broadcast set the new-waiter state on the mutex
>    when requeuing, so that the next unlock forces a futex wake that
>    otherwise might not happen.
> 
> 2. Having pthread_cond_timedwait set the new-waiter state on the mutex
>    after relocking it, either unconditionally (this would be rather
>    expensive) or on some condition. One possible condition would be to
>    keep a counter in the condvar object for the number of waiters that
>    have been requeued, incrementing it by the number requeued at
>    broadcast time and decrementing it on each wake. However the latter
>    requires accessing the cond var memory in the return path for wait,
>    and I don't see any good way around this. Maybe there's a way to
>    use memory on the waiters' stacks?

On further consideration, I don't think this works. If a thread other
than one of the cv waiters happened to get the mutex first, it would
fail to set the new-waiter state again at unlock time, and the waiters
could get stuck never waking up.

So I think it's really necessary to move the waiter count to the
mutex.

One way to do this with no synchronization cost at signal time would
be to have waiters increment the mutex waiter count before waiting on
the cv futex, but of course this could yield a lot of spurious futex
wake syscalls for the mutex if other threads are locking and unlocking
the mutex before the signal occurs.

I think the other way you proposed is in some ways ideal, but also
possibly unachievable. While the broadcasting thread can know how many
threads it requeued, the requeued threads seem to have no way of
knowing that they were requeued after the futex wait returns. Even
after they were successfully requeued, the futex wait could return
with a timeout or EINTR or similar, in which case there seems to be no
way for the waiter to know whether it needs to decrement the mutex
waiter count. I don't see any solution to this problem...

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.