Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 24 Sep 2020 12:14:07 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Bug with priority inheritance and condition variables

On Thu, Sep 24, 2020 at 03:58:17PM +0100, Edward Scott wrote:
> Hello,
> 
> There appears to be a bug when using priority inheritance in combination
> with condition variables. I have some code that reproduces the bug:
> 
> https://github.com/edward-scott/musl-prio-inherit-cv-bug
> 
> Using git bisect I traced the origin of the bug to this commit:
> 
> https://git.musl-libc.org/cgit/musl/commit/?id=54ca677983d47529bab8752315ac1a2b49888870
> 
> which is the commit that is described as "implement priority inheritance
> mutexes".
> 
> From my analysis it appears that _m_waiters is used by the
> priority inheritance logic to maintain some state (as described in the
> commit message) but that conflicts with some use of _m_waiters in the
> condition variable implementation.

I think this is entirely correct analysis. Thanks for catching this!

> The consequence is that pthread_mutex_lock erroneously returns EDEADLK.

OK, it took me a second to understand this part, because I thought it
would be ENOTRECOVERABLE, but that's only for robust+PI mutexes.
EDEADLK seems to be a consequence of succeeding but returning EBUSY,
which is "wrong" but should only be able to happen with inconsistent
state, as produced by pthread_cond_timedwait.

> I don't understand the code well enough to produce a fix.

I'll take a look. I'd like to just drop adusting the waiters count
here and instead set the bit-31 may-have-waiters flag here, but I'm
not sure that's right for all mutex types. It certainly can be made to
do that just on PI mutexes if needed but having fewer special cases is
preferable.

> The demo code (a cut version of some production code) will reproduce the
> failure. Commenting out the pthread_mutexattr_setprotocol call in
> the iot_mutex_init function at the end of the thread.c file will cause the
> code to work as intended (without priority inheritance). The code works
> fine either way with the GNU lib.
> 
> BTW can I recommend that the "magic numbers"  used to represent mutex modes
> be replaced at some point with defined constants as it would make the code
> much easier to follow.

Yes, it's been something I kinda wanted to do, but that would have
obfuscated and cluttered the actual changes in development when it was
being done. It might be time to go back and add some now that this
code is mature.

> This is my first post to this list so I hope this message is on the right
> list and is helpful.

Yep, this is fine. Thanks again!

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.