Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 19 Feb 2020 09:26:30 +0100
From: Sebastian Gottschall <s.gottschall@...wrt.com>
To: musl@...ts.openwall.com
Subject: Re: race condition in sem_wait


Am 19.02.2020 um 04:39 schrieb Rich Felker:
> On Wed, Feb 19, 2020 at 01:46:34AM +0100, Sebastian Gottschall wrote:
>> Hello
>>
>> i discovered recently a race condition while playing with threads
>> and sem_wait/sem_post
>> sem_wait may fail with errno set EAGAIN which is not valid since
>> only sem_trywait is able to set that errno code.
>> this was causing a bug with a later select() and accept() which
>> failed since accept does not work if errno is set to EAGAIN.
>> from my point of view the bug is in sem_timedwait.c
>>
>>          if (!sem_trywait(sem)) return 0;
>>
>>          int spins = 100;
>>          while (spins-- && sem->__val[0] <= 0 && !sem->__val[1]) a_spin();
>>
>>          while (sem_trywait(sem)) {
>>
>>
>> the fist sem_trywait will fail with -1 and sets EAGAIN. but the
>> second sem_trywait will not fail and does return 0. the problem now
>> is that errno is still present and not reset.
>> this may cause if sem_post is called from a second thread on the
>> same semaphore.
>> of course the same bug affects sem_timedwait itself.
>> so i assume sem_wait is not thread safe which is bad and is not
>> follow the posix specification
>>
>> or am i wrong here?
> errno is only meaningful on failure; unless specified otherwise (a few
> functions are special because you can't [easily] distinguish success
> from failure for them without examining errno), any standard function
> may have changed the value of errno when it returns with success. The
> only thing it's not allowed to do is clear it (set it to 0).
the problem is the posix manual specifies exclicit that EAGAIN cannot be 
returned by sem_wait and in my code sample

the following happens

sem_wait(semaphort)
select(....)
socket = accept(....)  -> fails

accept fails because sem_wait did set errno to EAGAIN and accept will 
fail if errno is set to EAGAIN
i use sem_wait to limit the number of threads in my webserver. on the 
thread itself i call sem_post.
but to make it work correct i have to set errno=0 before calling accept 
since accept will not work if errno is set to EAGAIN
if you read the posix man for accept, you will find out that accept will 
read errno unconditional and this is also the case for the musl 
implementation


Sebastian

>
> Rich
>

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.