musl - Re: trouble spots for atomic access

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150519220722.GO17573@brightrain.aerifal.cx>
Date: Tue, 19 May 2015 18:07:22 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: trouble spots for atomic access

On Tue, May 19, 2015 at 03:57:00PM +0200, Jens Gustedt wrote:
> Hello,
> by forcing the compiler to detect consistency checks for 
> atomics as I mentioned earlier, I detected 5 trouble spots. The first
> four are relatively clear:
> 
>  - a_and and a_or interfaces on i386 and friends are not consistent
>    with the remaining archs. They have `volatile void*` for the
>    arguments and then do a gratuitous cast to `int*`. As far as I can
>    see just using `volatile int*` as for the other archs works fine.

Indeed, this is purely an oversight. The motivation was probably
originally being able to operate on mismatching types (valid since the
actual access happens via asm) but this isn't used anyway. I'll fix it.xs

>  - pthread_once_t should always be volatile
>  - pthread_spinlock_t should always be volatile

These are C++ ABI changes. I'd like to make the change but I'm
concerned it might break things.

>  - pthread_barrier needs atomic increment

Where?

> The fifth troubles me a bit. It concerns __timedwait and
> __timedwait_cp. These both are mostly used with a first argument addr
> that is atomic. This makes sense, since addr then is passed to a call
> to futex, which internally might do some atomic operations. Now there

The volatile here is merely to make it legal to pass pointers to
volatile objects. The only access happens from kernelspace where the
volatility of the object in userspace is not visible.

> is one call that doesn't pass something that is otherwise seen as
> atomic, namely line 14 in pthread_join.c. It reads as
> 
> 	while ((tmp = t->tid)) __timedwait_cp(&t->tid, tmp, 0, 0, 0);
> 
> So is the task id here to be seen as atomic, or not? Will updates to
> that field that are not atomic (and maybe optimized in some sort) be
> able to mess up the futex call?

The only write to t->tid is by the kernel; it writes a zero and futex
wakes when the thread exits. No writes whatsoever happen before the
thread exits, so I don't see any way this code could give rise to an
early return for pthread_join (which would be very dangerous).
However, it seems plausible that the compiler could load t->tid twice,
see the running thread's tid the first time but zero the second time,
and thereby call __timedwait_cp with a value of 0, resulting in a wait
operation that never returns.

I don't know a really good solution to this. The vast majority of uses
of tid are self->tid, from which perspective tid is not volatile or
even mutable. We don't want to pessimize them with volatile access.
Probably the best approach is to split it out into separate tid and
exit_futex fields.

Rich

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.