musl - Re: C11 threads

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140726022453.GH4038@brightrain.aerifal.cx>
Date: Fri, 25 Jul 2014 22:24:53 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: C11 threads

On Sat, Jul 26, 2014 at 01:26:09AM +0200, Jens Gustedt wrote:
> Am Freitag, den 25.07.2014, 18:19 -0400 schrieb Rich Felker:
> > Since glibc is going that way, I think we should keep the type sizes
> > the same but leave it explicitly undefined to mix calls to C11 and
> > POSIX functions on the same synchronization objects.
> 
> yes, defitively. Not even for the synchronization functions, basically
> none from pthread and C11 thread should be mixed, I think. I thought
> of figuring out a way to make this even a link error.

What do you mean? If you meant that calling C11 thread functions and
POSIX thread functions in the same program should be an error, I think
that's very wrong. The next issue of POSIX will be aligned with C11,
so both sets of interfaces will exist. Even if not for that, though,
it's just wrong conceptually to exclude the use of both. For example a
library written to ISO C could be using threads (C11 threads)
internally as an implementation detail, and this should not break a
caller which is using POSIX threads.

> > Could you elaborate? Perhaps we're thinking of different things.
> 
> probably.
> 
> The C11 specification is so bad, that some of the functions don't even
> have a semantic description of what they are supposed to achieve. E.g
> tss_t completely lacks a description of what a destructor is, what
> the "iterations" in TSS_DTOR_ITERATIONS describe etc. Parts of it are
> or will be adressed TC, but there is much more. I wrote my findings up
> here
> 
> http://gustedt.wordpress.com/2012/10/14/c11-defects-c-threads-are-not-realizable-with-posix-threads/

Thanks -- I'll check it out.

> > One aspect is that all POSIX synchronization functions are full
> > barriers, whereas presumably the C11 ones are proper acquire/release
> > barriers as appropriate for the operation being performed.
> 
> That is probably the intention. But e.g mtx_lock is only required to
> synchronize with calls to mtx_unlock, not with other calls to mtx_lock
> or similar. The term "lock" isn't even defined.

...

> > Another is Austin Group issue #755, which I'm hoping the WG14 will not
> > rule the same way on. The Austin Group's resolution for POSIX mutexes
> > makes recursive and error-checking mutexes very expensive and requires
> > threads to maintain a record of all such mutexes they own. I have not
> > yet implemented this in musl but I plan to do so soon (BTW this needs
> > to be added to the roadmap).
> 
> I am not sure that I completely understand what this is about. But in
> any case, C threads don't have error checking :)

It's about what happens when a thread exits whole holding a recursive
or errorchecking mutex. If the ownership of that mutex is tracked by a
thread id stored in the mutex (this is the only practical way to do
it), a newly created thread could wrongly become the owner of the
orphaned mutex just by getting the same thread id (by chance). The
only implementation options to avoid this are to have thread ids so
large that values never have to be reused, or to track the list of
mutexes owned by a thread so that it can change the owner to a dummy
value that will never match when it exits.

The obvious way to avoid this problem would be to add to the
specification:

"If a thread exits while it is the owner of a mutex, the behavior is
undefined."

Unfortunately the Austin Group did not want to do this. I'm hoping
someone can raise the issue with WG14 and that they'll decide
differently for C11 threads, so that C11 recursive mutexes can be more
efficient.

For POSIX mutexes, we're basically going to have to treat recursive
and errorchecking mutexes as robust mutexes to meet the requirements
of the standard, despite these requirements only affecting broken
programs that leave mutexes locked (and thus permanently locked) when
a thread exits.

> > > The advantages that I see that this doesn't need all that attr stuff
> > > and has no concept of cancellation.
> > 
> > Cancellation is rather irrelevant to the synchronization primitives.
> > Obviously C11's lack of cancellation makes C11 threads much easier to
> > implement if you're not also doing POSIX threads, but if you're doing
> > both, the fact that C11 lacks a cancellation function makes almost no
> > difference (just one simple cleanup function in condvar wait).
> 
> There is a bit more, maybe. __timedwait does cleanup push and pop and
> messes around with cancelbuf. (But probably I just don't understand
> enough what's going on, here.)

That's just to avoid having two different versions of __timedwait.
It's used for condvar wait, join, and a few other operations which are
cancellable. For mutexes the cancellation feature of __timedwait is
not used.

> In any case, without attributes and possible cancelation, thrd_create
> becomes significantly shorter than pthread_create.

The vast majority of the code in pthread_create is setting up the
stack, TLS, POSIX TSD, and the contents of the __pthread structure
(called TCB on other implementations). Attributes are a pretty small
part.

> And to my limited experience having well defined atomics that are
> integrated in the language, often helps to completely avoid mutexes
> and conditions.

I'm not sure about that. Atomics are mostly useful for the situations
where spinlocks would suffice. They don't help anywhere you would
expect a "wait" operation to happen (e.q. waiting for a queue to
become non-empty or non-full).

> > > > and definitely allows smaller size if the application
> > > > only uses them (but of course makes libc.a and libc.so larger).
> > > 
> > > If we use weak aliases, this basically blows up the symbol table a
> > > bit.
> > > 
> > > When I strip my test executable, it is impressively small.
> > 
> > Yes, this is one reason I don't want __-prefixed symbols for
> > everything, just functions that really _need_ namespace-safe versions.
> > We should probably do some linking tests against all the C11 functions
> > to make sure they're not pulling in functions outside the C11
> > namespace and add __-prefixed versions of those functions only, rather
> > than preemptively trying to guess what needs protection.
> 
> I tried to do that already.

OK.

> > > > > For the types this should be easy to have them typedef'ed to some
> > > > > opaque struct.
> > > > 
> > > > For the types that vary per-arch, I don't see any way around having
> > > > them in the alltypes.h.in bits.
> > > 
> > > The only types that seem to be in bits are mutex and condition. All
> > > others seem to be arch independent.
> > 
> > Ah yes the others are probably not needed for C11.
> > 
> > > On the long we should probably have __pthread names in the alltypes.h
> > > files and typedef these in pthread.h and threads.h respectively.
> > 
> > That doesn't work because both sys/types.h and pthread.h have to
> > expose the pthread names.
> > 
> > > > As for the constants, which ones are
> > > > you talking about?
> > > 
> > > the mtx_ constants are the critical ones.
> > > 
> > > thrd_error and friends can easily be mapped to EXXX error codes from
> > > errno.h. Since these names are all reserved anyway, including errno.h
> > > shouldn't do much harm.
> > 
> > I don't think including errno.h implicitly is permitted. E* is only
> > reserved when errno.h is included. Anyway this is somewhere we wasnt
> > to agree with glibc on the values; I suspect they'll just use values
> > 1,2,... rather than errno codes. If so, that means we really need a
> > wrapper function rather than just an alias.
> 
> Hm, not sure that I follow.
> 
> I only need EBUSY, EINVAL, ENOMEM, and ETIMEDOUT, and effectively only
> that these are consistent with the rest of the C library, which for
> this implementation of C threads will always be musl.

The point of ABI compatibility is that (at this point just some)
binaries and (more importantly) shared libraries without source that
were built/linked against glibc can be used with musl. But for this to
work, the values of the constants need to be the same.

> > > > I don't think it's so easy to directly use the
> > > > POSIX ones for C11 since they have some different semantics (flag bits
> > > > vs enumeration-style) for a few.
> > > 
> > > the mtx_ ones are basically flags, just as the PTHREAD_MUTEX_ ones.
> > > This fits well with the actual values of these in musl.
> > 
> > The PTHREAD_MUTEX_* ones are not flags; each corresponds to a single
> > type and they are not combinable.
> 
> ah, ok, anyhow the only one that I am really interested in is
> PTHREAD_MUTEX_RECURSIVE. So perhaps I can push that down to the
> implementation.

mtx_init needs to be a wrapper for pthread_mutex_init rather than an
alias anyway, since mtx_init takes an integer and pthread_mutex_init
takes a pointer to an attribute object.

> > The C11 ones are combinable, but mtx_timed is useless
> 
> yes, I think I already noted that in the file that I attached to my
> initial mail
> 
> > -- there's no need to care whether it will be
> > used for timed operations when it's initialized.
> > 
> > BTW n1570 only has 4 possible values to pass to mtx_init, but reads
> > (7.26.4.2 p2):
> > 
> > "The mtx_init function creates a mutex object with properties
> > indicated by type, which must have one of the six values:"
> > 
> > Any idea if six is just a mistake or if there are some values missing
> > from the list?
> 
> It is just a leftover, and has/willbe been corrected.
> 
> To summarize, I'd need to get
> 
> EBUSY, EINVAL, ENOMEM, ETIMEDOUT and TSS_DTOR_ITERATIONS

TSS_DTOR_ITERATIONS can just be defined to whatever the right value is
-- IIRC we use the minimum POSIX requires. It doesn't need to
magically sync with something else. If we ever need to change it we
can change both.

Obviously if the error values are used directly, duplicating them in
another header is more trouble since they vary per-arch. This is part
of why I would actually prefer not to use them for the thread function
result codes, but which we do will depend on which way glibc does it.
I can check in with them and see if they have a plan yet.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.