musl - Re: C threads, v. 6.2

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140831003034.GN12888@brightrain.aerifal.cx>
Date: Sat, 30 Aug 2014 20:30:34 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: C threads, v. 6.2

On Sat, Aug 30, 2014 at 09:43:36AM +0200, Jens Gustedt wrote:
> Am Samstag, den 30.08.2014, 01:30 -0400 schrieb Rich Felker:
> > On Fri, Aug 29, 2014 at 09:01:11PM +0200, Jens Gustedt wrote:
> > Unless the intent is to permanently have namespace violations, mtx_t
> > must be defined at some point such that it does not have
> > pthread_mutex_t as its C++ ABI "struct tag". It could have mtx_t
> > (because the specific name is reserved), or something like __mtx_t
> > (with a name in a general reserved namespace). This requires being a
> > different type from pthread_mutex_t.
> 
> Are we master of that decision, or do we have to coordinate that with
> other C libraries?

In principle we need to coordinate. However, I'm going to strongly
recommend using a struct with no tag (in which case the C++ pseudo-tag
becomes mtx_t) since any other solution has the C aliasing problems
we've discussed (a struct with a tag cannot be the same type as a
struct with no tag, can it?).

> > > There is basically one base choice to make:
> > > 
> > >  - we decide if pthread_mutex_t and mtx_t are seen as two different
> > >    types or not for any application that includes both headers
> > 
> > This is not a choice; it's mandated by the fact that our
> > pthread_mutex_t has a "struct tag" (in C++) that's a namespace
> > violation for use as the tag for mtx_t.
> 
> We could probably also find a trick that has us clean on the C side,
> and have namespace violation just in a C++ context :)

Yes, actually I thought of this option too, but I'd rather not get us
stuck with something ugly like that. :)

> > However, by the C rules, they're only "different types" when they're
> > both visible in the same translation unit. To a translation unit where
> > only one is visible, since the typedef name is not actually part of
> > the type, just an alias, both are structures without tags, and the one
> > that is visible is _the same type_ as whichever one it needs to be to
> > make the code correct.
> > 
> > I don't see any problem if an application has both types visible in
> > one of it's TUs, since no "aliasing" takes place on the app side. The
> > tagless structure "struct { union {...} __u; }" (whichever instance of
> > it) is simply zero-initialized on the application TU side. On the
> > implementation side, functions like pthread_mutex_trylock access a
> > tagless structure "struct { union {...} __u; }", of which they have
> > only one defined: the one referenced by the pthread_mutex_t typedef.
> 
> As I said, on the side of the current C thread implementation that
> needs a thorough revision to be sure that none of the TU sees two
> types. I'll look into that.
> 
> > > (This should be made independent of the question if we silently use
> > > the same hidden type, or similar structured type, under the hood.)
> > > 
> > > For C this choice is not so relevant, since all interfaces are just
> > > pointers to struct, so they are interchangeble, and this helps for the
> > > implementation.
> > > 
> > > For C++ this is not the same because "type" for them means *typename*,
> > > defined in addition that is determined in some subtle and not so
> > > obvious way.
> > > 
> > > For backward compatibility, the C++ ABI seems to dictate that there
> > > must be at least one such type that is called pthread_mutex_t. So we
> > > have to keep the type with that typename for them, it is as simple as
> > > that.
> > > 
> > > Now in a C++ context that choice above boils down to the question
> > > 
> > >   - is mtx_t a typedef to pthread_mutex_t or is it a proper type?
> > > 
> > > If we want it to be a proper type (for which I would argue, I think)
> > > we have to think of ways to make C++ believe that the two types are
> > > different, even if we use the same implementation underneath.
> > 
> > Yes, because of the namespace, C++ has to believe the types are
> > different. But the (C) implementation of the functions is not subject
> > to C++ rules about types; it's not C++ code. Thus I think everything
> > is fine.
> > 
> > If you really still think there's a problem, I still have one trick
> > I've mentioned before that makes it a 100% non-problem: never using
> > the pthread_mutex_t or mtx_t type at all internally, but instead using
> > the type of their first member. I believe I could make this work with
> > only a few lines of source-level changes, no change to the output
> > code, and minimal ugliness. Let me know if you still have doubts
> > whether the above analysis I gave is correct, and if so, I'll give my
> > trick a try.
> 
> So let us talk through this, I suspose the main change that you would
> do for that is to change the accessor macros such that they have the
> additional indirection. I can see that this would easily work for the
> pthread TU.

No, if we did it that way, it would still be a potential aliasing
violation. For example, suppose m has type pthread_mutex_t* but
actually points to an mtx_t. Then m->[anything] is accessing *m with
the wrong effective type.

Instead, the argument name would be changed from "m" to "m0"
everywhere, and the first line of the function would be:

struct __mutex *const m = (struct __mutex *)m0;

This is legal because the first member of both pthread_mutex_t and
mtx_t would be a struct of type struct __mutex, and it's always valid
to convert back and forth between a pointer to a structure and a
pointer to its first member. So the argument m0 would essentially be
being used like a void* to convey to the mutex functions a pointer to
the struct __mutex rather than the containing object.

Then the rest of the function body would be essentially identical to
what it is now, except we would either use m0 again, or cast back,
when calling other mutex functions.

> For the C thread TU, what would be the mechanics for them to call one
> of the (aliased) pthread functions?

With my alternate solution just described, simply including the normal
pthread header and casting the pointer when making the call would be
fully legal.

With the approach we previously discussed, where we have to ensure
that no TU that accesses the contents of a mutex or cv structure can
see both the C11 and POSIX versions, The C11 TUs would have to contain
prototypes for the aliased POSIX functions like:

int __pthread_mutex_lock(mtx_t *);

Note that this is a perfectly correct prototype because mtx_t is just
this TU's typedef name for the tagless "struct { union { ... } __u; }"
that it's using, which is "the same type" as pthread_mutex_lock.c's
pthread_mutex_t.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.