Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 22 Nov 2014 21:52:15 +0100
From: Joakim Sindholt <opensource@...sha.com>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] Add stdatomic.h for clang>=3.1 and gcc>=4.1

On Sun, 2014-11-09 at 18:11 +0100, Jens Gustedt wrote:
> Hello,
> interesting idea!
> 
> I am not at all sure how far we should go with such a support, because
> if the compiler is claiming to be C11 (which much of them do with
> -std=c11) and we don't provide incomplete atomics support, the
> application programmer will have difficulties dealing with the cases
> and have to do feature tests which are not trivial at all.
> 
> The __STD_NO_ATOMICS__ macro is the only feature test macro that is
> foreseen by the standard and it leaves no choice: either you have
> *all* atomics support (_Atomic qualifier, _Atomic type specifier,
> operations on atomic types, atomic type generic functions) or *none*.
> 
> I would propose some rules that should be respected by such a header
> file:
> 
>  - all syntax that this supports should be correct C11 syntax, but it
>    may be a restricted version, that is not all C11 might be
>    supported. In such a case the compilation should fail.
> 
>  - all C11 syntax that was not in C99 and that compiles with this
>    header should compile to correct executables with provable atomic
>    sematic as defined by C11
> 
>  - all produced object code must be ABI compatible with the newer
>    versions of the same compiler

Agreed.

I'm sorry it's taken so long to respond but I didn't want to do it from
my phone and I haven't been home in a while.

> Your file doesn't provide support for atomics for the non-basic
> types, but which the C11 standard requests? How to deal with them? Do
> you plan to implement libatomic.a ? I think that would be a good idea,
> because the gcc implementation is bogus and relies on
> pthread_mutex_t instead of atomic_flag.

I'm not sure what you mean here? Does __atomic_stuff in GCC not get
translated into assembly level atomics? I just disassembled a program
compiled with this revised version of stdatomic.h and GCC 4.7 and I see
no special calls and plenty of atomic asm.
And I do wish I had the time to go and implement atomic functions for
libc.a but I'm not really sure how to do it right now.

> Would we'd have to mimic the different interfaces on a per compiler
> base?

As it stands right now clang has their __c11_atomic stuff and gcc has
__atomic which doesn't mimic C11 exactly but close enough. Other
compilers will probably provide one or the other but the problem here is
obviously that we might end up trampling on a compiler's own
implementation.
There are 2 reasons I propose this be included in musl at some point
after rigorous testing:
1) clang provides full C11 atomic support as of 3.1 but doesn't ship
stdatomic.h until 3.6, which is not even released yet.
2) It would be ideal if we were to provide stdatomic.h support for
compilers that do not have it built in.
Number 2 has a few problems as it requires generic function-like macros
and I cannot see how to implement it properly without compiler-specific
features. That might be alright too since we can at least support old
GCC.

> Generally, at least for a start, I think it would be a good idea to
> separate the support for different compilers, and in particular don't
> claim support for a compiler which you didn't even test :)

In my defense, I did say it was only for personal testing purposes. I'm
not insane enough to propose committing this to musl without testing.

> Having something for earlier versions of clang would already be a good
> start.
> 
> Some more comments in the text, but I didn't do a complete review,
> yet, less did I do tests.
> 
> Jens
> 
> Am Sonntag, den 09.11.2014, 13:53 +0100 schrieb Joakim Sindholt:
> > GCC and Clang ship their own stdatomic.h headers but not for all the
> > versions mentioned above. They ship many other standard headers too that
> > musl also ships (stdbool/def/int/...).
> > 
> > In the end musl should be a complete libc with all standard headers and
> > this is one of them. It is up to the individual programmer to ensure
> > that his or her compiler is properly supported by for example checking
> > __STD_NO_ATOMICS__ and __STDC_VERSION__ >= 201112L.
> 
> As said above checking these would have to give the assertion that the
> platform supports all atomics, library and compiler, which we
> shouldn't give lighthearted
> 
> > For clang the support is complete for versions >= 3.1. GCC added the
> > __atomic functions in 4.7 but not _Atomic until 4.9 so from 4.1 it will
> > work correctly so long as you don't use VLAs together with things like
> > the ++ operator in arguments and is subject to the same limitations as
> > 4.7.
> 
> I don't follow here. Could you explain a bit more?
> 
> I have the impression that your patch can't work at all with gcc,
> since it is missing _Atomic and you need the typedef's with _Atomic in
> them. Then, what is the connection with VLA? and the ++ operator?

GCC was nice enough to add all the __atomic stuff sans _Atomic for 4.7.
The _Atomic keyword didn't come along until 4.9. So basically you can
trivially implement the function-like macros for 4.7 but you won't have
4.7.

> > From 4.7 it will work as inteded so long as you only use the
> > function-like-macros and from 4.9 it will have full support.
> 
> I don't think the functionlike macros will work for arbitrary types

GCC and Clang both implement the header this way.

> > This patch has only been tested with clang. The __typeof__ magic is to
> > avoid changes to alltypes. It should be changed before committing this
> > to musl. This patch is only for people who want to test themselves.
> 
> I would be easier to read if you would provide the changes in
> alltypes, as well. Since we are using git, it should not be a problem
> to keep all these changes to different files together in one patch.

I know but dalias suggested making it independent of the rest of musl so
people could just drop it into include/ and get testing. I originally
did it with a heap of changes to alltypes.h and if committed it would be
with those changes in place.

> > This work is released to the Public Domain.
> > ---
> >  include/stdatomic.h | 287 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 287 insertions(+)
> >  create mode 100644 include/stdatomic.h
> > 
> > diff --git a/include/stdatomic.h b/include/stdatomic.h
> > new file mode 100644
> > index 0000000..3a61e47
> > --- /dev/null
> > +++ b/include/stdatomic.h
> > @@ -0,0 +1,287 @@
> > +#ifndef _STDATOMIC_H
> > +#define _STDATOMIC_H
> > +
> > +#ifndef __has_extension
> > +#define __has_extension(x) 0
> > +#define __musl_has_extension 1
> > +#endif
> > +
> > +#if defined(__clang__) && __has_extension(c_atomic)
> > +
> > +#define kill_dependency(y) (y)
> > +
> > +#define ATOMIC_VAR_INIT(value) (value)
> > +#define atomic_init(obj, value) __c11_atomic_init(obj, value)
> > +
> > +#define atomic_is_lock_free(obj) __c11_atomic_is_lock_free(sizeof(obj))
> > +#define __atomic_type_is_lock_free(type) \
> > +    __c11_atomic_is_lock_free(sizeof(type))
> > +
> > +#define atomic_thread_fence(order) __c11_atomic_thread_fence(order)
> > +#define atomic_signal_fence(order) __c11_atomic_signal_fence(order)
> > +
> > +#define atomic_store_explicit(object, desired, order) \
> > +    __c11_atomic_store(object, desired, order)
> > +#define atomic_load_explicit(object, order) \
> > +    __c11_atomic_load(object, order)
> > +
> > +#define atomic_exchange_explicit(object, value, order) \
> > +    __c11_atomic_exchange(object, value, order)
> > +#define atomic_compare_exchange_strong_explicit(object, expected, desired, \
> > +                                                success, failure) \
> > +    __c11_atomic_compare_exchange_strong(object, expected, desired, \
> > +                                         success, failure)
> > +#define atomic_compare_exchange_weak_explicit(object, expected, desired, \
> > +                                              success, failure) \
> > +    __c11_atomic_compare_exchange_weak(object, expected, desired, \
> > +                                       success, failure)
> > +
> > +#define atomic_fetch_add_explicit(object, operand, order) \
> > +    __c11_atomic_fetch_add(object, operand, order)
> > +#define atomic_fetch_sub_explicit(object, operand, order) \
> > +    __c11_atomic_fetch_sub(object, operand, order)
> > +#define atomic_fetch_or_explicit(object, operand, order) \
> > +    __c11_atomic_fetch_or(object, operand, order)
> > +#define atomic_fetch_xor_explicit(object, operand, order) \
> > +    __c11_atomic_fetch_xor(object, operand, order)
> > +#define atomic_fetch_and_explicit(object, operand, order) \
> > +    __c11_atomic_fetch_and(object, operand, order)
> > +
> > +#elif (__GNUC__ == 4 && __GNUC_MINOR__ >= 7) || __GNUC__ > 4
> > +
> > +#define ATOMIC_VAR_INIT(value) (value)
> > +#define atomic_init(obj, value) do { *(obj) = (value); } while (0)
> > +
> > +#if (__GNUC__ == 4 && __GNUC_MINOR__ >= 9) || __GNUC__ > 4
> > +#define kill_dependency(y) __extension__({__auto_type __y = (y); __y;})
> > +
> > +#define atomic_is_lock_free(obj) \
> > +    __extension__({ \
> > +        __auto_type __obj = (obj); \
> > +        __atomic_is_lock_free(sizeof(*__obj), __obj); \
> > +    })
> > +#else
> > +#define __NEED__Atomic
> > +
> > +#define kill_dependency(y) (y)
> > +
> > +#define atomic_is_lock_free(obj) \
> > +    __atomic_is_lock_free(sizeof(*(obj)), (void *)0)
> > +#endif
> > +
> > +#define __atomic_type_is_lock_free(type) \
> > +    __atomic_always_lock_free(sizeof(type), (void *)0)
> > +
> > +#define atomic_thread_fence(order) __atomic_thread_fence(order)
> > +#define atomic_signal_fence(order) __atomic_signal_fence(order)
> > +
> > +#define atomic_store_explicit(object, value, order) \
> > +    __atomic_store_n(object, value, order)
> > +#define atomic_load_explicit(object, order) \
> > +    __atomic_load_n(object, order)
> > +
> > +#define atomic_exchange_explicit(object, desired, order) \
> > +    __atomic_exchange_n(object, desired, order)
> > +#define atomic_compare_exchange_strong_explicit(object, expected, desired, \
> > +                                                success, failure) \
> > +    __atomic_compare_exchange_n(object, expected, desired, 0, success, failure)
> > +#define atomic_compare_exchange_weak_explicit(object, expected, desired, \
> > +                                              success, failure) \
> > +    __atomic_compare_exchange_n(object, expected, desired, 1, success, failure)
> > +
> > +#define atomic_fetch_add_explicit(object, operand, order) \
> > +    __atomic_fetch_add(object, operand, order)
> > +#define atomic_fetch_sub_explicit(object, operand, order) \
> > +    __atomic_fetch_sub(object, operand, order)
> > +#define atomic_fetch_or_explicit(object, operand, order) \
> > +    __atomic_fetch_or(object, operand, order)
> > +#define atomic_fetch_xor_explicit(object, operand, order) \
> > +    __atomic_fetch_xor(object, operand, order)
> > +#define atomic_fetch_and_explicit(object, operand, order) \
> > +    __atomic_fetch_and(object, operand, order)
> > +
> > +#elif __GNUC__ == 4 && __GNUC_MINOR__ >= 1
> > +
> > +#define __NEED__Atomic
> > +
> > +#define kill_dependency(y) (y)
> > +
> > +#define ATOMIC_VAR_INIT(value) (value)
> > +#define atomic_init(obj, value) do { *(obj) = (value); } while (0)
> > +
> > +#define atomic_is_lock_free(obj) (sizeof(obj) <= sizeof(void *))
> > +#define __atomic_type_is_lock_free(type) (sizeof(type) <= sizeof(void *))
> > +
> > +#define atomic_thread_fence(order) __sync_synchronize()
> > +#define atomic_signal_fence(order) __asm__ volatile ("" : : : "memory")
> > +
> > +/* for GCC < 4.7 some concessions are made. First off, ordering is ignored and
> > + * always treated as a full seq_cst barrier.
> 
> ok, this is consistent with the standard
> 
> > Second, although these are
> > + * described as "generic functions" GCC < 4.7 cannot support uses such as:
> > + * int n = 4;
> > + * atomic_int arr[n];
> > + * atomic_int *i = arr;
> > + * if (atomic_compare_exchange_strong(i++, ...)) {...
> > + * because of the liberal use of the __typeof__ extension.
> > + * Furthermore, GCC specified that test_and_set doesn't necessarily support
> > + * arbitrary values. It's used as both exchange and store here. Be careful. */
> 
> lock_test_and_set is simply not the right tool for these operations,
> this should only be used for atomic_flag

I know. It was just a quick way of approximating. It's been removed now
as it didn't make much sense. GCC < 4.7 really should be implemented as
function calls.

> > +#define atomic_store_explicit(object, value, order) \
> > +    do { \
> > +        __sync_lock_test_and_set(object, value); \
> > +        __sync_synchronize(); \
> > +    while (0)
> 
> A missing "}" ?

Fixed.

> > +#define atomic_load_explicit(object, order) \
> > +    __atomic_fetch_and_add(object, 0)
> > +
> > +#define atomic_exchange_explicit(object, desired, order) \
> > +    __extension__({ \
> > +        __typeof__(*(object)) __ret; \
> > +        __ret = __sync_lock_test_and_set(object, desired); \
> > +        __sync_synchronize(); \
> > +        __ret; \
> > +    })
> > +#define atomic_compare_exchange_strong_explicit(object, expected, desired, \
> > +                                                success, failure) \
> > +    __extension__({ \
> > +        __typeof__(object) __expected = (expected); \
> > +        __typeof__(*__expected) __prev; \
> > +        _Bool __ret; \
> > +        __prev = __sync_val_compare_and_swap(object, *__expected, desired); \
> > +        __ret = (__prev == *__expected); \
> > +        *__expected = __prev; \
> > +        __ret; \
> > +    })
> > +#define atomic_compare_exchange_weak_explicit(object, expected, desired, \
> > +                                              success, failure) \
> > +    atomic_compare_exchange_strong_explicit(object, expected, desired, \
> > +                                            success, failure)
> > +
> > +#define atomic_fetch_add_explicit(object, operand, order) \
> > +    __sync_fetch_and_add(object, operand)
> > +#define atomic_fetch_sub_explicit(object, operand, order) \
> > +    __sync_fetch_and_sub(object, operand)
> > +#define atomic_fetch_or_explicit(object, operand, order) \
> > +    __sync_fetch_and_or(object, operand)
> > +#define atomic_fetch_xor_explicit(object, operand, order) \
> > +    __sync_fetch_and_xor(object, operand)
> > +#define atomic_fetch_and_explicit(object, operand, order) \
> > +    __sync_fetch_and_and(object, operand)
> > +
> > +#else
> > +
> > +#error "Musl's stdatomic.h does not support your compiler"
> > +
> > +#endif
> > +
> > +#ifdef __musl_has_extension
> > +#undef __musl_has_extension
> > +#undef __has_extension
> > +#endif
> > +
> > +#ifdef __NEED__Atomic
> > +#undef __NEED__Atomic
> > +#define _Atomic volatile
> > +#define volatile(x) x volatile
> > +#endif
> 
> argh, redefining volatile, no way
> 
> this messes the syntax completely think of contexts where () can
> appear after volatile. E.g
> 
> int volatile (*toto)[5];
> 
> is a valid array declaration. Probably there are other such uses of
> volatile that would mess up completely.
> 
> I suppose that you want to ensure that your _Atomic macro works with
> both, the qualifier version and the type specifier. I don't think that
> any of this is possible. This is a language feature, and nothing the
> library can deal with.
> 
> In P99 I made the choice to just support the specifier variant,
> definining _Atomic to be a function like macro. First, this is then
> still valid C99.
> 
> Then, this avoids to give the false impression to the application
> programmer that _Atomic qualifiers with all the operations would be
> supported.
> 
> E.g with your patch something like
> 
> _Atomic unsigned counter = ATOMIC_VAR_INIT(0);
> 
> ...
> 
> if (counter++) {
>   ... so something ..
> } else {
>   ... we are the first ...
> }
> 
> would compile and pass completely unnoticed, but wouldn't implement
> atomic semantics at all.
> 
> With P99, the eqivalent line with type specifiers
> 
>  Atomic(unsigned) counter = ATOMIC_VAR_INIT(0);
> 
> would compile, but the line
> 
> if (counter++) {
> 
> would fail.
> 
> In contrast, with the line
> 
> if (atomic_fetch_add(&counter, 1)) {
> 
> which also is correct C11 would work.

Noted and changed to an anonymous struct. I realized after sending it
how idiotic #define volatile(x) was.

> > +typedef enum {
> > +    memory_order_relaxed = 0,
> > +    memory_order_consume = 1,
> > +    memory_order_acquire = 2,
> > +    memory_order_release = 3,
> > +    memory_order_acq_rel = 4,
> > +    memory_order_seq_cst = 5
> > +} memory_order;
> 
> I'd prefer to provide the corresponding macros as gcc defines them
> with #ifdef and then just use these macros here

I'd prefer to just keep them as these constants. I would go so far as to
say that this is a valid interpretation of the standard.
http://port70.net/~nsz/c/c11/n1570.html#7.17.3

> > +typedef _Atomic _Bool atomic_bool;
> > +typedef _Atomic char atomic_char;
> > +typedef _Atomic signed char atomic_schar;
> > +typedef _Atomic short atomic_short;
> > +typedef _Atomic int atomic_int;
> > +typedef _Atomic long atomic_long;
> > +typedef _Atomic long long atomic_llong;
> > +typedef _Atomic unsigned char atomic_uchar;
> > +typedef _Atomic unsigned short atomic_ushort;
> > +typedef _Atomic unsigned int atomic_uint;
> > +typedef _Atomic unsigned long atomic_ulong;
> > +typedef _Atomic unsigned long long atomic_ullong;
> > +typedef _Atomic unsigned short atomic_char16_t;
> > +typedef _Atomic unsigned atomic_char32_t;
> > +typedef _Atomic __typeof__(L'\0') atomic_wchar_t;
> > +typedef _Atomic signed char atomic_int_least8_t;
> > +typedef _Atomic short atomic_int_least16_t;
> > +typedef _Atomic int atomic_int_least32_t;
> > +typedef _Atomic __typeof__(0x100000000) atomic_int_least64_t;
> > +typedef _Atomic signed char atomic_int_fast8_t;
> > +typedef _Atomic int atomic_int_fast16_t;
> > +typedef _Atomic int atomic_int_fast32_t;
> > +typedef _Atomic __typeof__(0x100000000) atomic_int_fast64_t;
> > +typedef _Atomic unsigned char atomic_uint_least8_t;
> > +typedef _Atomic unsigned short atomic_uint_least16_t;
> > +typedef _Atomic unsigned int atomic_uint_least32_t;
> > +typedef _Atomic __typeof__(0x100000000U) atomic_uint_least64_t;
> > +typedef _Atomic unsigned char atomic_uint_fast8_t;
> > +typedef _Atomic unsigned atomic_uint_fast16_t;
> > +typedef _Atomic unsigned atomic_uint_fast32_t;
> > +typedef _Atomic __typeof__(0x100000000U) atomic_uint_fast64_t;
> > +typedef _Atomic __typeof__((char *)0 - (char *)0) atomic_intptr_t;
> > +typedef _Atomic __typeof__(sizeof(0)) atomic_uintptr_t;
> > +typedef _Atomic __typeof__(sizeof(0)) atomic_size_t_t;
> > +typedef _Atomic __typeof__((char *)0 - (char *)0) atomic_ptrdiff_t;
> > +typedef _Atomic __typeof__((char *)0 - (char *)0) atomic_intmax_t;
> > +typedef _Atomic __typeof__(sizeof(0)) atomic_uintmax_t;
> > +
> > +#define ATOMIC_BOOL_LOCK_FREE __atomic_type_is_lock_free(_Bool)
> > +#define ATOMIC_CHAR_LOCK_FREE __atomic_type_is_lock_free(char)
> > +#define ATOMIC_CHAR16_T_LOCK_FREE __atomic_type_is_lock_free(unsigned short)
> > +#define ATOMIC_CHAR32_T_LOCK_FREE __atomic_type_is_lock_free(unsigned)
> > +#define ATOMIC_WCHAR_T_LOCK_FREE __atomic_type_is_lock_free(__typeof__(L'\0'))
> > +#define ATOMIC_SHORT_LOCK_FREE __atomic_type_is_lock_free(short)
> > +#define ATOMIC_INT_LOCK_FREE __atomic_type_is_lock_free(int)
> > +#define ATOMIC_LONG_LOCK_FREE __atomic_type_is_lock_free(long)
> > +#define ATOMIC_LLONG_LOCK_FREE __atomic_type_is_lock_free(long long)
> > +#define ATOMIC_POINTER_LOCK_FREE __atomic_type_is_lock_free(void *)
> > +
> > +#define atomic_store(object, desired) \
> > +    atomic_store_explicit(object, desired, memory_order_seq_cst)
> > +#define atomic_load(object) \
> > +    atomic_load_explicit(object, memory_order_seq_cst)
> > +
> > +#define atomic_exchange(object, desired) \
> > +    atomic_exchange_explicit(object, desired, memory_order_seq_cst)
> > +#define atomic_compare_exchange_strong(object, expected, desired) \
> > +    atomic_compare_exchange_strong_explicit(object, expected, desired, \
> > +                                            memory_order_seq_cst, \
> > +                                            memory_order_seq_cst)
> > +#define atomic_compare_exchange_weak(object, expected, desired) \
> > +    atomic_compare_exchange_weak_explicit(object, expected, desired, \
> > +                                          memory_order_seq_cst, \
> > +                                          memory_order_seq_cst)
> > +
> > +#define atomic_fetch_add(object, operand) \
> > +    atomic_fetch_add_explicit(object, operand, memory_order_seq_cst)
> > +#define atomic_fetch_sub(object, operand) \
> > +    atomic_fetch_sub_explicit(object, operand, memory_order_seq_cst)
> > +#define atomic_fetch_or(object, operand) \
> > +    atomic_fetch_or_explicit(object, operand, memory_order_seq_cst)
> > +#define atomic_fetch_xor(object, operand) \
> > +    atomic_fetch_xor_explicit(object, operand, memory_order_seq_cst)
> > +#define atomic_fetch_and(object, operand) \
> > +    atomic_fetch_and_explicit(object, operand, memory_order_seq_cst)
> > +
> > +typedef _Atomic _Bool atomic_flag;
> 
> that has chances to be wrong base type, probably this would be better
> 
> typedef _Atomic int atomic_flag;
> 
> at least you'd really have to be sure about the ABI that later
> versions of the corresponding compiler suppose

The GCC test_and_set function explicitly takes a bool * and both GCC and
Clang implement it as a _Bool. GCC of course makes it significantly more
complicated than that and will on occasion make it an unsigned char but
nonetheless bool is the correct type here.
I have changed it to be an atomic_bool in a struct as both GCC and Clang
has it in a struct. Presumably to separate it from the generic _Atomic
stuff.

> > +#define ATOMIC_FLAG_INIT ATOMIC_VAR_INIT(0)
> > +
> > +#define atomic_flag_test_and_set(object) \
> > +    atomic_exchange(object, 1)
> > +#define atomic_flag_test_and_set_explicit(object, order) \
> > +    atomic_exchange_explicit(object, 1, order)
> 
> here also this should definitively chose the ABI that is provided,
> namely the __sync_lock_test_and_set builtins.
> 
> Operations on atomic_flag must be guaranteed to be lockfree, and the
> __sync_lock builtins are the only ones where gcc seem to give such a
> guarantee.

Noted and changed.

> > +#define atomic_flag_clear(object) \
> > +    atomic_store(object, 0)
> > +#define atomic_flag_clear_explicit(object, order) \
> > +    atomic_store_explicit(object, 0, order)
> > +
> > +#endif

And this time I actually did test it with GCC 4.7. I don't have GCC 4.9
to test with here so I will leave that as an exercise to the reader.

I know this is a lot of crap to sift through but I would really like to
eventually see an stdatomic.h in musl.

-- Joakim



View attachment "stdatomic.h" of type "text/x-chdr" (11453 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.