Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 21 May 2015 14:06:12 +0200
From: Szabolcs Nagy <nsz@...t70.net>
To: musl@...ts.openwall.com
Subject: Re: Refactoring atomics as llsc?

* Rich Felker <dalias@...c.org> [2015-05-21 00:12:37 -0400]:
> This is coming along really well so far. Here's the ARMv7 code
> generated for a sample external x_swap function that calls a_swap:
> 
> x_swap:
>         mov     r3, r0
>         dmb ish
> .L3:
>         ldrex r0, [r3]
>         strex r2,r1,[r3]
>         cmp     r2, #0
>         bne     .L3
>         dmb ish
>         bx      lr
> 
> The code that's producing this is the arm atomic_arch.h (so far only
> supports inline atomics for v7+):
> 
...
> #ifndef a_swap
> #define a_swap a_swap
> static inline int a_swap(volatile int *p, int v)
> {
> 	int old;
> 	a_pre_llsc();
> 	do old = a_ll(p);
> 	while (!a_sc(p, v));
> 	a_post_llsc();
> 	return old;
> }
> #endif
> 

nice

> Unfortunately there's a nasty snag: global objects like
> need_fallback_a_swap, v6_compat, or barrier_func_ptr will be re-read
> over and over in functions using atomics because the "memory" clobbers
> in the asm invalidate any value the compiler may have cached.
> 
> Fortunately, there seems to be a clean solution: load them via asm
> that looks like
> 
> static inline int v6_compat() {
> 	int r;
> 	__asm__ ( "..." : "=r"(r) );
> 	return r;
> }
> 
> where the "..." is asm to perform the load. Since this asm is not
> volatile and has no inputs, it can be CSE'd and treated like an
> attribute-const function. Strictly speaking this doesn't prevent
> reordering to the very beginning of program execution, before the
> runtime atomic selection is initialized, but I don't think that's a
> serious practical concern. It's certainly not a concern with dynamic
> linking since nothing can be reordered back into dynamic-linker-time,
> and the atomics would be initialized there. For static-linking LTO
> this may require some more thought for formal correctness.

does gcc cse that?

why is it guaranteed that r will be always the same?

(and how can gcc know the cost of the asm? it seems to
me that would be needed to determine if it's worth keeping
r in a reg or just rerun the asm every time)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.