Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 24 Aug 2018 16:17:36 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH v2] fix tls access on arm targets before armv6k

On Fri, Aug 24, 2018 at 09:27:50PM +0200, Szabolcs Nagy wrote:
> rewrote it in asm to fix a runtime-abi issue.

I've already queued the previous fix. I could replace it since it's
not pushed, but I'd kinda prefer to make this change independent from
the bugfix since it's already been done in 2 steps anyway; normally I
don't like to mix bugfixes with refactoring except when it would be
hard to do the bugfix independently first.

> >From 90da0eb3d0e92203d822ef9d5f96ef0e4e276ca2 Mon Sep 17 00:00:00 2001
> From: Szabolcs Nagy <nsz@...t70.net>
> Date: Thu, 23 Aug 2018 10:57:34 +0000
> Subject: [PATCH] fix tls access on arm targets before armv6k
> 
> commit 610c5a8524c3d6cd3ac5a5f1231422e7648a3791 changed the thread
> pointer setup so tp points at the end of the pthread struct on arm,
> but failed to update __aeabi_read_tp so it was off by 8.
> 
> this broke tls access in code that is compiled with -mtp=soft, which
> is the default when target arch is pre armv6k or thumb1.
> 
> __aeabi_read_tp used to call c code, but that was incorrect as the
> arm runtime abi specifies special pcs for this function: it is only
> allowed to clobber r0, ip, lr and cpsr.

Maybe keep just this paragraph if factoring as 2 commits?

> if musl is compiled for a target with hardware tp then the code can
> be further optimized so __aeabi_read_tp is an alias to __a_gettp_cp15,
> but that requires preprocessed asm with conditionals, for now this
> is a minimal bug fix.

I think this optimization isn't so important since, if the user is
compiling for a higher ISA level, they won't even be generating calls
to __aeabi_read_tp. It would marginally help the case where you
compile an optimized-for-native-cpu libc.so but use low-baseline
application binaries, but I doubt that's a case many users care about.

> ---
>  src/thread/arm/__aeabi_read_tp.s   | 10 ++++++----
>  src/thread/arm/__aeabi_read_tp_c.c |  8 --------
>  2 files changed, 6 insertions(+), 12 deletions(-)
>  delete mode 100644 src/thread/arm/__aeabi_read_tp_c.c
> 
> diff --git a/src/thread/arm/__aeabi_read_tp.s b/src/thread/arm/__aeabi_read_tp.s
> index 9d0cd311..2585620c 100644
> --- a/src/thread/arm/__aeabi_read_tp.s
> +++ b/src/thread/arm/__aeabi_read_tp.s
> @@ -2,7 +2,9 @@
>  .global __aeabi_read_tp
>  .type __aeabi_read_tp,%function
>  __aeabi_read_tp:
> -	push {r1,r2,r3,lr}
> -	bl __aeabi_read_tp_c
> -	pop {r1,r2,r3,lr}
> -	bx lr
> +	ldr r0,1f
> +	add r0,r0,pc
> +	ldr r0,[r0]
> +2:	bx r0
> +	.align 2
> +1:	.word __a_gettp_ptr - 2b

Isn't there an idiomatic "adr" pseudo-mnemonic preferable to use
instead of the explicit pc arithmetic here? Or is that necessarily
less efficient?

> diff --git a/src/thread/arm/__aeabi_read_tp_c.c b/src/thread/arm/__aeabi_read_tp_c.c
> deleted file mode 100644
> index 654bdc57..00000000
> --- a/src/thread/arm/__aeabi_read_tp_c.c
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -#include "pthread_impl.h"
> -#include <stdint.h>
> -
> -__attribute__((__visibility__("hidden")))
> -void *__aeabi_read_tp_c(void)
> -{
> -	return (void *)((uintptr_t)__pthread_self()-8+sizeof(struct pthread));
> -}
> -- 
> 2.18.0
> 

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.