Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 30 Mar 2016 10:29:26 -0400
From: Rich Felker <dalias@...c.org>
To: Jaydeep Patil <Jaydeep.Patil@...tec.com>
Cc: "musl@...ts.openwall.com" <musl@...ts.openwall.com>
Subject: Re: [PATCH] Fix atomic_arch.h for MIPS32 R6

On Wed, Mar 30, 2016 at 09:45:59AM +0000, Jaydeep Patil wrote:
> >> >> >The compiler has no way of knowing that the operand will be used
> >> >> >with ll with the 9-bit offset restriction; as far as it knows, it
> >> >> >will be used in a normal context where a 16-bit offset is valid. I
> >> >> >don't have a toolchain that will target r6, but you can try the
> >> >> >following program which produces an offset of 4096 for loading p[1024]:
> >> >> >
> >> >> >unsigned ll1k(volatile unsigned *p) {
> >> >> >	unsigned val;
> >> >> >	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) :
> >> >> >"memory" );
> >> >> >	return val;
> >> >> >}
> >> >> >
> >> >> >I would expect this to produce errors at assembly time on r6.
> >> >> >Rich
> >> >>
> >> >> This is what compiler has generated for above function:
> >> >>
> >> >> $ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32
> >> >>
> >> >> Objdump:
> >> >>
> >> >> 00000000 <ll1k>:
> >> >>    0:   24821000        addiu   v0,a0,4096
> >> >>    4:   7c420036        ll      v0,0(v0)
> >> >>    8:   d81f0000        jrc     ra
> >> >>    c:   00000000        nop
> >> >
> >> >Can you try gcc -S instead of -c (still at -O3) to produce asm output
> >> >without assembling it?
> >>
> >> Generated asssembly:
> >>
> >> #APP
> >>  # 4 "test.c" 1
> >>         ll $2, 4096($4)
> >>  # 0 "" 2
> >> #NO_APP
> >>         jrc     $31
> >>
> >> Even if we set "noreorder" before LL, assembler generates addiu+ll:
> >>
> >> 00000000 <ll1k>:
> >>    0:   24821000        addiu   v0,a0,4096
> >>    4:   7c420036        ll      v0,0(v0)
> >>    8:   d81f0000        jrc     ra
> >>    c:   00000000        nop
> >
> >I see. I suspected the assembler was doing it. "noat", not "noreorder", is the
> >way to suppress things like this but I doubt even "noat" does it since a
> >separate temp register ("at") is not needed in this case.
> >
> >If all assembers that support R6 support this rewriting, then the ZC constraint
> >in gcc is really just an optimization, not strictly necessary. We should probably
> >check (1) whether clang's internal assembler can do the rewriting, and (2)
> >whether clang supports the ZC constraint. I would prefer using ZC but I want
> >to do whatever is more compatible; I don't think the codegen efficiency
> >matters a lot either way.
> >Rich
> 
> Clang's integrated assembler does not support this rewriting. However ZC is supported.
> I have modified both atomic_arch.h and pthread_arch.h to reflect this. 
> Please refer to https://github.com/JaydeepIMG/musl-1/tree/fix_inline_asm_for_R6 for the patch (also listed below).
> I have also added R6 as subarch.
> 
> 
> 
> >From 20054ee55643d9e81163ca58ac63cc38b5080969 Mon Sep 17 00:00:00 2001
> From: Jaydeep Patil <jaydeep.patil@...tec.com>
> Date: Wed, 30 Mar 2016 10:37:30 +0100
> Subject: [PATCH] [MIPS] Update inline asm for R6 and add R6 as subtarget
> 
> ---
>  arch/mips/atomic_arch.h    | 17 +++--------------
>  arch/mips/pthread_arch.h   |  8 +-------
>  arch/mips64/atomic_arch.h  | 12 +++++-------
>  arch/mips64/pthread_arch.h |  7 +------
>  configure                  |  2 ++
>  5 files changed, 12 insertions(+), 34 deletions(-)
> 
> diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h
> index ce2823b..4dbe4bb 100644
> --- a/arch/mips/atomic_arch.h
> +++ b/arch/mips/atomic_arch.h
> @@ -3,10 +3,8 @@ static inline int a_ll(volatile int *p)
>  {
>  	int v;
>  	__asm__ __volatile__ (
> -		".set push ; .set mips2\n\t"
>  		"ll %0, %1"
> -		"\n\t.set pop"
> -		: "=r"(v) : "m"(*p));
> +		: "=r"(v) : "ZC"(*p));
>  	return v;
>  }

Did you see the proposed patch I sent? Your version does not work
because it removes support for pre-mips2. It also removed support for
gcc versions prior to 4.9, since "ZC" was not added until r6 support
was added (or maybe later). My version of the patch addresses both of
these issues by conditioning on __mips_isa_rev>=6.

>  #define a_pre_llsc a_barrier
> diff --git a/arch/mips/pthread_arch.h b/arch/mips/pthread_arch.h
> index 8a49965..d8b6955 100644
> --- a/arch/mips/pthread_arch.h
> +++ b/arch/mips/pthread_arch.h
> @@ -1,13 +1,7 @@
>  static inline struct pthread *__pthread_self()
>  {
> -#ifdef __clang__
> -	char *tp;
> -	__asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" );
> -#else
>  	register char *tp __asm__("$3");
> -	/* rdhwr $3,$29 */
> -	__asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) );
> -#endif
> +	__asm__ __volatile__ ("rdhwr %0,$29" : "=r" (tp));
>  	return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
>  }

This also needs to either be conditioned on __mips_isa_rev>=2 or
similar, or else the .set push/pop stuff needs to be used for
__mips_isa_rev<6.

>  #define a_barrier a_barrier
>  static inline void a_barrier()
>  {
> -	/* mips2 sync, but using too many directives causes
> -	 * gcc not to inline it, so encode with .long instead. */
> -	__asm__ __volatile__ (".long 0xf" : : : "memory");
> +	__asm__ __volatile__ ("sync" : : : "memory");
>  }

I think this should be okay, since all mips64 ISA levels have this
insn as far as I know.

> diff --git a/configure b/configure
> index 213a825..969671d 100755
> --- a/configure
> +++ b/configure
> @@ -612,11 +612,13 @@ trycppif __AARCH64EB__ "$t" && SUBARCH=${SUBARCH}_be
>  fi
>  
>  if test "$ARCH" = "mips" ; then
> +trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
>  trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
>  trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
>  fi
>  
>  if test "$ARCH" = "mips64" ; then
> +trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
>  trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
>  trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
>  fi

This looks ok, but a similar condition in arch/mips*/reloc.h is also
needed.

Since I've done most of the thinking about the above issues already
and have a patch for some of them, let me prepare a complete patch and
send it to the list for you to check and make sure it meets your needs
for r6. I should be able to prepare it very quickly. Then we can look
at applying the same changes to the n32 port and reviewing it.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.