musl - add loongarch64 port v8.

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3838b2d6-8330-33b5-fd87-8af3404a29dc@loongson.cn>
Date: Tue, 26 Sep 2023 11:28:27 +0800
From: Hongliang Wang <wanghongliang@...ngson.cn>
To: musl@...ts.openwall.com
Subject: add loongarch64 port v8.

Hi,

I have fixed the issues listed below, and post
0001-add-loongarch64-port-v8.patch,as shown in the attachment.
please review again, if there are anything that need to be modified,
please point out, thank you.

Hongliang Wang

在 2023/9/22 上午9:37, Hongliang Wang 写道:
> Hi,
> 
> Thank you for your review, I will modify it according to the review
> suggestions and post v8 patch later.
> 
> Hongliang Wang
> 
> 在 2023/9/20 下午9:16, Szabolcs Nagy 写道:
>> * Jianmin Lv <lvjianmin@...ngson.cn> [2023-09-20 15:45:39 +0800]:
>>> Sorry to bother you, but I just want to know if there is any progress on
>>> this, because Alpine is blocking by this patch for a long time. As 
>>> Hongliang
>>> has explained questions you mentioned one by one, if any question 
>>> need to be
>>> discussed, please point them out, so that the patch can be handled 
>>> further.
>>
>> i think you should post a v8 patch to move
>> this forward. (not on github, but here)
>>
>>> On 2023/8/15 下午4:17, Hongliang Wang wrote:
>>>> 在 2023/8/13 上午9:41, Rich Felker 写道:
>>>>> On Sat, Aug 05, 2023 at 11:43:08AM -0400, Rich Felker wrote:
>>>>>> On Sat, Aug 05, 2023 at 02:18:35PM +0800, 翟小娟 wrote:
>>>>>> -+#define __BYTE_ORDER 1234
>>>>>> ++#define __BYTE_ORDER  __LITTLE_ENDIAN
>>>>>
>>>>> This is gratuitous, mismatches what is done on other archs, and is
>>>>> less safe.
>>>>>
>>>> The modification is based on the following review suggestion(
>>>> WANG Xuerui <i@...0n.name> reviewed in
>>>> https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>
>>>> `#define __BYTE_ORDER __LITTLE_ENDIAN` could be more consistent
>>>> with other arches.
>>
>> please use 1234.

--- a/arch/loongarch64/bits/alltypes.h.in
+++ b/arch/loongarch64/bits/alltypes.h.in
@@ -2,7 +2,7 @@
  #define _Int64 long
  #define _Reg   long

-#define __BYTE_ORDER  __LITTLE_ENDIAN
+#define __BYTE_ORDER  1234
  #define __LONG_MAX    0x7fffffffffffffffL


>>
>>>>>> -+TYPEDEF unsigned nlink_t;
>>>>>> ++TYPEDEF unsigned int nlink_t;
>>>>>
>>>>> Gratuitous and in opposite direction of coding style used elsewhere in
>>>>> musl. There are a few other instances of this too.
>>>>>
>>>> Based on the following review question(WANG Xuerui <i@...0n.name>
>>>> reviewed in https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>
>>>> `unsigned int`? Same for other bare `unsigned` usages.
>>>>
>>>> I fixed it to a explicit definition.
>>
>> please use plain unsigned, not unsigned int.

--- a/arch/loongarch64/bits/alltypes.h.in
+++ b/arch/loongarch64/bits/alltypes.h.in
@@ -2,7 +2,7 @@

-TYPEDEF unsigned int nlink_t;
+TYPEDEF unsigned nlink_t;
  TYPEDEF int blksize_t;

>>
>>>>>> -+      register uintptr_t tp __asm__("tp");
>>>>>> -+      __asm__ ("" : "=r" (tp) );
>>>>>> ++      uintptr_t tp;
>>>>>> ++      __asm__ __volatile__("move %0, $tp" : "=r"(tp));
>>>>>>    +      return tp;
>>>>>
>>>>> Not clear what the motivation is here. Is it working around a compiler
>>>>> bug? The original form was more optimal.
>>>>
>>>> The modification is based on the following review suggestion(
>>>> WANG Xuerui <i@...0n.name> reviewed in
>>>> https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>
>>>> While the current approach works, it's a bit fragile [1], and
>>>> the simple and plain riscv version works too:
>>>>
>>>> uintptr_t tp;
>>>> __asm__ ("move %0, $tp" : "=r"(tp));
>>>>
>>>> [1]:https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables 
>>>>
>>
>> this looks ok to me.
>>
>> (original code looks sketchy to me.)
>>
>>>>>>    +#define CRTJMP(pc,sp) __asm__ __volatile__( \
>>>>>> -+      "move $sp,%1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
>>>>>> -+
>>>>>> -+#define GETFUNCSYM(fp, sym, got) __asm__ ( \
>>>>>> -+      ".hidden " #sym "\n" \
>>>>>> -+      ".align 8 \n" \
>>>>>> -+      "       la.local $t1, "#sym" \n" \
>>>>>> -+      "       move %0, $t1 \n" \
>>>>>> -+      : "=r"(*(fp)) : : "memory" )
>>>>>> ++      "move $sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
>>>>>
>>>>> Not clear why this was changed. It was never discussed afaik. It looks
>>>>> somewhat dubious removing GETFUNCSYM and using the maybe-unreliable
>>>>> default definition.
>>>>>
>>>> Based on the following review question(
>>>> WANG Xuerui <i@...0n.name> reviewed
>>>> in https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>
>>>> Does the generic version residing in ldso/dlstart.c not work?
>>>>
>>>> I found the code logic is consistent with the generic version, So
>>>> I removed the definition here and replaced it with generic version.
>>
>> i would change it back to the asm.
>>
>> generic version should work, but maybe we don't
>> want to trust the compiler here (there may be
>> ways to compile the generic code that is not
>> compatible with incompletely relocated libc.so)
>> the asm is safer.
>>
--- a/arch/loongarch64/reloc.h
+++ b/arch/loongarch64/reloc.h
@@ -18,3 +18,10 @@

  #define CRTJMP(pc,sp) __asm__ __volatile__( \
         "move $sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
+
+#define GETFUNCSYM(fp, sym, got) __asm__ ( \
+       ".hidden " #sym "\n" \
+       ".align 8 \n" \
+       "       la.local $t1, "#sym" \n" \
+       "       move %0, $t1 \n" \
+       : "=r"(*(fp)) : : "memory" )

>>>>> It also looks like v5->v6 rewrote sigsetjmp.
>>>>>
>>>> Based on the following review question(
>>>> WANG Xuerui <i@...0n.name> reviewed
>>>> in https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>
>>>> This is crazy complicated compared to the riscv port, why is the
>>>> juggling between a0/a1 and t5/t6 necessary?
>>>>
>>>> I optimized the code implementations.
>>
>> this should be ok.
>>
>>>>> v6->v7:
>>>>>
>>>>> sigcontext/mcontext_t change mostly makes sense, but the
>>>>> namespace-safe and full mcontext_t differ in their use of the aligned
>>>>> attribute and it's not clear that the attribute is needed (since the
>>>>> offset is naturally aligned and any instance of the object is produced
>>>>> by the kernel, not by userspace).
>>>>>
>>>>>> -+      long               __uc_pad;
>>>>>
>>>>> This change to ucontext_t actually makes the struct ABI-incompatible
>>>>> in the namespace-safe version without the alignment attribute. IIRC I
>>>>> explicitly requested the explicit padding field to avoid this kind of
>>>>> footgun. Removing it does not seem like a good idea.
>>>>>
>>>> Initially, we add __uc_pad to ensure uc_mcontext is 16 byte alignment.
>>>> Now, we added __attribute__((__aligned__(16))) to
>>>> uc_mcontext.__extcontext[],this can ensure uc_mcontext is also 16 byte
>>>> alignment. so __uc_pad is not used.
>>>>
>>>> Remove __uc_pad, from the point of struct layout, musl and kernel are
>>>> consistent. otherwise, I think it may bring a sense of inconsistency
>>>> between kernel and musl. Due to Loongarch is not merged into musl now,
>>>> Remove it no compatibility issues.
>>
>> please add back the padding field.

--- a/arch/loongarch64/bits/signal.h
+++ b/arch/loongarch64/bits/signal.h
@@ -40,6 +40,7 @@ typedef struct __ucontext
         struct __ucontext  *uc_link;
         stack_t            uc_stack;
         sigset_t           uc_sigmask;
+       long               __uc_pad;
         mcontext_t         uc_mcontext;
  } ucontext_t;

>>
>>>>> In stat.h:
>>>>>
>>>>>> -+      unsigned long __pad;
>>>>>> ++      unsigned long __pad1;
>>>>>
>>>>> This is gratuitous and makes the definition gratuitously mismatch what
>>>>> will become the "generic" version of bits/stat.h. There is no contract
>>>>> for applications to be able to access these padding fields by name, so
>>>>> no motivation to make their names match glibc's names or the kernel's.
>>>>>
>>>> OK.
>>
>> please fix it.

--- a/arch/loongarch64/bits/stat.h
+++ b/arch/loongarch64/bits/stat.h
@@ -6,7 +6,7 @@ struct stat {
         uid_t st_uid;
         gid_t st_gid;
         dev_t st_rdev;
-       unsigned long __pad1;
+       unsigned long __pad;
         off_t st_size;
         blksize_t st_blksize;
         int __pad2;

>>
>>>>> In fenv.S:
>>>>>
>>>>>> ++#ifdef __clang__
>>>>>> ++#define FCSR $fcsr0
>>>>>> ++#else
>>>>>> ++#define FCSR $r0
>>>>>> ++#endif
>>>>>
>>>>> It's not clear to me what's going on here. Is there a clang
>>>>> incompatibility in the assembler language you're working around? Or
>>>>> what? If so that seems like a tooling bug that should be fixed.
>>>>>
>>>> The GNU assembler cannot correctly recognize $fcsr0, but the
>>>> LLVM IAS does not have this issue, so make a distinction.
>>>> This issue has been fixed in GNU assembler 2.41. but for compatible
>>>> with GNU assembler 2.40 and below, $r0 need reserved.
>>
>> it sounds like the correct asm is $fcsr0, so that's
>> what should be used on all assemblers, not just on
>> clang as.
>>
>> only broken old gnu as should use different syntax.
>> for this a configure test is needed that adds a
>> CFLAG like -DBROKEN_LOONGARCH_FCSR_ASM when fails.
>> and use that for the ifdef.
>>
--- a/configure
+++ b/configure

if test "$ARCH" = "loongarch64" ; then
trycppif __loongarch_soft_float "$t" && SUBARCH=${SUBARCH}-sf
+printf "checking whether compiler support FCSRs... "
+echo "__asm__(\"movfcsr2gr \$t0,\$fcsr0\");" > "$tmpc"
+if $CC -c -o /dev/null "$tmpc" >/dev/null 2>&1 ; then
+printf "yes\n"
+else
+printf "no\n"
+CFLAGS_AUTO="$CFLAGS_AUTO -DBROKEN_LOONGARCH_FCSR_ASM"
+fi
fi

--- a/src/fenv/loongarch64/fenv.S
+++ b/src/fenv/loongarch64/fenv.S
@@ -1,9 +1,9 @@
  #ifndef __loongarch_soft_float

-#ifdef __clang__
-#define FCSR $fcsr0
-#else
+#ifdef BROKEN_LOONGARCH_FCSR_ASM
  #define FCSR $r0
+#else
+#define FCSR $fcsr0
  #endif

  .global        feclearexcept

>>>>
>>>> The linux kernel also has a similar distinction:
>>>>
>>>> commit 38bb46f94544c5385bc35aa2bfc776dcf53a7b5d
>>>> Author: WANG Xuerui <git@...0n.name>
>>>> Date:   Thu Jun 29 20:58:43 2023 +0800
>>>>
>>>>      LoongArch: Prepare for assemblers with proper FCSR class support
>>>>
>>>>      The GNU assembler (as of 2.40) mis-treats FCSR operands as 
>>>> GPRs, but
>>>>      the LLVM IAS does not. Probe for this and refer to FCSRs 
>>>> as"$fcsrNN"
>>>>      if support is present.
>>>>
>>>>      Signed-off-by: WANG Xuerui <git@...0n.name>
>>>>      Signed-off-by: Huacai Chen <chenhuacai@...ngson.cn>
>>>>
>>>>> Otherwise, everything looks as expected, I think. I'm okay with making
>>>>> any fixups for merging rather than throwing this back on your team for
>>>>> more revisions, but can you please follow up and clarify the above?
>>
>> all issues look minor.
>>
>> if you post a v8 it likely gets into a release faster.
>>

View attachment "0001-add-loongarch64-port-v8.patch" of type "text/x-patch" (45014 bytes)
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.