Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 9 Nov 2023 21:15:30 +0800
From: Jingyun Hua <huajingyun@...ngson.cn>
To: musl@...ts.openwall.com
Subject: Re: add loongarch64 port v8.

Hi all,

Thank you for taking the time to check this email. I'm very interested
in musl and alpine, and I'd like to know the current status of this
patch and what we can do for it next.

After Hongliang updated this patch from v7 to v8, in order to verify it,
I built the alpine linux from scratch base on this musl v8 patch on the
LoongArch64 machine, and successfully compiled projects such as gcc,
rust, llvm, go, etc.. Currently, more than 8,500 packages have been
built and running normally in my local repository.

The patch has indeed been on hold for a long time. Is it in the
immediate plans? Is there anything that needs to be modified and
improved?

Looking forward to your reply, thanks!

Regards,
Jingyun Hua

在 2023/9/26 上午11:28, Hongliang Wang 写道:
> Hi,
> 
> I have fixed the issues listed below, and post
> 0001-add-loongarch64-port-v8.patch,as shown in the attachment.
> please review again, if there are anything that need to be modified,
> please point out, thank you.
> 
> Hongliang Wang
> 
> 在 2023/9/22 上午9:37, Hongliang Wang 写道:
>> Hi,
>>
>> Thank you for your review, I will modify it according to the review
>> suggestions and post v8 patch later.
>>
>> Hongliang Wang
>>
>> 在 2023/9/20 下午9:16, Szabolcs Nagy 写道:
>>> * Jianmin Lv <lvjianmin@...ngson.cn> [2023-09-20 15:45:39 +0800]:
>>>> Sorry to bother you, but I just want to know if there is any 
>>>> progress on
>>>> this, because Alpine is blocking by this patch for a long time. As 
>>>> Hongliang
>>>> has explained questions you mentioned one by one, if any question 
>>>> need to be
>>>> discussed, please point them out, so that the patch can be handled 
>>>> further.
>>>
>>> i think you should post a v8 patch to move
>>> this forward. (not on github, but here)
>>>
>>>> On 2023/8/15 下午4:17, Hongliang Wang wrote:
>>>>> 在 2023/8/13 上午9:41, Rich Felker 写道:
>>>>>> On Sat, Aug 05, 2023 at 11:43:08AM -0400, Rich Felker wrote:
>>>>>>> On Sat, Aug 05, 2023 at 02:18:35PM +0800, 翟小娟 wrote:
>>>>>>> -+#define __BYTE_ORDER 1234
>>>>>>> ++#define __BYTE_ORDER  __LITTLE_ENDIAN
>>>>>>
>>>>>> This is gratuitous, mismatches what is done on other archs, and is
>>>>>> less safe.
>>>>>>
>>>>> The modification is based on the following review suggestion(
>>>>> WANG Xuerui <i@...0n.name> reviewed in
>>>>> https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>>
>>>>> `#define __BYTE_ORDER __LITTLE_ENDIAN` could be more consistent
>>>>> with other arches.
>>>
>>> please use 1234.
> 
> --- a/arch/loongarch64/bits/alltypes.h.in
> +++ b/arch/loongarch64/bits/alltypes.h.in
> @@ -2,7 +2,7 @@
>   #define _Int64 long
>   #define _Reg   long
> 
> -#define __BYTE_ORDER  __LITTLE_ENDIAN
> +#define __BYTE_ORDER  1234
>   #define __LONG_MAX    0x7fffffffffffffffL
> 
> 
>>>
>>>>>>> -+TYPEDEF unsigned nlink_t;
>>>>>>> ++TYPEDEF unsigned int nlink_t;
>>>>>>
>>>>>> Gratuitous and in opposite direction of coding style used 
>>>>>> elsewhere in
>>>>>> musl. There are a few other instances of this too.
>>>>>>
>>>>> Based on the following review question(WANG Xuerui <i@...0n.name>
>>>>> reviewed in https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>>
>>>>> `unsigned int`? Same for other bare `unsigned` usages.
>>>>>
>>>>> I fixed it to a explicit definition.
>>>
>>> please use plain unsigned, not unsigned int.
> 
> --- a/arch/loongarch64/bits/alltypes.h.in
> +++ b/arch/loongarch64/bits/alltypes.h.in
> @@ -2,7 +2,7 @@
> 
> -TYPEDEF unsigned int nlink_t;
> +TYPEDEF unsigned nlink_t;
>   TYPEDEF int blksize_t;
> 
>>>
>>>>>>> -+      register uintptr_t tp __asm__("tp");
>>>>>>> -+      __asm__ ("" : "=r" (tp) );
>>>>>>> ++      uintptr_t tp;
>>>>>>> ++      __asm__ __volatile__("move %0, $tp" : "=r"(tp));
>>>>>>>    +      return tp;
>>>>>>
>>>>>> Not clear what the motivation is here. Is it working around a 
>>>>>> compiler
>>>>>> bug? The original form was more optimal.
>>>>>
>>>>> The modification is based on the following review suggestion(
>>>>> WANG Xuerui <i@...0n.name> reviewed in
>>>>> https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>>
>>>>> While the current approach works, it's a bit fragile [1], and
>>>>> the simple and plain riscv version works too:
>>>>>
>>>>> uintptr_t tp;
>>>>> __asm__ ("move %0, $tp" : "=r"(tp));
>>>>>
>>>>> [1]:https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables 
>>>>>
>>>
>>> this looks ok to me.
>>>
>>> (original code looks sketchy to me.)
>>>
>>>>>>>    +#define CRTJMP(pc,sp) __asm__ __volatile__( \
>>>>>>> -+      "move $sp,%1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
>>>>>>> -+
>>>>>>> -+#define GETFUNCSYM(fp, sym, got) __asm__ ( \
>>>>>>> -+      ".hidden " #sym "\n" \
>>>>>>> -+      ".align 8 \n" \
>>>>>>> -+      "       la.local $t1, "#sym" \n" \
>>>>>>> -+      "       move %0, $t1 \n" \
>>>>>>> -+      : "=r"(*(fp)) : : "memory" )
>>>>>>> ++      "move $sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
>>>>>>
>>>>>> Not clear why this was changed. It was never discussed afaik. It 
>>>>>> looks
>>>>>> somewhat dubious removing GETFUNCSYM and using the maybe-unreliable
>>>>>> default definition.
>>>>>>
>>>>> Based on the following review question(
>>>>> WANG Xuerui <i@...0n.name> reviewed
>>>>> in https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>>
>>>>> Does the generic version residing in ldso/dlstart.c not work?
>>>>>
>>>>> I found the code logic is consistent with the generic version, So
>>>>> I removed the definition here and replaced it with generic version.
>>>
>>> i would change it back to the asm.
>>>
>>> generic version should work, but maybe we don't
>>> want to trust the compiler here (there may be
>>> ways to compile the generic code that is not
>>> compatible with incompletely relocated libc.so)
>>> the asm is safer.
>>>
> --- a/arch/loongarch64/reloc.h
> +++ b/arch/loongarch64/reloc.h
> @@ -18,3 +18,10 @@
> 
>   #define CRTJMP(pc,sp) __asm__ __volatile__( \
>          "move $sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" )
> +
> +#define GETFUNCSYM(fp, sym, got) __asm__ ( \
> +       ".hidden " #sym "\n" \
> +       ".align 8 \n" \
> +       "       la.local $t1, "#sym" \n" \
> +       "       move %0, $t1 \n" \
> +       : "=r"(*(fp)) : : "memory" )
> 
>>>>>> It also looks like v5->v6 rewrote sigsetjmp.
>>>>>>
>>>>> Based on the following review question(
>>>>> WANG Xuerui <i@...0n.name> reviewed
>>>>> in https://www.openwall.com/lists/musl/2022/10/12/1):
>>>>>
>>>>> This is crazy complicated compared to the riscv port, why is the
>>>>> juggling between a0/a1 and t5/t6 necessary?
>>>>>
>>>>> I optimized the code implementations.
>>>
>>> this should be ok.
>>>
>>>>>> v6->v7:
>>>>>>
>>>>>> sigcontext/mcontext_t change mostly makes sense, but the
>>>>>> namespace-safe and full mcontext_t differ in their use of the aligned
>>>>>> attribute and it's not clear that the attribute is needed (since the
>>>>>> offset is naturally aligned and any instance of the object is 
>>>>>> produced
>>>>>> by the kernel, not by userspace).
>>>>>>
>>>>>>> -+      long               __uc_pad;
>>>>>>
>>>>>> This change to ucontext_t actually makes the struct ABI-incompatible
>>>>>> in the namespace-safe version without the alignment attribute. IIRC I
>>>>>> explicitly requested the explicit padding field to avoid this kind of
>>>>>> footgun. Removing it does not seem like a good idea.
>>>>>>
>>>>> Initially, we add __uc_pad to ensure uc_mcontext is 16 byte alignment.
>>>>> Now, we added __attribute__((__aligned__(16))) to
>>>>> uc_mcontext.__extcontext[],this can ensure uc_mcontext is also 16 byte
>>>>> alignment. so __uc_pad is not used.
>>>>>
>>>>> Remove __uc_pad, from the point of struct layout, musl and kernel are
>>>>> consistent. otherwise, I think it may bring a sense of inconsistency
>>>>> between kernel and musl. Due to Loongarch is not merged into musl now,
>>>>> Remove it no compatibility issues.
>>>
>>> please add back the padding field.
> 
> --- a/arch/loongarch64/bits/signal.h
> +++ b/arch/loongarch64/bits/signal.h
> @@ -40,6 +40,7 @@ typedef struct __ucontext
>          struct __ucontext  *uc_link;
>          stack_t            uc_stack;
>          sigset_t           uc_sigmask;
> +       long               __uc_pad;
>          mcontext_t         uc_mcontext;
>   } ucontext_t;
> 
>>>
>>>>>> In stat.h:
>>>>>>
>>>>>>> -+      unsigned long __pad;
>>>>>>> ++      unsigned long __pad1;
>>>>>>
>>>>>> This is gratuitous and makes the definition gratuitously mismatch 
>>>>>> what
>>>>>> will become the "generic" version of bits/stat.h. There is no 
>>>>>> contract
>>>>>> for applications to be able to access these padding fields by 
>>>>>> name, so
>>>>>> no motivation to make their names match glibc's names or the 
>>>>>> kernel's.
>>>>>>
>>>>> OK.
>>>
>>> please fix it.
> 
> --- a/arch/loongarch64/bits/stat.h
> +++ b/arch/loongarch64/bits/stat.h
> @@ -6,7 +6,7 @@ struct stat {
>          uid_t st_uid;
>          gid_t st_gid;
>          dev_t st_rdev;
> -       unsigned long __pad1;
> +       unsigned long __pad;
>          off_t st_size;
>          blksize_t st_blksize;
>          int __pad2;
> 
>>>
>>>>>> In fenv.S:
>>>>>>
>>>>>>> ++#ifdef __clang__
>>>>>>> ++#define FCSR $fcsr0
>>>>>>> ++#else
>>>>>>> ++#define FCSR $r0
>>>>>>> ++#endif
>>>>>>
>>>>>> It's not clear to me what's going on here. Is there a clang
>>>>>> incompatibility in the assembler language you're working around? Or
>>>>>> what? If so that seems like a tooling bug that should be fixed.
>>>>>>
>>>>> The GNU assembler cannot correctly recognize $fcsr0, but the
>>>>> LLVM IAS does not have this issue, so make a distinction.
>>>>> This issue has been fixed in GNU assembler 2.41. but for compatible
>>>>> with GNU assembler 2.40 and below, $r0 need reserved.
>>>
>>> it sounds like the correct asm is $fcsr0, so that's
>>> what should be used on all assemblers, not just on
>>> clang as.
>>>
>>> only broken old gnu as should use different syntax.
>>> for this a configure test is needed that adds a
>>> CFLAG like -DBROKEN_LOONGARCH_FCSR_ASM when fails.
>>> and use that for the ifdef.
>>>
> --- a/configure
> +++ b/configure
> 
> if test "$ARCH" = "loongarch64" ; then
> trycppif __loongarch_soft_float "$t" && SUBARCH=${SUBARCH}-sf
> +printf "checking whether compiler support FCSRs... "
> +echo "__asm__(\"movfcsr2gr \$t0,\$fcsr0\");" > "$tmpc"
> +if $CC -c -o /dev/null "$tmpc" >/dev/null 2>&1 ; then
> +printf "yes\n"
> +else
> +printf "no\n"
> +CFLAGS_AUTO="$CFLAGS_AUTO -DBROKEN_LOONGARCH_FCSR_ASM"
> +fi
> fi
> 
> --- a/src/fenv/loongarch64/fenv.S
> +++ b/src/fenv/loongarch64/fenv.S
> @@ -1,9 +1,9 @@
>   #ifndef __loongarch_soft_float
> 
> -#ifdef __clang__
> -#define FCSR $fcsr0
> -#else
> +#ifdef BROKEN_LOONGARCH_FCSR_ASM
>   #define FCSR $r0
> +#else
> +#define FCSR $fcsr0
>   #endif
> 
>   .global        feclearexcept
> 
>>>>>
>>>>> The linux kernel also has a similar distinction:
>>>>>
>>>>> commit 38bb46f94544c5385bc35aa2bfc776dcf53a7b5d
>>>>> Author: WANG Xuerui <git@...0n.name>
>>>>> Date:   Thu Jun 29 20:58:43 2023 +0800
>>>>>
>>>>>      LoongArch: Prepare for assemblers with proper FCSR class support
>>>>>
>>>>>      The GNU assembler (as of 2.40) mis-treats FCSR operands as 
>>>>> GPRs, but
>>>>>      the LLVM IAS does not. Probe for this and refer to FCSRs 
>>>>> as"$fcsrNN"
>>>>>      if support is present.
>>>>>
>>>>>      Signed-off-by: WANG Xuerui <git@...0n.name>
>>>>>      Signed-off-by: Huacai Chen <chenhuacai@...ngson.cn>
>>>>>
>>>>>> Otherwise, everything looks as expected, I think. I'm okay with 
>>>>>> making
>>>>>> any fixups for merging rather than throwing this back on your team 
>>>>>> for
>>>>>> more revisions, but can you please follow up and clarify the above?
>>>
>>> all issues look minor.
>>>
>>> if you post a v8 it likely gets into a release faster.
>>>

-- 
Jingyun Hua

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.