Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <305c3b94-685d-4182-8944-ca10e941741e@foss.arm.com>
Date: Mon, 12 Jan 2026 10:44:04 -0600
From: Bill Roberts <bill.roberts@...s.arm.com>
To: musl@...ts.openwall.com
Subject: Re: [RFC 00/14] aarch64: Convert to inline asm



On 1/7/26 9:06 AM, Rich Felker wrote:
> On Thu, Dec 18, 2025 at 02:34:42AM -0600, Bill Roberts wrote:
>> Sorry for the delay folks, holidays and such. Happy 2026!
>>
>> On 12/8/25 1:10 PM, Rich Felker wrote:
>>> On Mon, Dec 08, 2025 at 11:44:43AM -0600, Bill Roberts wrote:
>>>> Based on previous discussions on enabling PAC and BTI for Aarch64
>>>> targets, rather than annotating the existing assembler, use inline
>>>> assembly and mix of C. Now this has the benefits of:
>>>> 1. Handling PAC, BTI and GCS.
>>>>      a. prologue and eplilog insertion as needed.
>>>>      b. Adding GNU notes as needed.
>>>> 2. Adding in the CFI statements as needed.
>>>>
>>>> I'd love to get feedback, thanks!
>>>>
>>>> Bill Roberts (14):
>>>>     aarch64: drop crt(i|n).s since NO_LEGACY_INITFINI
>>>>     aarch64: rewrite fenv routines in C using inline asm
>>>>     aarch64: rewrite vfork routine in C using inline asm
>>>>     aarch64: rewrite clone routine in C using inline asm
>>>>     aarch64: rewrite __syscall_cp_asm in C using inline asm
>>>>     aarch64: rewrite __unmapself in C using inline asm
>>>>     aarch64: rewrite tlsdesc reoutines in C using inline asm
>>>>     aarch64: rewrite __restore_rt routines in C using inline asm
>>>>     aarch64: rewrite longjmp routines in C using inline asm
>>>>     aarch64: rewrite setjmp routines in C using inline asm
>>>>     aarch64: rewrite sigsetjmp routines in C using inline asm
>>>>     aarch64: rewrite dlsym routine in C using inline asm
>>>>     aarch64: rewrite memcpy routine in C using inline asm
>>>>     aarch64: rewrite memset routine in C using inline asm
>>>
>>> Of these, at least vfork, tlsdesc, __restore_rt, setjmp, sigsetjmp,
>>> and dlsym are fundamentally wrong in that they have to be asm entry
>>> points. Wrapping them in C breaks the state they need to receive.
>>
>> I went through the generated code and ran tests against all of this and it
>> didn't break, is there some specific case or some compiler option
>> where this explodes? What state exactly gets trashed?
> 
> Generated code? That's not how this works. The code has to be
> semantically correct, not happen to produce machine code that's
> correct when compiled with the compiler you tested it with.

Yes, and we *agree* there. I see my fallacy with vfork, in my head I was
thinking fork semantics, which is not the case (duh).

For anyone curious, in vfork, the child shares the address space with 
the parent, so it has to be very careful on state modification, which in 
C we can't guarantee. For instance, C could spill to the stack.

So with that said, do you want these patches now while I re-spin?
- aarch64: drop crt(i|n).s since NO_LEGACY_INITFINI
- aarch64: rewrite fenv routines in C using inline asm


> 
> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.