|
|
Message-ID: <20251208191058.GJ1827@brightrain.aerifal.cx> Date: Mon, 8 Dec 2025 14:10:59 -0500 From: Rich Felker <dalias@...c.org> To: Bill Roberts <bill.roberts@....com> Cc: musl@...ts.openwall.com Subject: Re: [RFC 00/14] aarch64: Convert to inline asm On Mon, Dec 08, 2025 at 11:44:43AM -0600, Bill Roberts wrote: > Based on previous discussions on enabling PAC and BTI for Aarch64 > targets, rather than annotating the existing assembler, use inline > assembly and mix of C. Now this has the benefits of: > 1. Handling PAC, BTI and GCS. > a. prologue and eplilog insertion as needed. > b. Adding GNU notes as needed. > 2. Adding in the CFI statements as needed. > > I'd love to get feedback, thanks! > > Bill Roberts (14): > aarch64: drop crt(i|n).s since NO_LEGACY_INITFINI > aarch64: rewrite fenv routines in C using inline asm > aarch64: rewrite vfork routine in C using inline asm > aarch64: rewrite clone routine in C using inline asm > aarch64: rewrite __syscall_cp_asm in C using inline asm > aarch64: rewrite __unmapself in C using inline asm > aarch64: rewrite tlsdesc reoutines in C using inline asm > aarch64: rewrite __restore_rt routines in C using inline asm > aarch64: rewrite longjmp routines in C using inline asm > aarch64: rewrite setjmp routines in C using inline asm > aarch64: rewrite sigsetjmp routines in C using inline asm > aarch64: rewrite dlsym routine in C using inline asm > aarch64: rewrite memcpy routine in C using inline asm > aarch64: rewrite memset routine in C using inline asm Of these, at least vfork, tlsdesc, __restore_rt, setjmp, sigsetjmp, and dlsym are fundamentally wrong in that they have to be asm entry points. Wrapping them in C breaks the state they need to receive. Some others like __syscall_cp_asm are wrong by virtue of putting symbol definitions inside inline asm, which may be emitted a different number of times than it appears in the source. The labels in __syscall_cp_asm must exist only once, so it really needs to be external asm (for a slightly different reason than the entry point needing to be asm). The advice to move to inline asm was to do it where possible, i.e. where it's gratuitous that we had an asm source file. But even where this can be done, it should be done by actually writing the inline asm with proper register constraints, not just copy-pasting the asm into C files wrapped in __asm__. Some things, like __clone, even if they could be done as C source files with asm, are not valid the way you've just wrapped them because you're performing a return from within the asm but don't have access to the return address or any way to undo potential stack adjustments made in prologue before the __asm__. And this would catastrophically break if LTO'd. memcpy and memset are slated for "removal" at some point, replacing the high level flow logic in arch-specific asm with shared high level C and arch-provided asm only for the middle-section bulk copy/fill operation in aligned and unaligned variants. I'm really not up for reviewing and trusting in the correctness of large changes to any of the existing arch-specific memcpy/memset asm or adding new ones for other archs until then, because it's effort on something that's intended to be removed. So these should just be kept as-is for now. The approach to the fenv changes looks roughly right. This is also something I'd like to do in an arch-generic way at some point, but there's no good reason not to do it first on aarch64 like you've proposed. Removing crt[in].s is probably okay as well. We generally prefer patch series as a single email with multiple MIME attachments instead of git send-email threads, if that's easy for you to do. It's not a big deal either way but it keeps folks' inbox volume down and makes it easier to reply with review of the whole series together. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.