|
|
Message-ID: <16461434-a374-4e23-9948-80640fd3b31e@foss.arm.com> Date: Thu, 18 Dec 2025 02:34:42 -0600 From: Bill Roberts <bill.roberts@...s.arm.com> To: musl@...ts.openwall.com Subject: Re: [RFC 00/14] aarch64: Convert to inline asm Sorry for the delay folks, holidays and such. Happy 2026! On 12/8/25 1:10 PM, Rich Felker wrote: > On Mon, Dec 08, 2025 at 11:44:43AM -0600, Bill Roberts wrote: >> Based on previous discussions on enabling PAC and BTI for Aarch64 >> targets, rather than annotating the existing assembler, use inline >> assembly and mix of C. Now this has the benefits of: >> 1. Handling PAC, BTI and GCS. >> a. prologue and eplilog insertion as needed. >> b. Adding GNU notes as needed. >> 2. Adding in the CFI statements as needed. >> >> I'd love to get feedback, thanks! >> >> Bill Roberts (14): >> aarch64: drop crt(i|n).s since NO_LEGACY_INITFINI >> aarch64: rewrite fenv routines in C using inline asm >> aarch64: rewrite vfork routine in C using inline asm >> aarch64: rewrite clone routine in C using inline asm >> aarch64: rewrite __syscall_cp_asm in C using inline asm >> aarch64: rewrite __unmapself in C using inline asm >> aarch64: rewrite tlsdesc reoutines in C using inline asm >> aarch64: rewrite __restore_rt routines in C using inline asm >> aarch64: rewrite longjmp routines in C using inline asm >> aarch64: rewrite setjmp routines in C using inline asm >> aarch64: rewrite sigsetjmp routines in C using inline asm >> aarch64: rewrite dlsym routine in C using inline asm >> aarch64: rewrite memcpy routine in C using inline asm >> aarch64: rewrite memset routine in C using inline asm > > Of these, at least vfork, tlsdesc, __restore_rt, setjmp, sigsetjmp, > and dlsym are fundamentally wrong in that they have to be asm entry > points. Wrapping them in C breaks the state they need to receive. I went through the generated code and ran tests against all of this and it didn't break, is there some specific case or some compiler option where this explodes? What state exactly gets trashed? > > Some others like __syscall_cp_asm are wrong by virtue of putting > symbol definitions inside inline asm, which may be emitted a different > number of times than it appears in the source. The labels in > __syscall_cp_asm must exist only once, so it really needs to be > external asm (for a slightly different reason than the entry point > needing to be asm). Ah yes, in-lining could duplicate the label. > > The advice to move to inline asm was to do it where possible, i.e. > where it's gratuitous that we had an asm source file. But even where > this can be done, it should be done by actually writing the inline asm > with proper register constraints, not just copy-pasting the asm into C > files wrapped in __asm__. Some things, like __clone, even if they > could be done as C source files with asm, are not valid the way you've > just wrapped them because you're performing a return from within the > asm but don't have access to the return address or any way to undo > potential stack adjustments made in prologue before the __asm__. And > this would catastrophically break if LTO'd. > > memcpy and memset are slated for "removal" at some point, replacing > the high level flow logic in arch-specific asm with shared high level > C and arch-provided asm only for the middle-section bulk copy/fill > operation in aligned and unaligned variants. I'm really not up for > reviewing and trusting in the correctness of large changes to any of > the existing arch-specific memcpy/memset asm or adding new ones for > other archs until then, because it's effort on something that's > intended to be removed. So these should just be kept as-is for now. Ok, so then where we keep asm the same, you want to just always include a BTI or PAC instruction as needed? > > The approach to the fenv changes looks roughly right. This is also > something I'd like to do in an arch-generic way at some point, but > there's no good reason not to do it first on aarch64 like you've > proposed. So perhaps I can send this as a separate patch not in an RFC state? > > Removing crt[in].s is probably okay as well. Same for this patch, send as a non-rfc ready to go? > > We generally prefer patch series as a single email with multiple MIME > attachments instead of git send-email threads, if that's easy for you > to do. It's not a big deal either way but it keeps folks' inbox volume > down and makes it easier to reply with review of the whole series > together. I need to figure that out exactly, my git send-email foo is not strong. Is this literally just firing up a mail client and sending an email with attachments or is this using git send-email with -m? > > Rich Thanks Rich, so reading your comments, I think I'll need to re-architect some of the code base, not a problem. Something similar in concept to the fenv patch style. I can use plain asm entry points that just tail call into C and then they would just get marked with a single BTI C instruction which would just NOP on non-supported platforms.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.