|
|
Message-ID: <ba825341-3024-4a1b-8f15-b34bd48e2955@foss.arm.com> Date: Fri, 14 Nov 2025 00:43:41 -0600 From: Bill Roberts <bill.roberts@...s.arm.com> To: Bill Roberts <bill.roberts@....com>, musl@...ts.openwall.com Subject: Re: [PATCH 3/3] aarch64: enable PAC and BTI instruction support in musl build On 11/14/25 10:18 AM, Szabolcs Nagy wrote: > * Bill Roberts <bill.roberts@....com> [2025-11-13 13:44:28 -0600]: >> This change adds support for Pointer Authentication (PAC) and Branch >> Target Identification (BTI) within musl’s own code on AArch64. These >> features improve control-flow integrity and mitigate return-oriented >> programming attacks by hardening indirect branches and return >> instructions. >> >> To integrate these instructions robustly across toolchains: >> >> - PAC and BTI instructions are inserted directly into the assembly, >> rather than being emitted via CFI directives. This approach is taken >> because it is far simpler to remove or rewrite instructions using AWK >> than to identify and manually annotate every location that requires >> them. New assembly code should therefore be written with PAC and BTI >> awareness in mind. > > i worked on the original gnu toolchain support, will > add some notes. > > note that pac requires dwarf cfi directives for correct > unwinding (it clobbers the return address), which matters > for non-leaf functions e.g. when unwinding from a signal > handler (musl optionally supports async unwind), or when > debugging (broken backtrace in gdb is bad user experience). > this cfi is a pac-specific extension so some dwarf based > unwinders may not support it and crash on it (a cfi > interpreter cannot ignore unknown op codes, and it is > hard to bump the dwarf abi to notice this at link time), > so the pac extension can cause crashes even on cpus where > the instruction is a nop so i think we should only add > pac if the user asked for it. Oh yeah, I was initially thinking C only, so no unwinders, but gdb and the C++ users will need those CFI directives to indicate the stack frame has a signed return address and which key was used. > > it seems the only non-leaf asm that requires pac is legacy > _init/_fini and aarch64 seems to define NO_LEGACY_INITFINI > so we can probably remove the current _init/_fini code (?) That'd be nice, I always like removing code. Thanks for the review and insight Szabolcs, appreciate it. > then all the pac complexity goes away from the asm. (with > the understanding that if aarch64 needs non-leaf asm in the > future, that will require additional work) > >> --- a/crt/aarch64/crti.s >> +++ b/crt/aarch64/crti.s >> @@ -3,6 +3,7 @@ >> .type _init,%function >> .align 2 >> _init: >> + paciasp >> stp x29,x30,[sp,-16]! >> mov x29,sp > ... >> --- a/crt/aarch64/crtn.s >> +++ b/crt/aarch64/crtn.s >> @@ -1,7 +1,9 @@ >> .section .init >> ldp x29,x30,[sp],#16 >> + autiasp >> ret > > crti.s could be > > _init: > bti c > ret > > just in case somebody calls _init (musl does not). > >> >> - Since some older toolchains may not recognize PAC or BTI mnemonics, >> the post-processing step rewrites them into equivalent `hint` >> instructions. This allows musl to continue building successfully on >> systems without assembler support for these instructions, while still >> emitting the correct opcodes when using newer toolchains. > > i think using an awk script is fine, but so far it did not > modify the instructions, only added optional dwarf info. > > another solution is to use .S and a header with BTI_C macros. > or just add bti unconditionally (it is a nop, so only adds > minor size and performance overhead, the code alignment > changes in memcpy may be measurable and there was at least > one particular supercomputer where hints were not as fast as > normal nop, but i think for musl this is a valid choice) > handling the gnu property note is uglier then though, awk > is probably the best for that. You can just include a header file and ifdef on ASM add the GNU note, this is how it's done for other projects. I'll repeat myself here, for readers, but as I said in Patch 2/3, I would prefer to use the CPP and move .S files. > >> .hidden __fesetround >> .type __fesetround,%function >> __fesetround: >> + bti c >> mrs x1, fpcr > > note: bti c is only required if the symbol may be called > indirectly, so for hidden or local symbols a library may > omit the bti c if it ensures there are no indirect calls > to them. normally the toolchain may introduce indirect > calls e.g. if a direct call goes too far in a huge binary > the linker adds a stub that reaches the target with an > indirect branch (via x16/x17 registers that are reserved > for such tail calls and allowed to land on bti c). but it > was decided that for bti enabled binaries a linker must > not add such indirect branches to targets without bti c/j > enabling a compiler to omit some bti c/j. i think we dont > want to introduce constraints on the generic c code in > musl so hidden symbols should keep the bti, local > functions can avoid it but musl has no such case in asm. > >> +++ b/arch/aarch64/crt_arch.h >> @@ -3,6 +3,9 @@ __asm__( >> ".global " START "\n" >> ".type " START ",%function\n" >> START ":\n" >> +#if defined(__ARM_FEATURE_BTI_DEFAULT) >> +" hint 34\n" /* bti c */ >> +#endif > > i'd use > #if __ARM_FEATURE_BTI_DEFAULT
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.