Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251114161837.GH3520958@port70.net>
Date: Fri, 14 Nov 2025 17:18:37 +0100
From: Szabolcs Nagy <nsz@...t70.net>
To: Bill Roberts <bill.roberts@....com>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH 3/3] aarch64: enable PAC and BTI instruction
 support in musl build

* Bill Roberts <bill.roberts@....com> [2025-11-13 13:44:28 -0600]:
> This change adds support for Pointer Authentication (PAC) and Branch
> Target Identification (BTI) within musl’s own code on AArch64. These
> features improve control-flow integrity and mitigate return-oriented
> programming attacks by hardening indirect branches and return
> instructions.
> 
> To integrate these instructions robustly across toolchains:
> 
>  - PAC and BTI instructions are inserted directly into the assembly,
>    rather than being emitted via CFI directives. This approach is taken
>    because it is far simpler to remove or rewrite instructions using AWK
>    than to identify and manually annotate every location that requires
>    them. New assembly code should therefore be written with PAC and BTI
>    awareness in mind.

i worked on the original gnu toolchain support, will
add some notes.

note that pac requires dwarf cfi directives for correct
unwinding (it clobbers the return address), which matters
for non-leaf functions e.g. when unwinding from a signal
handler (musl optionally supports async unwind), or when
debugging (broken backtrace in gdb is bad user experience).
this cfi is a pac-specific extension so some dwarf based
unwinders may not support it and crash on it (a cfi
interpreter cannot ignore unknown op codes, and it is
hard to bump the dwarf abi to notice this at link time),
so the pac extension can cause crashes even on cpus where
the instruction is a nop so i think we should only add
pac if the user asked for it.

it seems the only non-leaf asm that requires pac is legacy
_init/_fini and aarch64 seems to define NO_LEGACY_INITFINI
so we can probably remove the current _init/_fini code (?)
then all the pac complexity goes away from the asm. (with
the understanding that if aarch64 needs non-leaf asm in the
future, that will require additional work)

> --- a/crt/aarch64/crti.s
> +++ b/crt/aarch64/crti.s
> @@ -3,6 +3,7 @@
>  .type _init,%function
>  .align 2
>  _init:
> +	paciasp
>  	stp x29,x30,[sp,-16]!
>  	mov x29,sp
...
> --- a/crt/aarch64/crtn.s
> +++ b/crt/aarch64/crtn.s
> @@ -1,7 +1,9 @@
>  .section .init
>  	ldp x29,x30,[sp],#16
> +	autiasp
>  	ret

crti.s could be

_init:
	bti c
	ret

just in case somebody calls _init (musl does not).

> 
>  - Since some older toolchains may not recognize PAC or BTI mnemonics,
>    the post-processing step rewrites them into equivalent `hint`
>    instructions. This allows musl to continue building successfully on
>    systems without assembler support for these instructions, while still
>    emitting the correct opcodes when using newer toolchains.

i think using an awk script is fine, but so far it did not
modify the instructions, only added optional dwarf info.

another solution is to use .S and a header with BTI_C macros.
or just add bti unconditionally (it is a nop, so only adds
minor size and performance overhead, the code alignment
changes in memcpy may be measurable and there was at least
one particular supercomputer where hints were not as fast as
normal nop, but i think for musl this is a valid choice)
handling the gnu property note is uglier then though, awk
is probably the best for that.

>  .hidden __fesetround
>  .type __fesetround,%function
>  __fesetround:
> +	bti c
>  	mrs x1, fpcr

note: bti c is only required if the symbol may be called
indirectly, so for hidden or local symbols a library may
omit the bti c if it ensures there are no indirect calls
to them. normally the toolchain may introduce indirect
calls e.g. if a direct call goes too far in a huge binary
the linker adds a stub that reaches the target with an
indirect branch (via x16/x17 registers that are reserved
for such tail calls and allowed to land on bti c). but it
was decided that for bti enabled binaries a linker must
not add such indirect branches to targets without bti c/j
enabling a compiler to omit some bti c/j. i think we dont
want to introduce constraints on the generic c code in
musl so hidden symbols should keep the bti, local
functions can avoid it but musl has no such case in asm.

> +++ b/arch/aarch64/crt_arch.h
> @@ -3,6 +3,9 @@ __asm__(
>  ".global " START "\n"
>  ".type " START ",%function\n"
>  START ":\n"
> +#if defined(__ARM_FEATURE_BTI_DEFAULT)
> +"	hint 34\n" /* bti c */
> +#endif

i'd use
#if __ARM_FEATURE_BTI_DEFAULT

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.