Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250709142646.GK288056@port70.net>
Date: Wed, 9 Jul 2025 16:26:46 +0200
From: Szabolcs Nagy <nsz@...t70.net>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: aarch64 SME support issues

* Rich Felker <dalias@...c.org> [2025-07-08 12:20:11 -0400]:
> On Sun, Jul 06, 2025 at 08:20:40PM +0200, Szabolcs Nagy wrote:
> > * Rich Felker <dalias@...c.org> [2025-07-01 17:37:03 -0400]:
> > > There's a thread going on elsewhere (glibc, kernel folks, etc.) that
> > > I'm CC'd on but that has not been on the musl list so far, about
> > > support for the aarch64 SME extension. I was under the impression that
> > > the way things were done on the ISA side, it should be possible to
> > > support applications that use it as long as the kernel does the right
> > > things, without any consideration for whether libc is new enough to
> > > know about it. (This is a condition I would deem necessary for it to
> > > be a transparent, non-ABI-breaking addition.) However, it seems that
> > > may not be the case. Here is a link to the current tail of the thread
> > > (note that it extends back thru June and May as well):
> > > 
> > > https://sourceware.org/pipermail/libc-alpha/2025-July/168330.html
> > > 
> > > At present, we should not have any musl-linked applications attempting
> > > to use SME, since it's mandatory to check the hwcap bits for it, and
> > > we have never defined the corresponding hwcap macro. (However it's
> > > possible that someone is wrongly bypassing libc headers and using the
> > > kernel ones, or defining it themselves, in which case they get to keep
> > > both pieces.)
> > > 
> > > Anyway, the immediate question I have in mind in preparation for a
> > > release is whether we should do something to future-proof for this
> > > now. Specifically, should we have the aarch64 entry code mask off all
> > > unknown hwcap bits? This would make it so if at some point in the
> > > future we expose a macro for SME, applications don't detect it as
> > > available if they're run with 1.2.6. (Note: this wouldn't help with
> > > 1.2.5 or earlier, since that ship has already sailed.)
> > 
> > fwiw i would not fiddle with hwcap for this release
> 
> Based on what you've said below I think that's not a good idea. See
> inline responses:
> 
> > 1. there are ways around that (cpu id registers for features
> > are now emulated for userspace by linux and hwcap is visible
> > in auxv etc) so we cant do it cleanly.
> 
> The documentation I found for using SME says it's required to use the
> hwcap bit to determine availability, not other means.
> 
> > 2. users of sme za state should rarely longjmp or create threads
> > so we are worrying about a cornercase we havent seen in practice
> > yet.
> 
> Yes, that's not generally the way musl deals with safety tho.
> 
> > 3. i think libgcc does not enable sme for musl due to lack of
> > __getauxval (not configure detected for bootstrap reasons,
> > based on target triplet, on for *-linux-gnu) so discussion is
> > moot until libgcc is updated.
> 
> I'm planning to include your patch exposing __getauxval in this
> release, which would thereby enable SME support on musl in a way that
> would silently break. So it sounds like adding __getauxval and one of
> either masking off hwcap, or actually adding working SME support, need
> to happen "atomically" in the same release in order not to put broken
> configurations into the wild.
> 
> > morally the sme runtime should be in libc but it ended up in
> > libgcc because that's supportable in old glibc without abi
> > update, there were glibc vs gcc release schedule dependency
> > delays and testability problems otherwise and because 2. the
> > abi breakage is unlikely.
> 
> It sounds like this was a commercial consideration for rapidly pushing
> a new feature to be available on existing system versions not actually
> prepared to support it safely. The norm should be that new
> functionality doesn't necessarily work on older systems and
> applications need to be prepared for that.
> 
> > but yes currently the libc control over sme is via __getauxval
> > and hwcap masking if we want it off.
> > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgcc/config/aarch64/__aarch64_have_sme.c;hb=HEAD
> 
> Do you have a recommendation/preference beween masking it off or
> dropping the __getauxval exposure for now?
> 
> I think I'd rather mask it off, since in the (unusual but plausible)
> case where a static-only toolchain is built, I think the libgccc
> configure test will see the hidden __getauxval and be able to use it
> already.
> 
> And if we do masking, I think it makes sense to mask off all unknown
> bits so this doesn't happen again in the future with the next new
> thing, but I'm not sure. Does this sound reasonable? Are there any
> cases where *hiding* a hwcap bit could result in malfunction?

ok i hadnt considered the __getauxval change, i think that
is useful to go in: it will take time to safely update libgcc
so better to add it sooner and potentially more widely useful
than just for SME.

i think hiding a hwcap bit may lead to inconsistencies due
to kernel behaving differently than what libc pretends,
but i don't have a strong case, it likely can only affect
hacky code. so likely no abi break for normal code.

e.g. kernel enables BTI on vdso (or static exe) and user code
trying to indirect jump into the middle of a function after
checking via the libc hwcap that bti is off.

or creating MTE tagged objects via mprotect + instructions
based on cpuid and then passing them to a function that is
only MTE safe when HWCAP_MTE is set.

or different part of atomics code trying to detect 128bit
lse atomics support differently (hwcap vs cpuid).

note that HWCAP2 is all used up, and now the top 32 bits
of HWCAP are getting allocated (used to be reserved when
we thought ilp32 was a thing, now only the top 2 bits are
kept for libc to use), musl does not have AT_HWCAP3 but
user code may query that anyway as AT_* values are abi.
not sure if you plan to deal with AT_HWCAP3 too.

i think masking HWCAP_SME* and top bits of AT_HWCAP
above 1<<41 should be fine for now. presumably this
can be undone if sme support is added.



> 
> > > The downside of this is that it would prevent using any other ISA
> > > features newer than what were available when the libc version shipped.
> > > But if ARM is potentially going to be making future ISA extensions
> > > breaking like this, it might be the safety-correct option.
> > > 
> > > If OTOH applications that use SME reference a libc-provided symbol
> > > (rather than a libgcc-provided one) to do the ABI magic, failure to
> > > resolve symbols would prevent them from being run unsafely, and
> > > there's not any issue.
> > 
> > for newlib libgcc uses a libc symbol __aarch64_sme_accessible
> > because there is no __getauxval.
> > 
> > but that's problematic for dynamic linking: the sme runtime
> > is in shared libgcc like the unwinder so all applications using
> > libgcc would fail not just the sme ones if the symbol is missing.
> 
> Indeed, that doesn't seem like a great idea.
> 
> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.