Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251114192158.GK3520958@port70.net>
Date: Fri, 14 Nov 2025 20:21:58 +0100
From: Szabolcs Nagy <nsz@...t70.net>
To: Rich Felker <dalias@...c.org>
Cc: Demi Marie Obenour <demiobenour@...il.com>, musl@...ts.openwall.com,
	Da Xie <xxie_xd@....com>
Subject: Re: Future plans for RISC-V Zicfiss/Zicfilp support?

* Rich Felker <dalias@...c.org> [2025-11-14 12:41:40 -0500]:
> On Fri, Nov 14, 2025 at 12:34:33PM -0500, Demi Marie Obenour wrote:
> > On 11/13/25 08:36, Rich Felker wrote:
> > > On Thu, Nov 13, 2025 at 05:19:08PM +0800, Da Xie wrote:
> > >> Hi everyone,
> > >>
> > >> I'm new to the musl community and was exploring its support for RISC-V.
> > >>
> > >> I was wondering if there are any plans to support the Zicfiss (shadow
> > >> stack) and/or Zicfilp (landing pads) extensions in the future. I
> > >> understand these are relatively new extensions aimed at improving
> > >> security (similar in spirit to Arm's GCS).
> > > 
> > > Plans, no, and probably not. We have not supported similar things for
> > > other architectures because they break existing API contracts about
> > > how the stack can be used and make it impossible to free resources or
> > > make promises not to enter unrecoverable late-failure situations, and
> > > because the idea of playing whack-a-mole with gadgets when you have
> > > functions like system() present as valid call targets anyway seems
> > > like very misplaced hardening effort in terms of cost vs benefits.
> > > 
> > > If there's some way it can work in a non-contract-breaking way,
> > > supporting it could be on the table eventually, but it's up to folks
> > > who want it to explain convincingly how that could work.
> > > 
> > > Rich
> > 
> > If you are okay with sharing, do you have an explanation for why
> > shadow stacks cause resource problems?  Stacks used to be executable,
> > so making them non-executable was already a compatibility break in
> > the past.
> 
> There was never a contract that you can execute code on the stack or
> even any way in the standard language to put code on the stack. It
> required writing asm or using dubious compiler extensions that were
> never portable and which musl never purported to support.
> 
> There are contracts with sigaltstack, pthread_attr_setstack,
> makecontext, etc. that allow an application to specify its own stack
> that code will run on, but that don't have any way to specify a shadow
> stack. If the implementation allocates its own shadow stack behind the
> scenes for this purpose, there is no way for the application to tell
> the implementation when it's okay to release that allocation, and it
> may also be the case (I forget the details; maybe someone can remind
> me) that the allocation happens too late for the application to be
> able to handle failure.

makecontext cant report errors, so allocation failure is fatal,
a new api would be needed to deal with this and free, but since
it is a broken legacy api the current glibc implementation just
hacked in a shadow stack that "mostly works".

sigaltstack i think ended up not using alt-shadow-stack but the
interrupted stack, which is wrong 1) in case the signal handler
is deep or 2) we are trying to handle a shadow stack overflow
safely (pretty much the main usecase of an altstack) or 3) if
one longjmps between various stacks (e.g. out from the altstack
and then back). these were considered to be corner-case of a
corner-case issues. for 2) the idea was to allocate shadow
stacks large enough so it never happens. this way there is no
alt-stack resource management problem, longjmp and unwind are
simpler too.

threads are a bit annoying (mainly because clone3 handles
thread/fork/vfork in a hairy way and all require different
stack behaviour) but resource management should be fine
(either kernel or user side), legacy clone does not know
the stack size and shadow stack cannot be easily managed
on the user side so the kernel has to guess the size but
switching to clone3 can fix this.

another resource problem is simply the shadow stack size
for the main thread. RLIMIT_STACK can dynamically change
so main stack has no reasonable upper bound, and an
application may have RLIMIT_AS (or strict overcommit) set
so the extra stacks can run into limits.

i think if distros are happy with such trade-offs then
adding shadow stack support can be an option.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.