Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHs-p=ifPY=Hh1UuJrDyP3N8kmpB-ryfR=4AcfvfoWQDa1GVxg@mail.gmail.com>
Date: Mon, 12 May 2025 21:20:00 +0100
From: Andy Caldwell <andy.m.caldwell@...glemail.com>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH] force -fno-lto for CRT files

On Mon, May 12, 2025 at 6:47 PM Rich Felker <dalias@...c.org> wrote:
>
> On Mon, May 12, 2025 at 04:15:34PM +0100, Andy Caldwell wrote:
> > Hey,
> >
> > An alternative approach would be to mark the function as `used` which the
> > attached patch does.  Sadly this attribute appears to maybe be a GNU-ism (also
> > supported in all versions of clang I could find) so maybe it would need a
> > feature check too to handle other compilers, but this feels like the correct
> > fix - telling the compiler that the symbol is used rather than disabling
> > compiler features that misbehave because the compiler can't see the indirect
> > calls.
> >
> >
> > From 988328716d269005ebb55095d74576bf88f5154f Mon Sep 17 00:00:00 2001
> > From: Andy Caldwell <andy.m.caldwell@...glemail.com>
> > Date: Mon, 12 May 2025 16:01:52 +0100
> > Subject: [PATCH] Mark `__dls2` as `__attribute__((used))`
> >
> > This function is only referenced indirectly (though an explicit symbol
> > lookup on inline assembly) which causes LTO to strip it, causing linker
> > errors.  The `used` attribute tells the compiler (including the LTO
> > phase) that the function is referenced, even if the compiler can't see
> > that, preventing the function getting dropped.
> > ---
> >  ldso/dynlink.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/ldso/dynlink.c b/ldso/dynlink.c
> > index 715948f4..ab78274e 100644
> > --- a/ldso/dynlink.c
> > +++ b/ldso/dynlink.c
> > @@ -1704,7 +1704,7 @@ static void install_new_tls(void)
> >   * symbols. Its job is to perform symbolic relocations on the dynamic
> >   * linker itself, but some of the relocations performed may need to be
> >   * replaced later due to copy relocations in the main program. */
> > -
> > +__attribute__((used))
> >  hidden void __dls2(unsigned char *base, size_t *sp)
> >  {
> >       size_t *auxv;
> > --
> > 2.49.0
>
> Is that even the only problem? My understanding is that this bug in
> LTO affects all the crt start files as well, and in those it's more
> explicit: it's ignoring a top-level asm root not just a reference via
> inline asm inside a function.

If the goal is to build `libc.so` with artifact-wide optimisations enabled,
then this is the only change I needed to get that working (including
in its `ld-musl` and `ldd` modes).

Note that naively passing `-flto` in `CFLAGS also causes `rcrti.o` and
`Scrti.o` to be LLVM bitcode rather than object files (since they don't
get re-compiled after their initial LTO-compatible compilation) so at
least _some_ amount of filtering makes sense here (though see below).

> On top of that, LTO-type transformations are wrong/unsafe for code
> that executes prior to relocations and passing control to the entry
> point, since at least in theory they could lift things without side
> effects to somehow run before relocations or other initialization they
> depend upon has happened. In some sense, this code runs in a separate,
> lower-level execution environment operating *on* the memory of the
> program that will run rather than *as part of* the program. So I think
> it makes sense to suppress LTO.

Is this true?  My understanding is that LTO optimizations follow the same
rules as compiler optimizations (this specific case feels special but it's just
dead-code-elimination that the per-unit compilations can't do since they
can't see the whole binary).  Currently start-files are compiled with `-O2`
so any problematic lifting etc could already happen.

Out of perverse interest, I compiled `libc.a`, `Scrti.o` and `rcrti.o`
with `-flto`
and built some dynamic and static(-pie) binaries with `-flto` which all just
worked out of the box (the main complexity was convincing `musl-clang`
to use `lld` or `gold`).  If anyone wants to explore this further,
`rcrti.o` also
needs another `((used))` on it's version of `__dls2` for exactly the same
reason `dynlink` needs one.

> I kinda don't like introducing a second parallel set of "NOLTO" rules
> that are identical to the "NOSSP" ones though, when the motivation for
> both (that the code runs prior to the program/runtime being ready) is
> the same. Maybe we could rename "NOSSP" to something more descriptive
> and use the same CFLAGS & makefile rules for both? OTOH, I'd kinda
> rather not change/remove CFLAGS_NOSSP in case it breaks folks' custom
> config.mak's.. but maybe I'm the only one with one of those anyway.
>
> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.