Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 26 Sep 2022 19:02:42 -0400
From: Rich Felker <dalias@...c.org>
To: Colin Cross <ccross@...roid.com>
Cc: musl@...ts.openwall.com, Ryan Prichard <rprichard@...gle.com>
Subject: Re: Running musl executables without a preinstalled dynamic
 linker

On Mon, Sep 26, 2022 at 03:42:01PM -0700, Colin Cross wrote:
> On Mon, Sep 26, 2022 at 3:38 PM Colin Cross <ccross@...roid.com> wrote:
> >
> > On Tue, Aug 23, 2022 at 1:18 AM Szabolcs Nagy <nsz@...t70.net> wrote:
> > >
> > > * Colin Cross <ccross@...roid.com> [2022-08-22 17:22:06 -0700]:
> > > > On Sat, Aug 20, 2022 at 2:43 AM Szabolcs Nagy <nsz@...t70.net> wrote:
> > > > > i would not use Scrt1.o though, the same toolchain should be
> > > > > usable for normal linking and relinterp linking, just use a
> > > > > different name like Xcrt1.o.
> > > >
> > > > Is there some way to get gcc/clang to use Xcrt1.o without using
> > > > -nostdlib and passing all the crtbegin/end objects manually?
> > >
> > > this requires compiler changes (new cmdline flag) but then i think
> > > the code is upstreamable.
> >
> > I've used relinterp.o for now, and selected instead of Scrt1.o in
> > musl-gcc.specs and ld.musl-clang.
> >
> > >
> > > > > i would make Xcrt1.o self-contained and size optimized: it only
> > > > > runs at start up, this is a different requirement from the -O3
> > > > > build of normal string functions. and then there is no dependency
> > > > > on libc internals (which may have various instrumentations that
> > > > > does not work in Xcrt1.o).
> > > >
> > > > Doesn't this same logic apply to most of the code in dynlink.c?  My
> > > > main worry with a self contained implementation is that it requires
> > > > reimplementations of various string functions that are easy to get
> > > > wrong.  The current prototype reuses the C versions of musl's string
> > > > functions, but implements its own syscall wrappers to avoid
> > > > interactions with musl internals like errno.
> > >
> > > dynlink is in libc.so so it can use code from there.
> > >
> > > but moving libc code into the executable has different constraints.
> > > so you will have to make random decisions that string functions are
> > > in but errno is out, wrt which libc internal makes sense in the exe.
> > >
> > > i would just keep a separate implementation (or at least compile
> > > the code separately). string functions are easy to implement if
> > > you dont try to optimize them imo. then you have full control over
> > > what is going on in the exe entry code.
> >
> > I left the reimplementations of string functions and syscalls as
> > suggested.  Patch attached.

> From 0df460188b95f79272003bd0e5c12bceb2a3c25f Mon Sep 17 00:00:00 2001
> From: Colin Cross <ccross@...roid.com>
> Date: Thu, 22 Sep 2022 19:14:01 -0700
> Subject: [PATCH] Add entry point to find dynamic loader relative to the
>  executable
> 
> Distributing binaries built against musl to systems that don't already
> have musl is problematic due to the hardcoded absolute path to the
> dynamic loader (e.g. /lib/ld-musl-$ARCH.so.1) in the PT_INTERP header.
> This patch adds a feature to avoid the problem by leaving out PT_INTERP
> and replacing Scrt1.o with an entry point that can find the dynamic
> loader using DT_RUNPATH or LD_LIBRARY_PATH.
> 
> The entry point is in crt/relinterp.c.  It uses auxval to get the
> program headers and find the load address of the binary, then
> searches LD_LIBRARY_PATH or DT_RUNPATH for the dynamic loader.
> Once found, it mmaps the loader similar to the way the kernel
> does when PT_INTERP is set.  The musl loader uses PT_INTERP to set
> the path to the loader in the shared library info exported to the
> debugger, so relinterp creates a copy of the program headers
> with the PT_INTERP entry added pointing to the found location of
> the dynamic loader.  It updates AT_BASE to point to the address
> of the dynamic loader, then jumps to the loaders entry point.
> 
> The dynamic loader then loads shared libraries and handles
> relocations before jumping to the executable's entry point, which is
> the entry point in relinterp.c again.  Relinterp detects that
> relocations have been performed and calls __libc_start_main, the
> same way Scrt1.o would have.
> 
> Since relinterp runs before relocations have been performed it has
> to avoid referecing any libc functions.  That means reimplementing
> the few syscalls and string functions that it uses, and avoiding
> implicit calls to memcpy and memset that may  be inserted by the
> compiler.
> 
> Enabling relinterp is handled in the spec file for gcc and in
> the linker script for clang via a -relinterp argument.
> 
> Normally gdb and lldb look for a symbol named "_dl_debug_state" in
> the interpreter to get notified when the dynamic loader has modified
> the list of shared libraries.  When using relinterp the debugger is
> not aware of the interpreter (at process launch PT_INTERP is unset
> and auxv AT_BASE is 0) so it doesn't know where to look for the symbol.
> 
> They fall back to looking in the executable, so we can provide a symbol
> in relinterp.c for it to find.  The dynamic loader is then modified
> to also find the symbol in the exectuable and to call it from its own
> _dl_debug_state function.
> 
> The same tests in libc_test pass with or without LDFLAGS += -relinterp
> with both musl-gcc and musl-clang.
> 
> Ryan Prichard (rprichard@...gle.com) authored the original prototype
> of relinterp.

Have you looked at https://www.openwall.com/lists/musl/2020/03/29/9
where this has already been done? It's not upstream but my
understanding is that the author has been using it successfully for a
long time, and it's been through some review and as I recall was at
least close to acceptable for upstream.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.