Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251001120857.GF1827@brightrain.aerifal.cx>
Date: Wed, 1 Oct 2025 08:08:57 -0400
From: Rich Felker <dalias@...c.org>
To: Mike Hilgendorf <mike@...gram.dev>
Cc: musl@...ts.openwall.com
Subject: Re: `unsetenv()` does not always work when run in an
 `__attribute__((constructor))` function

On Wed, Oct 01, 2025 at 08:06:16AM -0400, Rich Felker wrote:
> On Tue, Sep 30, 2025 at 04:23:44PM -0500, Mike Hilgendorf wrote:
> > I've failed to reproduce this bug except in rare cases, this is the
> > smallest I could make it.
> > 
> > Say you want to set LD_PRELOAD to run some code before main() and
> > unset LD_PRELOAD within that function so it only runs for the parent
> > process. You might do something like this:
> > 
> > 
> > ```
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <errno.h>
> > 
> > __attribute__((constructor))
> > static int preload () {
> >     if (unsetenv("LD_PRELOAD")) {
> >         printf("unsetenv errored: %d\n", errno);
> >     }
> >     if (getenv("LD_PRELOAD")) {
> >         printf("LD_PRELOAD was still set\n");
> >     }
> > }
> > ```
> > compiled by running:
> > 
> > musl-gcc preload.c -shared -o preload.so
> > 
> > Now you want to inject this into a binary, say bash, compiled against musl libc
> > 
> > ```
> > cd bash-5.2.37
> > export CC=musl-gcc
> > ./configure
> > make
> > 
> > env -i LD_PRELOAD=path/to/preload.so ./bash
> > ```
> > 
> > You will see that LD_PRELOAD was not unset (even in the context of the
> > ctor function), and LD_PRELOAD is set in the shell.
> > 
> > From what I can tell, LD_PRELOAD is not special, this is true of other
> > environment variables. However this is not reproducible with simpler
> > programs - bash 5.2 is the one where I saw this happen first, and so
> > far the only program I know that has this issue.
> 
> Are you sure it's reproducible with other programs? I suspect what's
> happening is that bash is using the optional third argument envp to
> main, not the actual current environment, as its source to derive the
> initial environment list.
> 
> I don't see any way the issue you're describing could happen
> otherwise.
> 
> > Let me know if I'm doing something very wrong or if there is an easier
> > reproduction, from scanning bash's source I don't think they're doing
> > anything strange (they initialize variables with the `char** envp`
> > passed to main).
> 
> Yes, so it's exactly what I suspected.
> 
> > I do notice in `dynlink.c` that the stack pointer passed to the
> > entrypoint by the loader is whatever was passed to the loader, not
> > accounting for any envp mutations made by calling the DT_INIT_ARRAY
> > functions. But I don't know if this is actually a problem or not for
> > the implementation of _start.
> 
> No application code, not even DT_INIT_ARRAY handlers, has executed at
> this point. It all runs after execution is passed to the main
> program's ELF entry point (_start).
> 
> The behavior you're seeing has nothing to do with the dynamic linker.
> It's line 95 of src/env/__libc_start_main.c passing a pointer to the
> initial environment vector rather than any potentially-updated value
> of environ. I'm not sure if this should be changed or not (there are
> probably arguments for either), but bash using the nonstandard
> third-arg to main rather than the standard global environ[] strikes me
> as a bash bug.

Slight addendum: this is a big reason *why* folks should be using the
standard interfaces like environ[] rather than historical hacks like
the third arg envp: when there is a question about weird corner-case
behavior like we have here, there's an actual specification governing
what the behavior should be and usually a clear objective answer (and
if not, a consensus-based process for deciding if there should be one)
rather than "whatever your system happens to do".

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.