Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251001120616.GE1827@brightrain.aerifal.cx>
Date: Wed, 1 Oct 2025 08:06:16 -0400
From: Rich Felker <dalias@...c.org>
To: Mike Hilgendorf <mike@...gram.dev>
Cc: musl@...ts.openwall.com
Subject: Re: `unsetenv()` does not always work when run in an
 `__attribute__((constructor))` function

On Tue, Sep 30, 2025 at 04:23:44PM -0500, Mike Hilgendorf wrote:
> I've failed to reproduce this bug except in rare cases, this is the
> smallest I could make it.
> 
> Say you want to set LD_PRELOAD to run some code before main() and
> unset LD_PRELOAD within that function so it only runs for the parent
> process. You might do something like this:
> 
> 
> ```
> #include <stdio.h>
> #include <stdlib.h>
> #include <errno.h>
> 
> __attribute__((constructor))
> static int preload () {
>     if (unsetenv("LD_PRELOAD")) {
>         printf("unsetenv errored: %d\n", errno);
>     }
>     if (getenv("LD_PRELOAD")) {
>         printf("LD_PRELOAD was still set\n");
>     }
> }
> ```
> compiled by running:
> 
> musl-gcc preload.c -shared -o preload.so
> 
> Now you want to inject this into a binary, say bash, compiled against musl libc
> 
> ```
> cd bash-5.2.37
> export CC=musl-gcc
> ./configure
> make
> 
> env -i LD_PRELOAD=path/to/preload.so ./bash
> ```
> 
> You will see that LD_PRELOAD was not unset (even in the context of the
> ctor function), and LD_PRELOAD is set in the shell.
> 
> From what I can tell, LD_PRELOAD is not special, this is true of other
> environment variables. However this is not reproducible with simpler
> programs - bash 5.2 is the one where I saw this happen first, and so
> far the only program I know that has this issue.

Are you sure it's reproducible with other programs? I suspect what's
happening is that bash is using the optional third argument envp to
main, not the actual current environment, as its source to derive the
initial environment list.

I don't see any way the issue you're describing could happen
otherwise.

> Let me know if I'm doing something very wrong or if there is an easier
> reproduction, from scanning bash's source I don't think they're doing
> anything strange (they initialize variables with the `char** envp`
> passed to main).

Yes, so it's exactly what I suspected.

> I do notice in `dynlink.c` that the stack pointer passed to the
> entrypoint by the loader is whatever was passed to the loader, not
> accounting for any envp mutations made by calling the DT_INIT_ARRAY
> functions. But I don't know if this is actually a problem or not for
> the implementation of _start.

No application code, not even DT_INIT_ARRAY handlers, has executed at
this point. It all runs after execution is passed to the main
program's ELF entry point (_start).

The behavior you're seeing has nothing to do with the dynamic linker.
It's line 95 of src/env/__libc_start_main.c passing a pointer to the
initial environment vector rather than any potentially-updated value
of environ. I'm not sure if this should be changed or not (there are
probably arguments for either), but bash using the nonstandard
third-arg to main rather than the standard global environ[] strikes me
as a bash bug.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.