Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 5 Aug 2022 18:57:12 -0400
From: James Y Knight <jyknight@...gle.com>
To: libc-coord@...ts.openwall.com
Subject: Re: Proposing dl* extensions with explicit caller specification

I note that the ABI implementation for C++ global destructors
(`__cxa_atexit`, `__cxa_thread_atexit`, `__cxa_finalize`) also has a
dependency upon the caller's DSO identity, so that dlclose can call
the global destructors for only the library being closed. Happily,
those functions were designed to take an extra parameter, the address
of a hidden symbol `__dso_handle` that gets defined within every DSO.

Sadly, the same isn't true for dlsym/dlmopen/etc. (Potentially it
could've been done behind the scenes like `#define dlsym(h,s)
__dlsym_impl(h, s, &__dso_handle)`, or whatever). But...many decades
too late for that. :)

As to your proposal: it seems fairly reasonable, but adding new
entry-points does have a cost.

With that in mind: are you sure you really need this? Instead, in
shared-libasan mode, could you link a tiny static archive which
contains a hidden-visibility dlopen symbol like:
```
void *dlopen(const char*p, int m) {
    __asan_dlopen_pre(p,m);
    void* r=REAL(dlopen)(p,m);
    return __asan_dlopen_post(p,m,r); }
```
That way, you're only ever intercepting dlopen from within the same
shared object that the call was made from, avoiding the problem -- and
the requirement for these new libc functions. Could that work?

> take an explicit `dl_caller` pointer that is used in place of the return address

The documentation for the new parameter needs to be expanded. I
believe the intended semantics are that it may be set to _any_ address
within the DSO's mapped memory regions, but you should say that
explicitly. That is: passing a function return address is one
possibility, but you could also pass any other address within the
DSO's memory mappings (e.g. address of a symbol such as __dso_handle).

> dlsym_caller dlvsym_caller dlmopen_caller

Beyond those 3 APIs, grepping glibc for uses of RETURN_ADDRESS I see
there's also dl_iterate_phdr, _dl_mcount_wrapper,
_dl_mcount_wrapper_check, and various malloc debugging things. The
first, at least, seems relevant to this proposal -- presumably there
should be a variant of `dl_iterate_phdr` added, too.



On Thu, Aug 4, 2022 at 6:45 PM Keno Fischer <keno@...iacomputing.com> wrote:
>
> Dear libc maintainers,
>
> I'm hoping to coordinate consensus on a dlfcn API extension to
> address a common paper cut that users encounter when attempting
> to use various instrumentation tooling such as the {address,
> memory, thread} sanitizers (and others). I don't think the
> implementation is particularly difficult, but as it touches
> core dlfcn API surface, some consensus would be required among
> libc implementations to avoid making a mess.
>
> # The problem
>
> A little known quirk of the dlsym and (on certain implementations)
> dl(m)open APIs is that their behavior depends on the calling shared
> object. This shared object is usually determined using
> __builtin_return_address, or a hand-coded equivalent (e.g. reading
> the top of stack of x86_64 or accessing the lr registers on
> aarch64).
>
> This implicit dependence on the return address (apart from feeling a
> bit like an API smell) breaks the ability to use symbol interposition
> on these functions, as the usual interposition/RTLD_NEXT pattern will
> result in the call appearing to come from a different shared object
> than the non-interposed call. This is a regular cause of end user
> complaints (see e.g. [1-7]).
>
> A common suggestion is to use LD_LIBRARY_PATH in order to work around the
> missing caller-dependent RUNPATH lookup. However, as I will survey below,
> RUNPATH is not the only caller-dependent property (so the workaround
> is incomplete) and setting LD_LIBRARY_PATH may affect lookups in other
> parts of the application (or any spawned children) in undesirable ways (so
> the workaround is potentially harmful to correct operation).
>
> A different suggestion that was previously made (e.g. in [7]) is to switch
> the interceptors to a tail call. Where possible, this does address indeed
> address the issue (e.g. rr's interceptor [8] does this and doesn't suffer
> from the same problem). Unfortunately, this is not always possible. For example,
> the memory sanitizer interceptor [9] needs to introspect the loaded object in
> order to set up shadow memory for all newly added mappings.
>
> The tail call issue also brings up a related concern: Compiler optimizations
> do not model the return-address dependence of these functions and will thus
> happily move them into tail call position when possible, raising the possibility
> that a compiler upgrade will cause dynamic linker behavior to change.
>
> # A brief survey of current caller-dependence in libcs
>
> How the return address is used is not consistent between different libcs.
> Perhaps the most consistent use of the return address is in RTLD_NEXT.
> POSIX specifies that:
>
> ```
> RTLD_NEXT
>    Specifies the next executable object file after this one that defines name.
>    This one refers to the executable object file containing the invocation of dlsym().
> ```
>
> Because of the above mentioned tail-call issue, arguably the
> implementation using __builtin_return_address is not POSIX compliant,
> because the return address may not necessarily be the `object containing
> the invocation of dlsym`. Nevertheless, this is a minor issue and not generally
> what users run into.
>
> The more common situation of return-address dependence is in `dlopen`. POSIX
> makes no mention of return-address dependence in dlopen, so implementations
> differ somewhat in their use of the return address in dlopen context.
>
> For implementations that provide the `dlmopen` extension (e.g. Solaris/Illumos,
> glibc), the return address is generally used by `dlopen` to identify
> the calling objects's namespace.
>
> Implementations without this extension that I surveyed (e.g. musl libc, FreeBSD
> libc), generally do not have caller dependence in dlopen (if there is one,
> I would love to know about it so I can add it to the list).
>
> For implementations that do look at the calling object inside dlopen, it is
> generally used for a few other purposes also, including RUNPATH/RPATH handling,
> lookup of certain flags, determination whether the calling object is an audit
> object, etc. The RUNPATH/RPATH handling is usually the one that users complain
> about, but of course the remaining uses could also introduce hard-to-diagnose
> issues. Implementations that do not look at the caller in dlopen, generally
> use the main executable for all of these queries.
>
> Illumos also appears to have caller-dependence in `dlclose`, `dlerror` and
> `dlinfo`. I assume this is because lookup of this information is per-namespace,
> but I did not look into it too closely.
>
> # Proposed API
>
> The proposal here (previously made independently by other people in various
> forums) is to add new variants of the caller-dependent dlfcn functions
> that take an explicit `dl_caller` pointer that is used in place of the return
> address, e.g. for dlsym:
>
> ```
> #include <dlfcn.h>
>
> void *dlsym_caller(void *restrict handle, const char *restrict symbol, void *restrict dl_caller);
> ```
>
> Naturally there would be a `dlvsym_caller` for libcs that provide the `dlvsym`
> extension (and analogously for e.g. `dlfunc` on FreeBSD).
>
> For `dlopen`, since not all implementations have caller dependence, my proposal
> would be to not have `dlopen_from`, but instead only provide `dlmopen_from`
> (since caller-dependence, seems to be pretty closely tied to the dlmopen
> extension):
>
> ```
> #include <dlfcn.h>
>
> void *dlmopen_caller(Lmid_t lmid, const char *restrict filename, int flags, void *restrict dl_caller);
> ```
>
> In order to ensure that the dlopen behavior can be emulated without with this
> function, I would propose promoting `LM_ID_CALLER` to an exported flag (glibc
> already has an internal version of this):
> ```
> LM_ID_CALLER
> Load the shared object in the namespace of the calling object (determined
> implicitly by `dlmopen` or explicitly from the `dl_caller` argument to
> `dlmopen_caller`).
> ```
>
> # Next steps
>
> I'm hoping this overview was useful as a discussion of the problem I'm
> hoping to address and the current state of implementation. I'm not wedded
> to the specifics of the proposal, so suggestions for different names or
> semantics would be appreciated. I am particularly interested to know if
> there are additional complications in one implementation or another that
> I failed to pick up on in my survey above.
>
> Otherwise, assuming that people generally like this proposal, I would hope
> to be able to implement this in short order. I think in most implementations,
> this is simply a matter of adding the appropriate symbols as the functionality
> already exists. I recognize that it will probably take 10 years before this
> has propagated enough to be widely available to end users, but on the other
> hand, people have been complaining about this for the better part of 10 years,
> so if we'd fixed it at the time, we'd already be done - better late than never
> ;).
>
> Cheers,
> Keno
>
>
> [1] https://bugs.llvm.org/show_bug.cgi?id=27790
> [2] https://sourceware.org/bugzilla/show_bug.cgi?id=27504
> [3] https://sourceware.org/bugzilla/show_bug.cgi?id=25114
> [4] https://sourceware.org/bugzilla/show_bug.cgi?id=28008
> [5] https://sourceware.org/bugzilla/show_bug.cgi?id=28927
> [6] https://github.com/google/sanitizers/issues/1219
> [7] https://bugzilla.redhat.com/show_bug.cgi?id=1449604
> [8] https://github.com/rr-debugger/rr/blob/master/src/preload/overrides.c#L136-L143
> [9] https://github.com/llvm/llvm-project/blob/8e7acb670b3830a2c72ed2a47b93f88be971eed2/compiler-rt/lib/msan/msan_interceptors.cpp#L1332-L1337

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.