Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251112171455.GJ1827@brightrain.aerifal.cx>
Date: Wed, 12 Nov 2025 12:14:55 -0500
From: Rich Felker <dalias@...c.org>
To: Vivian Wang <wangruikang@...as.ac.cn>
Cc: musl@...ts.openwall.com, matthewcroughan <matt@...ughan.sh>
Subject: Re: [PATCH] ldso: Use rpath of dso of caller in dlopen

On Fri, Oct 17, 2025 at 06:50:52PM +0800, Vivian Wang wrote:
> Grab the return address using an arch-specific wrapper dlopen calling a
> generic __dlopen (analogous to dlsym and __dlsym), and use it to find
> the dso to use as needed_by for load_library in __dlopen. This way, when
> a dso calls dlopen, the library is searched from *this* dso's rpath.
> 
> This feature is used by shared libraries that dlopen on demand other
> shared libraries found in nonstandard paths.
> 
> This makes the behavior of DT_RUNPATH match glibc better. Also, since we
> already use this behavior with libraries loaded with DT_NEEDED, adding
> support for dlopen makes it more consistent.
> 
> By coincidence, both __dlsym and __dlopen take three arguments, the last
> of which is the return address. Therefore all of the arch-specific
> src/ldso/*/dlopen.s is just the corresponding dlsym.s with "dlsym"
> replaced by "dlopen".

I'm not convinced that this is a good change. With dlsym, behaving
differently based on the call point is optional nonstandard
functionality triggered by passing RTLD_NEXT, and it already has
problems. In particular, the return address does not properly
determine who the caller is; it will be wrong if there's a tail call
to dlsym. We've considered in the past making a new definition for
RTLD_NEXT that uses the address of an object in the translation unit
that uses RTLD_NEXT, which would fix this but has other subtly
different behavior (like if RTLD_NEXT isn't passed directly to dlsym
but to a wrapper for it in a different library) so it's not clear if
it would be a worthwhile improvement.

In addition to violating least-surprise and having a nonstandard
behavior always active, changing dlopen as in this patch would have
the same tail-call issue, and would only give the behavior some
callers want if dlopen is directly called. For example if you had
loaded a library with its own rpath and instead of calling dlopen, it
called some other-library-provided abstraction for loading modules
that in turn indirectly called dlopen, its rpath would not get used.
This seems confusing and undesirable.


Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.