|
|
Message-ID: <20251112171455.GJ1827@brightrain.aerifal.cx> Date: Wed, 12 Nov 2025 12:14:55 -0500 From: Rich Felker <dalias@...c.org> To: Vivian Wang <wangruikang@...as.ac.cn> Cc: musl@...ts.openwall.com, matthewcroughan <matt@...ughan.sh> Subject: Re: [PATCH] ldso: Use rpath of dso of caller in dlopen On Fri, Oct 17, 2025 at 06:50:52PM +0800, Vivian Wang wrote: > Grab the return address using an arch-specific wrapper dlopen calling a > generic __dlopen (analogous to dlsym and __dlsym), and use it to find > the dso to use as needed_by for load_library in __dlopen. This way, when > a dso calls dlopen, the library is searched from *this* dso's rpath. > > This feature is used by shared libraries that dlopen on demand other > shared libraries found in nonstandard paths. > > This makes the behavior of DT_RUNPATH match glibc better. Also, since we > already use this behavior with libraries loaded with DT_NEEDED, adding > support for dlopen makes it more consistent. > > By coincidence, both __dlsym and __dlopen take three arguments, the last > of which is the return address. Therefore all of the arch-specific > src/ldso/*/dlopen.s is just the corresponding dlsym.s with "dlsym" > replaced by "dlopen". I'm not convinced that this is a good change. With dlsym, behaving differently based on the call point is optional nonstandard functionality triggered by passing RTLD_NEXT, and it already has problems. In particular, the return address does not properly determine who the caller is; it will be wrong if there's a tail call to dlsym. We've considered in the past making a new definition for RTLD_NEXT that uses the address of an object in the translation unit that uses RTLD_NEXT, which would fix this but has other subtly different behavior (like if RTLD_NEXT isn't passed directly to dlsym but to a wrapper for it in a different library) so it's not clear if it would be a worthwhile improvement. In addition to violating least-surprise and having a nonstandard behavior always active, changing dlopen as in this patch would have the same tail-call issue, and would only give the behavior some callers want if dlopen is directly called. For example if you had loaded a library with its own rpath and instead of calling dlopen, it called some other-library-provided abstraction for loading modules that in turn indirectly called dlopen, its rpath would not get used. This seems confusing and undesirable. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.