Date: Fri, 24 Jun 2016 13:03:47 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: musl ldd: swt build: Error relocating / symbol not found On Fri, Jun 24, 2016 at 04:23:55PM +0000, Andrei Pozolotin wrote: > Szabolcs: > > On 06/23/2016 11:15 PM, Szabolcs Nagy wrote: > > * Andrei Pozolotin <andrei.pozolotin@...il.com> [2016-06-23 19:42:44 +0000]: > >> b) while at the same time musl ldd reporting that library dependency > >> tree is resolved with no error: > >> > >> lddtree /usr/lib/libswt-atk-gtk-4530.so > > that's not musl's ldd, but scanelf from pax-utils > thank you for pointing out. > > when debugging such a complicated setup the output > > of tools that may use subtly different library paths > > and symbol resolution logic is not very helpful. > ok, got it. > > ldd /usr/lib/libswt-gtk-4530.so > ldd /usr/lib/libswt-gtk-4530.so > ldd (0x55e333e6c000) > libc.musl-x86_64.so.1 => ldd (0x55e333e6c000) > > ldd /usr/lib/libswt-atk-gtk-4530.so > ldd /usr/lib/libswt-atk-gtk-4530.so > ldd (0x55edc6edc000) > libatk-1.0.so.0 => /usr/lib/libatk-1.0.so.0 (0x7fc763298000) > libc.musl-x86_64.so.1 => ldd (0x55edc6edc000) > libgobject-2.0.so.0 => /usr/lib/libgobject-2.0.so.0 (0x7fc763058000) > libglib-2.0.so.0 => /usr/lib/libglib-2.0.so.0 (0x7fc762d6d000) > libintl.so.8 => /usr/lib/libintl.so.8 (0x7fc762b5f000) > libffi.so.6 => /usr/lib/libffi.so.6 (0x7fc762957000) > libpcre.so.1 => /usr/lib/libpcre.so.1 (0x7fc7626fe000) > > would be more interesting.. > > > > but even then we don't know what's going on > > (if libswt-gtk-4530.so is dlopened with RTLD_LOCAL > > then its libgobject dependency might not be visible > > to libswt-atk-gtk-4530) > OK. here is the story: > > * java native interface: NativeLibrary.load() > http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/74e5fc94c77b/src/share/classes/java/lang/ClassLoader.java#l1726 > > * java JNI implementation: > Java_java_lang_ClassLoader_00024NativeLibrary_load > http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/74e5fc94c77b/src/share/native/java/lang/ClassLoader.c#l369 > > * libjvm.so entry point: os::dll_load > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/tip/src/share/vm/prims/jvm.cpp#l3959 > > * libjvm.so linux implementation > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/share/vm/runtime/os.hpp#l564 > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/os/linux/vm/os_linux.cpp#l1773 > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/os/linux/vm/os_linux.cpp#l1767 > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/os/linux/vm/os_linux.cpp#l1997 > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/os/linux/vm/os_linux.cpp#l1988 > > * and finally: it says: dlopen RTLD_LAZY: > http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/4529ee76d3f9/src/os/linux/vm/os_linux.cpp#l1988 > void * result = ::dlopen(filename, RTLD_LAZY); > > http://linux.die.net/man/3/dlopen > RTLD_LAZY: Perform lazy binding. Only resolve symbols as the code that > references them is executed. > If the symbol is never referenced, then it is never resolved. > (Lazy binding is only performed for function references; > references to variables are always immediately bound when the library is > loaded.) > > RTLD_LAZY is good, right? :-) OK, this is likely the root of the problem: invalid code assuming that it can load libraries with undefined symbols as long as it doesn't try to use those code paths. The man page you linked to is rather poor-quality. When symbol binding takes place with RTLD_LAZY is actually implementation-defined and can be anywhere between the time of dlopen and the time of use. The flag should be treated only as a hint for allowing performance optimizations, not as something that gives the caller permission to do erroneous things. Aside from formal correctness, there are multiple reasons for this. It's architecture- and linktime-option-dependent whether late binding is even possible at all, and musl purposefully does not implement lazy binding because it's a huge surface for bugs (which you can see by looking at glibc's history of bugs caused by lazy binding). There's one other well-known piece of software, x.org, abusing RTLD_LAZY in the same way, and we have discussed possible workarounds before. It would be possible to accept relocations with undefined symbol references at dlopen time by storing a list of them, and rather than lazily processing them at call time, re-process them after each additional dlopen. This would allow broken programs to work without introducing the bug surface that actual lazy-binding introduces. However it's a fairly big task to add, and it would be much nicer just to get the buggy programs fixed (there are already reasonable workarounds for x.org). Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.