Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 10 Aug 2019 19:27:49 +0200
From: Szabolcs Nagy <nsz@...t70.net>
To: musl@...ts.openwall.com
Cc: Luiz Angelo Daros de Luca <luizluca@...il.com>
Subject: Re: dlsym returning unresolved symbol address instead of
 dependency library symbol address

* Rich Felker <dalias@...c.org> [2019-08-10 12:42:52 -0400]:
> On Sat, Aug 10, 2019 at 12:11:11PM +0200, Szabolcs Nagy wrote:
> > * Luiz Angelo Daros de Luca <luizluca@...il.com> [2019-08-10 05:16:19 -0300]:
> > > I'm ruby maintainer in OpenWrt 18.06 (musl 1.1.19). I got a bug report (
> > > https://github.com/openwrt/packages/issues/9297) related to musl in mipsel
> > > 32bit.
> > > 
> > > When ruby loads a module (.so), it checks if that module was built for the
> > > same ruby that is loading it. Ruby loads libruby at startup, which exports
> > > ruby_xmalloc sym. So, the check consists on loading the module, searching
> > > for ruby_xmalloc in the module context and comparing with global
> > > ruby_xmalloc address. If they do not match, the module is using a different
> > > libruby. Something like this:
> > > 
> > > handle = (void*)dlopen(file, RTLD_LAZY|RTLD_GLOBAL)
> > > void *ex = dlsym(handle, EXTERNAL_PREFIX"ruby_xmalloc");
> > > if (ex && ex != ruby_xmalloc) {
> > >    // module is incompatible!
> > > }
> > > 
> > > The first time a module is loaded, it simply works as expected.
> > > I debugged and musl is working nicely. At do_dlsym(struct dso *p, const
> > > char *s, void *ra), it correctly fails to find the symbol with:
> > > 
> > > sym = sysv_lookup(s, h, p)
> > > 
> > > and correctly find it with:
> > > 
> > > sysv_lookup(s, h, p->deps[0])
> > > 
> > > Now, when the second module is loaded, it find "ruby_xmalloc" already with:
> > > 
> > > sym = sysv_lookup(s, h, p)
> > > 
> > > However, sym now points to the address of the undefined symbol in the
> > > second library (sym->st_shndx is NULL) instead of searching for it in
> > > dependencies. It seems that do_dlsym() only checks for undefined symbol
> > > (sym->shndx==NULL) when DL_FDPIC is 1 and DL_FDPIC is 0 in my case.
> > > 
> > > Does it make any sense to return an undefined symbol from dlsym()?
> > > Or does it make sense to return an undefined symbol from sysv_lookup()?
> > > Or is there any other arch specific issue that happened before, when
> > > library was loaded?
> > 
> > yes, if the search involves the main executable then
> > st_shndx==0 && st_value!=0 symbols must be included
> > because it's a plt in the exe and that's how function
> > addresses work.. on most targets except mips.
> > 
> > undef syms have st_value==0 in shared libs, maybe
> > not in mips? can you post the readelf -aW output of
> > the module that has st_shndx==0 && st_value!=0 entry
> > in its dynamic symbol table
> > 
> > i think this was going to be fixed by
> > https://www.openwall.com/lists/musl/2017/02/16/1/2
> > but that was never applied.
> 
> I brought it up a few times after that, asking what should be done
> since it no longer cleanly applies. The concept of that patch is
> probably still right but a localized fix now followed by deduplication
> later is probably preferable.
> 
> Do you know if the TLS and STB_LOCAL issues described there still
> exist too?

i think so

(there is no st_shndx check for STT_TLS and no OK_BIND check)

> 
> > > I created a simple patch that skips a symbol if it is undefined.
> > > https://raw.githubusercontent.com/luizluca/openwrt/b9674d528513c7c93205fa000fed7c0d3c6bb2e7/toolchain/musl/patches/020-dlsym_donot_return_address_from_undef_sym.patch
> 
> This patch is wrong (on non-MIPS and on MIPS with PLT); it will result
> in wrong values for dlsym of a
> 
> > i think the find_sym logic should be copied
> > because mips behaves differently from other targets:
> > 
> > http://git.musl-libc.org/cgit/musl/commit/?id=2d8cc92a7cb4a3256ed07d86843388ffd8a882b1
> 
> Yes. Conceptually, compared to find_sym, need_def is always false for
> dlsym (dlsym must return PLT thunk and copy relocation definitions),
> and STT_TLS was already checked as a special case above to lookup the
> thread-local copy of the object, so the only additional check needed
> here is !ARCH_SYM_REJECT_UND(sym). Does that sound correct to you?

i think the right check is

 sym->st_shndx || !ARCH_SYM_REJECT_UND(sym)

so the mips plt bit is only checked if st_shndx==0
otherwise bata symbols may be mishandled.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.