Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 12 Oct 2012 11:43:54 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: PATCH: dl_iterate_phdr()

On Fri, Oct 12, 2012 at 08:28:59AM -0500, Alex Caudill wrote:
> Thanks for the review! To clarify, I meant that the patch is public
> domain. This version implements most of your changes and attempts
> support for dlopen()'d libraries and vdso.
> 
> Some questions:
> 
> 1.) I guarded link.h with _GNU_SOURCE or _BSD_SOURCE; I'm unclear on
> whether this is necessary since it isn't a standard header, so please
> let me know your preference.

Not needed. I'd rather avoid excessive feature test macro checks like
this. The only time it might be desirable is when the nonstandard
header has a strong traditional precedent and the GNU version pollutes
the namespace with a lot of junk that might break apps. Then, the GNU
part should be protected.

Also, FYI, any header that checks feature test macros needs to be
including <features.h> to setup the default features if none are
explicitly defined.

> 2.) I'm unsure of how to replace ElfW(Phdr) in dl_phdr_info - or did
> you mean that musl should internally use some opaque type for
> dl_phdr_info? Also, should ElfW() go in elf.h?

Indeed, I think it's best to just keep using ElfW like you originally
did, then.

The issue with the types in elf.h is that they are designed for
writing a program that processes arbitrary (possibly foreign arch) ELF
files, not for writing native code, and thereby they make it so that
you have to use ugly macros to get the right native types. Most of
musl's dynamic linker just uses size_t for the address/word-size
types, and uint32_t and uint16_t for the unconditionally 32/16-bit
types, rather than the complex ELF type mess. However, since link.h is
a public interface, it's probably best to just follow the ugly
precedent there. After all, if somebody takes &foo->dlpi_addr, they
expect it to have the right type, and this could break if the ELF type
is unsigned long while the uintptr_t type is unsigned int (even if
they have the same size/representation).

Sorry for wasting your time with this issue.

> 3.) The attached dltest.c reveals a problem with dlopen()'d libraries
> which seems to be related to the locking strategy (see output below).
> If I don't take the lock and check for current == saved_tail, it
> "fixes" the example. At the least, I think this reveals a flaw with
> dlopen("foo", RTLD_NOW) - shouldn't it hold the lock until the dso
> list has been updated?
> 
> This function is used by libunwind and (I think) libgcc_eh for C++
> exception support, and it's possible that additional fields in
> dl_phdr_info will be necessary in order for those to work unmodified
> with musl. I'll look into this today and come up with more tests.
> Solaris and FreeBSD, at least, have these appended to struct
> dl_phdr_info:
> 
> unsigned long long int dlpi_adds;       /* total # of loads */
> unsigned long long int dlpi_subs;       /* total # of unloads */
> 
> The dl_iterate_phdr() callback is passed a size arg, and the callback
> is responsible for checking the size to ensure that the additional
> fields are present. Don't shoot the messenger! :)

That's no problem; they look easy to add. Also, the last two fields
for TLS stuff would be easy to add; they're already in struct dso (as
tls_image and tls_id).

> @@ -57,6 +58,8 @@ struct dso {
>  	size_t *dynv;
>  	struct dso *next, *prev;
>  
> +        Phdr *phdr;
> +        uint16_t phnum;

It looks like your patch is mixing tabs and spaces. Please use tabs
for indention, and make indention consistent with the rest of musl.

>  	if (aux[AT_PHDR]) {
>  		size_t interp_off = 0;
>  		size_t tls_image = 0;
>  		/* Find load address of the main program, via AT_PHDR vs PT_PHDR. */
> -		phdr = (void *)aux[AT_PHDR];
> -		for (i=aux[AT_PHNUM]; i; i--, phdr=(void *)((char *)phdr + aux[AT_PHENT])) {
> +		app->phdr = phdr = (void *)aux[AT_PHDR];
> +                app->phnum = aux[AT_PHNUM];
> +		for (i=app->phnum; i; i--, phdr=(void *)((char *)phdr + aux[AT_PHENT])) {

I would avoid changing the for look here. Hopefully the compiler
generates the same code either way, but in principle it's
easier/faster to read from aux[AT_PHNUM] (a fixed offset from the
stack pointer) than through indirection via app->phnum. And avoiding
the extra change makes it clear that the patch is not changing much,
just saving new values.

> +int dl_iterate_phdr(int(*callback)(struct dl_phdr_info *info, size_t size, void *data), void *data)
> +{
> +    struct dso *current, *saved_tail;
> +    struct dl_phdr_info info;
> +    int ret = 0;
> +
> +    pthread_rwlock_rdlock(&lock);
> +    saved_tail = tail;
> +    pthread_rwlock_unlock(&lock);
> +
> +    for(current = head; ; current = current->next) {
> +        info.dlpi_addr  = (uintptr_t)current->base;
> +        info.dlpi_name  = current->name;
> +        info.dlpi_phdr  = current->phdr;
> +        info.dlpi_phnum = current->phnum;
> +
> +        ret = (callback)(&info, sizeof (info), data);
> +        if (ret != 0) break;
> +
> +        if (current == saved_tail) break;
> +    }
> +    return ret;
> +}
>  #else
>  void *dlopen(const char *file, int mode)

After the #else, there are no-op/dummy versions of all the dl
functions for static linked programs. I think it should be possible to
do this for dl_iterate_phdr too -- just use __libc.auxv to get the
AT_PHDR and AT_PHNUM entries, and there's only a single DSO, the main
program.

Actually with vsyscall, there's also code running from the vdso, but
exceptions cannot occur in that code so only the debugger needs to be
aware of its existence. If you want, you could report it with
__libc.auxv using AT_SYSINFO_EHDR to get the phdr pointer.

> #ifdef _LP64

This should still be something like #if UINTPTR_MAX > 0xffffffff

> struct dl_phdr_info {
>     uintptr_t dlpi_addr;
>     const char *dlpi_name;
>     const ElfW(Phdr) *dlpi_phdr;
>     uint16_t dlpi_phnum;
> };

And could you use consistent indention (tabs) here?

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.