Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 16 Aug 2012 20:03:59 +0200
From: musl <b.brezillon.musl@...il.com>
To: musl@...ts.openwall.com
CC: Rich Felker <dalias@...ifal.cx>
Subject: Re: ldso : dladdr support

Hi,

Here is a new patch (and a diff from the previous one).

Could you tell me if the new decode_vec function is what you expected?

Regards,

Boris

On 12/08/2012 01:05, Rich Felker wrote:
> On Wed, Aug 08, 2012 at 03:57:15PM +0200, musl wrote:
>> Same as before except I use bit mask to support multiple hash algorithms.
> Sorry for taking a while to get back to you. I haven't had as much
> time to work on musl the past couple weeks and some other topics (like
> mips dynamic linking) had priority, but I hope to have more again for
> a while now. Here's a quick review of some things that will hopefully
> turn into a discussion for improving/simplifying the code.
>
>> +#ifdef _GNU_SOURCE
>> +typedef struct {
>> +    const char *dli_fname;  /* Pathname of shared object that
>> +                               contains address */
>> +    void       *dli_fbase;  /* Address at which shared object
>> +                               is loaded */
>> +    const char *dli_sname;  /* Name of nearest symbol with address
>> +                               lower than addr */
>> +    void       *dli_saddr;  /* Exact address of symbol named
>> +                               in dli_sname */
>> +} Dl_info;
> musl policy is not to have commentary, especially copied from
> third-party sources, in the public headers. This is partly to
> strengthen the claim that all public headers are public domain
> (contain no copyrightable content) and partly just to avoid size creep
> and wasted parsing time by the compiler.
>
>>  static void decode_dyn(struct dso *p)
>>  {
>> -	size_t dyn[DYN_CNT] = {0};
>> -	decode_vec(p->dynv, dyn, DYN_CNT);
>> -	p->syms = (void *)(p->base + dyn[DT_SYMTAB]);
>> -	p->hashtab = (void *)(p->base + dyn[DT_HASH]);
>> -	p->strings = (void *)(p->base + dyn[DT_STRTAB]);
>> +	size_t *v;
>> +	p->hashalgs = 0;
>> +	for (v = p->dynv; v[0]; v+=2) {
>> +		switch (v[0]) {
>> +		case DT_SYMTAB:
>> +			p->syms = (void *)(p->base + v[1]);
>> +			break;
>> +		case DT_HASH:
>> +			p->hashtabs[SYSV_HASH] = (void *)(p->base + v[1]);
>> +			p->hashalgs |= (1 << SYSV_HASH);
>> +			break;
>> +		case DT_STRTAB:
>> +			p->strings = (void *)(p->base + v[1]);
>> +			break;
>> +		case DT_GNU_HASH:
>> +			p->hashtabs[GNU_HASH] = (void *)(p->base + v[1]);
>> +			p->hashalgs |= (1 << GNU_HASH);
>> +			break;
>> +		default:
>> +			break;
>> +		}
>> +	}
>>  }
> This is rather ugly but I don't see a better way right off, since
> DT_GNU_HASH has that huge value... Maybe it would be nice to improve
> decode_vec to have a variant that takes a (static const) table of DT_x
> values and struct offsets to store them at when found..? This is just
> some rambling and I'm not sure it's better, but it might be smart if
> we're going to want to continue adding support for more
> non-original-sysv DT_* entries with huge values, so we don't have to
> write new switches for each one.
>
> BTW, while I _want_ it to be safe, it's possible that early switches
> (early meaning prior to the comment in __dynlink that says libc is now
> fully functional) will actually fail to work/crash on some archs... So
> this needs consideration too.
>
>>  static struct dso *load_library(const char *name)
>> @@ -784,8 +899,11 @@ end:
>>  static void *do_dlsym(struct dso *p, const char *s, void *ra)
>>  {
>> [...]
>>  	if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES))
>>  		return p->base + sym->st_value;
>>  	if (p->deps) for (i=0; p->deps[i]; i++) {
>> -		sym = lookup(s, h, p->deps[i]);
>> +		algs = p->deps[i]->hashalgs;
>> +		if (!(algs & ok)) {
>> +			if (algs & SYSV_HASH) {
>> +				h[SYSV_HASH] = sysv_hash(s);
>> +				sym = sysv_lookup(s, h[SYSV_HASH], p->deps[i]);
>> +				ok |= SYSV_HASH;
>> +			} else {
>> +				h[GNU_HASH] = gnu_hash(s);
>> +				sym = gnu_lookup(s, h[GNU_HASH], p->deps[i]);
>> +				ok |= GNU_HASH;
>> +			}
>> +		} else {
>> +			if (algs & SYSV_HASH)
>> +				sym = sysv_lookup(s, h[SYSV_HASH], p->deps[i]);
>> +			else
>> +				sym = gnu_lookup(s, h[GNU_HASH], p->deps[i]);
>> +		}
> This looks like a lot of code duplication and extra unnecessary
> variables. The way I would do it is something like:
>
> if (p->deps[i]->hashtab && (h || !p->deps[i]->ghashtab)) {
> 	if (!h) h = hash(s);
> 	sym = sysv_lookup(s, h, p->deps[i]);
> }
>
> i.e. if there's a sysv hash table and we've already computed h (sysv
> hash) or if there's no gnu hash table, compute h if it wasn't already
> computed, and then attempt a lookup with it.
>
> I'm not sure I got the logic all right (this is really a 1-minute
> glance over the code right now amidst doing lots of other stuff too)
> but the ideas are:
>
> - no need for extra vars for bitmask. Whether the hash var for the
>   corresponding hash type is nonzero is sufficient to tell whether
>   it's been computed.
> - no need for extra vars/fields to store which hash types a dso has.
>   Just use the hashtab/ghashtab fields in the dso struct, and let them
>   be null if the corresponding hash table does not exist. (And don't
>   make them an array unless there's a real benefit in using an array;
>   I don't think there is any benefit unless you're aiming for
>   extensibility to support N hash types.)
>
>> +static int do_dladdr (void *addr, Dl_info *info)
>> [...]
>> +			if (p->hashalgs & (1 << SYSV_HASH)) {
>> +				hashtab = p->hashtabs[SYSV_HASH];
>> +				for (i = 0; i < hashtab[1]; i++) {
> I'm not seeing why this function needs hash tables at all. It's not
> looking up symbols, just iterating over the entire symbol table, no?
> Please explain if I'm mistaken.
>
> Rich


View attachment "dladdr-gnu-hash-v2.diff" of type "text/x-patch" (9819 bytes)

View attachment "dladdr-gnu-hash-v2.patch" of type "text/x-patch" (10874 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.