musl - Re: ldso : dladdr support

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50212306.6070402@gmail.com>
Date: Tue, 07 Aug 2012 16:15:34 +0200
From: musl <b.brezillon.musl@...il.com>
To: musl@...ts.openwall.com
CC: Szabolcs Nagy <nsz@...t70.net>
Subject: Re: ldso : dladdr support

Thanks for your review.

On 07/08/2012 13:46, Szabolcs Nagy wrote:
> * musl <b.brezillon.musl@...il.com> [2012-08-07 11:04:19 +0200]:
>> This patch adds support for dladdr function.
>> It is based on my previous patch (gnu hash support).
>>
> i havent checked the content of your patch yet
> just had a quick glance
>
> i think before you make substantial changes
> it's better to have some discussion about
> the design
Could you tell me more about the design issues
(I guess this has something to do with function pointers and multi hash algorithms design) ?
>
> i'll just have some stylistic comments for now
> (i think there is no official guideline and
> we are not terribly anal about it but i'll
> still point out a few things for future reference)
>
>>  struct hash_algo {
>>  	uint32_t (*hash) (const char *);
>> -	Sym *(*lookup) (const char *s, uint32_t h, struct dso *dso);
>> +	Sym *(*lookup) (const char *, uint32_t, struct dso *);
>> +	void (*iterate) (struct dso *, int (*) (struct dso *, Sym *, void *), void *);
>>  };
>>  
> use
>   foo(arg)
> instead of
>   foo (arg)
>
> in case of keywords it's not clear what's better
> (musl usually uses spaces after if,while,for,..)
> but for function calls no space is clearer
>
> (in general musl code rarely uses unnecessary spaces and parentheses)
I'm correcting my code to match the musl coding style.
>
> actually i'm not sure if the function pointer approach is
> the right one here, but that's a design question
>
Could you tell me why (performance)?
>> +	uint32_t *hashtab = dso->hashtab;
>> +	uint32_t *buckets = hashtab + 4 + (hashtab[2] * (sizeof(size_t) / sizeof(uint32_t)));
> i don't know the algorithm but the sizeof magic looks suspicious to me
I took the algorithm described here :

https://blogs.oracle.com/ali/entry/gnu_hash_elf_sections

The maskwords are native word size (64 bits for elf64 and 32 bits for elf32).
I could define these macros:
#define MASKWORD_SIZE sizeof (Elf32_Word)
#define MASKWORD_SIZE sizeof (Elf64_Word)
>
>> +static uint32_t gnu_hash (const char *s0)
>> +{
>> +	const unsigned char *s = (void *)s0;
>> +	uint_fast32_t h = 5381;
>> +	for (unsigned char c = *s; c != '\0'; c = *++s)
>> +		h = h * 33 + c;
>> +	return h & 0xffffffff;
>> +}
>> +
> i think c != '\0' is not very idiomatic
> and i'd avoid using c99 style declaration in for
>
> use
>
> for (; *s; s++)
> 	h = 33*h + *s;
>
Corrected.
>> +	uint32_t shift2 = hashtab[3];
>> +	uint32_t h2 = h1 >> shift2;
> i'm not sure if input validation makes sense in ldso
> but shifts can be tricky (hashtab[3] must be in 0..31 here)
By validation do you mean mask the shift value with 0x1F ?
>
>> +	for (h1 &= ~1; 1; sym++) {
> the 1; in the middle is unnecessary
>
> h1 &= ~1; is problematic if signed int representation is
> not two's complement
> (~1 is an implementation defined negative value which is
> then converted to uint32_t according to well defined rules)
>
> so i'd use
>
>   for (h1 &= -2;; sym++) {
>
> which is probably less clear but more correct
> (maybe an explicit (uint32_t)-2 cast helps with that)
I'll take a closer look at the gnu_lookup algo to see if I can improve it.
>> -	if (h==0x6b366be && !strcmp(s, "dlopen")) rtld_used = 1;
>> -	if (h==0x6b3afd && !strcmp(s, "dlsym")) rtld_used = 1;
>> -	if (h==0x595a4cc && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
>> +	uint32_t computed[HASH_ALG_CNT / 32 + 1];
>> +	uint32_t hashes[HASH_ALG_CNT];
>> +	memset (computed, 0, sizeof (computed));
>> +	if (!strcmp(s, "dlopen")) rtld_used = 1;
>> +	if (!strcmp(s, "dlsym")) rtld_used = 1;
>> +	if (!strcmp(s, "__stack_chk_fail")) ssp_used = 1;
> ...
>> +		if (!(computed[dso->hashalg / 32] & (1 << (dso->hashalg % 32)))) {
>> +			h = hashalgs[dso->hashalg].hash(s);
>> +			hashes[dso->hashalg] = h;
>> +			computed[dso->hashalg / 32] |= (1 << (dso->hashalg % 32));
>> +		}
>> +		else {
>> +			h = hashes[dso->hashalg];
>> +		}
> i think we can have more efficient code here if
> only gnu and sysv hash algos are supported
> (i don't think it's realistic to assume new hash algorithms
> even gnu hash is a fairly useless addition to elf)
This has to do with the function pointer approach.
Do you prefer if/else alternative?
> the if else style is
>
>   if (cond) {
>   ...
>   } else {
>   ...
>   }
>
>
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.