|
|
Message-Id: <DETAAKGGXEL9.359K4AMK2X47X@tum.de>
Date: Tue, 09 Dec 2025 02:10:46 +0100
From: "Fabian Rast" <fabian.rast@....de>
To: "Rich Felker" <dalias@...c.org>
Cc: <musl@...ts.openwall.com>
Subject: Re: [PATCH v2] ldso: skip gnu hash calculation if
precomputed
On Mon Dec 8, 2025 at 5:20 PM CET, Rich Felker wrote:
> I was thinking the bloom filter would include everything in the symbol
> table, including both symbols that provide definitions and symbol
> table entries that are used to reference undefined symbols. In this
> case, we could use the bloom filter of the DSO making the reference to
> determine which of the two candidate hashes is correct, and do so very
> quickly (nothing but a few bitshifts and masks).
>
> However, it seems reasonable that the bloom filter would only include
> symbols that actually have definitions. Otherwise at each step of the
> DSO chain you'd get a false positive for every DSO that references the
> symbol without defining it. So we probably can't do it the way I'd
> hoped.
I think only defined symbols are included.
But UNDEF symbols are usually also not included in the hashtable - they
are usually sorted to the beginning and gnu hash uses a constant offset
for every symbol index to exclude them. So for these undef symbols where
the bloom filter would not work we do not even have the other 31 bits
because the symbol is not in the hashtable. Of course the hash is also
a link time constant for these symbols, but its just not in the binary...
>> > But if undefined symbols are present, then here's my concept for how
>> > it would work:
>> >
>> > It should consist of grabbing two consecutive bits aligned at a 2-bit
>> > boundary (so not crossing slots) from the bloom filter, and
>> > considering the possible values:
>> >
>> > 00 - impossible
>> > 01 - low bit is 0
>> > 10 - low bit is 1
>> > 11 - inconclusive; just compute hash. should be rare with good bloom.
This sounds like it should work. I will try to implement this and see.
> I think modern linkers sort relocations by the symbol referenced,
> which allows you to bypass the lookup if the reference is the same as
> the previous relocation. We do not take advantage of this in the
> do_relocs loop at all, but we could. That would probably give far more
> performance boost than eliding the hashing and might even make eliding
> the hashing pointless.
I have already implemented this and will send a patch.
Best Regards,
Fabian Rast
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.