Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251208052058.GQ3520958@port70.net>
Date: Mon, 8 Dec 2025 06:20:58 +0100
From: Szabolcs Nagy <nsz@...t70.net>
To: Rich Felker <dalias@...c.org>
Cc: Fabian Rast <fabian.rast@....de>, musl@...ts.openwall.com
Subject: Re: [PATCH v2] ldso: skip gnu hash calculation if precomputed

* Rich Felker <dalias@...c.org> [2025-12-07 21:25:42 -0500]:
> On Sun, Dec 07, 2025 at 08:44:54PM -0500, Rich Felker wrote:
> > This looks a lot more expensive than the method I suggested using the
> > bloom filter, which has no modulo, much less two.
> 
> Actually, I'm not sure if the bloom filter approach works. Is a hash
> included in the bloom filter as soon as it's present in the symbol
> table, or only if it's present with a definition?
> 
> If it's absent for undefined, then this approach is a non-starter. It
> can't tell us anything.

isn't it the other way around?

only defined symbols set bits in the bloom filter.
it is for checking if a dso might have a definition.

> 
> But if undefined symbols are present, then here's my concept for how
> it would work:
> 
> It should consist of grabbing two consecutive bits aligned at a 2-bit
> boundary (so not crossing slots) from the bloom filter, and
> considering the possible values:
> 
> 00 - impossible
> 01 - low bit is 0
> 10 - low bit is 1
> 11 - inconclusive; just compute hash. should be rare with good bloom.

yeah, one of the bits is 'in the filter' the other
is a random bit test: if p portion of the bits are
1 then with p chance we fail.

in bfd ld the bloom mask seems to be sized such that
if nsym in [ 3<<k, 3<<(k+1) ) then maskbits = 32<<k
so 3/32 <= nsym/maskbits < 6/32

each symbol sets two bits so the probability that
a bit is 0 is (1-1/maskbits)^(2*nsym) so a bit is 1
with roughly p = 1 - exp(-2*nsym/maskbits) that is
0.17 < p < 0.32 probability. (normally two bits
are tested so bloom fails with p^2, but the other
bit is shared between h&-2 and h|1)

if this is right and the hash is good, then we have
about 20-30% chance to fail and 70-80% to succeed
just with that one bit.


it seems to me that with many dsos (>50), it's not
the gnu_hash but the walk over all dsos and checking
gnu_lookup_filtered will be the slow part. not
sure if it's possible to help with that.
i also noticed that some symbols are looked up a lot:
in clang >30% of lookups are (multiple relocs per dso)
__cxa_pure_virtual
_ZTVN10__cxxabiv120__si_class_type_infoE
_ZTVN10__cxxabiv117__class_type_infoE
_ZTVN10__cxxabiv121__vmi_class_type_infoE
and these generate fair bit of lookups too (undef weak):
_ITM_deregisterTMCloneTable
_ITM_registerTMCloneTable
__deregister_frame_info
__register_frame_info

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.