musl - Re: [PATCH] ldso: skip gnu hash calculation if precomputed

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251205184832.GD1827@brightrain.aerifal.cx>
Date: Fri, 5 Dec 2025 13:48:34 -0500
From: Rich Felker <dalias@...c.org>
To: Fabian Rast <fabian.rast@....de>, musl@...ts.openwall.com
Subject: Re: [PATCH] ldso: skip gnu hash calculation if precomputed

On Fri, Dec 05, 2025 at 07:30:51PM +0100, Szabolcs Nagy wrote:
> * Fabian Rast <fabian.rast@....de> [2025-12-05 18:37:57 +0100]:
> > >> the gnu hash table only stores the upper 31 bits of the 32 bit wide
> > >> hashes. the lsb is used to indicate the end of a chain.
> > >> this missing bit can be guessed for the cost of traversing two chains
> > >> in the worst case.
> > >
> > > you mean two chains per loaded dso?
> > >
> > > then i'd expect the optimization to be only enabled
> > > if dso-index-in-symbol-lookup-order < N.
> > 
> > I don't quite understand what you mean by this.
> ...
> > Searching a hashtable bucket is not that expensive, in my experience they
> > are rather small.
> > For example here are the number of buckets for each length in libLLVM.so:
> > length: 1: 924 buckets
> > length: 2: 1429 buckets
> > length: 3: 1918 buckets
> > length: 4: 1895 buckets
> > length: 5: 1517 buckets
> > length: 6: 1045 buckets
> ...
> 
> you have to do this for every loaded dso that's
> before the dso being relocated in symbol lookup
> order.
> 
> we know the relocated dso has a definition for
> the symbol since it was in the hash table. so
> we expect every single earlier dso to fail the
> symbol lookup: symbol interposition is rare.
> the point of the hash is to make such lookup
> failure fast. but you made it twice as slow on
> average.
> 
> eventually the gain from not computing the hash
> is dominated by caculating 2*n bloom filters
> (and potentially chasing twice as many hash chains)

I think this analysis sounds correct, in which case the optimization
probably only makes sense if we can find a way to determine the
correct hash in tiny bounded time as a one-off operation, rather than
having to try two candidate hashes per dso.

One way to do this would be to use the bloom filter from the dso
making the symbol reference: if only one of the two candidates passes
it, we know which candidate was correct. I would expect this to happen
most of the time, if the bloom filter is actually doing anything
useful to begin with.

The other option would be seeing if there's a fast way to compute just
the low bit of the hash. I think it's just the parity of the low bits
of all the bytes in the symbol name, which may be moderately faster to
compute than the hash, but probably not enough to help.

My leaning would be that the winning move is just to use the bloom
filter when is succeeds, and otherwise treat it as not having a
precomputed gnuhash.

Rich

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.