Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 1 Jul 2015 17:10:16 +0300
From: Timo Teras <timo.teras@....fi>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: Further dynamic linker optimizations

On Wed, 1 Jul 2015 10:03:27 -0400
Rich Felker <dalias@...c.org> wrote:

> On Wed, Jul 01, 2015 at 08:41:29AM +0300, Timo Teras wrote:
> > On Tue, 30 Jun 2015 16:04:54 -0400
> > Rich Felker <dalias@...c.org> wrote:
> > 
> > > Discussion on #musl with Timo Teräs has produced the following
> > > results:
> > 
> > Nice summary. Thanks!
> > 
> > > - The whole outer for loop in find_sym is the hot path for
> > >   performance. As such, eliminating the lazy calculation of
> > > gnu_hash and simply doing it before the loop should be a
> > > measurable win, just by removing the if (!ghm) branch.
> > 
> > Additional thought. We could do a skip list here. If we calculate
> > the gnu-hash unconditionally, we could bloom filter bits to
> > construct a skip list.
> 
> This wasn't something I recall discussing...

Yes. That's why I said "additional thought". :)

> > That is, we have next_symlookup[] array that has pointer for each
> > wordsize bits (or potentially a small multiple of it). And we would
> > link each dso in next_symlookup array corresponding to each bloom
> > filter bit (for dso without gnu-hash it'd have to go to all of
> > them). Then on lookup we could just use the calculated bloomfilter
> > to follow the correct symlookup chain next pointers.
> 
> This is a very large size increase, and perhaps notable startup time
> increase, just for the sake of mislinked (clang) applications. It's
> something like the idea I wanted to do in a static linker, albeit with
> a much larger global table for where to start based on hash%largeval
> rather than local next/skip tables per module. But I don't think it's
> appropriate for dynamic linking.

Yes. And after some trivial benchmarking, it seems to not give any
significant improvement. bloomfilter using only one machine word is not
enough for anything. And doing anything larger gives too much memory
use overhead.

Just moving the gnu-hash calculation out of the loop and doing it
always + removing the ->global check will give already quite noticeable
boost.

Thanks.
Timo

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.