Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 20 Apr 2023 14:15:48 +0200
From: magnum <>
Subject: Re: Birthday paradox (Bloom filter, Perfect hash tables)

I kind of regret having this conversation on the ML instead of Github - 
the latter is way more convenient.

I found that Sayantan's perfect hash tables involves randomness - they 
end up with slightly different sizes.  A Mersenne Twister is seeded from 

Perhaps differences in speed between runs could be caused by different 
modulo figures ending up differently optimized?  Or should I expect that 
such optimizations are more or less uniform given that OFFSET_TABLE_SIZE 
is (assumably) always a prime?

I suppose we could give it a static seed for uniform test runs.  Perhaps 
we could even want to commit a static seed?  I assume we'd still need 
the varying seed for when it has to retry a table build though.

On 2023-04-19 23:47, magnum wrote:
> I now added code that checks if the offset + hash tables fit into local 
> memory and if so, skips the bitmap completely.  This happens up to about 
> 5300 loaded hashes.  But just like most things I tried, this performs 
> just the same as using the bitmap - neither better or worse.

Because of mentioned randomness, 5308 hashes *sometimes* fit in local 
memory without a bitmap, sometimes not.  But at that number, using local 
or not almost doesn't affect the speed I get!  Perhaps the freed local 
memory becomes cache (or caching is otherwise very good) - I recently 
heard someone mentioning that about modern GPU's.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.