john-dev - Re: Birthday paradox

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230418172936.GA2732@openwall.com>
Date: Tue, 18 Apr 2023 19:29:37 +0200
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Birthday paradox

On Tue, Apr 18, 2023 at 07:12:09PM +0200, magnum wrote:
> $ bc -l <<< '4096*(1-(4095/4096)^1024)'
> 906.12935699914074324992

I was using this same formula for these things.

> Background: I was amazed how an 4-level bitmap that "should" give 1/16 
> false positives, could end up as good as 1/43.  That was with 512 hashes 
> in a 4x1024 bitmap. Using this formula we end up with 1/41, so mystery 
> solved.
> 
> In another test case, 1024 hashes in a 8x1024 bitmap would be "full" and 
> should give 1/1 false positives if we don't factor in the birthday 
> paradox.  But empirical tests showed an amazing 1/38 - actually good 
> enough to make an nvidia perform near its best!  This was again 
> explained with the birthday paradox (which says 1/39).

Now you need to compare efficiency of these multiple bitmaps against one
Bloom filter of their cumulative size.  And then against a Cuckoo filter
of the same size.

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.