[<prev] [next>] [thread-next>] [day] [month] [year] [list]
```Date: Tue, 18 Apr 2023 19:12:09 +0200
From: magnum <magnumripper@...hmail.com>
To: john-dev@...ts.openwall.com

On 2023-04-18 18:28, magnum wrote:
> On 2023-04-18 17:57, magnum wrote:
>> Can anyone point me to a (approximation) formula for the birthday
>> paradox, where for example we have a bitmap with 4096 bits and
>> populate it with 1024 random bits. What is the expected number of bits
>> set in the bitmap?
>>
>> I think the answer is ~907, as that's what I'm seeing in my
>> experiments - and also what this simple script shows:
>
> Talking to the duck works every time :)
>
> Found it at
> https://jaxwebster.wordpress.com/2012/01/24/expected-number-of-different-birthdays/
>
> It's as simple as 4096*(1-(4095/4096)^1023) where ^ means power of, as
> in bc(1):

A correction, just in the remote case someone read the above and wants
to use it:  I mistakingly decreased by one where I shouldn't - the
correct formula is 4096*(1-(4095/4096)^1024).

\$ bc -l <<< '4096*(1-(4095/4096)^1024)'
906.12935699914074324992

And that's even closer to my empirical data.

Background: I was amazed how an 4-level bitmap that "should" give 1/16
false positives, could end up as good as 1/43.  That was with 512 hashes
in a 4x1024 bitmap. Using this formula we end up with 1/41, so mystery
solved.

In another test case, 1024 hashes in a 8x1024 bitmap would be "full" and
should give 1/1 false positives if we don't factor in the birthday
paradox.  But empirical tests showed an amazing 1/38 - actually good
enough to make an nvidia perform near its best!  This was again
explained with the birthday paradox (which says 1/39).

Cool stuff,
magnum

```

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.