john-dev - Re: Re: Judy array

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date: Sun, 13 Sep 2015 20:28:08 -0700
From: Fred Wang <waffle.contest@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Re: Judy array

On Sep 13, 2015, at 7:45 PM, Sayantan Datta <std2048@...il.com> wrote:
> Nice!! Unlike perfect hash tables, Judy array are supposed to be cache friendly. However, I'm curious regarding the number of lookups required!!. I'll study them in more details. 
> 
> Fred, have you compared the performance of bloom filters vs bitmaps(maybe one or multiple)? 
> 

Yes.  For the most part (and in particular, when Bloom filters are very sparse), they are always a win for "our" type of lookup.  In cracking, the vast majority of hashes will fail, and that is what I optimized for. 

A Judy array on its own is already faster than what John is doing.  Fronting this with a Bloom filter vastly improves performance.  Regretfully, I did not keep my bitmap timing - but it was not impressive.

I use a 10-year-old Dell 2950 as my test environment, precisely because it uses slower memory, and more easily shows improvements.  For my "standard" test case (MD5, 29 million hashes, a ~13 million entry dictionary, and best64 rules, yielding about 1 billion hash attempts to find about 1.7 million solutions)

hashcat	3 minute 54 seconds
mdxfind	1 minute 15 seconds  (Judy only)
mdxfind	47 seconds  (Current code, Bloom filter + Judy)

Its important to note that this includes the time to read 29M hashes, and store them - this takes about 22 seconds on the test box.  The box uses dual E5410  @ 2.33GHz, so 8 cores.

So, Judy on its own is pretty darn good, but fronting it with a Bloom filter is pure win for this application.  

Please take some time to try it out.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.