john-dev - Re: precomputed attacks for john: rainbow tables and other ways

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 7 Jul 2015 19:07:21 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: precomputed attacks for john: rainbow tables and
 other ways

On Fri, Jun 26, 2015 at 10:17:11AM +0300, Aleksey Cherepanov wrote:
> Another idea about tracking: at any point, we are sure that we tried
> all candidates in the beginning. So we can skip their tracking. Also
> we can postpone tracking of tail. We can track a frame in key space
> and move it when beginning is filled tightly.
> 
> For instance
> 
> ******....*...**....*...*
>       ^
>       All candidates on the left hand side were tried due to
>       sequential tries.
> 
> So with framing:
> 
> v1          v2
> ******....*...__...._..._
>       ^3  ^4  ^^5
> 
> 1 is the beginning of the frame.
> 2 is the end of the frame.
> 3 is our position in sequential tries.
> 4 is a tried candidate, it was hit by middle of some chain.
> 5 are tried candidates hit by middle of chains, but we don't track
> them now.
> 
> Then we shift our frame and continue generation:
>     v           v
> ******....*..............
>       ^
> 
> This way we need only frame size of memory and skip closest findings
> hoping that hits in the tail will be repeated after the shift of the
> frame.

I made an implementation in C with openssl. I did not try cracking so
the code may be broken.

I implemented cyclic buffer for tracking limited to 'tracking_buffer'
size, 'total_count / 8' is max.

Some stats for 2^24 candidates (3 words) with md5:

Legend:
chain length / number of colors, number of candidates to track as part of keyspace

200/200, 1/1
i = 16711680, ck = 953833, reuse ratio = 0.94292
real	0m47.326s
так и было, хорошо

200/200, 1/2
i = 16711680, ck = 1266126, reuse ratio = 0.92424
real	1m2.085s

200/200, 1/4
i = 16711680, ck = 1707701, reuse ratio = 0.89781
real	1m20.159s

200/200, 1/8
i = 16711680, ck = 2402943, reuse ratio = 0.85621
real	1m54.208s

200/200, 1/16
i = 16711680, ck = 3406468, reuse ratio = 0.79616
real	2m39.457s

So it can be real to squeeze 2 more bits into attack.


Though experiments suggest that it may be useful to write down
remaining candidates as is at some point to reduce number of chains
and respective computations. It is not that handy tracking only a bit
of keyspace.

Stats for some lengths:

200/200
i = 1966080, ck = 386923, reuse ratio = 0.80320
i = 6553600, ck = 654749, reuse ratio = 0.90009
i = 16711680, ck = 953833, reuse ratio = 0.94292
real	0m48.769s

2000/2000
i = 196608, ck = 38782, reuse ratio = 0.80274
i = 720896, ck = 68275, reuse ratio = 0.90529
i = 2097152, ck = 104052, reuse ratio = 0.95038
i = 16711680, ck = 224271, reuse ratio = 0.98658
real	1m54.509s

65536/65536
i = 65536, ck = 3258, reuse ratio = 0.95029
i = 851968, ck = 8245, reuse ratio = 0.99032
i = 16711680, ck = 23359, reuse ratio = 0.99860
real	6m27.350s

Multiplying number of chains (ck) and chain length, we get total
number of hashing ops. Let's divide it by total_count:
11.370575428009
26.7351865768433
91.24609375

So for chains of length 200, we do 11x more hashing to generate the
table.

'reuse ratio' is computed based on the beginning, but it should be
right for the whole keyspace as ratio of coverage due to randomness.

Again, I am afraid the code is broken so the stats and conclusions may
be wrong.

Thanks!

-- 
Regards,
Aleksey Cherepanov

View attachment "ossl.c" of type "text/x-csrc" (6422 bytes)
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.