john-dev - Re: Re: precomputed attacks for john: rainbow tables and? other ways

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20150629205941.GA16087@openwall.com>
Date: Mon, 29 Jun 2015 23:59:41 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Re: precomputed attacks for john: rainbow tables and?
 other ways

On Mon, Jun 29, 2015 at 11:54:40PM +0300, Aleksey Cherepanov wrote:
> Alain,
> 
> On Mon, Jun 29, 2015 at 03:53:38PM -0400, Alain Espinosa wrote:
> > 
> > 
> > -------- Original message --------
> > From: Aleksey Cherepanov <lyosha@...nwall.com> 
> > Date:06/29/2015 1:30 PM (GMT-05:00) 
> > To: john-dev@...ts.openwall.com 
> > Cc: 
> > Subject: Re: [john-dev] Re: precomputed attacks for john: rainbow tables and other ways 
> > 
> > > Sorry I'm not really following you...
> > 
> > Me neither :)
> > 
> > There is some research in perfect Hash tables to reduce memory and speed up access with TMTO:
> > 
> > - 2005, "Time-Memory Trade-Offs: False Alarm Detection Using Checkpoints"
> > - 2001, "Real Time Cryptanalysis of A5/1 on a PC"
> > 
> > Other topic:
> > I begin only recently to research TMTO but one optimization appear obvious, and to some extent was used but not completely.
> > 
> > N: size of key space
> > m: number of chains
> > t: chain length
> > 
> > The common analysis supposed mt^2=N for each table and t different tables. This is because larger m, t generate more probably duplicates and chain merges. t tables generate t^2 hash executions and t disk access in the online phase. Using more tables is only an optimization to cover a larger part of the keyspace, so generating only one table with distinguished points reduce hash execution to t and only one disk access at the expense of more offline time. How much? We can brute force the entire key space so at most t more expensive, possibly less with a smarter search. Note that we can easily eliminate duplicates because we use DP. This is a very large online speedup and I am not sure why this was not used.
> > 
> > This algorithm is not parallelizable as is but if many hashes are available, they can be searched in parallel.
> > 
> > Other big advantage is the very good client-server architecture. You don't need to download large table file. In the client you calculate the DP (performance heavy part) and ask the server if exists. The server only search in disk or RAM and return the start point, or false if the DP not exists.
> 
> I do not really get your algorithm. My view: let's end chains based on
> hash value, for instance end chain if hash has 2 leading zero bytes;
> so intermediate hashing results are not needed to be checked against
> the table, this way we compute (number of colors)*(chain length)
> hashes but check only (number of colors) hashes against the table, it
> allows to make lookup fast, also it allows to offload chain
> computation to devices with low bandwidth like fpga, gpu or remote
> machines allowing us to use very very long chains.

It is parallelizable. The only thing is that length of chains has to
be stored during generation phase. Then maximal length should be used.
Unfortunately max length is not guaranteed but it should not be very
big due to randomness of hashes.

> Do you mean the same?

Thanks!

-- 
Regards,
Aleksey Cherepanov

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.