Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 3 Jul 2015 12:08:22 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: continuous feeding (for data driven branching)

On Sun, Jun 21, 2015 at 01:14:14AM +0300, Solar Designer wrote:
> On Fri, Jun 19, 2015 at 05:33:11PM +0300, Aleksey Cherepanov wrote:
> > Just to document a discussion behind an idea...
> > 
> > When there is a data driven branching we may want to pack intermediate
> > results into several queues. For example, sunmd5 does it after "coin
> > flip".
> > 
> > Currently we get a lot of candidates, separate them into 2 packs and,
> > oops, packs may be sub-optimal: for instance, we got 8 candidates and
> > based on "coin flip" (or other factor), we got packs with intermediate
> > results for 3 and 5 candidates while we would like to have 4 and 4 for
> > our sse code.
> 
> This problem also exists in cases where the data-driven branching is not
> in the original algorithm, but is acquired along with our optimizations
> such as bitslicing.  This happens e.g. when we bitslice an algorithm
> where processing varied by password length.  Once we do, we have
> data-driven (actually password length driven) branching in our attack
> specific code.  Another example is tripcode, where descrypt salts are
> generated from candidate passwords, whereas our bitslice DES
> implementation assumes a fixed salt for all candidates.  Buffering and
> grouping by password length or by tripcode descrypt salt or whatever is
> relevant works around this issue, but with the drawback you describe.
> 
> > Continuous feeding: format reads N candidates, separate them and if it
> > needs more input to fill up buffers optimally then it asks for them.
> 
> Yes, that's an obvious idea I had some years back... possibly as far
> back as 1998, when thinking of whether to bitslice bsdicrypt's key
> setup.  (I ended up not bitslicing it at the time, and it still is not -
> only the main processing is bitsliced; I am still considering changing
> this, though, so that we'd drop the non-bitslice DES code from JtR tree.
> Would also need to deal with what'd become a similar issue in AFS_fmt.c,
> though - grouping by length.  Right now, that format is not even
> bitsliced at all for that reason, and it feels wasteful to spend time on
> it when few people, if anyone at all, still need it.)
> 
> Another way to do the same is to make the multiple queues visible to the
> cracking modes, which would then knowingly work on the multiple queues.
> This complicates the cracking modes' implementations, but may help solve
> some of the new issues such as progress and speed reporting, which could
> then be 100% correct, even if too complicated to many users ("what are
> all those queues I get separate reporting for?"), and crash recovery
> (can have separate checkpoints for each queue, much like we do for
> separate --fork'ed processes, or alternatively can record the fully
> processed point only, like "single crack" mode does for its rules now).

Queues are interesting if you can predict branching. Though some
branching can't be done so: sunmd5's branching is based on
intermediate results of hashing, so queues can't be used by generator.

By accident I brought interesting case of branching in LMs on
john-users: full LM of 2 halves need split of candidate, so candidates
longer than 7 chars should be split into 2 candidate, shorter
candidates should not be split.

(BTW parity of LMs provides interesting optimizations: if all right
halves empty, we can skip long passwords; if all right halves not
empty, we can skip short passwords; if we cracked all right halves,
then we can skip long passwords that do not end with such letters.)

set_key() (or similar new method) could do that: it should return flag
if it would like to get more input or not. It would be sufficient for
length based branching.

Extension of idea of set_key() that can ask for more: such set_key()
can pack optimal pack, compute intermediate results using vector code,
split them into queues based and ask for more candidates. It should be
not slower than with crypt_all(). But there are questions: at some
moment, we will have fully hashed results for some candidates and some
"work in progress" candidates, it would be nice to check hashes as
soon as they hashed fully.

Thanks!

-- 
Regards,
Aleksey Cherepanov

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ