john-dev - Re: [patch] optional new raw sha1 implemetation

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20120619141347.GA17060@openwall.com>
Date: Tue, 19 Jun 2012 18:13:47 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: [patch] optional new raw sha1 implemetation

On Tue, Jun 19, 2012 at 03:56:31PM +0200, Tavis Ormandy wrote:
> On Tue, Jun 19, 2012 at 05:31:39PM +0400, Solar Designer wrote:
> > How large is your actual max_keys_per_crypt?
> 
> At the moment, just 256. But I would like it to be much higher.

Why?  It's mostly just to reduce function call overhead and maybe to
optimize L1 cache usage (including instruction cache), but when you set
the value too high, you exceed the size of L1 data cache.  256 sounds
sane for a CPU-only format.  Well, maybe you can do 1024 and still be
within L1.  Even if you avoid the specific issue with
cmp_all()/cmp_one(), you would probably not want to use much higher
max_keys_per_crypt.

Well, another reason to use a higher max_keys_per_crypt is with
OpenMP-enabled builds, though.  Then, yes, it could make sense for you
to go for values that are beyond L1 cache size (but within L2 or L3).
However, performance will be worse than that of independent processes
anyway, since the candidate password generation is single-threaded and
not overlapping with hash computations under the current interfaces.
(I intend to address this by introducing set_mask() as discussed in
another john-dev thread.  I may also allow for async processing, but
unfortunately that's not compatible with OpenMP in particular, where
"nowait" only works within a parallel region, not across function
boundaries.)

> > Is your cmp_one() really as heavy as one SHA-1
> > computation?
> 
> A little heavier, because I optimise for the cmp_all case. for cmp_one,
> I have to unpack keys and so on.
> 
> >  Are you frequently running this against exactly 1 or 2
> > loaded hashes (not 3 or more)?
> 
> Yes.
> 
> > I am not sure if there's a way to reclaim that 0.1% without incurring it
> > (or more) elsewhere.  For example, you can do full instead of partial
> > hash comparisons in cmp_all(), but this might make its code slower
> > (through differences in the code and maybe through worse locality of
> > reference when full rather than partial binary hashes are in memory).
> 
> Hmm, not sure I follow. When I return a possible match from cmp_all,
> john calls cmp_one on all max_keys_per_crypt. As these are quite
> expensive for me as max_keys_per_crypt get's higher, and get_hash() ==
> get_binary() is very cheap, it seems like an easy win to test that
> before I really do the cmp_one test.
> 
> Does that sound wrong? It's not a big deal, but I don't see how it can
> hurt.
> 
> I mean, just this in cmp_one():
> 
> if (get_hash() != get_binary())
>   return 0
> 
> return full_comparison();

Oh, that's merely an optimization to your own cmp_one().  This makes
sense to me.  I did not realize your normal cmp_one() was heavier than
that one check.  Normally, that's pretty much what a cmp_one() does,
where any heavier stuff is left for cmp_exact().

Why limit it to the output size of the existing get_hash*() function
(the largest one?), though?  Can't you quickly compare a 32-bit value?

Thanks,

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.