john-dev - Re: get_hash*() might be removed with other format interface

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20150913145939.GA2813@openwall.com>
Date: Sun, 13 Sep 2015 17:59:39 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: get_hash*() might be removed with other format interface

On Sun, Sep 13, 2015 at 05:17:35PM +0300, Aleksey Cherepanov wrote:
> Consider format interface where crypt_all() fills an array of integers
> just like get_hash_6() returns. So we could avoid MAX_KEYS_PER_CRYPT
> function calls. But we would need to move body of get_hash_*() into
> caller, so we would not reduce bodies.

This is already possible in the current interface, by having crypt_all()
make use of its optional capability to check against loaded hashes.  The
caller would still double-check, but only if crypt_all() returned that
there might be matches.  So what you suggest sounds like middle ground:
include the bitmap index computations into crypt_all(), but don't
necessarily include actual bitmap lookups in there.  This may be shorter
source code (given how many formats we have), but it's also less optimal
(having to store those indices somewhere instead of using them right away).
Maybe the source code complexity and duplication should be reduced in
some other way, such as by offering a macro that crypt_all() could use
to perform the bitmap lookups.

Another aspect is that with our current naive use of OpenMP, moving the
bitmap lookups into crypt_all() allows to have them in the same parallel
region and in the same threads with the corresponding hash computations.
In an OpenMP-enabled format, your "crypt_all() fills an array of
integers" idea would still involve the overhead of transferring many of
those integers between CPUs - well, unless we switch to program-level
OpenMP support (where the parallel region would be at higher levels, and
everything below it would be MT-safe) or some other way of managing
multiple threads at that higher level, which we should want to anyway.

> It would hit flexibility. For instance, dummy format uses different
> ways to compute get_hash_*(). Hm, I guess it does not affect
> performance at all, because only 1 get_hash_*() function is used
> during the whole run of john, right?

For a given salt, yes.

> It would hit FMT_BS formats when bitmap is not in use. For that case,
> a separate method might be introduced to fill the array separately.

Yes, and IIRC bitslicing was the primary motivation behind having those
separate get_hash*() functions.

OTOH, there's no good reason to have separate binary_hash*() functions -
this should go.  Instead, we can have one binary_hash() function
accepting a binary and a desired hash table index size.

BTW, I think a few smallest hash table sizes (2 or 3) should go as well.
They pre-date our use of bitmaps.  Our current 4th size is 16-bit, and
that's only an 8 KB bitmap, so fits typical L1 data cache.  The hash
table that follows (upon a hit) is larger, though, such as 128 KB on
64-bit with PASSWORD_HASH_SHR 2.  For a salted hash, those might consume
quite some RAM, so this may be a reason to keep size 3 as well (and only
use it with --save-memory?)

At the same time, I think a larger size (than our current largest)
should be introduced.  Currently, the largest is 27-bit.  If we
introduce 30-bit on top of it (128 MB bitmap) and keep PASSWORD_HASH_SHR
at 2 as it currently is, this means the hash table size will be 1 GB on
32-bit systems (28-bit indices and 32-bit elements), so is still barely
acceptable compatibility-wise.  (I assume that malloc() of 2 GB or more
might not work on 32-bit, and wouldn't leave enough address space in the
process even if it works.)  Another reasonable option would be to
replace the current 27-bit with 28-bit, and introduce 32-bit.  Then we'd
need to use PASSWORD_HASH_SHR of at least 4 (probably 5 or 6 since 1.5 GB
for bitmap + hash table is too much) on 32-bit (but can keep it at 2 on
64-bit if we prefer).  All of this is if we don't go for any better
solution yet, but merely tweak what we've got first, optimizing speed
rather than memory usage.

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.