john-dev - Re: Bit slice implementation of DES based hashes

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG5=z927csAUrGzbkSMg7-c0_rFzW4=n+1u-mBE24=LaVu3D+w@mail.gmail.com>
Date: Fri, 23 Dec 2011 07:33:07 +0530
From: piyush mittal <piyush.cse29@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Bit slice implementation of DES based hashes

>
> A straightforward approach would be to have a get_hash() function that
> would for a given index (bit layer if we're talking bitslice) return the
> full computed hash value.
>


Bit layer here means to say any perticular Nth bit of all words present.



On Fri, Dec 23, 2011 at 1:25 AM, Solar Designer <solar@...nwall.com> wrote:

> On Thu, Dec 22, 2011 at 11:31:14PM +0530, piyush mittal wrote:
> > > Also, do you understand why the get_hash() functions process a few
> > > initial elements of B[] only?
> >
> > No this I am not getting.Why here only B[0] have taken?
>
> Not only B[0] is taken, but several initial elements of B[] are - from 4
> for DES_bs_get_hash_0() to 27 for DES_bs_get_hash_6().  (I am referring
> to 1.7.9's revision of these.)
>
> > Is here just 1st bit of each hash is checking??
>
> JtR has several algorithms to check whether the computed hash matches
> one of those loaded for cracking or not.  Which one of these algorithms
> is used depends on the number of loaded hashes with a given salt.
>
> A straightforward approach would be to have a get_hash() function that
> would for a given index (bit layer if we're talking bitslice) return the
> full computed hash value.  For DES-based hashes, this function would
> typically return a 64-bit value.  With a bitslice implementation, this
> value would be based on all elements of B[] (with one bit taken out of
> each element).  JtR would then compare this value against each hash
> loaded for cracking (or each with the current salt, if applicable).
>
> But this would be inefficient, so JtR does not do it.  Instead, when the
> number of loaded hashes (which share a given salt) is small, it calls
> cmp_all() to compare one of the loaded hashes against all recently
> computed hashes.  For example, a bitslice implementation on a machine
> with 128-bit SIMD vectors (e.g., x86-64 with SSE2) may compute 128
> hashes at a time.  The corresponding cmp_all() function compares one
> loaded hash against all of the 128 computed hashes in fewer than 128
> iterations.  On x86-64 it does this in 12 iterations on average (this is
> 2*log2(64)).
>
> Now, when the number of loaded hashes is large (for the current salt, if
> applicable), even the cmp_all() algorithm is not good enough.  We'd have
> to be calling this function for each loaded hash, which would be
> time-consuming.  Thus, one of the get_hash*() functions is called for
> every computed hash instead, returning partial hashes (4-bit to 27-bit
> in version 1.7.9).  This function's return value is used as array index
> to identify the hash bucket that might contain the computed hash (if we
> have a full match).  Typically, the get_hash*() function and the hash
> table size to use are chosen such that very few loaded hashes share a
> hash bucket, if at all.  In fact, some hash buckets may be empty (in
> those cases, it is instantly determined that a computed hash does not
> match any of the hashes loaded for cracking).
>
> If the number of hashes computed at a time is N (128 in the example
> above) and the number of hashes loaded for cracking that share a given
> salt is M, then complexity of the two algorithms is roughly:
>
> log2(N) * M for the cmp_all() algorithm
> N for the algorithm involving get_hash*()
>
> As you can see, the former is preferable for small M and the latter for
> large M.
>
> In either case, if the algorithm indicates a likely match, it might not
> tell us which of the loaded hashes was cracked and what the
> corresponding index is (such that we can retrieve the corresponding
> plaintext password that was tested).  Additionally, partial hashes may
> be used with these algorithms (e.g., 32-bit), which is why I wrote
> "likely".  Thus, in those relatively infrequent cases, further checks
> are done with cmp_one() and cmp_exact() to confirm a match and to
> identify which hashes got cracked (yes, more than one may get cracked
> with one set of hash computations).
>
> Do you see why get_hash*() do not need to return the full hash values?
>
> > and Does DEPTH here represents total number of words present??
>
> No.  DEPTH is a macro that may expand either to empty string or to
> [depth].  The latter is used when we're dealing with SIMD vectors larger
> than machine word size - e.g., 128-bit with SSE2, whereas the native
> word size is just 64 or 32 bits.  In get_hash*() functions, we need to
> extract some bits from just one bit layer.  We do this with regular
> (non-SIMD) operations and machine words (in fact, even these are wider
> than necessary since we're dealing with just one bit layer in these
> functions).  Thus, we split the bit layer index into two components:
> native machine word "depth" (which word in a SIMD vector we're dealing
> with currently) and bit position in that word.  You can see this in the
> init_depth() macro:
>
> #define init_depth() \
>        depth = index >> ARCH_BITS_LOG; \
>        index &= (ARCH_BITS - 1);
>
> Alexander
>



-- 
Piyush Mittal
Department of Computer Science and Engineering
National Institute of Technology,Rourkela
INDIA

Content of type "text/html" skipped
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.