john-dev - Re: Bit slice implementation of DES based hashes

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG5=z91mG0A-bVORMFkWXZB1X=cDvR97xCrj6+pin82redDP_g@mail.gmail.com>
Date: Fri, 23 Dec 2011 16:25:28 +0530
From: piyush mittal <piyush.cse29@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Bit slice implementation of DES based hashes

Oh...This is main problem between east n west...Day vs Night...no
communication.I think today I need to sacrifice my night sleep.

On Fri, Dec 23, 2011 at 7:33 AM, piyush mittal <piyush.cse29@...il.com>wrote:

> A straightforward approach would be to have a get_hash() function that
>> would for a given index (bit layer if we're talking bitslice) return the
>> full computed hash value.
>>
>
>
> Bit layer here means to say any perticular Nth bit of all words present.
>
>
>
> On Fri, Dec 23, 2011 at 1:25 AM, Solar Designer <solar@...nwall.com>wrote:
>
>> On Thu, Dec 22, 2011 at 11:31:14PM +0530, piyush mittal wrote:
>> > > Also, do you understand why the get_hash() functions process a few
>> > > initial elements of B[] only?
>> >
>> > No this I am not getting.Why here only B[0] have taken?
>>
>> Not only B[0] is taken, but several initial elements of B[] are - from 4
>> for DES_bs_get_hash_0() to 27 for DES_bs_get_hash_6().  (I am referring
>> to 1.7.9's revision of these.)
>>
>> > Is here just 1st bit of each hash is checking??
>>
>> JtR has several algorithms to check whether the computed hash matches
>> one of those loaded for cracking or not.  Which one of these algorithms
>> is used depends on the number of loaded hashes with a given salt.
>>
>> A straightforward approach would be to have a get_hash() function that
>> would for a given index (bit layer if we're talking bitslice) return the
>> full computed hash value.  For DES-based hashes, this function would
>> typically return a 64-bit value.  With a bitslice implementation, this
>> value would be based on all elements of B[] (with one bit taken out of
>> each element).  JtR would then compare this value against each hash
>> loaded for cracking (or each with the current salt, if applicable).
>>
>> But this would be inefficient, so JtR does not do it.  Instead, when the
>> number of loaded hashes (which share a given salt) is small, it calls
>> cmp_all() to compare one of the loaded hashes against all recently
>> computed hashes.  For example, a bitslice implementation on a machine
>> with 128-bit SIMD vectors (e.g., x86-64 with SSE2) may compute 128
>> hashes at a time.  The corresponding cmp_all() function compares one
>> loaded hash against all of the 128 computed hashes in fewer than 128
>> iterations.  On x86-64 it does this in 12 iterations on average (this is
>> 2*log2(64)).
>>
>> Now, when the number of loaded hashes is large (for the current salt, if
>> applicable), even the cmp_all() algorithm is not good enough.  We'd have
>> to be calling this function for each loaded hash, which would be
>> time-consuming.  Thus, one of the get_hash*() functions is called for
>> every computed hash instead, returning partial hashes (4-bit to 27-bit
>> in version 1.7.9).  This function's return value is used as array index
>> to identify the hash bucket that might contain the computed hash (if we
>> have a full match).  Typically, the get_hash*() function and the hash
>> table size to use are chosen such that very few loaded hashes share a
>> hash bucket, if at all.  In fact, some hash buckets may be empty (in
>> those cases, it is instantly determined that a computed hash does not
>> match any of the hashes loaded for cracking).
>>
>> If the number of hashes computed at a time is N (128 in the example
>> above) and the number of hashes loaded for cracking that share a given
>> salt is M, then complexity of the two algorithms is roughly:
>>
>> log2(N) * M for the cmp_all() algorithm
>> N for the algorithm involving get_hash*()
>>
>> As you can see, the former is preferable for small M and the latter for
>> large M.
>>
>> In either case, if the algorithm indicates a likely match, it might not
>> tell us which of the loaded hashes was cracked and what the
>> corresponding index is (such that we can retrieve the corresponding
>> plaintext password that was tested).  Additionally, partial hashes may
>> be used with these algorithms (e.g., 32-bit), which is why I wrote
>> "likely".  Thus, in those relatively infrequent cases, further checks
>> are done with cmp_one() and cmp_exact() to confirm a match and to
>> identify which hashes got cracked (yes, more than one may get cracked
>> with one set of hash computations).
>>
>> Do you see why get_hash*() do not need to return the full hash values?
>>
>> > and Does DEPTH here represents total number of words present??
>>
>> No.  DEPTH is a macro that may expand either to empty string or to
>> [depth].  The latter is used when we're dealing with SIMD vectors larger
>> than machine word size - e.g., 128-bit with SSE2, whereas the native
>> word size is just 64 or 32 bits.  In get_hash*() functions, we need to
>> extract some bits from just one bit layer.  We do this with regular
>> (non-SIMD) operations and machine words (in fact, even these are wider
>> than necessary since we're dealing with just one bit layer in these
>> functions).  Thus, we split the bit layer index into two components:
>> native machine word "depth" (which word in a SIMD vector we're dealing
>> with currently) and bit position in that word.  You can see this in the
>> init_depth() macro:
>>
>> #define init_depth() \
>>        depth = index >> ARCH_BITS_LOG; \
>>        index &= (ARCH_BITS - 1);
>>
>> Alexander
>>
>
>
>
> --
> Piyush Mittal
> Department of Computer Science and Engineering
> National Institute of Technology,Rourkela
> INDIA
>
>
>


-- 
Piyush Mittal
Department of Computer Science and Engineering
National Institute of Technology,Rourkela
INDIA

Content of type "text/html" skipped
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.