
Date: Thu, 22 Dec 2011 23:55:26 +0400 From: Solar Designer <solar@...nwall.com> To: johndev@...ts.openwall.com Subject: Re: Bit slice implementation of DES based hashes On Thu, Dec 22, 2011 at 11:31:14PM +0530, piyush mittal wrote: > > Also, do you understand why the get_hash() functions process a few > > initial elements of B[] only? > > No this I am not getting.Why here only B[0] have taken? Not only B[0] is taken, but several initial elements of B[] are  from 4 for DES_bs_get_hash_0() to 27 for DES_bs_get_hash_6(). (I am referring to 1.7.9's revision of these.) > Is here just 1st bit of each hash is checking?? JtR has several algorithms to check whether the computed hash matches one of those loaded for cracking or not. Which one of these algorithms is used depends on the number of loaded hashes with a given salt. A straightforward approach would be to have a get_hash() function that would for a given index (bit layer if we're talking bitslice) return the full computed hash value. For DESbased hashes, this function would typically return a 64bit value. With a bitslice implementation, this value would be based on all elements of B[] (with one bit taken out of each element). JtR would then compare this value against each hash loaded for cracking (or each with the current salt, if applicable). But this would be inefficient, so JtR does not do it. Instead, when the number of loaded hashes (which share a given salt) is small, it calls cmp_all() to compare one of the loaded hashes against all recently computed hashes. For example, a bitslice implementation on a machine with 128bit SIMD vectors (e.g., x8664 with SSE2) may compute 128 hashes at a time. The corresponding cmp_all() function compares one loaded hash against all of the 128 computed hashes in fewer than 128 iterations. On x8664 it does this in 12 iterations on average (this is 2*log2(64)). Now, when the number of loaded hashes is large (for the current salt, if applicable), even the cmp_all() algorithm is not good enough. We'd have to be calling this function for each loaded hash, which would be timeconsuming. Thus, one of the get_hash*() functions is called for every computed hash instead, returning partial hashes (4bit to 27bit in version 1.7.9). This function's return value is used as array index to identify the hash bucket that might contain the computed hash (if we have a full match). Typically, the get_hash*() function and the hash table size to use are chosen such that very few loaded hashes share a hash bucket, if at all. In fact, some hash buckets may be empty (in those cases, it is instantly determined that a computed hash does not match any of the hashes loaded for cracking). If the number of hashes computed at a time is N (128 in the example above) and the number of hashes loaded for cracking that share a given salt is M, then complexity of the two algorithms is roughly: log2(N) * M for the cmp_all() algorithm N for the algorithm involving get_hash*() As you can see, the former is preferable for small M and the latter for large M. In either case, if the algorithm indicates a likely match, it might not tell us which of the loaded hashes was cracked and what the corresponding index is (such that we can retrieve the corresponding plaintext password that was tested). Additionally, partial hashes may be used with these algorithms (e.g., 32bit), which is why I wrote "likely". Thus, in those relatively infrequent cases, further checks are done with cmp_one() and cmp_exact() to confirm a match and to identify which hashes got cracked (yes, more than one may get cracked with one set of hash computations). Do you see why get_hash*() do not need to return the full hash values? > and Does DEPTH here represents total number of words present?? No. DEPTH is a macro that may expand either to empty string or to [depth]. The latter is used when we're dealing with SIMD vectors larger than machine word size  e.g., 128bit with SSE2, whereas the native word size is just 64 or 32 bits. In get_hash*() functions, we need to extract some bits from just one bit layer. We do this with regular (nonSIMD) operations and machine words (in fact, even these are wider than necessary since we're dealing with just one bit layer in these functions). Thus, we split the bit layer index into two components: native machine word "depth" (which word in a SIMD vector we're dealing with currently) and bit position in that word. You can see this in the init_depth() macro: #define init_depth() \ depth = index >> ARCH_BITS_LOG; \ index &= (ARCH_BITS  1); Alexander
Powered by blists  more mailing lists