|
Date: Wed, 5 May 2021 00:11:14 -0700 From: David Sontheimer <david.sontheimer@...il.com> To: john-users@...ts.openwall.com Subject: Re: Cracking stats: p/s, c/s and C/s. Hashing cost factors. Alexander, Apologies for the slow response time. I have been exploring iterative cracking mode, and the longer run times that come with it. These ratios make more sense now, but in absolute terms these speeds in > the billions look unrealistic for the iterated hashes you previously > mentioned you were testing with. So I guess something else is still > wrong in your analysis script. > Those speeds were unsalted raw MD5. These are --test benchmarks for the hashes I mentioned previously: Benchmarking: md5crypt, crypt(3) $1$ (and variants) [MD5 256/256 AVX2 8x3]... (72xOMP) DONE > Many salts: 2726K c/s real, 37798 c/s virtual Only one salt: 2609K c/s real, 36303 c/s virtual Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X3]... (72xOMP) DONE > Speed for cost 1 (iteration count) of 32 Raw: 43416 c/s real, 607 c/s virtual > Benchmarking: sha512crypt, crypt(3) $6$ (rounds=5000) [SHA512 256/256 AVX2 4x]... (72xOMP) DONE > Speed for cost 1 (iteration count) of 5000 Raw: 43885 c/s real, 611 c/s virtual Benchmarking: sha1crypt, NetBSD's sha1crypt [PBKDF1-SHA1 256/256 AVX2 8x]... (72xOMP) DONE > Speed for cost 1 (iteration count) of 40000 Raw: 20132 c/s real, 280 c/s virtual Benchmarking: argon2 [Blake2 AVX]... (72xOMP) DONE > Speed for cost 1 (t) of 3, cost 2 (m) of 4096, cost 3 (p) of 1, cost 4 (type [0:Argon2d 1:Argon2i]) of 0 and 1 Raw: 3291 c/s real, 46.8 c/s virtual This is all on an older rack of 72 Xeon E5s. I was able to change sha1crypt to a single 40k iteration count - much appreciated - this will be a bit simpler to explain to my audience. You can lock Argon2 benchmarks to either Argon2d or Argon2i with: > ./john --test --format=argon2 --cost=0,0,0,0:0 > or: > ./john --test --format=argon2 --cost=0,0,0,1:1 > The rest of Argon2 parameter values are: t=3, m=4096, p=1. Is there documentation on the format of the --cost arguments for Argon2, bcrypt, sha256_crypt, and sha512_crypt? I'd like to run benchmarks with the costs matching the lowest that Passlib allows. For example, Passlib allows a minimum of 1000 rounds for sha512_crypt, while John defaults to 5000. When I attempt to alter the cost for these hashes it appears to either not effect the result, or I receive an error: FAILED (--cost pruned all 8 test vectors) > Moreover, I guess your reference to [rainbow tables] was colloquial rather > than > literal - you probably meant a precomputed table in general, not a > rainbow table specifically. However, the answer would be no anyway. > Yes, colloquial. This is helpful to understand. As I had mentioned, JtR does reuse the hashes it computes for multiple > comparisons against possible multiple loaded hashes sharing a salt. > However, this does not require maintaining any optimized nor long-term > table of the computed hashes, except for having the set of just-computed > hashes in memory. In the simplest case, you simply have one computed > hash at a time and you loop over all loaded hashes that have the same > salt. So you only store one currently computed hash in memory. In the > more complex case, which is what JtR typically has, multiple hashes are > computed in parallel, and the current batch of computed hashes for the > current salt (but different candidate passwords) is then mass-compared > against the loaded hashes sharing this salt. This mass-comparison > operation is optimized in some ways, including through having > performance-efficient data structures (bitmaps and hash tables) built > out of loaded hashes (but not out of just-computed hashes). > For example, if I use 72 forks, but only load ten password hashes (all sharing one uniform salt) then does John generate a batch of candidates across forks, and mass-compare against all ten loaded hashes... then repeat until matching candidates have been found for all ten loaded hashes? > As a special case, JtR may omit duplicate hashes on loading, in which > case the guess count and g/s reported while cracking will be limited to > those hashes it had loaded. "john --show" may then show a larger number > of guesses (including duplicates that might have been omitted before). > Given the hash types you're working with, you might face the first > special case (as you appear to deliberately play with reused salts), but > probably not the second. > This has certainly happened, and was bugging me for a while. Thank you. > > But we don't say "if a crypt matches ..." or the like. We say "if a > computed hash > matches ..." > > Therefore, I should not refer to c/s as crypts tested, but crypts > > generated, correct? > > This could have been correct, but we say "hashes computed". > Ok, many thanks. That's a helpful start on building my glossary. I'm now cracking randomly generated passwords of short lengths, consisting of only digits. I'm using iterative mode with both min-len and max-len constrained by the known length of the cleartext (eg: if cleartext_len = 4, then min-len=4 and max-len=4), with an alphabet of "Digits." It appears that without min-len equivalent to max-len, John has internal logic to begin with common password lengths. But with these constraints, in what order does iterative mode generate candidates by default? I imagine John could generate candidates by iterating through the bounded search-space by unicode value, regardless of number, letter or special character. Character frequency analysis? Something else? Not a random seed, no? If there is complex logic in the generation order, is this contained within the config file? Similarly, how does John generate candidates from a wordlist, or mask + wordlist? Forest from the trees - I'd like to better understand how to relate my theoretical time-to-crack calculations with the actual runtimes I'm seeing from John. -David
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.