Date: Wed, 7 Jun 2006 11:16:51 +0400 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: to MPI or not MPI john? On Tue, Jun 06, 2006 at 02:51:18PM +0200, Otheus (aka Timothy J. Shelling) wrote: > You mentioned a million hashes per second --- now I see your point about why > this approach is not so scalable. The "cracks per second" metric reported by > the --status command must then be including the words *rejected* by the > filter. No, it does not. > Otherwise I wouldn't get almost perfect speedup using the internal > chksum filter. You got almost perfect speedup because you had many salts and not too many nodes. You mentioned 299 salts. With 1M hashes computed per second on each node, that gives us "only" 3,300 of candidate passwords tried per second on each node. If you have, say, 4 nodes, it means that each node has to generate and chksum-filter around 13,000 of candidate passwords per second. That's not too many - so most of the CPU time is left for the actual hash computations. This is why I am saying that the scalability problem will primarily be seen with saltless hashes or when cracking hashes with a single salt or with a small number of salts. > >> ... with unsalted hashes, > >> it's clearly optimal to distribute the target patterns (the hashed > >> passwords) amongst the nodes. > > > >I re-read the above several times, but I still don't understand you. If > >I parse the above literally, you're suggesting that with multiple > >unsalted hashes it would be optimal to have different nodes compare > >against different hashes > > That is what I'm saying, assuming word generation is (even slightly) slower > than hash generation. That's irrelevant. Even if each candidate password takes 1 second to be generated and only 1 microsecond to be hashed, you do not want to be repeating any of those operations on each node - for the same candidate passwords - only for the computed hashes to be compared against different loaded ones. You can do the same on a single node. By using multiple nodes in this way, you're only distributing the processing cost of hash comparisons - not of candidate password generation and not of hashing. It is far more reasonable to distribute the latter costs, if the implementation is scalable enough for the given number of nodes. > Let's say you have a target password database of 10 hashes, and you have 2 > nodes. Each node uses the same word-generation algorithm. Give each node > half the hashes to crack. They should (on average) both complete (at about) > half the time it would have taken 1 node with those same 10 hashes. No. If those hashes are saltless, both nodes would be done with their share of the work at about the same time it would have taken a single node to complete all of the work with 10 hashes. So there would be no speedup from using multiple nodes at all. (This assumes that the cost of hash comparisons is negligible, which is not always true. But let's not get into that for now.) It's only with salted hashes and multiple salts that distributing the hashes across your nodes makes sense. You might not see the perfect 2x speedup with 2 nodes, though, since there's some per-candidate-password (rather than per-salt) processing even with salted hashes. Obviously, this "overhead" is only visible with small numbers of salts per node. If you have 1,000 hashes and 10 nodes, then each of your nodes will be doing the per-candidate-password processing only once per 100 hash computations, so the "overhead" will be negligible. But with 10 hashes and 10 nodes, it might correspond to around 10% of your total processing cost. So you'd get a 9x speedup over a single node - not 10x. > If we have 10 nodes, and we split the target database of hashes, then it > should take (roughly) about 1/10th the time it would have taken 1 node. > > How could it get any faster than that? As I've explained above, it might take 1/9th the time - and you can improve that to 1/10th by splitting the keyspace efficiently instead. Also, this distribute-the-hashes approach only works as long as the number of salts is no smaller than the number of nodes. When these numbers become close to each other, things get unoptimal. When the number of different salts is smaller than the number of nodes, you can't make use of some of your nodes at all. So we're back to splitting the keyspace. -- Alexander Peslyak <solar at openwall.com> GPG key ID: B35D3598 fp: 6429 0D7E F130 C13E C929 6447 73C3 A290 B35D 3598 http://www.openwall.com - bringing security into open computing environments
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.