Date: Sun, 10 Sep 2017 21:40:25 +0200 From: <spam@...lab.nl> To: <john-users@...ts.openwall.com> Subject: Hyperthreading / fork versus mpi / instruction sets? Hi list, I'm using john on one system with an enormous amount of CPUs. The CPUs support hyperthreading. I'm trying to figure out what the fastest combination of settings is. From a practical point of view I can perform benchmarks, simply measuring time for the same task with different settings. No problems there. However, I'd like to have a good understanding of the concepts and - if applicable - some specifics based on under the hood details of john. Hyperthreading: As far as I understand this is beneficial only if the cracking code is not 100% optimized. So in theory: not useful, each HT thread cannot do anything useful since the 'real' core is fully saturated. Practice: run john on HT cores as well to optimize CPU utilization. Even if the cracking code is 100% optimized it won't harm me (no disadvantages). Two questions: (1) is this correct? And (2) any advice about the number of extra HT processes to assign? Use all? Use just say one or two to compensate for a small fraction of non-perfect code? Fork vs. MPI: I've mentioned that there is a number of hash formats that support MPI and that john runs those hash types on MPI by default. Furthermore I've seen that forked parallel processing (--fork=n) is possible for all hash types. AFAIK, MPI is typically used in network connected multi-system environments. Forking is done on one machine. My assumption is that forking is more efficient than MPI because of less overhead (= faster). However MPI might allow more granular control, rescheduling during the cracking process to get maximum efficiency, but *only* useful if MPI latency is extremely low compared the cracking speed. My questions questions: (1) is this correct? Furthermore: (2) what's the best approach for fast hashes (e.g. raw-md5) and (3) what's the best approach for slow hashes (e.g. bcrypt)? Instruction sets: I've mentioned that john contains *a lot* of instruction set specific optimized code (e.g. SSE4.x/AVXx). Older multi CPU Xeon E5 and E7 systems are quite cheap nowadays, and looking at absolute performance (in general) they're still extremely fast (still in the list of fastest processors). However they lack e.g. AVX2 support. Right now it's very difficult for me to figure out what's the best choice, without buying massively expensive new CPUs. Question: is there a benchmark available of something like the latest fancy everything supporting CPU versus different instruction sets builds on this system, so one can figure out what the advantage of buying new CPUs is, from a john cracking perspective? Any other considerations or wise advice, especially concerning maximizing CPU cracking is also more than welcome! Thank you.
Powered by blists - more mailing lists
Powered by Openwall GNU/*/Linux - Powered by OpenVZ