Date: Sun, 16 Aug 2015 23:04:17 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: FMT_OMP_BAD On Sun, Aug 16, 2015 at 08:05:42PM +0300, Solar Designer wrote: > In this test, I compared an OpenMP-enabled build at different thread > counts (1 vs. 10). It would also be relevant to compare these against a > non-OpenMP build. For some formats, it may show substantially different > numbers than we're seeing for 1 thread in an OpenMP-enabled build (and > this will suggest there's a need for optimization for the slower one of > these two cases). In fact, for deciding on where to add > FAST_FORMATS_OMP checks, a comparison of non-OpenMP vs. 10 threads would > be more relevant than the above comparison for 1 vs. 10 threads. I just ran non-OpenMP benchmarks as well. Here's the comparison of non-OpenMP vs. 1 thread on super: Number of benchmarks: 403 Minimum: 0.55939 real, 0.55939 virtual Maximum: 1.12820 real, 1.12820 virtual Median: 0.99948 real, 0.99922 virtual Median absolute deviation: 0.01257 real, 0.01305 virtual Geometric mean: 0.98630 real, 0.98600 virtual Geometric standard deviation: 1.05860 real, 1.05886 virtual Median and mean are pretty close to 1.0, which is good. However, there are some outliers. I've attached the output of: ./relbench -v a0 a1 | grep ^Ratio: | sort -nk2 > nonvsomp.txt where a0 was non-OpenMP, and a1 was OpenMP with 1 thread. The worst performance impact of enabling OpenMP is seen for: Ratio: 0.55939 real, 0.55939 virtual dynamic_1400:Raw Ratio: 0.64531 real, 0.64531 virtual dynamic_1401:Only one salt This is followed by my bitslice DES formats, for which the impact is 20%. That's pretty bad. They start collecting much larger groups of candidate passwords when OpenMP is enabled, and this may result in higher cache miss rate. This doesn't explain why the performance impact is roughly the same regardless of iteration count, though. There must be something else. I'll need to check if this same performance impact is seen in core tree. The highest performance improvement is seen for: Ratio: 1.12820 real, 1.12820 virtual Fortigate, FortiOS:Many salts Ratio: 1.07760 real, 1.07760 virtual kwallet, KDE KWallet:Raw Ratio: 1.07405 real, 1.07405 virtual EPI, EPiServer SID:Only one salt Ratio: 1.07247 real, 1.07247 virtual SSHA512, LDAP:Many salts Ratio: 1.07125 real, 1.07125 virtual Tiger:Raw Ratio: 1.06730 real, 1.06730 virtual sapb, SAP CODVN B (BCODE):Only one salt Ratio: 1.06550 real, 1.05489 virtual dahua, "MD5 based authentication" Dahua:Raw Ratio: 1.06022 real, 1.06022 virtual has-160:Raw which might suggest that they need an equivalent of OMP_SCALE, and thus higher max_keys_per_crypt, even for non-OpenMP builds. Someone might want to explore this. Overall, I am relieved these performance differences aren't much worse. Alexander View attachment "nonvsomp.txt" of type "text/plain" (25851 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.