Date: Sun, 15 Apr 2012 04:17:47 +0400 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: statistics -openssl vs john Hi Deepika, I'm sorry we failed to reply to your question on john-dev sooner. Anyway, this fits john-users as well (or better). First, magnum is right: you're comparing apples to pears. Yet such comparisons are sometimes useful if you know how to interpret the results and don't assume that you have a direct comparison. Then, these performance numbers and those you posted before suggest that you might be on a virtual machine or at least on a system with other load. Benchmark results are very often incorrect when you run those benchmarks inside a VM: the VM's timers might not behave well enough. In your case, OpenSSL's performance numbers might be inflated (I am getting twice worse speeds on a non-virtualized 2.5 GHz Core 2'ish CPU), and John's affected in some other way. Large c/s real vs. c/s virtual differences are not normal when you're benchmarking things on a supposedly otherwise idle system. You need to actually make the system idle first - and avoid VMs. I've included some further comments inline: On Sun, Apr 15, 2012 at 12:59:51AM +0530, Deepika Dutta Mishra wrote: > Hi, I was doing speed test between openssl des and john des. I get > following statistics for openssl > > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 > bytes > des cbc 100225.76k 89521.76k 89778.20k 95060.70k > 96158.84k Here's what I am getting with OpenSSL 1.0.0d on a Xeon E5420 2.5 GHz (using one core in it): type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes des cbc 47318.13k 49515.67k 50002.01k 49758.55k 49883.82k OpenSSL 1.0.1 on FX-8120 o/c 4.5 GHz (turbo): type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes des cbc 72202.15k 75636.31k 75584.09k 75591.00k 76682.58k So your numbers look inflated to me. Possibly your VM's timer that OpenSSL's benchmark happened to use (via your guest OS kernel) ran slower than real time. Well, or maybe you just used a faster or/and more suitable CPU (like overclocked Sandy Bridge)? ...Oh, I think I've just figured it out: OpenSSL uses virtual (CPU) time for its benchmarks (confirmed with a quick test with 48 parallel invocations with a script on the FX-8120, which only has 8 logical CPUs), and from your John benchmarks we already know that you have a large discrepancy between real and virtual time (other system load or/and VM). So your benchmark results are not to be relied upon for any purpose. > and for john > > Benchmarking: Traditional DES [32/32 BS]... DONE > Many salts: 434566 c/s real, 997527 c/s virtual > Only one salt: 426208 c/s real, 568277 c/s virtual > > Benchmarking: LM DES [32/32 BS]... DONE > Raw: 9306K c/s real, 12086K c/s virtual These are pretty low speeds for John. Do you deliberately build it with a non-optimal make target for a "fair" comparison against OpenSSL (assuming that OpenSSL won't use SSE2 or the like for DES)? That might not actually be fair. The primary advantage of bitslicing is that it lets you use arbitrarily wide machine words or SIMD vectors efficiently. With only 32-bit machine words, that advantage is not present, but with 128-bit SSE2 vectors it is. Then, speaking of apples and oranges^Wpears, with OpenSSL you have a fixed key and you encrypt one stream of data with it. With the bitslice DES code in JtR, you have a set of ever-changing keys and you encrypt a constant value with those. These two tasks are quite different - not only in terms of parallelism (you got to have 32 separate keys or/and blocks at once for your build of JtR above), but also in terms of work performed (with JtR, you're doing a lot of key setup, whereas in OpenSSL it is not benchmarked - it is out of the benchmark's loop there). > Now considering openssl, it can process 100225.76 x 1000 = 100225760 > bytes/sec which should account to 100225760 /8 = 12528220 encryptions/sec > (since DES block size is 8 bytes) Yes (if your benchmark results were correct, which they are not). > With john, considering LM DES (which according to what I read does 2 DES > encryption), No, it is just one DES encryption, but the key changes every time you do it (JtR tries different candidate passwords), and there's also the hash comparison step (to detect cracked passwords). > the result is 9306 x 1000 = 9306000 x 2 = 18612000 > encryption/sec It'd be just 9306 x 1000 = 9306000 encryptions/sec, but that's wrong because OpenSSL uses virtual time, so you have to pick c/s virtual here, so it'd be 12086 x 1000 = 12086000 encryptions/sec. But that's still wrong because we have no idea how your real to virtual time ratio changed between the two benchmarks (clearly, it does change over time significantly - this is seen on different ones of your JtR benchmarks) and, more importantly, because in one case you're benchmarking DES encryption alone and in the other key setup and encryption and hash comparisons at once. > This provided 1.48 times speedup with john des (non sse or other > optimizations). Am I right in my calculation? No. Anyway, to get an idea of how fast John can really get, see: http://www.openwall.com/lists/announce/2011/06/22/1 Using a similar apples to pears comparison, this gives (for a Core i7-2600K 3.4 GHz + turbo): --- Benchmarking: Traditional DES [128/128 BS AVX-16]... DONE Many salts: 20668K c/s real, 2593K c/s virtual Only one salt: 8724K c/s real, 1094K c/s virtual That's for 8 threads on this quad-core CPU with SMT. (By the way, this corresponds to over 500 million of DES block encryptions per second, or a data encryption speed of 33 Gbps, if we were encrypting data. Of course, in practice there would be other limitations, such as data transfer bandwidth. But the crypto code and the CPU are this fast.) --- Newer versions of JtR built with newer gcc achieve higher speeds on the same machine: Benchmarking: Traditional DES [128/128 BS AVX-16]... DONE Many salts: 22773K c/s real, 2843K c/s virtual Only one salt: 18284K c/s real, 2291K c/s virtual Since every DES-based crypt(3) computation involves 25 modified-DES encryptions (slower than normal DES), that's over 4.5 Gbytes/sec or 36 Gbps data encryption speed. (In the multi-salt case, the key setup is out of the loop.) For a more direct comparison (yet still apples to pears indeed) to the OpenSSL benchmarks I posted above, here's what John achieves on one core in the FX-8120 o/c 4.5 GHz (turbo): Benchmarking: Traditional DES [128/128 BS XOP-16]... DONE Many salts: 5275K c/s real, 5275K c/s virtual Only one salt: 4993K c/s real, 4993K c/s virtual (Non-OpenMP build this time, to use just one CPU core.) That's 1055000 x 1000 bytes per second (about 1 Gbyte/sec), which is about 14 times faster than the OpenSSL speed. And that's not considering that JtR also implements DES-based crypt(3) salts in this benchmark (roughly a 7% performance hit). For a pure 32-bit build, if you must, I expect JtR to be faster than OpenSSL's DES - in this apples to pears comparison - by a factor of 1.2 (x86 in 32-bit mode, register-starved) to 4 (decent architectures). Here's a low speedup example (almost worst case for JtR), Pentium 3 1.0 GHz, deliberately non-optimal build of JtR ("make generic"): Benchmarking: Traditional DES [32/32 BS]... DONE Many salts: 125632 c/s real, 124388 c/s virtual Only one salt: 124448 c/s real, 124448 c/s virtual That's about 25 million bytes per second. OpenSSL on the same machine: type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes des cbc 20319.54k 21552.33k 21682.94k 21918.41k 21882.78k So we have a speedup of only between 1.2x to 1.25x here. As soon as we switch to an optimal build, things change dramatically (same machine): Benchmarking: Traditional DES [64/64 BS MMX]... DONE Many salts: 376320 c/s real, 376320 c/s virtual Only one salt: 367040 c/s real, 367040 c/s virtual That's about 75 million bytes per second, or a speedup of 3.5x. I hope this answers your question more than exhaustively. ;-) Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.