Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 31 Dec 2011 20:32:43 +0000
From: Alex Sicamiotis <alekshs@...mail.com>
To: <john-users@...ts.openwall.com>
Subject: RE: DES with OpenMP







> Date: Sat, 31 Dec 2011 22:50:54 +0400
> From: solar@...nwall.com
> To: john-users@...ts.openwall.com
> Subject: Re: [john-users] DES with OpenMP
> 
> On Sat, Dec 31, 2011 at 02:09:13PM +0000, Alex Sicamiotis wrote:
> > I've benchmarked DES (openMP) with GCC 4.6 / 4.7 and ICC 12.1...
> 
> Thank you for contributing those benchmark results to the wiki.  It's an
> impressive overclock you got (a really cheap CPU at 4 GHz).
> 
Indeed... and it has a very high efficiency per MHz - so even in low frequencies, it produces a lot of cracking throughput.


> 
> +80% is reasonable (that's 90% efficiency - that is, 180% out of 200%),
> +99.5% is too high.  In my testing, the efficiency of the bitslice DES
> parallelization with OpenMP is at around 90% for DES-based crypt(3) for
> "many salts" on current multi-core CPUs.  +99.5% indicates that there is
> another source of speedup besides the use of a second core.

Interesting.

>(96% would be believable,
> albeit still very high for this code.)
>

I also did a long-term, RL test, cracking a normal file which takes approximately 1m 30 secs to crack... the test was done at 3.45 GHz... it was 4.347K c/s in one core (icc) and took 1:29, and 8.708K in openMP icc build (took 0:44). This actually exceeded 100% scaling. (That was really unbelievable territory, lol). It would be extremely interesting to see what the high end avx-capable and HT-capable i7 do with icc openmp. 

I read what you said about icc improving per core speed, well I hadn't thought of the difference because non-openMP builds produce similar results due to the asm code. I just did a run of the openMP build with OMP_NUM_THREADS=1. It seems you are right. I got 4527k vs 4330-4350 of the best GCC/ICC/Open64 builds I have.... this is +4% in single core cracking speed. Not bad at all. Apparently icc is better even than the asm code. So it's efficiency is actually 8707 / (4527X2) = ~96.1%.
 
Btw, I also found very small gains by tweaking the linux kernel... I'm using opensuse and opensuse has two kernels... desktop (low latency, more timeslicing) and server (higher latency, more processing throughput). Server gave ~+30k c/s peak relative to desktop, and a custom built kernel for my cpu, less debug code etc, gave another 10k c/s relative to Server. I was kind of disappointed though. 5 or 6 years ago, in the Athlon XP days, latency was more critical for perfromance. IIRC, I was around 920k c/s with low-latency kernel, 970-980k c/s with server kernel and >1.020.000 in win32 with cygwin - but the system was clearly less responsive than even the server kernel of linux. I later read somewhere that XP uses a timer frequency of 100Hz (linux server kernel = 250, linux desktop kernel = 1000Hz). The increased hz frequency is more disruptive, and increased timeslicing leads to more cache-misses. Somehow, today's kernels only show marginal differences :|

 
> With GCC 4.6, there is a performance regression (compared to 4.5 and
> 4.4), which was especially bad without OpenMP.  This is one reason why
> JtR 1.7.9 forces the use of the supplied assembly code (whenever
> available) for non-OpenMP builds.  When you build with GCC 4.6 and
> OpenMP, you may be hit by this performance regression to some extent.
> You may want to try GCC 4.5 to avoid it.
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
> 

For the time being, I'm absolutely "settled" with the icc openmp version. It practically eliminates the need for two sessions of john, except when I'm running KDE desktop. Then the speed falls from 8600 k c/s to 8200-8300 k c/s, and ~7500k c/s when having mp3s playing or stuff. That's when I prefer 2 concurrent non-openMP versions. Even a 2% load in x.org, mp3s, plasma and stuff is very disruptive to openMP, despite nice values of minus 11 to minus 15 in a "server" kernel. 



 		 	   		  

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.