[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Dec 2005 01:42:51 +0000 (UTC)
From: Radim Horak <yesbody@...nam.cz>
To: john-users@...ts.openwall.com
Subject: Re: john improvement suggestions - vc compilation test
Solar Designer <solar@...> writes:
> I really don't think that the performance improvement you've observed is
> related to this compiler producing better code (I do not know whether
> that is the case). LM_fmt.c is not really performance critical. For LM
> hashes on x86/MMX, the performance critical code is primarily in DES_bs.c
> and indeed in x86-mmx.S.
...
> Can you provide the specific "john --test" outputs for both builds and
> tell us your CPU clock rate (real and P4 rating)?
My CPU: AthlonXP (Barton) 2.2 GHz (PR 3200+)
john 1.6.40
gcc 3.4.4
target: win32-cygwin-x86-mmx
edited:
Makefile: CFLAGS = -c -Wall -O4 -funroll-loops -fomit-frame-pointer -
march=athlon-xp -mtune=athlon-xp
params.h: BENCHMARK_TIME 30 (didn't help much)
bench.c: removed cygwin check
$ ./john -t
Benchmarking: Traditional DES [64/64 BS MMX]... DONE
Many salts: 753235 c/s real, 813754 c/s virtual
Only one salt: 683485 c/s
Benchmarking: BSDI DES (x725) [64/64 BS MMX]... DONE
Many salts: 24927 c/s real, 27091 c/s virtual
Only one salt: 24077 c/s
Benchmarking: FreeBSD MD5 [32/32]... DONE
Raw: 5643 c/s real, 6252 c/s virtual
Benchmarking: OpenBSD Blowfish (x32) [32/32]... DONE
Raw: 335 c/s real, 346 c/s virtual
Benchmarking: Kerberos AFS DES [48/64 4K MMX]... DONE
Short: 141239 c/s real, 156198 c/s virtual
Long: 528674 c/s
Benchmarking: NT LM DES [64/64 BS MMX]... DONE
Raw: 5745K c/s real, 6318K c/s virtual
-------------------
ms vc2003
method:
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoDES_fmt.o DES_fmt.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoDES_std.o DES_std.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoDES_bs.o DES_bs.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoBSDI_fmt.o BSDI_fmt.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoMD5_fmt.o MD5_fmt.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoMD5_std.o MD5_std.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoBF_fmt.o BF_fmt.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoBF_std.o BF_std.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoAFS_fmt.o AFS_fmt.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoLM_fmt.o LM_fmt.c
rem copy *.o to the src directory
rem make WITHOUT clean
make win32-cygwin-x86-mmx
$ ./john -t
Benchmarking: Traditional DES [64/64 BS MMX]... DONE
Many salts: 748780 c/s real, 812828 c/s virtual
Only one salt: 672553 c/s
Benchmarking: BSDI DES (x725) [64/64 BS MMX]... DONE
Many salts: 25094 c/s real, 27090 c/s virtual
Only one salt: 24464 c/s
Benchmarking: FreeBSD MD5 [32/32]... DONE
Raw: 5667 c/s real, 6188 c/s virtual
Benchmarking: OpenBSD Blowfish (x32) [32/32]... DONE
Raw: 334 c/s real, 364 c/s virtual
Benchmarking: Kerberos AFS DES [48/64 4K MMX]... DONE
Short: 140758 c/s real, 156105 c/s virtual
Long: 503631 c/s
Benchmarking: NT LM DES [64/64 BS MMX]... DONE
Raw: 6514K c/s real, 7152K c/s virtual
--------------------------------
The improvement seems to be more than just luck, the repeated testing shows it
is consistent. The most important/critical vc option for improving LM
performance seems to be "/arch:SSE" which unfortunately slipped from my original
post.
> Overall, I do not see a need to support building John with MSVC. That
> would complicate the code with more #ifdef's for no good reason.
>
Personally I am not interested in LM in john, I just wanted to share this
discovery :) with others, who can compile it themselves. Maybe someone can
figure out the details of this behaviour and make good use of them.
However I am interested in optimizing compilation, though my understanding of
source code and programming skills are very limited :)
-Radim
Powered by blists - more mailing lists
Powered by Openwall GNU/*/Linux -
Powered by OpenVZ