Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Dec 2005 01:42:51 +0000 (UTC)
From:  Radim Horak <yesbody@...nam.cz>
To: john-users@...ts.openwall.com
Subject:  Re: john improvement suggestions - vc compilation test

Solar Designer <solar@...> writes:

> I really don't think that the performance improvement you've observed is
> related to this compiler producing better code (I do not know whether
> that is the case).  LM_fmt.c is not really performance critical.  For LM
> hashes on x86/MMX, the performance critical code is primarily in DES_bs.c
> and indeed in x86-mmx.S.
...
> Can you provide the specific "john --test" outputs for both builds and
> tell us your CPU clock rate (real and P4 rating)?

My CPU: AthlonXP (Barton) 2.2 GHz (PR 3200+)

john 1.6.40
gcc 3.4.4
target: win32-cygwin-x86-mmx

edited:  
Makefile: CFLAGS = -c -Wall -O4 -funroll-loops -fomit-frame-pointer -
march=athlon-xp -mtune=athlon-xp
params.h: BENCHMARK_TIME 30 (didn't help much)
bench.c:  removed cygwin check

$ ./john -t
Benchmarking: Traditional DES [64/64 BS MMX]... DONE
Many salts:     753235 c/s real, 813754 c/s virtual
Only one salt:  683485 c/s

Benchmarking: BSDI DES (x725) [64/64 BS MMX]... DONE
Many salts:     24927 c/s real, 27091 c/s virtual
Only one salt:  24077 c/s

Benchmarking: FreeBSD MD5 [32/32]... DONE
Raw:    5643 c/s real, 6252 c/s virtual

Benchmarking: OpenBSD Blowfish (x32) [32/32]... DONE
Raw:    335 c/s real, 346 c/s virtual

Benchmarking: Kerberos AFS DES [48/64 4K MMX]... DONE
Short:  141239 c/s real, 156198 c/s virtual
Long:   528674 c/s

Benchmarking: NT LM DES [64/64 BS MMX]... DONE
Raw:    5745K c/s real, 6318K c/s virtual

-------------------
ms vc2003

method:
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoDES_fmt.o DES_fmt.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoDES_std.o DES_std.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoDES_bs.o DES_bs.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoBSDI_fmt.o BSDI_fmt.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoMD5_fmt.o MD5_fmt.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoMD5_std.o MD5_std.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoBF_fmt.o BF_fmt.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoBF_std.o BF_std.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoAFS_fmt.o AFS_fmt.c
cl /c /TC /G7 /Ox /O2 /Zl /arch:SSE /FoLM_fmt.o LM_fmt.c
rem copy *.o to the src directory
rem make WITHOUT clean
make win32-cygwin-x86-mmx


$ ./john -t
Benchmarking: Traditional DES [64/64 BS MMX]... DONE
Many salts:     748780 c/s real, 812828 c/s virtual
Only one salt:  672553 c/s

Benchmarking: BSDI DES (x725) [64/64 BS MMX]... DONE
Many salts:     25094 c/s real, 27090 c/s virtual
Only one salt:  24464 c/s

Benchmarking: FreeBSD MD5 [32/32]... DONE
Raw:    5667 c/s real, 6188 c/s virtual

Benchmarking: OpenBSD Blowfish (x32) [32/32]... DONE
Raw:    334 c/s real, 364 c/s virtual

Benchmarking: Kerberos AFS DES [48/64 4K MMX]... DONE
Short:  140758 c/s real, 156105 c/s virtual
Long:   503631 c/s

Benchmarking: NT LM DES [64/64 BS MMX]... DONE
Raw:    6514K c/s real, 7152K c/s virtual
--------------------------------

The improvement seems to be more than just luck, the repeated testing shows it 
is consistent. The most important/critical vc option for improving LM 
performance seems to be "/arch:SSE" which unfortunately slipped from my original 
post.

> Overall, I do not see a need to support building John with MSVC.  That
> would complicate the code with more #ifdef's for no good reason.
> 
Personally I am not interested in LM in john, I just wanted to share this 
discovery :) with others, who can compile it themselves. Maybe someone can 
figure out the details of this behaviour and make good use of them. 
However I am  interested in optimizing compilation, though my understanding of 
source code and programming skills are very limited :)

-Radim

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ