[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 30 Jun 2010 06:13:37 +0400
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: bitslice DES parallelization with OpenMP
New best benchmark (dual Xeon X5460 3.16 GHz, under some unrelated load):
Benchmarking: Traditional DES [128/128 BS SSE2-16]... DONE
Many salts: 20889K c/s real, 2607K c/s virtual
Only one salt: 5701K c/s real, 711814 c/s virtual
That's over 87% efficiency for the multi-salt case (I say "over"
considering that there was a bit of other load).
guesses: 15 time: 0:00:00:36 c/s: 18655K trying: zntkzntk - zzzzzzzz
This is john-1.7.6-omp-des-4, already uploaded to:
http://openwall.info/wiki/john/patches
On Wed, Jun 30, 2010 at 04:42:26AM +0400, Solar Designer wrote:
> ... Changing DES_bs_mt from 8 to 96, I am getting a 1% to 2% slowdown on
> an otherwise idle system,
I was too quick to state that. I forgot that higher DES_bs_mt may also
make it feasible to parallelize set_salt() and even cmp_all(). Taking
care of that and increasing DES_bs_mt further to 192, I reclaimed the
old speed and more on an almost idle system. On the Core i7 920 2.67 GHz
system, I am now getting:
Benchmarking: Traditional DES [128/128 BS SSE2-16]... DONE
Many salts: 10174K c/s real, 1267K c/s virtual
Only one salt: 4841K c/s real, 602923 c/s virtual
That's 88% efficiency (of 11500K for 8 separate processes).
To avoid wasting CPU time when an actual run is about to terminate -
when it has fewer than a full chunk of candidate passwords yet to test -
I also enhanced the "crypt bodies" to perform only the required number
of loop iterations. With this, I am getting:
host!solar:~/john/john-1.7.6-omp-des/run$ ./john -e=double --salts=-2 ~/john/pw-fake-unix
Loaded 1458 password hashes with 1458 different salts (Traditional DES [128/128 BS SSE2-16])
simsim (u2671-des)
[...]
ssssss (u3087-des)
guesses: 14 time: 0:00:00:03 c/s: 9873K trying: ajjgajjg - btslbtsl
guesses: 14 time: 0:00:00:09 c/s: 10019K trying: btsmbtsm - debrdebr
guesses: 14 time: 0:00:00:15 c/s: 10053K trying: eokyeoky - fyudfyud
woofwoof (u1435-des)
guesses: 15 time: 0:00:01:02 c/s: 10055K trying: wtaywtay - ydkdydkd
guesses: 15 time: 0:00:01:08 c/s: 10004K trying: zntkzntk - zzzzzzzz
So 10M c/s on the Core i7 is achieved in practice.
On the dual Xeon, for which I included the new 20M benchmark at the
start of this message, an actual run now does:
host!solar:~/john$ ./john-omp-des-4 -e=double --salts=-2 pw-fake-unix
Loaded 1458 password hashes with 1458 different salts (Traditional DES [128/128 BS SSE2-16])
simsim (u2671-des)
cloclo (u2989-des)
mimi (u3044-des)
aaaa (u1638-des)
xxxx (u845-des)
aaaaaa (u156-des)
jamjam (u2207-des)
booboo (u171-des)
bebe (u1731-des)
gigi (u2082-des)
cccccc (u982-des)
jojo (u3027-des)
lulu (u3034-des)
ssssss (u3087-des)
guesses: 14 time: 0:00:00:01 c/s: 19487K trying: ajjgajjg - btslbtsl
guesses: 14 time: 0:00:00:06 c/s: 20544K trying: eokyeoky - fyudfyud
guesses: 14 time: 0:00:00:16 c/s: 18641K trying: kdvwkdvw - lofblofb
guesses: 14 time: 0:00:00:27 c/s: 18626K trying: snzgsnzg - tyiltyil
woofwoof (u1435-des)
guesses: 15 time: 0:00:00:36 c/s: 18655K trying: zntkzntk - zzzzzzzz
As you can see, it actually exceeds 20M at times, but then goes below
that because of the changing non-John load.
Any feedback?
Anyone to test this on other systems, with other versions of gcc (needs
4.2 or newer, but I only tested 4.5.0), etc?
Alexander
Powered by blists - more mailing lists
Powered by Openwall GNU/*/Linux -
Powered by OpenVZ