Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Wed, 17 Dec 2008 04:12:52 +0900
From: "Dumplinger Boy" <nasay.ognad@...il.com>
To: john-users@...ts.openwall.com
Subject: Bitslice DES fast implementation for AltiVec(PPC)

Hello, all.

I tried the method of making Bitslice DES can executed at high speed
or more in PowerPC.
VSEL instruction can replace 3 instruction combination(VAND, VANDC, and VOR).

 d = vec_or(vec_andc(a, c), vec_and(b, c));  ==>  d = vec_sel(a, b, c);

 It confirms with PowerPC G4(MacOS 10.4) and Cell-PPU on PLAYSTATION 3(Linux).
In G4, as for the performance of JtR 1.7.3.1, the one compiled with
gcc-3.3(Apple's)
was the fastest, and more performance gained (about 20%) with using my code.

before:
> Benchmarking: Traditional DES [128/128 BS AltiVec]... DONE
> Many salts:     725427 c/s real, 728340 c/s virtual
> Only one salt:  649676 c/s real, 652285 c/s virtual

after:
> Benchmarking: Traditional DES [128/128 BS AltiVec]... DONE
> Many salts:     867635 c/s real, 871119 c/s virtual
> Only one salt:  762291 c/s real, 763818 c/s virtual


P.S.
 As long as I know, scaler-integer and AltiVec instructions can be
operated parallel in most PowerPC imprementation. Therefore, there is a
possibility of more performance gain.

Download attachment "john-the-ripper-1.7-faster-altivec.diff.gz" of type "application/x-gzip" (763 bytes)

Download attachment "sboxes-alti.c" of type "application/octet-stream" (17255 bytes)

-- 
To unsubscribe, e-mail john-users-unsubscribe@...ts.openwall.com and reply
to the automated confirmation request that will be sent to you.

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ