Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 11 Nov 2007 10:52:07 +0300
From: Solar Designer <>
Subject: Re: bitslice implementation of ORACLE hash cracking

On Mon, Nov 05, 2007 at 03:43:55AM +0000, Larry Bonner wrote:
> just wondering, is it possible to write bitslice implementation for
> oracle hashes that use (new hash) sha-1 and
> (older hash) des in cbc mode? if not, why not?

I haven't looked into the Oracle hashes specifically, but in general:

DES-based hashes can be efficiently and relatively easily bitsliced as
long as the number of DES block encryptions per hash computation is
fixed (for a given salt, if applicable).  When that number is not fixed,
implementation complexity increases dramatically and efficiency
decreases, but an implementation is still possible (with buffering).

CBC mode is not a problem for bitslicing (as multiple candidate
passwords will be tested in parallel), although I am not sure what you
mean by this mode being used for password hashing.  (I am not familiar
with Oracle password hashing.)

As to SHA-1 (as well as MD4 and MD5, for that matter), bitslice
implementations are possible, but they are only more efficient than
traditional ones on CPUs with wide registers (say, 128+ bits), with
large L1 caches, and/or with unusually(?) high instruction issue rates.
Current general-purpose CPUs that satisfy these criteria happen to also
support parallel operations on 32-bit elements within the 128-bit vector
registers (SSE, AltiVec), which is both more straightforward and likely
more efficient for these hashes.

It has been many years since I've experimented with a bitslice
implementation of MD5 - on an old Alpha (dual-issue, 64-bit) in 1998.
It achieved about the same performance as a traditional implementation
(of a single instance of MD5) did on that system, which led me to
conclusions mentioned above.  Maybe it's worth giving this another try
on SSE2, AltiVec, or maybe Cell's SPEs (128 registers would be very
handy).  With a large enough L1 instruction cache, bit rotates can be
omitted completely - instead, different registers (or array elements)
are substituted in subsequent steps.  (That old Alpha did not have a
large enough L1 I-cache, so bit rotates were performed at runtime.)

Alexander Peslyak <solar at>
GPG key ID: 5B341F15  fp: B3FB 63F4 D7A3 BCCC 6F6E  FC55 A2FC 027C 5B34 1F15 - bringing security into open computing environments

To unsubscribe, e-mail and reply
to the automated confirmation request that will be sent to you.

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ