Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 27 Apr 2006 22:01:54 +0400
From: Solar Designer <>
Subject: Re: Performance tuning

I wrote, regarding the MMX-to-SSE bitslice DES hack:
> > On Pentium 3 and on AMD processors, the SSE code is slower in all
> > cases (for the benchmarks I've performed or have seen so far).

On Thu, Apr 27, 2006 at 06:26:18PM +0200, Simon Marechal wrote:
> There are twice as much sse registers on amd64 than on 32bits mode. Do
> you use this feature?

No, not yet.  The extra registers are indeed very helpful, but the
slowdown with the move from MMX to SSE on AMD processors is bad enough
that the extra registers, if used to reduce the instruction count and/or
to avoid dependencies, would barely compensate for it (of course, this
is just my guesstimate).

Perhaps this is worth doing for EM64T and for future AMD processors.

> I think a cool feature would be to add a more "realistic" benchmark
> option. If it would be possible to have john do candidate key setup,
> hash and comparisons, instead of just hashing, it would reveal the cost
> of slower key setup functions, usually associated with SIMD versions of
> the cipher.

John does just that already.

It's only for salted hashes that John benchmarks things differently.
For those, it also does a benchmark for the typical "many salts" case,
doing salt setup instead of keys setup for each crypt_all() call, but
still doing keys setup once per BENCHMARK_MANY crypt_all() calls.  This
is implemented in bench.c: benchmark_format() - the do ... while loop.

Alexander Peslyak <solar at>
GPG key ID: B35D3598  fp: 6429 0D7E F130 C13E C929  6447 73C3 A290 B35D 3598 - bringing security into open computing environments

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ