Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 29 Jan 2008 06:31:03 +0300
From: Solar Designer <>
Subject: Re: UltraSPARC T2 support

On Mon, Jan 28, 2008 at 03:26:43PM -0500, Al D. Baran wrote:
> Any work on UltraSPARC T2 support

There's not any specific support for these processors yet, but people
have been running JtR benchmarks on the T1 and tuning the underlying
Solaris system to make JtR scale to up to 32 threads on the T1:

There are also newer revisions of the presentation by Thomas Nau that
include JtR as one of many examples and provide more hints on profiling
and performance tuning under Solaris:

I recall that there was another independent effort to tune JtR on the
T1, with similar results and conclusions, but I am no longer able to
find it on the web.

I am not aware of similar benchmarks for the T2.

> to use the onboard crypto processors

Actually, the T2 includes crypto accelerators on the chip - one per CPU
core (8 per chip).  Yes, it would be great to try to make JtR use them,
but I am not aware of such efforts.  It is not certain which of the
password hashes supported by JtR the accelerators are usable for, and
what kind of speedup may be achieved - this will require some research.

This presentation:

gives some performance numbers for the accelerators on page 25.  The
numbers are per chip (8 accelerators).  While 83 Gbps for DES is quite
impressive - it could translate to up to 50M c/s at traditional
DES-based crypt(3) in theory (assuming no "overhead", which can't be
true in practice) - there's no information on whether the expansion
permutation may be modified (required to support salts).  And the
numbers for MD5 and SHA-1 are not nearly as impressive - 80M c/s per
chip theoretical maximum for MD5, which is around the same per-core
speed that MDCrack achieves on x86 CPUs (10M c/s per core).  Well, if
the 80M c/s were to be achieved in practice (which I doubt), then you'd
have this single chip match the performance of two quad-core x86 CPUs -
for some nice power savings.

> instead of the v8 assembly?

The SPARC V8 assembly code found in sparc.S in JtR 1.7 is almost unused
anyway.  When you build JtR for a 64-bit SPARC target, which is what you
should be doing on modern systems, that file is not used at all.  When
you build for a 32-bit SPARC target (other than Linux), the file is
used, but on most systems bitslice DES is determined to be faster at
compile-time, so that is used instead, whereas the only use left for the
non-bitslice implementation is some non-performance-critical processing
in key setup for BSDI-style DES-based hashes that you probably don't use
anyway.  Thus, it is only on ancient systems that this file is being
used for real, and it is a candidate for removal in a future release.

As to bitslice DES, modern C compilers produce decent code for it
(except on register-starved architectures, which SPARC is not).  Of
course, some further improvement with assembly code (perhaps
Perl-generated) is still possible - such as by making use of UltraSPARC
VIS extensions (more bitwise ops, and they work on floating-point
registers), which in fact was done in DES clients many
years ago.

Alexander Peslyak <solar at>
GPG key ID: 5B341F15  fp: B3FB 63F4 D7A3 BCCC 6F6E  FC55 A2FC 027C 5B34 1F15 - bringing security into open computing environments

To unsubscribe, e-mail and reply
to the automated confirmation request that will be sent to you.

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ