Date: Tue, 29 Jan 2008 06:31:03 +0300 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: UltraSPARC T2 support On Mon, Jan 28, 2008 at 03:26:43PM -0500, Al D. Baran wrote: > Any work on UltraSPARC T2 support There's not any specific support for these processors yet, but people have been running JtR benchmarks on the T1 and tuning the underlying Solaris system to make JtR scale to up to 32 threads on the T1: http://www.sun.com/products-n-solutions/edu/events/archive/hpc/2006presentations/Dev04_ThomasNau_Showcase.pdf There are also newer revisions of the presentation by Thomas Nau that include JtR as one of many examples and provide more hints on profiling and performance tuning under Solaris: http://www.guug.de/veranstaltungen/osdevcon2007/slides/OSDevCon_Berlin_200307_DTrace.pdf http://www.rz.rwth-aachen.de/computing/events/2007/sunhpc_2007/dtrace.pdf I recall that there was another independent effort to tune JtR on the T1, with similar results and conclusions, but I am no longer able to find it on the web. I am not aware of similar benchmarks for the T2. > to use the onboard crypto processors Actually, the T2 includes crypto accelerators on the chip - one per CPU core (8 per chip). Yes, it would be great to try to make JtR use them, but I am not aware of such efforts. It is not certain which of the password hashes supported by JtR the accelerators are usable for, and what kind of speedup may be achieved - this will require some research. This presentation: http://blogs.sun.com/sprack/resource/N2_Announce_Breakout_final.pdf gives some performance numbers for the accelerators on page 25. The numbers are per chip (8 accelerators). While 83 Gbps for DES is quite impressive - it could translate to up to 50M c/s at traditional DES-based crypt(3) in theory (assuming no "overhead", which can't be true in practice) - there's no information on whether the expansion permutation may be modified (required to support salts). And the numbers for MD5 and SHA-1 are not nearly as impressive - 80M c/s per chip theoretical maximum for MD5, which is around the same per-core speed that MDCrack achieves on x86 CPUs (10M c/s per core). Well, if the 80M c/s were to be achieved in practice (which I doubt), then you'd have this single chip match the performance of two quad-core x86 CPUs - for some nice power savings. > instead of the v8 assembly? The SPARC V8 assembly code found in sparc.S in JtR 1.7 is almost unused anyway. When you build JtR for a 64-bit SPARC target, which is what you should be doing on modern systems, that file is not used at all. When you build for a 32-bit SPARC target (other than Linux), the file is used, but on most systems bitslice DES is determined to be faster at compile-time, so that is used instead, whereas the only use left for the non-bitslice implementation is some non-performance-critical processing in key setup for BSDI-style DES-based hashes that you probably don't use anyway. Thus, it is only on ancient systems that this file is being used for real, and it is a candidate for removal in a future release. As to bitslice DES, modern C compilers produce decent code for it (except on register-starved architectures, which SPARC is not). Of course, some further improvement with assembly code (perhaps Perl-generated) is still possible - such as by making use of UltraSPARC VIS extensions (more bitwise ops, and they work on floating-point registers), which in fact was done in distributed.net DES clients many years ago. -- Alexander Peslyak <solar at openwall.com> GPG key ID: 5B341F15 fp: B3FB 63F4 D7A3 BCCC 6F6E FC55 A2FC 027C 5B34 1F15 http://www.openwall.com - bringing security into open computing environments -- To unsubscribe, e-mail john-users-unsubscribe@...ts.openwall.com and reply to the automated confirmation request that will be sent to you.
Powered by blists - more mailing lists
Powered by Openwall GNU/*/Linux - Powered by OpenVZ