john-dev - Re: JtR on Power

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150709105919.GA16586@openwall.com>
Date: Thu, 9 Jul 2015 13:59:20 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: JtR on Power

On Thu, Jul 09, 2015 at 12:00:41PM +0800, Lei Zhang wrote:
> I pick a few representative formats here to demonstrate the difference after using AltiVec:

I wouldn't call these formats representative (they are fast hashes), and
these are poor speeds either way.

Can you show md5crypt, phpass, sha256crypt, sha512crypt?  And PBKDF2-*?

> Strangely, MD5, SHA256 and SHA512 become even slower.

I don't know what exact CPUs you're on, but I suspect they are designed
to run 4 or 8 threads/core (for POWER7 and POWER8, respectively), and
the impact from not doing so might be more profound for SIMD
instructions (higher latency) than for scalar ones.

So you might need to be doing multi-threaded benchmarks for the slow
hashes, but you also need to be very careful about possible other load
on the system.  If there is any, then you'd need to set OMP_NUM_THREADS
and possibly GOMP_CPU_AFFINITY to get yourself some otherwise free cores.
To set GOMP_CPU_AFFINITY right, you'd first need to figure out the
system's mapping of logical CPUs to physical cores.

In fact, GOMP_CPU_AFFINITY spanning the whole range might be needed even
on an otherwise idle system, as we're seeing on "super".

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.