Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 27 Jun 2012 07:46:40 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: mschap-v2 conversion

On Wed, Jun 27, 2012 at 01:44:28AM +0200, magnum wrote:
> On 2012-06-26 15:02, Solar Designer wrote:
> > On Tue, Jun 26, 2012 at 10:06:47AM +0200, magnum wrote:
> >> Solar may respond much better when he gets some more time.
> > 
> > I'm sorry, but my opinion is that tuning OpenMP performance with the
> > current early/experimental bitslicing implementation for this format is
> > premature.
> 
> This is my fault,

No problem.

> I just noticed she broke OMP and she went off fixing
> it. Anyway, even if it's not time to really *tune* it, it should
> definitely not be slower than running one core.

Running slower than on one core may happen during development.

> > Notice that the speedup from bitslicing without OpenMP is quite low,
> > compared to what we're seeing for purely DES formats (much higher
> > speedup there).  I guess this might be because of the uses of MD4 and
> > the conversions to/from bitslice representation, but that does not
> > explain the low speed for the "many salts" case (the uses of MD4 are in
> > key setup only).  We need to seriously look into this and see what can
> > be done about it.
> 
> I presume Deepika is not really interested in this particular format but
> in bit-slicing, so using SSE2 for MD4 might be out of scope. I could add
> SSE2 for MD4 at some point if it helps. I just need to understand the
> context. If we always have 32 or 64 (or 128) keys at a time, it should
> be a walk in the park.

Right.

> I just know the basic theory of BS. Deepika, what is the expected
> speedup from just BS (no OMP) if this had been a straight format with no
> MD4 and stuff involved? If we run 64-bit, we do 64 items at a time,
> right? But it's not 64x faster of course. How much faster, in general,
> should it be?

Sorry for commenting on a question addressed to Deepika rather than to
me, but I'd expect it to be "a few times" faster even on a 32-bit build
for the "many salts" case.  BTW, it looks like not only the MD4 stuff,
but also DES key setup may be done out of the per-salt loop.

Alexander

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ