john-dev - Re: Formats using non-SIMD SHA2 implementations

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <op.x3c555zd3qx92w@mom-laptop>
Date: Fri, 14 Aug 2015 10:19:07 -0500
From: "jfoug@....net" <jfoug@....net>
To: john-dev@...ts.openwall.com
Subject: Re: Formats using non-SIMD SHA2 implementations

On Fri, 14 Aug 2015 04:31:00 -0500, magnum <john.magnum@...hmail.com>  
wrote:
> The problem is when you have different length input in one vector. Say  
> one of them required 4 limbs, and another just 3 and the rest only one.  
> This is doable (we do in eg. SAP F/G) but tedious - and reduces benefit  
> of SIMD much like diverging threads in OpenCL does. So we usually don't  
> do SIMD with such formats.

A couple exceptions to that are the sha[256][512]crypts.  Those have
varying number of SIMD limbs. The way we work around that problem, is to
set a number of passwords to work with pretty high, and then within the
format we 'group' the passwords into sets that all use the same number
of SHA* limbs for each 'step'. There are 5 (or 8?) different groupings of
limb counts (IIRC), and are based upon length of the password, and length
of the salt.  I simply use arrays of pointers and counts.  It is sort of
an ugly looking method, but it works very good. Doing this, we are assured
that we are not doing wasted hashes where only part of the items being
worked on are productive.  With the oSSL CTX model, you do not have to
concern yourself with issues like this, since you work on one items at
a time.  But when working on many in parallel, if you want to not waste
cycles, and the items vary in length, you have to group.  NOTE in the
dynamic format, we handle variable number of limbs in hashes (FLAT input)
but we do not do it in the 'smart' manner. We simply check after each
hash to see if there are any elements which are done and keep the results
and if there are still items which require more looping, we loop, doing
more hash limbs.  Dynamic was a much more difficult thing, since we can
not figure out a 'simple' formula for a known hash to say that all
input words between X and Y should group here, and all between Y+1 and Z
should group here, etc.  So I simply have not written that into code.
The times it would be helpful are actually few, so there is no great
overall loss.

In this format, (which I have not looked at), it may be that all of the
length of data to crypt is based upon the salt. If so, then you do not
have to worry about any grouping like this. They will all be grouped
the same.  You just have to do all the work yourself (like Magnum
listed) to properly hash the data.  NOTE, you do not (and should not)
build a set of buffers large enough to write all of the data to. That
would be wasteful and actually probably slower, since the working set
would be much larger.  Instead have 1 buffer, fill it properly, crypt
fill it again, crypt passing in results from last, etc until done.

Jim.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.