Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 26 Jan 2013 22:12:16 +0100
From: Lukas Odzioba <>
Subject: Re: Proposed optimizations to pwsafe

2013/1/26 Brian Wallace <>:
> I have some working improvements to pwsafe-cuda.  I don't have the most
> powerful GPU, but the improvement so far is as follows:
> Original:
> Benchmarking: Password Safe SHA-256 [CUDA]... DONE
> Raw:    51801 c/s real, 51801 c/s virtual
> New:
> Benchmarking: Password Safe SHA-256 [CUDA]... DONE
> Raw:    70593 c/s real, 70593 c/s virtual

Thank you very much for working on that! I saw your code and that was
exactly what we needed.
I do have some comments:
1) for me code would be cleaner if we use 1 proper #define doing that:

w[12] = sigma1( w[10] ) + w[5];
+  d += Sigma1( a ) + Ch( a, b, c ) + 0xc6e00bf3 + ( (w[12]) );
+  h += d;
+  d += Sigma0( e ) + Maj( e, f, g );

But at this point it is easier to see some improve obvious
optimizations like: sigma0(const);

2) w[64]  should become w[16] sooner or later for now it is not
critical, it would be good to get rid of H, and k tables too.
3) I guess that is true quite rare:
 if(h + H[7] == v[7]){ ... }
and if it is true we could do all of the rest the computations without
following if statements.

I am just curious, you wrote a tool for unrolling or did all this by hand?


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.