Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 27 Jan 2013 00:35:01 +0200
From: Milen Rangelov <>
Subject: Re: Proposed optimizations to pwsafe

Just a side note, I just had a look at your opencl pwsafe code and there
are obvious optimizations that can be done. Some are minor, but the most
important is the following. You have this:

#define Ch(x, y, z) (z ^ (x & (y ^ z)))
#define Maj(x, y, z) ((y & z) | (x & (y | z)))

If you replace those by:

#define Ch(x,y,z) (bitselect(z,y,x))
#define Maj(x,y,z) (bitselect(y, x,(z^y)))

You are effectively using just 1 ALU operation per Ch as compared to 3 and
2 ALU ops per Maj as compared to 4.

You've got 80 steps per SHA256 block operation, so you save 360 ALU ops per
SHA256. bitselect is mapped to the hardware instruction BFI_INT. This is
applicable to amd hardware only, not nvidia.

Hope that helps :)

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.