Date: Sun, 27 Jan 2013 00:35:01 +0200 From: Milen Rangelov <gat3way@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Proposed optimizations to pwsafe Just a side note, I just had a look at your opencl pwsafe code and there are obvious optimizations that can be done. Some are minor, but the most important is the following. You have this: #define Ch(x, y, z) (z ^ (x & (y ^ z))) #define Maj(x, y, z) ((y & z) | (x & (y | z))) If you replace those by: #define Ch(x,y,z) (bitselect(z,y,x)) #define Maj(x,y,z) (bitselect(y, x,(z^y))) You are effectively using just 1 ALU operation per Ch as compared to 3 and 2 ALU ops per Maj as compared to 4. You've got 80 steps per SHA256 block operation, so you save 360 ALU ops per SHA256. bitselect is mapped to the hardware instruction BFI_INT. This is applicable to amd hardware only, not nvidia. Hope that helps :) Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.