Date: Sun, 27 Jan 2013 03:10:37 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Proposed optimizations to pwsafe On Sun, Jan 27, 2013 at 01:07:13AM +0200, Milen Rangelov wrote: > #define rotate(a,b) ((a<<b)+(a>>(32-b)) > > is faster than doing it the usual way: > > #define rotate(a,b) ((a<<b)|(a>>(32-b)) > > and generated PTX is the same except for the ADD/OR thing. My theory is > that using addition somehow utilizes the hardware instruction (the integer > fused multiply-add one) but at least at PTX level, this is not visible. This could be, although some MADs are visible at PTX level. Another guess is that ADD might actually have lower latency than OR - although it'd be weird. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.