
Date: Fri, 29 May 2015 10:56:09 +0300 From: Solar Designer <solar@...nwall.com> To: Alain Espinosa <alainesp@...ta.cu> Cc: johndev@...ts.openwall.com Subject: Re: bitslice SHA256 On Fri, May 29, 2015 at 01:22:10AM 0400, Alain Espinosa wrote: > ...I briefly experimented with merged ADDs in this md5slice.c revision > > I will take a look. > > ...add32c() is a 3input ADD where one of the inputs is a constant > > I check this code searching how to reduce sum instructions count. If I understand it correctly you use more than 5 for one add (more than 10 for 2, if I recall correctly you use 11). My add32() appears to use 5 (not counting the loads and the store): a = *x++; b = *y++; *z++ = (p = a ^ b) ^ c; c = (p & c)  (a & b); But you're right  my add32c()'s code path when the constant has a 1 bit uses 11 (with XNOR) or 12 (without). This feels wrong, and there got to be a way to optimize this to 10 or less within the same instruction set. Its code path for when the current constant bit is 0 has only 7 operations, though  so this demonstrates how the addition of a constant can be cheaper than of a variable: a = *x++; b = *y++; if (c & 1) { *z++ = ~(a ^ b) ^ c1 ^ c2; c2 = (a & b & (p = c1  c2))  (c1 & c2 & (q = a  b)); c1 = p  q; } else { *z++ = (q = (p = a ^ b) ^ c1) ^ c2; c1 = (p & c1)  (a & b); c2 &= q; } Alexander
Powered by blists  more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.