Date: Tue, 15 Mar 2011 18:21:59 -0500 From: "jfoug" <jfoug@....net> To: <john-dev@...ts.openwall.com> Subject: RE: Speedup of x86 .S build of raw-sha1 format This is saved_key unsigned char saved_key[80*4*MMX_COEF]; >From: RB [mailto:aoz.syn@...il.com] >[snip] > >> + //memset(saved_key, 0, sizeof(saved_key)); >> + memset(saved_key, 0, 64*MMX_COEF); > >For the memory-management poor but intellectually curious on the list, >can you explain why? Is it just because we avoid sizeof() in what I >presume is a tight loop? What happens if saved_key is larger than >64*MMX_COEF, or can that even happen? The first 64 bytes (times MMX_COEF) are used for the 'key'. The later bytes are used as feedback bytes (i.e. temp vars). It appears those are first written to, before they are read. Thus, that is why I think only cleaning out the first 64 bytes (ignore the MMX coef), is adequate enough. The way the SSE for SHA 1 works, is after the first 16 'rounds', it kicks into a different subround. This subround 'feeds' in some of the data from the prior 64 bytes, and writes this stirred value, into the end of the array. Thus, this working set of feedback data, is always in a prior 64 byte window. Thus, for SHA (as implemented), the input buffer is 16 int32's, but it requires 16+64 of these int32's. At least this is what I 'believe' is happening. The first 16 int32's are the only input that is read only. The last 64 are write-first, then read. Jim.
Powered by blists - more mailing lists
Powered by Openwall GNU/*/Linux - Powered by OpenVZ