Date: Mon, 18 Mar 2013 10:40:16 +0400 From: Solar Designer <solar@...nwall.com> To: crypt-dev@...ts.openwall.com Subject: Re: scrypt TMTO defeater On Sun, Mar 17, 2013 at 08:03:21AM +0400, Solar Designer wrote: > Unfortunately, it turns out that BlockMix's shuffling(*) gets in the way > of efficient implementations, and _possibly_ the inefficiency mostly hurts > cleaner CPU code, as opposed to hacks and ASICs. Specifically, when the > V element we write into is or might be the same as V_j, in a clean > implementation we have to postpone those writes until the BlockMix > completes, and this means re-reading the same values from memory > (hopefully, from L1 cache) rather than simply storing register values > into two locations at once (such as X and V_j). The above applies when trying to store (or XOR) BlockMix output into a V element, but that in itself is problematic: if V_j on the very next iteration of the SMix loop is the same as the element we just wrote into, the XORs may cancel each other, in some cases resulting in really nasty consequences. The previous TMTO defeater patches I posted in the last 2 days suffer from this problem. Oops. I hope this is taken for granted, but to be on the safe side let me say that none of these early/experimental TMTO defeater patches are meant for actual use. Now, I could fix this by adding even more complexity, such as to avoid repeating j's (detect repeats and XOR with 1 if so, which may be done in a branch-less fashion). However, I chose to rethink the whole approach instead, and realized that Anthony's original example did not have this problem due to storage of the block value from just prior to a BlockMix. A reason why I did not try implementing Anthony's original example right away was that it was incompatible with an optimization I had made in crypto_scrypt-sse.c, where the XORs were re-ordered. Now I went for the extra implementation complexity in this version of the code, implementing a separate BlockMix function with the original order of XORs so that the right value could be written back into V_j. This appears to work well, and is fast. -ref and -nosse implement Anthony's original example without such extra complexity. New revision is attached. Comments are welcome. New MD5 of output of "tests" with defeat_tmto = 1: 2843045848a204727c8dd7677bd6b8e3 Because of all the partial bypasses of this kind of TMTO defeater (that I've documented in separate postings), I am likely to also proceed to implement a defeater by means of changes to the first SMix loop. Then the question will be whether to keep the defeater in the second loop as well or not. Alexander View attachment "escrypt-defeat-tmto-30.diff" of type "text/plain" (17136 bytes) Download attachment "escrypt-0.0.30.tar.gz" of type "application/x-gzip" (10529 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.