
Date: Mon, 25 Nov 2013 15:05:46 +0400
From: Solar Designer <solar@...nwall.com>
To: cryptdev@...ts.openwall.com
Subject: increasing scrypt SMix 1st loop AT cost
Hi,
Attached is a program that simulates scrypt's SMix with its original two
loops, as well as 3 possible revisions of the first loop (thus a total
of four code versions). The revisions are less friendly to TMTO, so
should only be used when easy and efficient TMTO is not desired (for the
defender's own use).
loop1_pow2 increases SMix's total areatime cost for the attacker to 4/3
of original (that is, by one third). In practice, the increase is
likely higher since this also makes TMTO less efficient (and as we've
discussed before, an attacker with ASIC would actually want to use TMTO
with classic scrypt). The revised first loop's AT cost is calculated as
(1/2)^2 + (1/4)^2 + (1/8)^2 ..., which for large N approaches 1/3.
loop1_mod increases SMix's total areatime cost for the attacker to 1.5
of original, similarly without consideration of its effect on TMTO yet.
However, it has certain maybedrawbacks compared to loop1_pow2: the
distribution of j's is farther away from uniform (it is uniform for any
given i, but not across all i's) and fewer different V_j's are being
accessed total. These are in addition to drawbacks of the integer
division itself: depends on chosen size of X (beyond bits in N1),
somewhat slow, not constant time, slightly nonuniform when size of X is
not a lot larger than log2(N).
loop1_wrap is similar to loop1_mod, but it replaces the modulo division
with a custom function I came up with, which has both pros and cons.
Currently my preference is loop1_pow2 (this is what I have in our
scryptderived development tree).
Program output:
classic
B' = 292735ce775aaef5
t_cost = 2097152
at_cost1 = 1073741824 at_cost2 = 549755813888 at_cost = 550829555712
at_cost1/t = 512 at_cost2/t = 262144 at_cost/t = 262656
total: nhits=663284 (63.26%) mhits=9
loop1_pow2
loop1: nhits=595349 (56.78%) mhits=10
B' = d9247fb08f9e807c
t_cost = 2097152
at_cost1 = 183251937963 at_cost2 = 549755813888 at_cost = 733007751851
at_cost1/t = 87381 at_cost2/t = 262144 at_cost/t = 349525
total: nhits=882412 (84.15%) mhits=13
loop1_wrap
loop1: nhits=456640 (43.55%) mhits=24
B' = b7283d0ff61c6599
t_cost = 2097152
at_cost1 = 274877644800 at_cost2 = 549755813888 at_cost = 824633458688
at_cost1/t = 131072 at_cost2/t = 262144 at_cost/t = 393216
total: nhits=830413 (79.19%) mhits=24
loop1_mod
loop1: nhits=524257 (50.00%) mhits=21
B' = 2365d32db26bfc21
t_cost = 2097152
at_cost1 = 274877644800 at_cost2 = 549755813888 at_cost = 824633458688
at_cost1/t = 131072 at_cost2/t = 262144 at_cost/t = 393216
total: nhits=855632 (81.60%) mhits=22
For nhits, higher is better (more V elements were read from). For
mhits, lower is better (suggests we're closer to uniform distribution,
although indeed there's no guarantee from just looking at this value).
I've also tried compressing sequences of j's with gzip and comparing the
compressed sizes, I actually looked at lists of j's sorted by hit count,
and I similarly tested many other possible definitions of wrap().
Alexander
View attachment "simat.c" of type "text/xc" (4319 bytes)
Powered by blists  more mailing lists