Date: Fri, 11 Apr 2014 12:26:05 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Handling of hashes with different iteration counts Frank, On Fri, Apr 11, 2014 at 12:02:24AM +0200, Frank Dittrich wrote: > So far, most formats just had a tunable parameter "iteration count". > Now I am dealing with django-scrypt. > It has the tunable cost parameters known from scrypt: > N - the CPU cost, must be a power of 2, defaults to 2^15 > r - the memory cost, defaults to 8 It's wrong to call scrypt's N "the CPU cost" and r "the memory cost". Either of them affects both the CPU time cost and the memory cost, in roughly the same way. The difference between them is in how they affect the memory access pattern: higher r increases the size of individual accesses (more cache lines are accessed sequentially if r is higher), whereas higher N increases the number of individual random accesses. > p - the parallelization parameter, defaults to 1 Right. > If p is 1, then N*r = m_cost = N*r*p = t_cost. This doesn't seem to be > very useful. Yet it makes sense. > For now (not committed yet, and thus no pull request), I decided to > report N*p as t_cost and r*p as m_cost. > (As long as p == 1, just N and r are reported; p > 1 influences both > t_cost and m_cost. Does that make sense?) No, this makes no sense to me. With scrypt, the only way to tune t_cost without affecting m_cost is via p. That's the reality. What you're doing is reusing my suggested t_cost and m_cost names to mean something totally different. That's confusing. Please don't do that! BTW, in yescrypt, I've introduced t - a way to tune t_cost without affecting either m_cost or p. So with yescrypt you'd have: m_cost = N * r t_cost = N * r * f(t) * ((flags & YESCRYPT_PARALLEL_SMIX) ? 1 : p) whereas with classic scrypt it is: m_cost = N * r t_cost = N * r * p (This is assuming that we're not making use of p as parallelism, since we have enough parallelism from multiple candidate passwords.) Typically, we'll see p=1 and f(t)=1, though. In other words, with typical uses the defender maximizes the memory usage for the time allotted to compute a hash, because this maximizes the area-time. So typically m_cost and t_cost calculated as above will be equal (or numerically proportional, if we introduce scaling to some actual units of memory and time, respectively). I understand we might want to group hashes not only by their full m_cost and t_cost, but also e.g. by N and r individually, as the differences in memory access pattern may turn into speed differences (this is why these settings are separately tunable, after all). However, this simply does not fit into the m_cost and t_cost model. If you want to include such support, you need to include it explicitly: as ability to group hashes by individual hash type specific parameters. For yescrypt, you would then also need to support grouping separately by p, t, and flags. Do we really want to introduce that support proactively? Thanks, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.