```Date: Mon, 29 Apr 2013 11:08:29 -0400
From: Matt Weir <cweir@...edu>
To: crypt-dev@...ts.openwall.com
Subject: Re: Representing the crack resistance of a password.

Jeffrey,
First of all, thanks for taking this conversation here since it'll be
good to get everyone else's input as well! My views on what make a useful
metric for the resistance of a password creation policy to attack have been
evolving over the years. I used to think that "Guessing Entropy" as brought
up by Alexander was the way to go, but I'm not so sure anymore. My main
issue with it, and with the metric you proposed, is I suspect the resulting
value won't provide actionable info to the defender, (Note: that's also my
objection to using Shannon Entropy).

Let me back up. From the way I understand your metric, it attempts to
measure the expected number of guesses it takes to crack 50% of user
passwords, (or to put it another way, to achieve a 50% success rate
cracking one password). The usual definition of "Guessing Entropy" is the
the average number of guesses required to crack a password, (which is a
slightly different metric if the distribution is not even). I'd argue that
either one of those metrics isn't what most defenders really want to know.

To put it another way, let's say your method said that for a given password
policy P, C(X,k) was equal to 32. Aka it takes around 2 billion guesses to
crack 50% of the passwords. One question many defenders want to know is how
does this password policy fare during an online attack when the attacker
can only make around 1000 guesses? While it does give an upper bound, (aka
an attacker will have less than a 50% success rate), that's not all that
useful since a 50% success rate is higher than the amount of risk most
defenders are willing to accept. Also the distribution is not even for
human generated passwords so the defender can't simply estimate the rate of
success as if the password policy created passwords equivalent to a random
keys 32 bits long.

I suspect a better metric is to model the expected success rate of an
attacker given K guesses. I apologize for mangling the equation since I'm
sending it as an e-mail, but you'd want something along the lines of:

X = (k=1 to k=Max Allowed Guesses) Sum( Pr(X=k) )

Where Pr(X=k) is the probability of an attacker guessing a password on
guess number k.

This would give you a metric that would allow you to say, "If an attacker
is allowed to make 1k guesses, their chances of success is X for password
creation policy P."

The advantage of this is that it also allows you to estimate risk for both
online and offline attacks where the number of allowed guesses can be
drastically different. You can then change variables until the resulting
value represents the amount of risk you are willing to accept. Aka let's
say we calculate X and the chance of an attacker succeeding is 10%, but
we're only willing to accept 1% risk. We then have a choice to either make
our creation policy stronger or limit the number of guesses an attacker can
make.

This is kind of what NIST tried to do in SP800-63, but they got into issues
trying to fit Shannon Entropy into that model.

The hard part of course is estimating Pr(X=k).... One way to go about doing
that is to model a password cracking session against real passwords. The
problem is, it's hard to estimate the effectiveness of password policies
that we haven't seen. For example, there hasn't been a major disclosure of

At the end of the day though, this is always going to be a problem with any
metric we choose. We need to go into this willing to revise our findings as

Matt

On Mon, Apr 29, 2013 at 8:26 AM, Solar Designer <solar@...nwall.com> wrote:

> On Mon, Apr 29, 2013 at 01:05:30AM -0500, Jeffrey Goldberg wrote:
> > In a discussion on Twitter, Matt Weir has persuaded me that talking
> about the Shannon Entropy, H, of a password generation system doesn't get
> what we want. H doesn't capture appropriate facts about the distribution of
>
> That's correct.
>
> > The far more meaningful notion would be "what is the prob of an attacker
> cracking a pw under that policy after N guesses". This, as I now
> understand, is not in general computable from H. (It is computable from H
> when the distribution is uniform, but the whole point is that humans do not
> select passwords uniformly from some set.)
> >
> > I asked how we should characterize, or even name, this notion. I tossed
> out
> >
> >   C(X, k) = 2 log_2 G(0.5, X, k)
> >
> > where k is the key or password, X is this distribution, and G(p, X, k)
> is is the number of Guesses needed to hit k in X with probability p.
>
> Regarding your G() above, see:
>
>
> http://www.lysator.liu.se/~jc/mthesis/4_Entropy.html#SECTION00430000000000000000
>
> for a formal definition of "Guessing entropy" and some discussion.
>
> Alexander
>

Content of type "text/html" skipped
```