|
|
Message-ID: <bb947076-61f4-b613-0337-aa39cba0316c@bestmx.net>
Date: Sat, 9 Jul 2016 18:00:46 +0200
From: "e@...tmx.net" <e@...tmx.net>
To: passwords@...ts.openwall.com
Subject: Don't Scratch Your Entropy
I have a strong conviction that 99% of "security experts" do not know
the definition of the entropy. This conviction does certainly seem
wildly deranged for you, unless you know the definition in question. So,
let's begin with the definition, by the book.
H = sum(p_i * log(p_i))
This is a function of the probability vector P = {..., p_i, ...} that
represents a distribution of a random variable. Entropy is a
characteristic of a distribution of a random variable. No more and no less.
Let us find the entropy of your password. Your password's distribution
vector is {1}, therefore your password's entropy is:
H = 1 * log(1) = 0
Your password's entropy is ZERO. Try log(1) in different bases on
different computers if you are unsure.
A sophisticated reader may ask: "What if we apply entropy to the
password creation procedure?" It is doable in seemingly reasonable way.
We can model any password creation procedure as a random choice from a
pool of candidate passwords, then characterize the password distribution
over this pool with the entropy. The resulting number will tell us how
much information our procedure represents. So what? Is this number of
any use in the context of "password security"?
Security experts usually jump in here and claim that this number
represents the strength of the produced password. For the argument sake,
let's accept this claim, and construct a password creation procedure as
follows:
the password pool is {"123", "password",
"gtfr3467ujhbvcddgy6r5ddsefvvs", "###"},
we toss two coins and pick one from this four according to the coin toss
outcome.
The entropy of this procedure is (given the coin toss produces uniformly
distributed outcomes):
H1 = -(1/4) * log(1/4) * 4 = 2
Now (according to the mainstream computer "science" (dictated by the
NIST recommendations)) we must label all our passwords with this entropy
value:
"123" has the entropy based strength 2
"password" has the entropy based strength 2
"gtfr3467ujhbvcddgy6r5ddsefvvs" has the entropy based strength 2
"###" has the entropy based strength 2.
Looks somewhat counter intuitive, and not at all what you used to think
about the "entropy" as being pronounced by a respectable "expert" with a
straight face.
Furthermore, we can define another password creation procedure:
toss one coin and pick from the pool
{"123","gtfr3467ujhbvcddgy6r5ddsefvvs"}.
The entropy of this procedure is (twice less than the previous): 1.
Therefore:
the password "123" has the entropy based strength 1.
The very same password "123" that also has the strength 2. A password
has two different strengths simultaneously. If we understand the
"strength" as a likelihood of being guessed by the attacker, then a
single password can not have two different values, because the password
alone is the input argument for the hypothetical attack, not the
password creation procedure.
Thus, accepting the premise: "the password creation entropy
characterizes a produced password", we end up with a contradiction.
Entropy is demonstrated to be not a function of a password. However, in
a little less mentally insane world I should have skipped this lengthy
demonstration altogether. The entropy is just defined as a function of a
random distribution -- who would have thought that it is also NOT a
function of anything else!
But I am not a champion of taking the longer route to obvious
conclusions. Matt Weir have conducted a meticulous experiment with
leaked passwords to make the statement: "entropy based password strength
measures do not provide any actionable information to the defender", and
also: "there is no way to convert the notion of Shannon entropy into the
guessing entropy of password creation policies". In other words, he gave
us an experimental evidence that the entropy is irrelevant to the
password strength problem. Of course, it is irrelevant! This irrelevance
is plainly written in the entropy definition. Matt, you could have just
read the definition and say: "corollary, dear 'experts', don't scratch
your entropy". Nevertheless, these experimental results are of a great
value for humanity, and I am glad we have them, the more evidence the
better. In this world of imbeciles, even the most obvious facts require
tons of "proofs", so far as the "experts" does not go along with math
logic very well.
Still there is more to the topic! Not only the entropy of an accurate
password creation model is irrelevant to the problem of password
strength, but also the model itself is not possible in real life
usecases. What distribution are you going to apply to human created
passwords? Given that (a) humans are incapable of randomization (b) the
pool of passwords they choose from is not accessible to us, not even by
vivisection of the brain. This fact makes the entropy even worse than
irrelevant, it makes the entropy ARBITRARY -- whatever distribution we
assume for a human created password it is inevitably baseless arbitrary
garbage.
Let's recap:
The entropy is a function of a distribution of a random value.
Corollary:
(a) your password's entropy is 0
(b) every "security expert" pronouncing "entropy", without defining the
distribution or at very least the pool of candidate passwords, is a
brain dead buffoon.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.