```Date: Thu, 13 Dec 2012 23:17:45 -0500
From: "Matt Weir" <cweir@...edu>
To: <crypt-dev@...ts.openwall.com>
Subject: RE: Intentionally Increasing Collisions in Password Hashing Algorithms

Hey Steven,
You've hit on the real question here. Ignoring all the other
potential problems, can these collisions cause enough problems for attackers
that this scheme is potentially worthwhile? I wish I had a good answer one
way or the other but I don't yet. I was just starting to work through what
even constitutes valid assumptions when I posted to Twitter, became involved
in a discussion with Solar, and then decided to take the conversation here.
So that's a longer way of saying that I really hope I'm not wasting
everyone's time.

The central factor seems to be: "What is the maximum acceptable risk due to
collisions during online guessing attacks?". That question drives everything
else. On one extreme, if you can't reduce your password hash beyond 127 bits
to stay within a level of acceptable risk, sure you can lop off that last
bit in an MD5 hash but it's not going to impact the attacker at all. Going
completely in the opposite direction, you can have a one bit hash. That
would cause significant difficulty for an attacker when it comes to
attacking password re-use but online attack become comically easy.

This further breaks down into several other questions. For example: "How
many guesses can an attacker make?" and "What's an edge case for risk that's
more realistic than allowing a 1 bit hashes?" Or to put it the NIST way:

Chance of success = Number of Allowed Guesses / 2^H(x) where H(x) is
hash-length in bits.

So we *just* need to find a realistic chance of success as well as the
number of allowed/expected guesses and the rest is easy!

>> At 15-bits, an attacker would need about 16k guesses (on average) to get
into an account using
>> random guesses.  I think that's pretty unacceptable, even for a low
security site.

I'm not saying you're wrong, but consider this for a second. In the RockYou
list, 0.89% of all users choose the password '123456'. This means if a site
doesn't implement a blacklist, an attacker can expect on average to crack a
random account on a low value site for every 113 users they try one guess
on. That's an acceptable amount of risk for many sites. The big difference
vs. collisions of course is the user's ability to influence their own
security. The user who chooses '123456' is accepting the risk that their
account may be cracked. Having a user who chooses 'Re3l1yStr0^gPW' and then
has their account compromised by a collision may not be as acceptable. Also
there is the scale involved. Having 1% to 10% of your user base compromised
due to weak passwords might be ok, but having 100% compromised due to
collisions is a different story.

I have to admit I'm floundering even on what an expected number of allowed
guesses should be. As a quick test I ran THC Hydra against my router's https
admin page and was able to make around 3000 guesses a minute after messing
with the number of tasks to run. I'm sure against a real website and having
multiple clients, (plus more experience using the proper tools), an attacker
can do much, much better. On the other hand there was no rate limiting or
penalty for making a large number of guesses. This is where I need to search
some more to find what the typical volume of online attacks admins see in
real life. Then there's the basic question about how much is it worth to an
attacker to be able to break into an account? Is your threat model some
experienced cracker with a botnet behind them, or your friend who wants to

>>With a very high probability, every one of those 200 passwords is a
correct guess (not just a collision).

That's a very good point. If a user chooses a weak password then the number
of collisions isn't going to help them much. The attacker is going to guess
their password early on before too many collisions are generated. Increasing
collisions may only help users who select passwords that do require
millions/billions of guesses to crack. This kind of goes back to the
original issue of it would be nice if all sites just used strong hashes like
bcrypt/scrypt so we wouldn't even have to muck around with crazy ideas like
this.

In conclusion, I'm not disagreeing with anything you said. I'm just
struggling with formally defining what constitutes unacceptable security
when it comes to resistance against online cracking attacks.

Matt

```

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.