crypt-dev - Intentionally Increasing Collisions in Password Hashing Algorithms

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ9ii1GG0QJe8i4vdSEdD5X0HMAM-gNn+X2bnWeZg65zVHS=aw@mail.gmail.com>
Date: Wed, 12 Dec 2012 18:03:12 -0500
From: Matt Weir <cweir@...edu>
To: crypt-dev@...ts.openwall.com
Subject: Intentionally Increasing Collisions in Password Hashing Algorithms

So this is a continuation of a rather long Twitter thread between
Solar, @DefuseSec and me, (apparently Twitter is not IRC), talking
about increasing collisions in certain password hashing algorithms as
a defensive measure. As a quick disclaimer, this very well might be a
horrible idea, (I know when people first hear about it that’s their
initial reaction). That being said, it is something that I feel is at
least worth looking into. Also i'd like to point people to
@DefuseSec's blogpost on the subject:

https://defuse.ca/blog/2012/12/increasing-collisions-in-password-hashing/

Quick Definitions:
Offline Attack: The attacker has access to the password hash. They can
make as many guesses as their resources allow them to. An example of
this would be using John the Ripper/HashCat to crack BCrypt hashes.
Online Attack: The attacker does not have access to the password hash.
The defender can limit the number/rate of guesses the attacker can
make. An example of this would be an attacker using THC Hydra to
submit password guesses to a web login page.

Background/Scope:
Just to get this out of the way at the start. Increasing collisions is
in fact a horrible idea if your goal is to defend against offline
attacks. Under no circumstances should this be used for encryption
applications.

So taking a step back, there are certain fundamental truths when it
comes to the current state of password security. Users will choose
weak passwords; users will reuse passwords; and many sites will not
implement memory hard or computationally hard password hashing
algorithms. Now some users do choose strong passwords, many users
avoid reusing passwords for important sites, and some sites do use
strong hashing functions. But there’s enough examples of that not
being the case I don’t think I have to spend too much time saying this
is a major problem. If it wasn't a problem then tools like JtR and
Hashcat would be useless and no-one would worry when LinkedIn’s
password hashes were stolen ;p

Now there’s certainly good tactics to combat this problem. Two-factor
authentication, password vaults like Keepass, hashing algorithms like
BCrypt and SCrypt, user training, etc. That being said, there’s
reasons why we haven’t solved the password problem yet. Many of these
techniques like password vaults are only slowly gaining user
acceptance. On the other hand blaming users for picking weak passwords
has gained high acceptance but that doesn't seem to be working very
well ;p Strong hashing algorithms certainly can scale up to websites
that support millions of users, (Twitter uses BCrypt). But going off
Solar and Simon Marechal’s slides from the Password^12 conference when
talking about phpass, “The portable hashes are very simple, which was
key to phpass’ acceptance. More elaborate portable hashes would likely
not be accepted; This may be something to try for a next generation
phpass now that the foot is in the door”. My interpretation of that,
(which is also based on my personal experiences), is there is going to
be a lot of resistance to widely deploying computationally expensive
hashing algorithms. Here's the link to their talk btw:

http://www.openwall.com/presentations/Passwords12-The-Future-Of-Hashing/

This brings us to password hash collisions. Ideally we could just
focus on making password hashes as expensive as possible to compute
while training users to choose strong/unique passwords to make
cracking cost prohibitive. Since there is a lot of pushback on this,
I've been looking at defensive techniques that might have an
asymmetric cost for an attacker vs. the defender. A good example of
such a technique is password salts. They have almost no impact on the
defender, but have a huge impact on an attacker who is targeting
multiple users.

Now one issue is that users have passwords for many websites where
their account has little value to an attacker, but the users then
reuse their passwords for sites that do contain valuable info.
Attackers exploit this by breaking into low value sites, stealing the
password hashes, (hopefully they are hashed…), cracking them and then
attempting to use the stolen credentials against high value sites.
Basically they are taking advantage of the fact that offline attacks
against weak hashes are fairly cheap while online attacks can be
comparatively quite expensive, (you have to deal with rate limiting,
IP blacklists, captchas, etc). Now online attacks are certainly
popular as well, (just look into all the SSH bruteforcing going on),
but targeting password reuse is popular for a reason. So it would be
nice if we could raise the cost of checking if a cracked password is
valid on a different website, (basically adding an “online” component
to an “offline” cracking attack).

That’s why I’m looking into intentionally increasing collisions in
password hashes for low value websites/accounts. If we can insert some
level of uncertainty into the hash cracking process, (does this guess
really match the plaintext password the user selected?), it can raise
the cost for an attacker when it comes to attacking other sites. For
example, now attackers will try a username+password combo along with a
few variations, (such as incrementing numbers at the end of the
password), but if that doesn't work they assume the user didn't reuse
their password and they move on. The best description I've seen of
this based on real world data is:

https://www.guildwars2.com/en/news/mike-obrien-on-account-security/

With collisions, the attacker doesn't know if the password they
cracked is the user’s real password. This means when they attempt to
reuse it and fail the attacker has a couple of choices:
1)	Proceed as normal, (depending on the probability of a collision,
the attacker’s first match still has a good chance of being the user’s
password). The advantage for a defender is, A) If the attacker did
crack the user’s password then it would have been cracked regardless
B) If it is a collision the attacker will fail to access the user’s
account even if the user reused their password.
2)	If the reused credentials don’t allow access, continue their
cracking session until they find another match and then try the new
credentials, (they can queue up possible passwords before checking
them). Repeat until they find the correct password, or they run out of
time, (they decide to stop). Not only is there the additional cost of
checking collisions, but if a user did not reuse their password the
attacker doesn't know that. Aka the attacker may waste time
cracking/checking passwords long beyond when they would normally stop
if there weren't collisions.

One other advantage with password hash collisions is they provide
deniability of the plaintext value. If someone’s password is
embarrassing, such as “I hate my boss”, this provides them the ability
to claim that their password was really something else if their
password hash is disclosed.

Now unlike salts, there are downsides to the defender for this
approach. The first is that it increases a user’s vulnerability to
online guessing attacks. AKA, an attacker’s guess might collide with
the user’s password allowing the attacker to log in. That’s not a
minor concern and I’ll come back to that later. The second problem is
that for this approach to have any impact it will make it trivial for
an attacker to find a collision when performing an offline attack.
Basically if an attacker has access to the hash they also have access
to that user’s account. That’s not something to discount and it was
the subject of a lot of the Tweets between Solar and myself earlier ;p

Before getting back to those trade-offs I’d like to describe some
ideas about how to increase hash collisions in a what I hope is a
controlled fashion.  For example, there are many different ways to
increase collisions in popular password hashing functions. The key
then is to select a method that does not leak additional info about
the plaintext password while at the same time allowing the defender to
estimate the number of collisions that are likely to occur.

This rules out increasing collisions by mangling the plaintext
password before hashing. An unintended example of this approach can be
seen in old DES based Crypt(3) hashes which truncate user passwords
after 8 characters. While this certainly increases collisions, (for
example ‘password’ and ‘password123’ would share the same hash), it
also allows an attacker to target only the first 8 characters of the
password instead of the entire password. While there are some
advantages to this approach, (the attacker doesn't know whether to try
‘password’ or ‘password123’ when reusing the credentials), those
advantages are more than outweighed by the fact that an attacker can
learn about the beginning portion of a password without having to
guess the entire password.

These requirements also disqualifies modifying the internal operation
of popular hashing algorithms, or building new collision prone hashing
algorithms from scratch. Beyond the fact that such an approach is a
bad idea in general, such changes or new hashing algorithms may
unintentionally leak information about the plaintext password. In
addition, it is unlikely without a detailed study of the new
algorithms that the number of collisions can be accurately predicted.
This is a longer way of saying that while it is theoretically possible
a new collision prone hashing algorithm could be developed, the
chances of it going catastrophically wrong are not trivial and there
is no reason to do so in the first place.

Therefore the approach I’m currently investigating is to create
collisions by truncating the results of existing password hashing
algorithms. First of all this approach does not leak any additional
information about the underlying password in an offline attack. I’m
working on a formal proof to show this but the colloquial example is
if this approach did leak additional info, then attackers would
already be doing it. Note, I’m specifically referencing offline
attacks. In an online attack it does leak data since it can let the
attacker know about collisions if they can guess them, (and as a side
effect log in which might be a bigger problem…). Still the knowledge
gained about the underlying plaintext password learned based on the
collisions would probably be limited if a standard/modern hashing
algorithm was used and the password salt was large and remained
unknown to the attacker.

As far as being able to estimate the collision frequency goes, this is
admittedly a trickier prospect. This is because it is dependent on the
underlying hash function being used. The closer in behavior the
underlying hash function is to a random oracle, then the easier it is
to calculate the frequency of collisions. While no hash function
exists that is a perfect random oracle, several are close enough in
practice that intelligent guesses might be able to be made as to the
probability that random collisions will occur, (yes I just suggested
we make the cow spherical…)

There are several other advantages to increasing collisions by
truncating existing hashes. First, it should cause no additional
performance impact compared to existing implementations. In fact it
might trivially make authentication attempts quicker as only the first
n bits need be compared between the hash from a user’s password
submission and the saved hash on disk, (hopefully that speedup isn't
noticeable or you need to be using a stronger hash!) Second, this
should limit any additional side channel leaks compared to the
existing hashing setup as the added work, (the truncation), is done on
a fixed length hash for a fixed length value for every single password
hash, regardless of the underlying plaintext. Finally, and perhaps
most importantly, it is easy to upgrade to this from existing
deployments. You simply take whatever hashing algorithm is being used
and truncate it to the desired length. Since this is not dependent on
the underlying plaintext password, site administrators don’t have to
deal with two sets of user hashes, one for users who have logged in
since the upgrade, and one for users who haven’t, (note if you want to
avoid collisions with a blacklist of banned passwords it does require
some additional work though).

Now one consideration is you probably don’t want a user’s strong
password to collide with passwords that are commonly used in online
attacks. Aka if a user’s password collided with ‘123456’ that could be
a serious problem. One solution for this when the user creates a new
password, (or logs in with an existing password and the plaintext is
available), is to check the password’s hash against the hash for those
blacklisted passwords. If a collision occurs, change the salt, (You
are using a salt right!?), and run a check with the new salt. Repeat
until there are no collisions with blacklisted passwords. Now this is
more complicated if you are upgrading your hashing scheme or
blacklists, and don’t have access to the user’s plaintext passwords.
There’s a couple of ways to deal with this, but one option might
simply be to “hash the hash” and make that part of the new hashing
algorithm going forward, (or switch back once the user logs in again).
It also helps if the same blacklist is used to prevent users from
actually picking those banned passwords.

Oh, also a unique long and random salt is required. If different sites
share the same collisions it kind of negates the advantages of this
approach ;p

So this leaves a lot of interesting questions. For example where
should this be used and on what accounts? Aka you might not want to
use this on webmaster accounts while using it for normal users. Also
you may not want to use it for accounts that contain sensitive info as
the assumption that if an attacker has the hash then they have access
to the user’s account might not hold true. Then there’s the basic
question of how likely should collisions be? Plus it's still left to
be determined if the advantages of this approach outweigh all the
negatives even for unimportant accounts.

…. But I’m way past my 140 character limit right now and to be
perfectly honest I don’t have good answers for these questions yet so
I’ll leave answering them to other people and future e-mails ;p
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.