passwords - Re: Submitting Partial Password Hashes to Pwned Password Lookup

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+E3k90pv45a9oWAKb5NM+1D-s1kxdbCS0-qYxjttpk4d-GhMA@mail.gmail.com>
Date: Mon, 12 Mar 2018 18:10:53 -0800
From: Royce Williams <royce@...hsolvency.com>
To: passwords@...ts.openwall.com
Subject: Re: Submitting Partial Password Hashes to Pwned Password Lookup

Agree 100% with Jim - and I believe that the "commonly" concept is indeed
being lost in context.

Royce

On Mon, Mar 12, 2018 at 2:01 PM, Jim Fenton <fenton@...epopcorn.net> wrote:

> On 3/12/18 1:19 PM, Matt Weir wrote:
>
> This e-mail has already grown too large as it is, but I’d be
> interested in other people’s thoughts on this subject. Am I
> misunderstanding the use of K-anonymity? How should we look at the
> security of this approach?
>
>
> Hi Matt,
>
> I have been looking for the appropriate venue to write something about
> Pwned Passwords, so thanks for the prompt.
>
> tl;dr: Whether you call this k-anonymity or not, I have concerns.
>
> The concern with this, of course, is that some attack (perhaps an attack
> on the web server or CDN) might give an attacker access to the queries and
> associated IP addresses. With the IP addresses, it might be possible to
> determine who the user (and their userID is). With a full hash, it would be
> possible to crack that hash to find the password.
>
> But these hashes also come with frequency statistics. So the attacker can
> just start with the most likely hash (the one with 47205 occurrences in
> your example) and work down the frequency list from there. There are 475
> hashes representing 53006 password instances in the page you cited, so it's
> likely that you'd only need to crack that one hash value and that's the
> right one.  Your analogy with past use of Shannon entropy is indeed
> accurate.
>
> Another concern I have is with the API itself. Many web servers by default
> log the URLs of their requests, so with this API it might log the IP
> address of the requester and the prefix of the hash. I'm sure Troy has this
> all covered but it sounds like he has an elaborate CDN to handle the flood
> of requests he gets for this, and I hope they have this covered as well. A
> better approach would be to use a POST request and put the hash prefix in
> the body of the request (although I don't know what effect that might have
> on the CDN).
>
> My other concern has to do with the effects of using such a large
> blacklist. This has the potential to frustrate users: "Every password I try
> is already taken!" when in fact if the password only appears only a few
> times in such a large corpus it's probably pretty good. When users get
> frustrated like this, they tend to do predicable things, akin to appending
> ! to a password to meet composition rules. A good reference on this:
>
> Habib, Hana, Jessica Colnago, William Melicher, Blase Ur, Sean Segreti,
> Lujo Bauer, Nicolas Christin, and Lorrie Cranor. “Password Creation in the
> Presence of Blacklists,” 2017. https://www.ece.cmu.edu/~
> lbauer/papers/2017/usec2017-blacklists.pdf.
>
> If a pattern emerges, attackers don't have far to go from the passwords in
> this corpus to find the right answer. When the offline attack gets the
> attacker close enough to enable an online attack, we have a problem.
>
> I played with this a bit for PasswordsCon LV 2016 using Mark Burnett's
> corpus of 10 million breached passwords, and formed the opinion that a list
> of about 100,000 passwords (representing those appearing 3 or more times in
> the corpus) was reasonable, both from a security and lack-of-frustration
> perspective:
>
> https://www.slideshare.net/jim_fenton/toward-better-password-requirements
> (see slides 17-22)
>
> I haven't seen whether Troy has published any frequency statistics for
> this set. If not, I should ask him (or just do it myself).
>
> I'm a little concerned that people are interpreting the following wording
> in NIST 800-63B as a requirement that the list has to be as comprehensive
> as possible:
>
> When processing requests to establish and change memorized secrets,
> verifiers SHALL compare the prospective secrets against a list that
> contains values known to be commonly-used, expected, or compromised.
>
> I wrote that sentence, and that wasn't my intent (the key word here is
> "commonly"). There's a balance, and making it harder to choose an
> acceptable password isn't always good for security.
>
> -Jim
>  (personal opinions, not NIST's, of course)
>

Content of type "text/html" skipped
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.