Date: Mon, 12 Mar 2018 15:01:24 -0700 From: Jim Fenton <fenton@...epopcorn.net> To: passwords@...ts.openwall.com Subject: Re: Submitting Partial Password Hashes to Pwned Password Lookup On 3/12/18 1:19 PM, Matt Weir wrote: > This e-mail has already grown too large as it is, but I’d be > interested in other people’s thoughts on this subject. Am I > misunderstanding the use of K-anonymity? How should we look at the > security of this approach? Hi Matt, I have been looking for the appropriate venue to write something about Pwned Passwords, so thanks for the prompt. tl;dr: Whether you call this k-anonymity or not, I have concerns. The concern with this, of course, is that some attack (perhaps an attack on the web server or CDN) might give an attacker access to the queries and associated IP addresses. With the IP addresses, it might be possible to determine who the user (and their userID is). With a full hash, it would be possible to crack that hash to find the password. But these hashes also come with frequency statistics. So the attacker can just start with the most likely hash (the one with 47205 occurrences in your example) and work down the frequency list from there. There are 475 hashes representing 53006 password instances in the page you cited, so it's likely that you'd only need to crack that one hash value and that's the right one. Your analogy with past use of Shannon entropy is indeed accurate. Another concern I have is with the API itself. Many web servers by default log the URLs of their requests, so with this API it might log the IP address of the requester and the prefix of the hash. I'm sure Troy has this all covered but it sounds like he has an elaborate CDN to handle the flood of requests he gets for this, and I hope they have this covered as well. A better approach would be to use a POST request and put the hash prefix in the body of the request (although I don't know what effect that might have on the CDN). My other concern has to do with the effects of using such a large blacklist. This has the potential to frustrate users: "Every password I try is already taken!" when in fact if the password only appears only a few times in such a large corpus it's probably pretty good. When users get frustrated like this, they tend to do predicable things, akin to appending ! to a password to meet composition rules. A good reference on this: Habib, Hana, Jessica Colnago, William Melicher, Blase Ur, Sean Segreti, Lujo Bauer, Nicolas Christin, and Lorrie Cranor. “Password Creation in the Presence of Blacklists,” 2017. https://www.ece.cmu.edu/~lbauer/papers/2017/usec2017-blacklists.pdf <https://www.ece.cmu.edu/%7Elbauer/papers/2017/usec2017-blacklists.pdf>. If a pattern emerges, attackers don't have far to go from the passwords in this corpus to find the right answer. When the offline attack gets the attacker close enough to enable an online attack, we have a problem. I played with this a bit for PasswordsCon LV 2016 using Mark Burnett's corpus of 10 million breached passwords, and formed the opinion that a list of about 100,000 passwords (representing those appearing 3 or more times in the corpus) was reasonable, both from a security and lack-of-frustration perspective: https://www.slideshare.net/jim_fenton/toward-better-password-requirements (see slides 17-22) I haven't seen whether Troy has published any frequency statistics for this set. If not, I should ask him (or just do it myself). I'm a little concerned that people are interpreting the following wording in NIST 800-63B as a requirement that the list has to be as comprehensive as possible: > When processing requests to establish and change memorized secrets, > verifiers SHALL compare the prospective secrets against a list that > contains values known to be commonly-used, expected, or compromised. I wrote that sentence, and that wasn't my intent (the key word here is "commonly"). There's a balance, and making it harder to choose an acceptable password isn't always good for security. -Jim (personal opinions, not NIST's, of course) Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.