Date: Thu, 25 Aug 2016 08:46:37 +0200 From: Yoha <yoha@...on.org> To: passwords@...ts.openwall.com Subject: Re: GMOs And Passwords > A password does not have to be random!!! Yes, they do. I think you meant "generated by computers". > A password have to be UNKNOWN and UNOBTAINABLE for the attacker. > (it is not equivalent to randomness). That's pretty much the definition of random in cryptography. If you do not have a background in cryptography, do consider that it the intuitive concept of “randomness” does not map well to a mathematical definition. This article is a pretty good summary: https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator . > Look, since we know that humans are destined to fail in creating a > random password, it should be obvious that asking them to fail is a > stupid move. Agreed, in the ideal case, humans should be be the one generating the passwords. They should only select a generation process. > I suggest making DEEPLY PERSONAL passwords. > You can ask your users to use a bit of memories that they know they > never shared with anyone; write a sentence about it, add some flavour; > Job's done. This is terrible advice, unless your goal is to crack the users' passwords. > Actually, people already feel about the passwords as very personal > stuff. Look, how many people use birthdays as their passwords -- this > is clearly an attempt of PERSONALIZATION. This is a total failure > because the birthdays are NOT personal information. Funny as it is, in > this birthday password situation the actual mistake is on the birthday > side not the password side. People understand passwords, people > misunderstand birthdays! And it is a solvable problem, it is very easy > to tell people, you are on the right path, just make it way further. It reads like an “is-ought” fallacy, but I see your point. Correct my if I misinterpret: people want passwords they can memorize, and you should take that into account. Then everyone here agree with you; the disagreement is about how to conciliate this goal with that of making passwords unguessable. I feel that your model is mostly intuitive, and you disregard the actual reasons for which we want computer-generated passwords. I have written of summary of those reason. If you are already familiar with the concept, it might be useful for someone else, and it is also an occasion for me to organize my thoughts. --- First, the setup. We consider two types of attacks at once, because they share similar patterns: in an *online attack*, an *adversary* is trying to access an user's account (for instance) by guessing their password; in an *offline attack*, an adversary has gotten access to discriminating information on the password (i.e. a cryptographic digest), and is trying to find which passwords maps to this information, knowing the mapping process. First, we should note that an online attack is, in practise, supposed to be strictly harder than an offline attack (we can use guess rate limits on the server's side). For the purpose of password generation, we can therefore focus on the second attack model. Second, we must realize that we cannot attain information theoretic security <https://en.wikipedia.org/wiki/Information-theoretic_security>, since most information is leaked (through the digest); to deter an attacker, we will have to rely on computational security <https://en.wikipedia.org/wiki/Computational_hardness_assumption>. In other words, given an infinite amount of computational power, the adversary /will/ succeed; our only hope is to assume they have a limited amount of available computational power, and make it so that guessing the password will take a long time. Since the adversary will try various guesses until they find the correct password, we have two approaches to protect the user: * make each guess costly * make the attack require many guesses The first approach is what key derivation functions (PBKDF2, bcrypt, scrypt, argon2) aim for. This is very effective at improving security, but is limited by the requirement of not deterring normal uses (real user logging in): it it takes 100 ms to test a guess, but that the password is obvious, the adversary will still take only 100 ms to crack it. --- So, now, we want the attack to require many guesses. Obviously, we cannot ensure that criterion with probability 1 (the adversary could always find the password at their first guess). Hence, we need to refine our model to allow for a probabilistic defense. First, we better give the adversary as much plausible superpowers as possible, so that our model encompasses most scenarios. For this, we let the attacker know pretty much everything about how the password was generated (by a computer or by a human), except the password itself. There is definitely some fuzzy limit in our definition: assuming perfect simulation of a deterministic universe (not even required if generation process use no quantum source of entropy¹), the adversary could simulate the whole generation process with the correct initial conditions and derive what password was generated. Realistically, though, we assume the adversary is somewhat more limited than that (there are some debates about this, when designing true random generators like /dev/random). ¹ in the absolute, most generation processes do use some quantum entropy anyway (noise in network activity, /probably/ some stuff in neuronal communication) Now, what does it mean, to know the generation process, from the point of view of the adversary? Well, now, they can use their knowledge of the generation process to construct the probabilistic distribution of all password, and optimize their search to minimize the expected guessing time, by choosing the best possible order. More specifically, they will try to guess in order decreasing likelyhood (first "123456", then "password", ... /then "Z///暗号/7bc_*ō[…]//")/. Since the goal of the adversary is to /minimize/ their expected guessing time, and our goal is to /maximize/ it, you can think of it as [minimax](https://en.wikipedia.org/wiki/Minimax). Now, it turns out that the solution is to have a uniform distribution. That way, the order of guesses do not effect the expected guessing time, meaning that the adversary cannot reduce it. Since constructing and comparing probabilistic distributions is not always very practical (especially distributions on all strings of length up to 64 characters <https://nakedsecurity.sophos.com/2016/08/18/nists-new-password-rules-what-you-need-to-know/>), we'd like a simpler metric. This is exactly what entropy, in the context of information theory <https://en.wikipedia.org/wiki/Entropy_%28information_theory%29>, is. This is very convenient: is h is the entropy, an optimal adversary will have an expected guessing time of 2^h (for b = 2). Our goal is thus: create a generation process whose entropy is maximum All of this was only to give us a security metric. Of course, we do need to take other constraints into account (ease of memorizing). --- Now, remember that we want a secure generation process for easy-to-memorize passwords. For the security, the passwords should be as long as possible, and generated as uniformly as possible; for the ease of memorizing, they should be as relatable as possible to the user. Now, the easiest and most generic solution to this to select a dictionary of common words in the language of the users, and then pick some uniformly at random. However, as long as we check the entropy of the process, a few tweaks are possible: * letting the user choose among several generated passwords (reroll), sacrificing just a few bits of the entropy * letting the user choose among several dictionaries (cities, sport teams, colors, presidents, numbers etc) * using a user-specific dictionary (however, such a dictionary will probably be pretty short, so passwords should be longer) Just keep in mind that the adversary does know the dictionary, or the preference of the user among several passwords. Finally, regarding human-generated passwords, we can measure their security very easily: let a human generate a probabilistic distribution, and compute its entropy; repeat for many people. Well, it turns out that we really suck at generating entropy. A tempting solutions would be to let the user choose a long passphrase (a sentence), but it's not automatically better. In the computer-generated case, we can make it so that each word is generated independently from the others (or, at least, know exactly their correlation, and adjust entropy); in the human-generated case, words will be highly correlated, and the entropy will probably grow /very/ slowly. For instance, some users are very likely to use famous quotes, which are definitely at the very left of the probability distribution for any decent adversary. --- With this in mind, we should be able to have users pick decently secure passwords, while avoiding horrendous policies of the like of lowercase+uppercase+digit+non-alphanumeric. Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.