john-users - Re: password ranking

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68d9b34e0912081350g17e25865jbc3f20b362129b1f@mail.gmail.com>
Date: Tue, 8 Dec 2009 22:50:15 +0100
From: "Luke O'Connor" <lukejamesoconnor@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: password ranking

Alexander,

thank you for the detailed reply - a bit beyond my current understanding
of JtR and that is for me to improve. A few other comments in the body

rgs Luke

On Mon, Dec 7, 2009 at 3:24 AM, Solar Designer <solar@...nwall.com> wrote:

> On Mon, Dec 07, 2009 at 01:08:40AM +0100, Luke O'Connor wrote:
> > What I was wondering was whether there is a document which describes the
> > search strategy that is followed by JtR. I see from the documentation
> > pages that many search options can be configured but I was hoping someone
> > could give me a simple answer based on standard settings (if they exist).
>
> Default settings and default data files do exist, but there can be no
> answer that would be simple, correct, and complete at the same time.
>
> > Imagine that I start JtR to search for a password like 8h2wt6ghw -
> expressed
> > as windows hash for example - how many guesses will JtR make before the
> > password will be found? Let's for the moment assume that JtR runs
> > indefinitely.
>
> This is either tricky or time-consuming to compute.  You could in fact
> write a program that would compute this reasonably quickly for the
> "incremental" mode (with given settings and .chr file), but such a
> program does not readily exist (as far as I'm aware) and it is not a
> trivial one to write.
>

ok, I guess I need to play with JtR a bit to understand how it works.


>
> > I think an interesting measure of password complexity would be some
> function
> > JtR(Password) which returns the position of Password in the list of
> > candidate passwords generated by JtR according to its search strategy.
>
> JtR's "incremental" mode tries candidate passwords roughly in order of
> decreasing estimated probability of each being "the" password.  (With
> multiple hashes loaded for cracking this would have to be worded a bit
> differently, but the main idea holds.)  The probability estimate is
> roughly the product of conditional probabilities of the characters in
> each character position in the candidate password.  The conditional
> probabilities take two preceding characters into account.  In turn, the
> conditional probabilities of each character are estimated as being equal
> to frequencies of occurrences of the corresponding character triplet at
> the corresponding position in passwords of the corresponding length as
> seen in john.pot at the time the .chr file is generated.  Because of the
> need to generate candidate passwords real fast and without consuming an
> enormous amount of memory (such as for a sort buffer), the actual
> ordering is slightly different.  Specifically, the .chr files do not
> even encode the probability values, but they encode ordered lists of
> characters and other parameters needed for the fast candidate password
> generating algorithm instead.
>
>
All right so the search strategy seems to be driven by tri-gram frequencies.
Will the search eventually try all tri-grams, even those with no
observations
in john.pot? I guess I am asking of the search is exhaustive or generated
over passwords that are deemed likely according to the observed tri-gram
stats?


> Thus, instead of focusing on JtR's approximation, you could calculate
> the product of estimated probabilities directly.  Well, you would not be
> able to do that working from a .chr file (at best, you'd get the same
> slightly distorted value that JtR "uses"), but you could do it from a
> set of passwords (a john.pot file or the like) that you would otherwise
> generate a .chr file from.  This could be easier to do, faster, and more
> accurate than your proposed "JtR(Password)" function.  (Well, it would
> not be more accurate if you're focusing on the difficulty of getting a
> password cracked with JtR specifically, but it would be more accurate if
> you are not focusing on JtR's specific attack.  It would continue to
> assume an attack based on just frequencies of character sequences,
> though, without other input information, whereas in practice a password
> can be weak for other reasons.)
>

I have to admit that you are losing me here based on my lack of knowledge
of JtR, but I will have to return to it after getting some experience.

>
> I must admit that probability of a candidate password is quite a
> different metric than the number of other candidate passwords that would
> be tried before reaching this one.  Yet either can be useful.
>

Well it can't be too different - at least related. If the prob of the
candidate
password is driving the search strategy then the number of guesses before
hitting the right password must be related to the prob of the password as
estimated by JtR. But again my ignorance could be showing here.


>
> Then, in practice, with default settings, JtR actually runs through
> "single crack" and wordlist modes, both with word mangling rules, before
> it proceeds with "incremental".  You could want to simulate that, too -
> and for this it is in fact practical to simply enumerate all candidate
> passwords until yours is possibly hit.  For "single crack" mode, you'd
> have to provide additional information along with the password (such as
> the username), so in a sense the strength of the password depends on
> what related information is available to a cracker.  This is quite true
> for the real world.
>

I need to look up the definitions of these modes again.

>
> > Is there any analysis along these lines, a document which describes how a
> > given search strategy works its way through the set of all passwords (if
> > it was able to run indefinitely)?
>
> I am not aware of a decent and detailed description of JtR's "search
> strategies", beyond various postings I made to this mailing list (much
> like this one).  There are brief descriptions in the JtR documentation,
> but those are meant for end-users.
>

So It seems I have to become an end user! Fair enough.

>
> However, there is a wiki page on the Markov mode, which is an unofficial
> addition to JtR (found in the jumbo patch):
>
> http://openwall.info/wiki/john/markov
>
> As you can see, the Markov mode is quite similar to JtR's "incremental"
> mode.  Perhaps the most obvious user-visible difference is that the
> Markov mode is not intended to run indefinitely, until success, or until
> being interrupted (it may also terminate on its own, yet fail to crack
> the password), whereas the "incremental" mode is (assuming a large
> enough charset and maximum length, resulting in a practically "infinite"
> password space).
>

I will read the paper and perhaps ask again

>
> I hope this helps.
>
> Alexander
>
> P.S. It appears that you have inconsistent configuration of your mail
> client software, where lines were first wrapped at something like 100
> characters, but then wrapped for a second time at under 80 characters.
> I corrected this manually for the quoting.  If you can, please correct
> this on your end before you make your next posting (say, to wrap at 72
> characters right away).  Thanks!
>

hope that is fixed now as well
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.