john-users - Re: Replacement for all.chr based on "Rock You" Passwords. URL inside.

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100214060744.GA16077@openwall.com>
Date: Sun, 14 Feb 2010 09:07:44 +0300
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: Replacement for all.chr based on "Rock You" Passwords. URL inside.

On Thu, Feb 11, 2010 at 06:40:49AM -0500, Matt Weir wrote:
> Hey Minga, thanks for providing a .chr file based on the RockYou dataset. As
> Solar requested I ran some tests against various password lists using your
> .chr file, the default JtR All .chr file, along with a custom .chr file I
> created based on the PhpBB.com dataset. The short version is that the
> RockYou set performed the best when cracking other website password lists.
> You can find the full results + graphs + a lot of off topic rambling here:
> 
> http://reusablesec.blogspot.com/2010/02/even-more-markov-modeling-whats-in.html

Thank you, this is helpful.  It's a pity that you did not compare the
two RockYou .chr files against other than RockYou passwords, though.

I also ran some tests.  I took this file with stronger-than-average Unix
crypt(3) hashes:

http://forum.insidepro.com/viewtopic.php?t=4260
http://forum.insidepro.com/download.php?id=1309

This has over 50,000 hashes (apparently, those the author of that forum
thread couldn't crack).  A comment on the second page has a file
attached with 23,860 of the hashes cracked, so they were not that
strong, after all:

http://forum.insidepro.com/download.php?id=1661

This also gives you a new training/test set, if you like.

Anyway, I focused on a single salt since there was one with more than
1000 hashes.  You can achieve this using the "--salts=1000" option to
John, or you can extract just the relevant 1634 lines with:

sed -n 's/^[^:]*\(:0\..*\)$/x\1/p' < total1-Uncracked.txt | sort -u > passwd

So this was my test set, and I ran three instances of JtR on a Q6600
using different .chr files:

all.chr:
guesses: 287  time: 0:00:19:37  c/s: 2599M  trying: 0ertmsbr - 0ertmsi1
guesses: 319  time: 0:00:32:24  c/s: 2534M  trying: rfsabAd - rfsakas

Minga's rockyou.chr revision 1 (unique):
guesses: 404  time: 0:00:19:43  c/s: 2426M  trying: 2lodbhc - 2lod407
guesses: 456  time: 0:00:32:27  c/s: 2339M  trying: pjpsend1 - pjpseted

Minga's rockyou.chr revision 2 (non-unique):
guesses: 394  time: 0:00:19:49  c/s: 2453M  trying: bjnen80 - bjnemy7
guesses: 451  time: 0:00:32:27  c/s: 2363M  trying: pj1c96b - pj1c9bw

As you can see, both RockYou files performed better than the old
all.chr, but the new revision performed slightly worse than the initial
one did.  I think this is related to these passwords being stronger than
average, so the new rockyou.chr's increased focus on substrings found in
common passwords did not help.

For another test, I ran all three .chr files against hashed password.lst
(common passwords only).  all.chr performed the best, rockyou.chr
revision 2 was second, and rockyou.chr revision 1 performed the worst.
all.chr's success can be explained by its training set actually
containing all of password.lst's common passwords (and a lot more).
rockyou.chr revision 2 performed better due to its increased focus on
common passwords in general.

> As of right now I don't have a good data-set containing computer log-ins,
> (vs website log-ins). Because of that I really want to stress that while
> these tests imply that Minga's Rockyou .chr file might perform better when
> attacking web based passwords, John the Ripper's default .chr files were
> trained on computer passwords. If you are attacking a
> LANMAN/NTLM/Crypt(3)/CISCO/etc hash you probably still want to use John's
> included .chr files.

This is not confirmed by the test I ran against the 1634 Unix hashes,
although perhaps 99% of those hashes came from the same system (which is
why they share the same salt - the system must have been misconfigured).

I think rockyou.chr performs so well primarily because the RockYou list
was so large.  This might also be related to a shortcoming of the
current "incremental" mode code where it lacks a fallback to trigraph
frequencies for other than the exact character position with the exact
password length.  Instead, when no precise data is available, it falls
back to digraph and then to individual character frequencies.  Perhaps I
need to try introducing a fallback to trigraphs from other positions and
lengths to be applied before the fallback to digraphs.  Meanwhile, a
larger training set mitigates this shortcoming.

> In fact you defiantly want to use the included .chr
> files against LANMAN hashes due to how the hash works, (Only Uppercase, 7
> characters max). This also brings up the point that without additional
> filters, using other attack types, (such as 'Alnum'), that are provided with
> JtR might work better over shorting cracking sessions than using Minga's
> RockYou .chr file that includes uppercase, special characters, etc.

Right.  We'll need RockYou-based .chr files with the filters pre-applied.

Another thing to try would be keeping only some instances of each common
password in john.pot used to generate the .chr files.  For example, we
could replace the number of occurrences with a base-2 or base-10
logarithm of the actual number - e.g., if the password "123456" is seen
290k times, we can instead keep only 18 or only 5 instances of it (need
to experiment with this).  This would reduce the focus on substrings
from common passwords, yet preserve some bias towards those substrings.

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.