john-users - Re: source of information for John's charset files

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ9ii1EmrzuP3DacWGxtcd95JaOdLciBWhLjFu0_B903FpwKvQ@mail.gmail.com>
Date: Sun, 2 May 2021 23:00:34 -0400
From: Matt Weir <cweir@...edu>
To: john-users@...ts.openwall.com
Subject: Re: source of information for John's charset files

I apologize in advance if I misunderstood your testing procedure or your
results, but using the HIBP list as a test set is really problematic when
applying that to normal password cracking sessions.

Duplicates matter and our techniques should reflect that. Making guesses of
'123456' and 'password' before 'ajger' should be rewarded, but using the
HIBP list all three guesses are awarded the same value. I could see
excluding the top 10k password guesses from an incremental training set,
(since '123456' and 'password' will be almost certainly cracked by a
dictionary attack), to optimize how incremental plays with brute-force, but
even that approach while it seems like it makes sense, has backfired on me
every time I have tried it, resulting in worse results when applying it to
new datasets.

I've actually been looking into something similar with an "optimization" of
the PCFG tool. I wanted to make OMEN play nicer with the dictionary like
attack that PCFG does, so I've tried to train OMEN on passwords that the
other parts of the grammar didn't crack. My thinking was that OMEN then
would specifically target those types of passwords. Long story short, those
tests were an unmitigated disaster when I then applied the grammar against
new test sets. It made my tool worse, not better.

Now I admit I could be wrong. Training on unique passwords might end up
making Incremental mode better. But before we make those changes, I'd
really like to see those tests run against a more realistic dataset that
HIBP. I know HIBP is based on real passwords, but there are so many
different artificialities that go into it's construction I have deep
suspicions on using it as a representative password set.

On a different point, I am totally ok with updating the training set from
RockYou. I could go on and on about the weirdness of that dataset, not to
mention that it really is showing its age. The gold standard right now of
public datasets would probably be the LinkedIn list, which also is showing
its age, but is a bit more comparable to current web passwords.

The one advantage of the HIBP list is it does have some non-english
datasets in it. That's a whole other conversation though on how to better
incorporate other languages into cracking sessions.

Side note, I just saw your most recent results of training/running against
RockYou. I'm willing to admit I'm wrong if you are getting better results
training without dupes. That's just contrary to what I've seen in the past.
I might need to run some tests of my own to look into this.

Cheers,
Matt/Lakiw

On Sun, May 2, 2021 at 5:39 PM Solar Designer <solar@...nwall.com> wrote:

> On Sun, May 02, 2021 at 11:21:34PM +0200, Solar Designer wrote:
> > Anyway, I just ran some tests the other way around - "cracking" RockYou
> > passwords.  I didn't try excluding RockYou itself from the training sets
> > here - can't do that while including our current .chr files in the
> > comparison.  So this is in-sample testing, which is generally a wrong
> > thing to do, but with that in mind here are the results for different
> > training sets (all are for incremental mode and 1 billion candidates):
> >
> > RockYou with dupes - 20.2%
> > RockYou unique - 21.9%
> > HIBPv7 cracked - 17.9%
> >
> > The percentages cracked are those of RockYou unique.
> >
> > Not surprisingly, RockYou is best fit for itself.  HIBP is an acceptable
> > fit as well.  It could have potentially performed better than RockYou
> > on this test due to its larger size, but as we can see that was not
> > enough to overcome it not being such a perfect fit as RockYou itself.
>
> FWIW, RockYou unique being best fit for itself persists after I shuffled
> it and split it into a 1M test set and 13.3M training set (no matching
> passwords in the sets, but both sets are parts of RockYou).  Got 21.5%.
>
> Alexander
>
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.