john-users - Re: source of information for John's charset files

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210505121450.GA16397@openwall.com>
Date: Wed, 5 May 2021 14:14:51 +0200
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: source of information for John's charset files

On Mon, May 03, 2021 at 04:18:15PM +0200, Solar Designer wrote:
> My expectation is that training on unique passwords only will in fact
> reduce the number of cracked accounts when only incremental mode is
> used.  However, after having run through RockYou as a wordlist, it's not
> obvious whether it's beneficial for incremental mode to also favor the
> repeated passwords like 123456.
> 
> What you suggest about excluding e.g. just the top 10k makes sense.
> Another approach I thought of, but I don't recall trying, is to apply a
> logarithmic scale to the counts.  For example, for passwords appearing
> 1000+ times include them 4 times, for 100+ include them 3 times, etc.

I experimented with this now, and it may very well be a way to have the
best of both worlds, or at least a reasonable balance.  Compared to
RockYou unique, I am getting the most improvement (overall across
different test sets) by adding to it a list of RockYou passwords that
appeared 3+ times on the original with-duplicates list.  In other words,
a password that appeared 3+ times is listed twice, otherwise just once.

Of course, this doesn't distinguish frequencies of passwords within top
~1 million, so e.g. "password" ranks way lower than it does in our
current default .chr files generated from the original with-duplicates
RockYou.  However, "123456" still ranks first, because not only it but
also its substrings rank high.

I also tried adding top 100k and top 10k lists (giving 4 repeats for
passwords in top 10k), which hurt my tests on all-unique test sets a
tiny bit (possibly just noise), but it also brought "password" only a
bit higher.

> I decided to test these not only at 1 billion candidates, but also at
> other points.  I use three training sets: RockYou with dupes (same as
> was used to generate our currently bundled .chr files - in fact, I just
> reuse ascii.chr from there), RockYou unique shuffled and 1M test set
> removed from it (so 13.3M training set), and HIBP v7 458M cracked (after
> removal of the fbobh_* pattern).  The test set is always the mentioned
> 1M from RockYou unique shuffled.  Here are the percentages cracked at
> 10M, 100M, 1G, 10G, 100G candidates:
> 
> RockYou with dupes - 4.6%, 10.2%, 20.2%, 33.3%, 48.0%
> RockYou -1M unique - 4.7%, 11.2%, 21.5%, 35.0%, 48.3%
> HIBP v7 cracked    - 3.2%,  8.7%, 17.8%, 30.0%, 44.5%
> 
> So despite of "RockYou -1M unique" being the only one 100% out-of-sample
> test (no password appears in both the training and the test set) and
> also having the smallest training set (at 13.3M), it outperforms the two
> other tests across this whole range.
> 
> Of course, HIBP performing worse doesn't necessarily mean it's a worse
> choice in general - just that it's a worse fit for RockYou.  We've also
> seen that when using a portion of HIBP as the test set, things are the
> other way around - training on the rest of HIBP produces better results
> than training on RockYou does.

Here's the mix of my new potion:

HIBP v7 cracked   - x1 (also includes RockYou)
RockYou unique    - x30
RockYou top 1M 3+ - x31

This gives a list of roughly double the size of HIBP v7 cracked: ~458M
to ~924M.  Generating a .chr file with --external=filter_ascii uses
~896M from there (excludes the nested hashes, among other things).  This
gives the highest weight to passwords appearing on RockYou 3+ times,
then to the rest of RockYou, and hopefully only uses HIBP to resolve
ties and as a fallback where RockYou lacks a definitive pattern.

The results directly comparable to those above, also repeated below for
comparison, are:

RockYou with dupes - 4.6%, 10.2%, 20.2%, 33.3%, 48.0%
RockYou -1M unique - 4.7%, 11.2%, 21.5%, 35.0%, 48.3%
HIBP v7 cracked    - 3.2%,  8.7%, 17.8%, 30.0%, 44.5%
New mix            - 4.9%, 10.8%, 20.8%, 34.0%, 47.6%

Also for comparison, the best I am able to achieve by training on (full)
RockYou only (processed in various ways) at 1 billion candidates (middle
column above) is 22.0%.

To remind, the above uses 1M of RockYou unique shuffled as the test set,
so except for the "RockYou -1M unique" line this is in-sample testing.
So these good results don't mean a lot, but they do mean that the usage
of HIBP hurting this test is mostly gone with the new mix.  This matters
more together with another result:

As I mentioned earlier in this thread, training on HIBP v7 cracked not
surprisingly provided the best result when testing on HIBP v7 as well,
including with non-overlapping training and test sets.  The results at 1
billion candidates were 5.4% for training on RockYou with duplicates,
6.0% for RockYou unique, and 6.6% for HIBP cracked unique.  Well, here's
a new result: the new mix above achieves almost 6.6% as well (to be
specific, 6.64% vs. 6.59%).

So maybe it's a good balance for targeting yet unknown password sets.

> Also curious is how many different passwords the different training sets
> crack.  At the 100G mark, the three runs above cracked a total of 52.5%.

Adding the new mix increases the total for the four 100G pots to 52.7% -
not much increase from the previous 52.5%.

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.