john-users - Re: How does incremental mode works?

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <50A7B9E4.8080108@banquise.net>
Date: Sat, 17 Nov 2012 17:23:00 +0100
From: Simon Marechal <simon@...quise.net>
To: john-users@...ts.openwall.com
Subject: Re: How does incremental mode works?

On 11/16/2012 10:16 PM, Richard Miles wrote:
> 1) Is there a command-line parameter to replace the default path of
> $JOHN/markov.stats?

I have not been following what's in jumbo for a while but I suppose
there is a way in the config file.

> 2) How big should be a wordlist to generate a stats file? I mean, the
> bigger is not always the best, right? Or too short will be bad as well,
> right? Does the size of the generated stats file influence on the attack's
> time?

I will try to give a high level description of how it works in another
mail, but will answer this specific question here. The statistical
generators all behave more or less the same in this regard. Once you
have generated the stats file from a training set, you will be between
the following extreme configurations :
* too little data and you risk "overfitting", that means not having a
model generic enough to find passwords that differ from the training
set. For example, if you train it with a single word "aa", the markov
mode will only output candidates with a's in them (not entirely true).
* with a lot of data, your model will be generic. This is a good situation.

You usually want to train it with as much data as possible, provided
that this data matches the kind of passwords you are going to attack.

> 3) What is the proper kind of wordlist that I should use to generate a
> stats file? A default one such as passwords.lst? Rockyou leak? PHPbb leak?
> All of them together?

The proper wordlist is the one that looks like the passwords you want to
attack. If this is a public leak, rockyou is your best choice. If this
is something else, you will have to find something else ;)

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.