Date: Mon, 19 Nov 2012 09:39:50 -0600 From: Richard Miles <richard.k.miles@...glemail.com> To: john-users@...ts.openwall.com Cc: magnum <john.magnum@...hmail.com>, simon@...quise.net Subject: Re: How does incremental mode works? Hi Simon, Thanks for your answer, very appreciated. I still have some questions and suggestion if you don't mind. On Sat, Nov 17, 2012 at 10:23 AM, Simon Marechal <simon@...quise.net> wrote: > On 11/16/2012 10:16 PM, Richard Miles wrote: > > 1) Is there a command-line parameter to replace the default path of > > $JOHN/markov.stats? > > I have not been following what's in jumbo for a while but I suppose > there is a way in the config file. > Sorry, I was not clear on my previous e-mail. You are correct, it's possible to specify the pass at john.conf - however, I would like to pass this parameter via command-line. At documentation and even at config file is described that no command-line is available, but I'm just curious why not? I mean, I don't believe there is a technical limitation. Is there a chance to add it to the TODO list? Magnum does a great job and constantly improve jTr command-line options, can you consider it please? :) > > > 2) How big should be a wordlist to generate a stats file? I mean, the > > bigger is not always the best, right? Or too short will be bad as well, > > right? Does the size of the generated stats file influence on the > attack's > > time? > > I will try to give a high level description of how it works in another > mail, but will answer this specific question here. The statistical > generators all behave more or less the same in this regard. Once you > have generated the stats file from a training set, you will be between > the following extreme configurations : > * too little data and you risk "overfitting", that means not having a > model generic enough to find passwords that differ from the training > set. For example, if you train it with a single word "aa", the markov mode will only output candidates with a's in them (not entirely true). > * with a lot of data, your model will be generic. This is a good > situation. > You usually want to train it with as much data as possible, provided > that this data matches the kind of passwords you are going to attack. > > Make sense, however, what I noticed in practice is that even using the exact same Markov options and the same target password hashes the time change too much. For example: A) 55 NTLM password hashes with both default stats and stats based on rockyou.txt with option --markov=240:0:0:13 completes in 16 hours at most on my computer. The size of stats generated for rockyou.txt is bigger in comparison with default stats. B) 55 NTLM password hashes with a stats file based on a really big dictionary (~50GB) with the same option --markov=240:0:0:13 takes very long on the same computer. The interesting thing however is that size of stats is much smaller in comparison with the one generated for rockyou.txt. Strange, not? Consequently I believe that really big dictionaries are not a good option with Markov. If I'm missing something, please, let me know. > > > 3) What is the proper kind of wordlist that I should use to generate a > > stats file? A default one such as passwords.lst? Rockyou leak? PHPbb > leak? > > All of them together? > > The proper wordlist is the one that looks like the passwords you want to > attack. If this is a public leak, rockyou is your best choice. If this > is something else, you will have to find something else ;) > I understand, but I have still questions and a suggestion if you don't mind. A.1) What is the minimum size (number of words) that a file must have to produce an effective stats file? B.1) Once a new password is cracked with Markov should not be useful to get this information to update the stats file and recalculate the probabilistic? It may be wrong, but I guess that for example with a good amount of passwords cracked with Markov if we used this data to modify the stats "on the fly" it could give better results, not? C.1) Also, speaking about modify stats file. The stats file that comes with jTr is not based on rockyou.txt - but it's great. However, the password list used to generate it is not public available (AFAIK). Is there a chance to get an existent stats file and read a new dictionary file and use it to generate an updated stats file that contains the results of the original stats file previously created and the new wordlist? I think it could be a very nice feature. :) D.1) This one is for Magnum again since he always improve jTr with amazing small features that make our files easier. Today calculate the time that Markov will run based on time is a bit of pain as described here http://openwall.info/wiki/john/markov Do you think that you could add a new command-line option to automate it? For example, maybe we could do something like --markov=autoadjust-10800:0:0:13 whre jTr would calculate itself the best possibility of markov level based on the current password cracking speed of the target hash and autoadjust it to run during 3 hours (10800). What do you think? Thanks a lot and sorry for too many dumb questions / suggestions.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.