john-users - Re: john improvement suggestions

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:  <loom.20051222T130123-68@post.gmane.org>
Date: Thu, 22 Dec 2005 14:58:54 +0000 (UTC)
From:  Radim Horak <yesbody@...nam.cz>
To: john-users@...ts.openwall.com
Subject:  Re: john improvement suggestions - complexity reductions


  Things to test, when you can't test the whole keyspace*

Solar Designer <solar@...> writes:

> > - I think that the complexity gap between wordlist mode (even with rules) 
and 
> > incremental modes is too big, there is a "undiscovered country" of various 
> > optimizations/complexity reductions etc. Ie. like generating only 
"pronouncable" 
> > passwords
> 
> I disagree.

I'll try to substantiate my point.

> John effectively stores statistical information on trigraphs (and on
> arbitrary trigrams) in .chr files.  I don't think you can be generally
> smarter than that when programming a filter manually.  Besides, you
> might want to be trying the less likely candidate passwords later on.

I don't know exactly how smart .chr files are, but I started to think more about 
it only after using up all wordlists & std. rules and after the incremental 
modes ceased to crack more passwords in a reasonable timeframe.

> If you _really_ want to filter out non-pronounceable passwords, you can
> do that with an external filter() that you would specify in john.conf,
> then invoke John like:
> 
> 	./john -i=alpha -e=filter_pr0n passwd

To write such filter (in john/C-like syntax) is beyond my abilities.

> For a better c/s rate (but a highly non-optimal order of tries), you
> could define an entire external mode that would produce only
> pronounceable passwords.

OK. Here's how I've done that. 
I had the pronouncable-passwords idea about 5 years ago, but I bumped in the 
limits in storage place then. This year I had time and resources to realize it. 
The basic idea was to base the generation on concatenating written syllables of 
appropriate languages. I brainstormed over 1000 syllables (length 1-6) from 
various languages - which is far from complete but should be the majority of the 
most used. Then I generated all words with up to 4 combinations, limiting words 
at length=8. This resulted in about 5 * 10^9 of words, sorted and uniqued about 
3.5 * 10^9 (around 30 GB of storage space), while the smallalpha keyspace (at 
length=8) is about 209 * 10^9. All the manipulations with the tens of GB took 
its time, luckily the final "wordlists" are very compressible - that's also 
where I bumped in the 2GB limit. 
The resulting keyspace reduction is 1:60 against the smallalpha and the success 
rate of cracked passwords (within a month) was about 20 times better than my 
custom alpha.chr (generated and tested with john 1.6.30). The testing sample was 
 several hundred non-trivial (ie. not cracked so far) hashes.


> > (http://www.multicians.org/thvv/gpw.html)
> 
> Now, it could make sense to match a very particular password generator
> like that with a John external mode. 

This generator in particular would be little more difficult to match, since it 
generates trigram statistics from a supplied wordlist (to match a particular 
language) before compilation.
 
> Additionally, many of those password generators do not produce a uniform
> distribution.  So a combination of a .chr file generated from lots of
> actual passwords produced by the particular generator and an external
> filter() which would make John obey the rules of that generator might
> work best.

I think this is good idea even without the external filter (for specific 
languages).

> > or the recent addition of alpha-numeric charset.
> 
> Actually, in most cases all.chr should be used.  I believe most of the
> uses of the new alnum.chr would be because of a lack of understanding of
> how "incremental" mode works.  But the demand for this was just too
> high so I got tired of explaining why this was not needed except in very
> few cases. 

I generated my alnum.chr variant some time ago using external filter. The 
success rate of my alphanumeric mode for non-trivial passwords (length of 7) 
within a month of computer time was many times better then all.chr.
The majority of the passwords was not based on english language, which could 
handicap the all.chr file. I couldn't generate my custom language specific all.
chr, because I didn't have set of passwords with all characters. 

Even so I just can't lose the impression, that smaller keyspace is just more 
effective to begin with.
My tests with limited length incremental mode with all.chr shows that for about 
90% of passwords it was needed 25% of total keyspace cracking-time and there 
were passwords cracked after 50% of total keyspace cracking-time. If my 
calculations are correct, the alnum.chr reduces the keyspace in ratio 1:2352 
which is enormous. And the statistical optimization in all.chr just may not fit 
a particular scenario.

Other more or less successfuly tested complexity reductions include
(targeted at Tradit. DES):
- with alphanumeric: presume the position of numbers is on the end, like 
johntr12, jtr1995. My optimized order is this: $[1290763458]
- with all keyspace: presume there is only 1 special character and is in the end 
or in the beginning. My optimized, most used specials (unescaped): $[. !*/@$]
- with all keyspace: Even if the position of the spec. char is unpredictable, 
the total amounth of spec. chars is highly unlikely to be more than 2.
- with alpha keyspace: presume the Capital characters are only in the beginning, 
or in the end or both. (When processing wordlist, they can be either capitalized 
letter from a word or added to the word.)
- with alphanumeric: presume there is only 1 alpha character, the rest are 
numeric.
- with all keyspace: some rather simple passwords are hard to crack, because 
they are concatenated from a few small words with (or without) spaces between 
them. This could be solved by smarter and more careful generation of wordlists 
from e-texts.
- with alphanumeric: When national keyboards are used and the password is 
accepting only basic ascii characters, there are often passwords blindly typed 
on english keyboard as if they were typed on national keyboard resulting in 
passwords like "hesl94ko" which is result of czech diminutive of "password". 
This again can be solved only by sensitive wordlist generation.
- with all keyspace: The next character in password is just one key of keyboard 
distance away from the previous. Like asdf, but also qazwsx, 1q2w3e and even 
hytgbnmj. (With and without the possible repetition of last char.) I haven't yet 
written a program that would generate those, external filter for john would be 
the best - any volunteers? :)

Now I'd like to encourage others to come up with similar ideas!

Now I hope I have presented some things, that are just a little bit smarter than 
all.chr ;)

> (I've been reminded that one of those cases is testing for
> compliance to password policies that require alphanumeric plus one shift
> character.)

Well well, isn't this just nice example of reducing complexity? This would 
probably result in *just one* shift character at the beginning or at the end of 
the password - increasing the resulting keyspace only 1.44 times instead of 77 
times for the full small+big alphanumeric.

*) note: I am using term keyspace, while password-space would be more 
appropriate.

> As Frank has pointed out, some of the "single crack" mode rules may only
> be used on word pairs.

I noticed that when I was testing it. I commented out those few rules and the 
rest (which is majority) gave me good results.

-Radim
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.