john-users - Re: Re: john improvement suggestions

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20051231030032.GA23367@openwall.com>
Date: Sat, 31 Dec 2005 06:00:32 +0300
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re:  Re: john improvement suggestions - complexity reductions

On Thu, Dec 22, 2005 at 02:58:54PM +0000, Radim Horak wrote:
> I had the pronouncable-passwords idea about 5 years ago, but I bumped in the 
> limits in storage place then. This year I had time and resources to realize it. 
> The basic idea was to base the generation on concatenating written syllables of 
> appropriate languages. I brainstormed over 1000 syllables (length 1-6) from 
> various languages - which is far from complete but should be the majority of the 
> most used. Then I generated all words with up to 4 combinations, limiting words 
> at length=8. This resulted in about 5 * 10^9 of words, sorted and uniqued about 
> 3.5 * 10^9 (around 30 GB of storage space), while the smallalpha keyspace (at 
> length=8) is about 209 * 10^9. All the manipulations with the tens of GB took 
> its time, luckily the final "wordlists" are very compressible - that's also 
> where I bumped in the 2GB limit. 
> The resulting keyspace reduction is 1:60 against the smallalpha and the success 
> rate of cracked passwords (within a month) was about 20 times better than my 
> custom alpha.chr (generated and tested with john 1.6.30). The testing sample was 
>  several hundred non-trivial (ie. not cracked so far) hashes.

OK, it's an interesting result you've got.

I'd like you to clarify a few things:

1. When you say the success rate was 20 times better, you're comparing
the first month of running against your custom "wordlist" against a
non-first month of running against your custom alpha.chr, correct?  The
.chr file's success rate would have been better on its first month,
correct?

2. Did you do any comparisons against the alpha.chr and/or all.chr
supplied with John 1.6 instead of against your custom alpha.chr?

> I generated my alnum.chr variant some time ago using external filter. The 
> success rate of my alphanumeric mode for non-trivial passwords (length of 7) 
> within a month of computer time was many times better then all.chr.
> The majority of the passwords was not based on english language, which could 
> handicap the all.chr file. I couldn't generate my custom language specific all.
> chr, because I didn't have set of passwords with all characters. 

Well, I attribute most of the difference to the previously-cracked
passwords in your john.pot matching your remaining passwords better.
There shouldn't be a lot of a difference in efficiency between alnum.chr
and all.chr, -- except that all.chr will eventually pick more passwords.

I had done my testing, too, with alpha.chr vs. all.chr, -- as expected,
all.chr would outperform alpha.chr after some time (way earlier than the
alpha.chr keyspace would be exhausted).

> Even so I just can't lose the impression, that smaller keyspace is just more 
> effective to begin with.
> My tests with limited length incremental mode with all.chr shows that for about 
> 90% of passwords it was needed 25% of total keyspace cracking-time and there 
> were passwords cracked after 50% of total keyspace cracking-time.

I'm not sure what specific test you're referring to here, but I'll try
to guess.  You've reduced MaxLen making it low enough that you can
search the keyspace exhaustively, and you actually did that, correct?
Now, you're complaining that it's taken as much as 25% of the total
cracking time (that is, the time needed to exhaustively search the
reduced keyspace) to crack 90% of the passwords in your sample.  OK,
this is quite realistic.

Now, if you reduced your keyspace to that of alnum.chr (36 different
characters, no uppercase ones), would you crack those 90% at all, or
would more than 10% of the passwords in your sample fall outside of the
alnum.chr keyspace?..

> If my 
> calculations are correct, the alnum.chr reduces the keyspace in ratio 1:2352 
> which is enormous.

That's about right, for candidate passwords of up to 8 characters long.

> And the statistical optimization in all.chr just may not fit 
> a particular scenario.

That's true, but the same applies to alnum.chr, -- unless you would
actually search that smaller keyspace exhaustively.

> Other more or less successfuly tested complexity reductions include
> (targeted at Tradit. DES):
> - with alphanumeric: presume the position of numbers is on the end, like 
> johntr12, jtr1995. My optimized order is this: $[1290763458]

Well, the existing all.chr and alnum.chr should already happen to place
numbers at the end more often than in other character positions.

They would also generate all-numeric candidate passwords.

> - with all keyspace: presume there is only 1 special character and is in the end 
> or in the beginning. My optimized, most used specials (unescaped): $[. !*/@$]
> - with all keyspace: Even if the position of the spec. char is unpredictable, 
> the total amounth of spec. chars is highly unlikely to be more than 2.
> - with alpha keyspace: presume the Capital characters are only in the beginning, 
> or in the end or both. (When processing wordlist, they can be either capitalized 
> letter from a word or added to the word.)
> - with alphanumeric: presume there is only 1 alpha character, the rest are 
> numeric.

Well, .chr files only encode information on trigrams (separately for
different character positions and for different password lengths), not
on entire (longer) passwords, but most of the time the information is
sufficient for John to try candidate passwords of the kinds you describe
earlier than many others.

> - with all keyspace: some rather simple passwords are hard to crack, because 
> they are concatenated from a few small words with (or without) spaces between 
> them. This could be solved by smarter and more careful generation of wordlists 
> from e-texts.

Yes, a double-wordlist mode to be used with really small wordlists could
make some sense.

Alternatively, given that the input wordlists would need to be really
small, such double-wordlist file could be pre-generated with a trivial
Perl script.

> - with alphanumeric: When national keyboards are used and the password is 
> accepting only basic ascii characters, there are often passwords blindly typed 
> on english keyboard as if they were typed on national keyboard resulting in 
> passwords like "hesl94ko" which is result of czech diminutive of "password". 
> This again can be solved only by sensitive wordlist generation.

Yes.

> - with all keyspace: The next character in password is just one key of keyboard 
> distance away from the previous. Like asdf, but also qazwsx, 1q2w3e

Yes, these are quite common.

> and even hytgbnmj.

I really don't think this one is common. ;-)

> (With and without the possible repetition of last char.) I haven't yet 
> written a program that would generate those, external filter for john would be 
> the best - any volunteers? :)

It's easier to code this in Perl, although if you really want to
generate _all_ possible passwords of this kind, an external mode could
do better.

> Now I hope I have presented some things, that are just a little bit smarter than 
> all.chr ;)

Yes, and you can definitely crack more passwords by using these tricks
_after_ you've been cracking with all.chr for a while.  But if you use
them _instead_ of the .chr files (that is, not use John's "incremental"
mode at all), then you may well crack fewer password hashes because of
your being "smart" like that. ;-)

> > (I've been reminded that one of those cases is testing for
> > compliance to password policies that require alphanumeric plus one shift
> > character.)
> 
> Well well, isn't this just nice example of reducing complexity? This would 
> probably result in *just one* shift character at the beginning or at the end of 
> the password - increasing the resulting keyspace only 1.44 times instead of 77 
> times for the full small+big alphanumeric.

I'm not sure where the "1.44 times" figure came from, but, yes, this is
not the smartest security policy for the reason you describe - yet it
makes some sense.  There's no such thing as strong Windows passwords
anyway, if an attacker can get his/her hands on the LM hashes.

> > As Frank has pointed out, some of the "single crack" mode rules may only
> > be used on word pairs.
> 
> I noticed that when I was testing it. I commented out those few rules and the 
> rest (which is majority) gave me good results.

Yes, indeed.  Those are good rules to apply to "highly-focused" wordlists.

Thanks,

-- 
Alexander Peslyak <solar at openwall.com>
GPG key ID: B35D3598  fp: 6429 0D7E F130 C13E C929  6447 73C3 A290 B35D 3598
http://www.openwall.com - bringing security into open computing environments

Was I helpful?  Please give your feedback here: http://rate.affero.net/solar
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.