john-users - Re: .chr files (Was: automation equipped working place of hash cracker, proposal)

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120415212018.GC5913@debian>
Date: Mon, 16 Apr 2012 01:20:18 +0400
From: Aleksey Cherepanov <aleksey.4erepanov@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: .chr files (Was: automation equipped working place
 of hash cracker, proposal)

On Fri, Apr 13, 2012 at 10:59:42PM +0200, Simon Marechal wrote:
> Le 13/04/2012 22:46, Aleksey Cherepanov a écrit :
> > Assume that we have mixed passwords of two patterns. We build .chr and
> > enumerate each password with a number according to its positions in a list of
> > candidates this .chr file provides. We drop one password from our set and redo
> > the steps and numbers are changed: if ratio between the biggest group of
> > password and the smallest group is higher than before then it was a password
> > from the smallest group else it was a password from the biggest group. I am
> > not sure how to measure numbers right.
> 
> You assume that incremental mode will be a good tool to model password
> patterns. I do not believe this is the case for most, even if it worked
> reasonably well during the constest.

It is assumption without any real practice.

> > Though I think there could other statistical methods to find groups of
> > passwords. Something like cluster analysis is going onto mind.
> 
> I a tried to find ideas on the topic of identifying mangling rules
> patterns. When one of them exists (and is plain enough, a simple
> mutation rule combined with some append/prepend), it is easy to
> recognize it. When it gets more complex, you need more rules, and my
> stuff obviously doesn't run in polynomial time (but linear space I believe).
> Other methods I could think of would be even slower. Is there somebody
> on this list who is knowledgeable about linguistics and/or statistics ?

You could find groups of passwords by other methods and then search rules.  I
think it would be faster to search easy rules when we know they could be
found.

After the contest I wrote a small script to search similar passwords:

On Thu, Aug 11, 2011 at 12:08:30AM +0400, Aleksey Cherepanov wrote:
> > Now I realize that my helper script was wrong by design because it
> > was intended to find groups of passwords that are similar in meaning
> > of small mutations. However there are a lot of groups with passwords of
> > certain form that are not similar in such meaning. I think it is
> > possible to make universal script for grouping (maybe using artificial
> > neural network or some statistical methods).
> 
> However groups of passwords with small mutations are important
> (mississippi and obsessiveness are such). And their differences could
> be simplified to just case differences that are easy to find.

I did not realize that one character difference is easy to find too.

Assume that we have two passwords: abc and axc. So we will try to drop
one character in each position and use this modified passwords as keys to
store original passwords in hashtable (dictionary). In this hashtable
for each key we will have a list/set of passwords that could be reduced
to that key. So abc and axc will be mostly listed with different keys
but both of them will be listed within key 'ac'.

Implementation:
perl -e '@w = <>; for $w (@w) { push @{$h{substr($w, 0, $_) . substr($w, 0, $_)}}, $w for 0 .. length $w }; length @{$h{$_}} >= 2 && print @{$h{$_}}, "\n" for keys %h' < cracked

It will find that groups and print them all separated by empty
lines. However some of passwords will be in several groups. Script
written during contest merges such groups. Probably it would be good here
too, would not it?

Regards,
Aleksey Cherepanov
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.