john-users - Re: De-duping NT ruleset

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130331082010.GA445@openwall.com>
Date: Sun, 31 Mar 2013 12:20:10 +0400
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: De-duping NT ruleset

Rich,

On Sat, Mar 30, 2013 at 08:02:59PM -0400, Rich Rumble wrote:
> I ran a small test on "NT" using a simple set of letters:
> aaaaaaaaaaaaA
> AaAaAaAaAaAaa
> aaaaa
> AAAAA
> 
> That produced 16448 cadidates, and half are repeats, it should be 8224
> unique cadidates. I changed NT to be like this instead (below), and it only
> produces 8224 like I believe it's supposed to.

You can't reliably adjust a JtR ruleset to avoid producing duplicates
stemming from similarities across multiple input words, without also
missing some desirable candidate passwords on other valid inputs.  This
is a fundamental problem, it is not specific to the NT ruleset.  Thus,
what you have done is introduce a bug: on other valid inputs, your
NT_Length will miss some passwords that NT would produce.  This is easy
to see:

$ cat w
aaaaaaaaaaaaA
$ ./john -w=w -ru=NT_Length -stdo|sort -u|wc -l 
words: 4096  time: 0:00:00:00 DONE (Sun Mar 31 12:10:09 2013)  w/s: 45511  current: AAAAAAAAAAAAa
4096
$ ./john -w=w -ru=NT -stdo|sort -u|wc -l
words: 8192  time: 0:00:00:00 DONE (Sun Mar 31 12:10:11 2013)  w/s: 102400  current: AAAAAAAAAAAAa
8192

... or is this on purpose, with NT_Length being intended for use on some
special cases of input files, not on all?  If so, that's inconsistent
with the original purpose of the NT ruleset; perhaps you'd need to call
your ruleset differently and explain what inputs it is valid for, then.

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.