john-users - Re: pwgen

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101117185101.GA26442@openwall.com>
Date: Wed, 17 Nov 2010 21:51:01 +0300
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: pwgen

On Wed, Nov 17, 2010 at 05:11:46PM +0100, Albert Veli wrote:
> I noticed a LOT of *nix people (and others too) use the same tool to
> generate passwords, http://pwgen.sourceforge.net/
> 
> Since the tool generates "passwords which can be easily memorized by a
> human" the keyspace of those should be smaller than the brute force
> keyspace.
> 
> Did anybody look at implementing the pwgen algorithm for guessing
> passwords in john?

I did not bother trying to deal with pwgen's phonemes directly (see
pw_phonemes.c in pwgen's source tarball), but instead I tried to exploit
the non-uniform distribution of individual characters and character
combinations resulting from pwgen's use of phonemes (and maybe from
something else).  I did:

$ ./pwgen -1cn 8 1000000 | sed 's/^/:/' > john.pot
$ wc -l john.pot
1000000 john.pot
$ sort -u john.pot | wc -l
997936
$ ./john -make=pwgen.chr
Loaded 1000000 plaintexts
Generating charsets... 1 2 3 4 5 6 7 8 DONE
Generating cracking order... DONE
Successfully written charset file: pwgen.chr (62 characters)

As you can see, the distribution is so bad that there are over 2000
non-unique passwords (yes, full passwords!) in a million.  To illustrate
this another way, here are some stats for the number of occurrences of
each possible character in the 8th position:

$ cut -c9 john.pot | sort | uniq -c | sort -rn > pos8freq
$ wc -l pos8freq 
60 pos8freq
$ head -17 pos8freq 
 109318 e
  91751 i
  91467 h
  74485 o
  67320 u
  56825 a
  17866 9
  17835 6
  17801 1
  17784 3
  17739 5
  17737 8
  17708 2
  17675 0
  17622 4
  17549 7
  16442 g
$ echo `head -8 pos8freq | awk '{ print $1 }'` | sed 's/ /+/g' | bc
526867
$ echo `head -17 pos8freq | awk '{ print $1 }'` | sed 's/ /+/g' | bc
684924

In plain English, 60 out of 62 characters are possible in the 8th
position with pwgen's default settings.  (The options I had passed to
pwgen actually make it match its defaults for tty-enabled operation,
despite of me having redirected output to a pipe.)

If we try only 8 out of 62, we nevertheless have a 52% chance of getting
a pwgen-generated password cracked (or we'd crack 52% of a large number
of such passwords).  If we try 17, we improve our chances to over 68%.

Since JtR's incremental mode is even smarter in that it considers the
preceding characters as well, and since it applies this approach to all
character positions and not just the 8th, it should crack way more than
50% of passwords (perhaps 90% to 99%?) in 1/8th of the time needed to
search the whole 62-character keyspace for length 8.  It'd be curious to
give this a try.

I experimented with the above a year ago.  I only had one specific
password hash to attack (a client's lost password), so I did not proceed
to figure out the actual percentages, etc. for incremental mode runs on
multiple hashes of pwgen'ed passwords.  I also did not use pwgen.chr
during the KoreLogic contest; maybe it would have helped a little bit
since according to info KoreLogic released after the contest 1000 of
the passwords were actually pwgen'ed (random-1000-from-pwgen.txt).
They may be used as testing material now - to demonstrate that attacks
actually work for passwords generated by someone else.

Alexander
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.