john-users - Re: Bleeding jumbo now defaults to UTF-8

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <556C57B4.4060504@gmail.com>
Date: Mon, 01 Jun 2015 15:01:40 +0200
From: Marek Wrzosek <marek.wrzosek@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: Bleeding jumbo now defaults to UTF-8

W dniu 01.06.2015 o 03:08, magnum pisze:
> On 2015-06-01 01:34, Marek Wrzosek wrote:
>> W dniu 01.06.2015 o 00:44, magnum pisze:
>>> On 2015-05-31 16:09, Marek Wrzosek wrote:
>>>> Let's summarize what have changed. Before defaulting to UTF-8 in
>>>> john.pot were plain-texts and there was possible to use many encodings
>>>> in one wordlist. Moreover plain-texts were known, but information about
>>>> human-readable form of passwords was gone. After change john can use
>>>> only single-encoding wordlists, stores human-readable passwords in
>>>> john.pot, but plain-texts are gone and one will need to repeat cracking
>>>> passwords using many different target encodings. Just defaulting to
>>>> UTF-8 have solved some issues but have created new ones.
>>>
>>> True. How often is the new defaults a problem IRL though? If you audit a
>>> system it will likely have just one encoding and you should have a good
>>> idea which is is.
>>>
>>> magnum
>>>
>> Can you guarantee that on some audited system that runs an Internet
>> service that is used by people from all over the world and they were
>> using different operating systems, they speak different languages and
>> still all passwords have just one encoding? It could be true today. But
>> was it true in the past?
> 
> We're talking about defaults and common cases. For uncommon cases, you'd
> use non-defaults. Makes sense, doesn't it? It has been the other way
> round until now, and it did not make sense.
> 
>> For systems with mixed encodings old jumbo would crack all encodings
>> using e.g. all.lst on one run. New jumbo will need several runs and all
>> e.g. ASCII-only passwords will be repeated.
> 
> Only if you insist on the idea of a single gigantic universal wordlist.
> No matter how you use that beast, you'll end up suboptimal (but easy to
> use).
> 
> Hey, no functionality was removed. Just reset john.conf to the legacy
> settings and temporarily use that. Do so with a separate pot file (using
> the -pot option) so you don't ruin the all-utf8 pot file.
> 
> I'd do it differently though.
> 
> magnum
> 
> 
And I was saying about non-default "smart" target encoding, but if it's
impossible, it's OK.
Even single-language wordlists have words without any language-specific
letters and they would be repeated unchanged if someone will run john
several times with different --target-encoding. So the other workaround
is to separate ASCII-only passwords from those UTF-8 wordlists and make
ASCII wordlist and then from other passwords (passwords with at least
one non-ASCII character) and make wordlist for cracking with different
--target-encoding. Separating Russian passwords was easy task. Is there
a simple way to make these wordlists for e.g. German or French or
"iso-8859-1 part" of all.lst_utf8? How would grep command look like to
achieve this?
Reverting back to --enc=ascii is not the answer because one will end
with plain-texts with unknown encoding and different for every line.
UTF-8 pot files are better, less information is gone than in old-ones.
Guessing encoding for UTF-8 passwords is easier than guessing
human-friendly form of passwords with unknown encoding.
-- 
Marek Wrzosek
marek.wrzosek@...il.com
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.