|
|
Message-ID: <472408d6a850a7d84c2a08e0c3d87bdb@smtp.hushmail.com>
Date: Mon, 25 May 2015 10:32:07 +0200
From: magnum <john.magnum@...hmail.com>
To: john-users@...ts.openwall.com
Subject: Re: Bleeding jumbo now defaults to UTF-8
On 2015-05-24 06:17, Solar Designer wrote:
> On Fri, May 22, 2015 at 06:33:42PM +0200, magnum wrote:
>> On 2015-05-22 16:48, Marek Wrzosek wrote:
>>> That's a great news! What is the simplest way to "repair" all.lst from
>>> Openwall?
>>
>> I bet it's a mix of encodings so can't simply be converted.
>
> Yes. And maybe it should stay as a mix of encodings despite of magnum's
> change, because quite often multiple encodings may possibly have been
> used in target passwords.
Yes, it might be relevant to keep one copy like that. Rockyou shows a
real-world case where most of the hashes were UTF-8 but some were
ISO-8859-1/CP1252 and a few were something else.
> I am worried that some lines are not valid UTF-8, though.
If used with the new defaults, a warning will be emitted and conversion
will be truncated whenever 8-bit non-UTF8 is seen ("Möller" in 8859-1
will become "M").
> How do we ensure those are tested against the hashes
> verbatim, like core (non-jumbo) JtR would test them? Will this just
> happen that way despite of the recent change of default in jumbo?
If running with --enc=raw, the warnings will not be emitted and it will
behave just like non-jumbo (at least in this regard). This is actually
just an alias for --enc=ascii but the latter name might be confusing for
this use.
> magnum, what do you suggest we do? Simply assuming that e.g. md5crypt
> hashes are likely of UTF-8 plaintexts won't do. Some of them might be,
> but some older ones might be iso-8859-1 or koi8-r or windows-1251 as
> well. That's why current all.lst mixes all of these encodings together.
You would either run a mixed-codepage wordlist with --enc=raw (but just
like core john, you won't get eg. case-flipping of 8-bit characters.
Also, note that while this may be sensable for md5crypt, it isn't for
NT, or any other hash that use UTF-16 internally).
Or you'd use UTF-8 wordlist(s) (perhaps some of the non-"all" ones) and
specify a target encoding. This will work for NT et al too.
magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.