Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Sat, 05 Mar 2011 14:59:23 +0100
From: magnum <>
To: JimF <>,
Subject: Re: md5_gen, proposed functionality

John-Dev list, I have exchanged a couple of mails with Jim regarding 
having unicode($p) as an function in md5_gen. I did it off-list as I 
thought most john-users members would not benefit from the discussion 
but now this new list exist, I copy it to you too.

On 03/05/2011 03:36 AM, JimF wrote:
> Explain a little more the situations you have need for this.
> 1. Is it input file password hashes?

The situation is EXACTLY like NT hashes: The input file is ASCII, but 
the hashes *was created from* plaintexts encoded in UTF16.

Consider this:

$ echo -n password | md5sum
5f4dcc3b5aa765d61d8327deb882cf99  -

$ echo -n password | iconv -t utf-16le | md5sum
b081dbe85e1ec3ffc3d4e7d0227400cd  -

You can crack the first hash with raw-md5 or md5(p) but you will *never 
ever* crack the second hash without my fix. Actually that one is still 
obviously raw-md5 but it is usually described as md5(unicode($p)) for 
clarity. Just google "md5(unicode(pass))".

$ echo -n password | iconv -t utf-16le | hd
00000000  70 00 61 00 73 00 73 00  77 00 6f 00 72 00 64 00 

Obviously you can't use a wordlist file encoded in UTF-16 because John 
will read the "p" from password and then see a NULL that ends the 
string. We would not want to change that because it would be a total 
rewrite of the whole lot. And it would likely slow down stuff when not 

> 2. Is it dictionary files?

NT hashes are md4(unicode($p)), right? This is the same thing, but I 
want md5(unicode($p)). Files, rules, modes are not changed. In my way of 
doing it (as well as the present mscash, NT or NTLMv2 formats for 
example) the candidates are never really converted per se, but in 
set_key() they are what can be described as casted, so there is 
effectively a null inserted between every char.

> 4. Exactly what do you 'think' needs done?

On a higher level and not caring about performance, I would like to be 
able to create a md5_gen(1xxx) or a thin format that uses unicode($p) 
instead of just ($p).

We have functions like MD5GenBaseFunc__append_salt so maybe we can have 
a function like MD5GenBaseFunc__convert2unicode.

Then I could state md5(unicode($p)) like this:

# expression shown will be the string:   md5_gen(1100) md5(unicode($p))

This one function would also let me do md5(unicode($p.$s)) if I appended 
the salt before the conversion.

> It may be that I can add some input flags, and add code to the proper
> functions that change thier behavior based upon these flags. Then you
> could simply build a format which has these flags set, and it would 'work'.

On a lower level, you'll know better than me. If we did not care about 
performance, this could happen in a separate function plain2utf16() 
called before set_key() and maybe this is still an option but that would 
mean *key from that step on is a string (*wchar?) that can have nulls in 
it, so you'd have to rewrite set_key/get_key anyway.


I have left out the UTF-8 discussion from this because it makes things 
much more complicated and I think we should address that later. But this 
"casting" conversion will ONLY work for ASCII and ISO-8859-1 wordlists. 
This is a current problem with NT hashes too. And it's much lower 
priority so let's leave that for now.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.