Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Thu, 20 Aug 2009 11:26:22 -0500
From: "JimF" <>
To: <>
Subject: Thoughts and questions on creation of a 'generic' MD5 hash set format (to handle 'all' of them)

I have been busy with 'real work'  (damn kids want food every day, i guess), so have not done a lot with john for a while.

However, I have been thinking about how best to feed john some of the many md5 hash types (families).  I propose something like this:

Password 4turtles
Salts either ttzzz or i a   (i space a)

uid:md5($p.$s5)ttzzzf879de3ea2c872243bf38ff482fecb7f     (pw=4turtles salt=ttzzz)
uid:md5($s5.$p)ttzzzb5944fc539d959a300ac9896bb98bada     (pw=4turtles salt=ttzzz)
uid:md5($s5.$p.$s5)ttzzz9f0367a67426e852a08b54e0d25b2f99 (pw=4turtles salt=ttzzz)
uid:md5(md5($p).$s3)i a2abca28714f40edb09a639f555e63098  (pw=4turtles salt=i a)
uid:md5(md5($p))d894b3efe537e7c180c71129b7a5221b         (pw=4turtles)
uid:md5(md5(md5($p)))5ede6d1ca68d4c589c29084857cf0584    (pw=4turtles)
uid:md5($p)32ec7dad341b379d0b9103e45e7d1438              (pw=4turtles)
(note the last one is simply 'raw' MD5)

What are people's thoughts about this 'format'?  Then john could simply have a -format=md5-generic. I would think that john could be coded to handle this pretty easy (the parsing is trivial, since all you parse is md5 ( ) $p and $sLen value.It could even be 'optimized' by hard coding many of the 'common' known types, and then building a simple parser to handle ones we do not recognise the signature for, so that 'new' fomat may not get all the low level 'tweeks', but should still be pretty damn fast.  

I 'believe' that ONLY 1 type of signature would be possible in a file at a time.  The format would probably simply use the first 'valid' md5(...) signature, set itself up to procees 'that' type, and then only load lines from the file with that signature.  That is much like what happens today, when there are multiple types mixed in the passwd file.  The first 'type' is what is used.  Note, we might have to 'add' a command option to allow the user to 'force' which type.  So, he could call with -format=md5-generic -md5-type=md5(md5($p.$s6)) and get only those types processed, even if the first valid md5-generic seen was not md5(md5($p.$s6))

What do people think of this?   NOTE, this could be enhanced for MD4, SHA1, etc or any other 'base' hash family.  

I can see a few issues.  How do we handle up-cased hex?  Also, some how to deal with base-64 or other encodings?  But to start, we could handle most any situation of multi-md5 'standard' with passwords (i.e. lower cased hex, and all intermediates fully converted out to low case hex before re-hashing), and salts.   

For salts, I think $s salts should be of this format:   $s#[-#-delim]  so $s5 means 5 char salt $s16 is 16 char salt, while $s2-6-$ means variable 2 to 6 char salts, delmited with $ (but the trailing $ is not part of the salt).  The delim is probably not needed, since we 'know' the fixed size of the hash.  In md5-base16, it is 32.  So if we have $s2-6 and there are 35 bytes after the md5 signature, then the salt is 3 bytes.  This might also be a workable way to do it.  So, we 'could' code for salt signature to be be in the format $s#[-#[-d]]

Hashes like phpass might still need more of a 'custom' work, since the number of recursive md5's is based upon the signature.  It 'could' be done, but might be better as a specialty format.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.