Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 8 Jun 2012 16:13:10 +0200
From: Frank Dittrich <frank_dittrich@...mail.com>
To: john-users@...ts.openwall.com
Subject: Re: JtR to process the LinkedIn hash dump

On 06/08/2012 02:41 PM, jfoug wrote:
>> There should be a possible permanent solution for this "special" format:
>> During prepare, or during split, always initialize the first 32 bits.
>>
>> Then, john should see only one hash, and write hashes with '00000' to
>> the pot file.
> 
> I do not believe this will help at all.  If you do get this to work, then
> you lose the non 00000 hashes altogether.

But if we assume that there are no false positives, then nobody really
needs to store the 00000 hashes in the pot file.

In prepare for this format (or in split?), you treat any hash as if the
begin were 00000.

> The problem is you really DO have duplicates in there.

Yes, you do have 2 different external representations for the same hash.

> DOES not see the duplicates, because it is not purposely broken like the
> format (the format is 'broken' in that is ignores the first 32 bits).  Thus,
> when you run JtR and it finds a dupe, it will write the first one to the pot
> file.

That's why I suggested get_source to recompute the correct sha-1 (the
one without 00000, and store this in the pot file, no matter which of
the hashes you got.
(If you got the one without the first bits zeroed out, you can just
convert the internal binary hash into the external representation.
Only if the first bits are all zero, you need to compute the sha-1 hash
for the password.)

> Then, upon loading from the pot file, a different dupe detection logic is
> used.  Within loader, the actual strings are compared (after prepare() and
> split()  ).  Thus, within loader, whichever hash was stored in the .pot
> file, WILL be removed from the current run. However, if there was another
> (or multiple), which are duplicates, BUT which have a different full hash
> signature, then those will be left intact.

Does that mean, the loader will not process any format specific code to
convert the representation stored in the pot file into the one needed
for this format? Then indeed my suggestion will not work.

> Now, re-reading your above (and below),  I do see what you are saying, but I
> have to see just how this will properly integrate in. This is like prepare,
> but in reverse.  Prepare is used to 'fixup' the ciphertext prior to JtR
> running (split also, but it has no information about GECOS which is often
> needed).  This would be fixing up the ciphertext on the backside, KNOWING
> the missing bits.  I do not think that get_source() would work for this,
> because it gets used for more than just writing the .pot file lines.

But get_source has the password as one of the input parameters, and
computing the SHA-1 of the known password should be easy.
What am I missing?
Is it the loader problem you mentioned above?
(May be this discussion is getting too technical for john-users.)

> One 'other' way to fix this issue, is to simply write a pre-processor, that
> simply drops any duplicates from the original input list.  Keep the real
> hash, and dump the 00000 smashed value that is the same.  

Yes. Advantage: you'd have to do it just once, and not for every john
session.


Frank

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.