john-dev - Re: binary hashes and BINARY

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <4DD12B7B.1080901@bredband.net>
Date: Mon, 16 May 2011 15:49:47 +0200
From: magnum <rawsmooth@...dband.net>
To: john-dev@...ts.openwall.com
Subject: Re: binary hashes and BINARY_SIZE

On 2011-03-29 21:11, Alain Espinosa wrote:
> On 3/29/11, Solar Designer<solar@...nwall.com>  wrote:
>> ... moving some code from cmp_one() to cmp_exact().  Alain?  Many
>> other "formats" are similarly not optimized in this respect.  Not that I
>> care much (with modern RAM sizes), but Alain started to compare memory
>> usage of different tools here:
>> http://www.hashsuite.webuda.com/index.php?p=1_6 ;-)
>> JtR's memory usage at NTLM will grow by 4 MB or 8 MB with the large hash
>> tables patch, but we can reduce it by 8 MB or 12 MB with the BINARY_SIZE
>> optimization.
>
> I make a rapid read and do not fully understand how john works. When i
> have some free time i read again and change the NTLM behavior. I think
> this 'john internal documentation' is very useful for john developers
> when starting a new format. Add it to the wiki or a 'How to support a
> new format'?
>
> I think memory usage/managment make an appreciable influence in
> performance which a large number of hashes.

As the current code only use 4 bytes for cmp_all() and cmp_one() I'm 
pretty sure a BINARY_SIZE of just 4 is plenty enough. cmp_exact() would 
have to be re-written so it creates a full NT hash from scratch and 
compares with that. This can use a copy of the (full) generic code as it 
doesn't need to be fast.

I had a go at re-arranging the current cmp_one() and cmp_exact in a way 
I believe is "more proper" (for the current binary size). I thought this 
wouldn't affect performance in any direction (as cmp_all() was not 
affected) but for some reason my benchmark dropped 5%. It must be one of 
those random how-stuff-ends-up cases.

Anyway, I enclose the code in case it helps someone think. I started to 
look at x86-64.S but that's way over my head. I could do it for 
crypt_all_generic as an example. This would be dead simple, just store b 
(instead of a, b, c, d) in the output array, making it a quarter of its 
current size. The SSE and MMX versions would need more work I guess, 
re-designing the buffers.

BTW I was experimenting the other day with defaulting to the old 
"non-fully-Unicode" crypt_all assembly code, and only use the new code 
when needed, but that too just made a performance drop, coincidental I 
presume.

magnum

View attachment "NT_fmt_cmp_fns.c" of type "text/x-csrc" (2741 bytes)

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.