Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Thu, 24 May 2012 10:38:09 -0500
From: "jfoug" <jfoug@....net>
To: <john-dev@...ts.openwall.com>
Subject: memory usage within JtR and possible ways to significantly reduce it.

We have talked at one time, about setting the binary_size within a JtR
format as a way of reduction of memory.  I would like to propose a different
approach, as a possible way to setup a format (would also require changes to
core john).

 

I propose 

 

1. that many 'simple' formats allocate a full sized binary.  

2. add a new function to the format (optional, i.e. has a default), which
has this signature:

    char *rebuilt_hash(void *binary, void *salt);  

    the salt possibly would not be needed, since we already are working with
the 'current' salt, but if we do pass the salt pointer in, it allows this
function to be called at other times, than just after a crypt_all

3. if that function is set, then JTR does NOT allocate or store the hash in
memory (I am 99% sure it does this today).

4. if that function is set, then JtR would call the rebuild_hash() prior to
cmp_exact (or we simply say for these functions that cmp_one returning true
is 'good enough').

5. also, the results from the rebuild_hash() would be used to write the line
to the .pot file (and or to the log file).

 

This should reduce memory significantly IF a format is designed to work this
way, without any slowdown in 'normal' runtime.  There might be just a touch
more CPU used, when the 'rare' event of a cracked password, but in normal
runtime of JtR, that is a very rare event, and not the norm.

 

Here is an example, (NT)

 

Current:

 

Bin      4 bytes  (might still be 16, but for this example, I will assume it
was only 4 bytes).

Hash  36 byte.

Salt     0 bytes.

Total allocation, 40 bytes, plus any overhead for MEM_ALIGNED_WORD (let's
say that is 3 bytes).

 

Proposed:

Bin   16 bytes

Hash 0 bytes.

Salt 0 bytes.

(same 3 bytes MEM_ALIGNED_WORD 'waste').

 

So this goes from 43 bytes per candidate, down to 19 bytes.  Better than 50%
reduction.

 

Here is how the original logic is working (saltless, so the logic is a
little simpler)

 

if (set_key(next_pass) == fmt.max_password_count) {

     Crypt_all();

     foreach binary: if(cmp_all(binary)) {

           foreach candidate: if(cmp_one(binary) && cmp_exact(hash) {

                output_found_password(hash);

           }

     }

}

 

Here would be the 'new' logic

 

if (set_key(next_pass) == fmt.max_password_count) {

     Crypt_all();

     foreach binary: if(cmp_all(binary)) {

           foreach candidate: if(cmp_one(binary) {

                if (fmt.rebuild_hash != fmt_default_rebuild_hash)

                     hash = fmt_rebuild_hash(bin_hash, salt);

                if (cmp_exact(hash))

                     output_found_password(hash);

           }

     }

}

 

There would also be some logic changes in the loader (or other areas), so
that the hashes would not be stored.  The one issue that would have to be
addressed in some other way, is removal of hashes, that were found in the
.pot file.  Right now, the hashes are in memory.  We would not have that
luxury.  It may be that the pot would have it's hashes temp load into
memory, then the input hashes are removed when they are read from the input
file (i.e. reverse order from what it is today).

    

 

What do others say with these ideas?

 

This has come up recently for me, with that large 140m hash leak.  Reduction
of memory footprint like this, will allow JtR to better scale to huge
numbers of input candidates.

 

Jim.

 

 


Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.