Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Mon, 3 Sep 2012 11:16:44 -0400
From:  <jfoug@....net>
To: john-dev@...ts.openwall.com
Subject: WAS:  Re: [john-users] JTR against 135 millions MD5 hashes

(moved from john-users, due to the reply on this message does not fit into j-u content)

The memory reduction due to the source() function, and also due to removing some wasteful allocations within the prepare() and valid() within dynamic.  I believe the wasteful allocations in dynamic have been removed, all the way back to the J7-RC.   The memory requirements were greatly reduced early on, but have slowly started to be lost some (the bitmaps).  However, total mem usage is still much less than it was, when we were keeping the full source buffer for each hash.

There are still other areas where significant memory reduction can be made:

When we do use source(), we currently have to keep the full binary in memory, which is a waste, it is ONLY needed to regenerate source to print it back out. In 'non' source() mode, we only need to keep 4 bytes in hot memory, for a binary check.  I think we could reduced this from 16 bytes per hash (or 64 bytes for sha512!!!), to only 8 bytes per hash, with an auxiliary file storing the full array of binary hash data points.  Since the binary data is all fixed size, JtR only needs to use an index, vs a real offset pointer, computing the true offset into the file at runtime.. This should allow JtR to only need to have 4 bytes for the pointer into this file.  That would allow running against 4 billion hashes at once, which should be an 'adequate' limitation.   So this would drop us from 16 bytes to 8 bytes per hash.  Probably make up for the memory lost to the bitmaps (and quite a bit more). Run time should not be impacted much, except VERY early on, when a huge number of hashes are being found.  Once you get past that early onslaught, having to lookup data in an external file should not impact runtime at all.  That file would ONLY be needed to be accessed when printing a found factor (or in some of the -show functions).

Second, IIRC JtR uses way too much memory when the input file is largely cracked.  If the input file has 140 million, but 115 million of them are cracked, there is a lot of memory used loading the hashes, then loading the .pot file hashes, then things get marked off, and a new list of the remander candidates is created. But IIRC, this memory is not released (or not fully released).  It would be nice if JtR would be able to load 140m into the same machine, no matter if there was a .pot file with 115m or not.  Also, it would be nice if JtR had the same final runtime foot print no matter if you run against 25m hashes, or an input file with 140m where 115m were already found.

I have not looked into the 2nd issue, because I simply made tools that would strip out found hashes from the master list, into a 'current running' input file.  So instead of having 140m input file, with a 115m .pot file, I had many .pot files, which in total contained the 115m found, and then had the resultant stripped down input file, which contained only the non-cracked.  Maintenance of this data was far from ideal, but it worked. However, it would be much better if JtR could do this all on it's own, without breaking a sweat.

I am just throwing some mind food out there.  I have not looked too deeply into any this, but:

For the binary in a separate file, the biggest issue, is that separate file, making sure it gets cleaned up, and what to do upon restart, if one already exists.  Leaving 2gb or larger files scattered on a users system is not a really good thing to do. However, I think changing having the full binary in memory, to having most of the binary live in an external file, should reduce memory about 30-40% beyond what it has been reduced already.

For trying to use same memory for 25m vs 140m with 115m in the .pot file, likely it would require something like offloading things to disk, being able to FULLY scrub memory, then reloading properly.  There may be many other ways to do this, but often when there is this many allocations, that is about the only way to really free up the temp used pages, and reclaim memory.

For loading 140m vs 140m with 115m in .pot file loading on same machine (i.e. using same memory), we may have to do something along the lines of how the unique is done, vs loading the whole world into hot memory.  This 'unique' like reduction may also help with the problem talked about in the prior paragraph.  

But again, a lot of these have straggler temp files, which are VERY large, and are ugly things if left uncleaned on a users system.  Also doing things like this will likely increase load times quite a bit, and may not be the best method for things like the -show command, which load time is pretty much the entire runtime.  For a long cracking session, load time does not matter much.  For a short cracking session (or no cracking session, such as -show), load time is a very large part of overall time.

Again, the ideas are mind food.  I have not put any time into code.  But it is similar to when I was looking at doing source().  I KNEW JtR was using way more memory than was needed.  It is all in finding the largest of the low hanging fruit, and picking it, and doing it in a way that has the minimal adverse affect runtime speed.

Jim.

---- Simon Marechal <simon@...quise.net> wrote: 
> > I don't know why 16 GB RAM didn't appear to be sufficient in Simon's
> > experiment.  bleeding-jumbo is supposed to use less RAM (than older
> 
> I don't know either (I probably only had 8GB free at that time), turns
> out it eats 12.5GB with patch:
> 
> 138fcdd49c037f4883fbe0e48a444ab557d160ed
> 
> it uses 7.6G with latest bleeding jumbo:
> 
> ce390da7963e260bf97ee6cde0dcf31a91d0937a

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ