|
|
Message-ID: <cd0de2261299e6f6b049c58ca37bac97@smtp.hushmail.com>
Date: Mon, 03 Mar 2014 05:08:01 +0100
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: Reload pot file
On 2014-03-03 03:42, Solar Designer wrote:
>> A faster and safer solution would be to just re-process pot file using
>> existing functions. We miss the opportunity to reload the input files
>> containing hashes to crack but that was never my main goal anyway. The
>> worst problem seems to be the database used during initial load is not
>> exactly the same as the one ultimately used. Perhaps that doesn't
>> necessarily matter?
>
> I don't understand what you mean by "the database used during initial
> load is not exactly the same as the one ultimately used". Can you
> please clarify which aspect(s) you're referring to here? I'd like to
> comment on this, but as it is I am just confused.
This comment (I don't fully understand the implications yet):
struct db_password {
...
/* After loading is completed: pointer to next password hash with
* the same salt and hash-of-hash.
* While loading: pointer to next password hash with the same
* hash-of-hash. */
struct db_password *next_hash;
...
And these:
struct db_main {
...
/* Salt and password hash tables, used while loading */
struct db_salt **salt_hash;
struct db_password **password_hash;
...
The latter two tables are freed in ldr_fix_database. I thought I'd need
to re-alloc and rebuild them (from the running DB) before trying to call
the existing ldr_load_pot_file(). And then call fix_database again
afterwards. Does that make sense?
A different approach - and maybe quicker unless the above is simpler
than I imagine - would be to do it more like cracker.c does when
cmp_exact() returns true. I'd need to process the "hash:plain" into
binaries, salts, sources and plains as if it came from a running format
after a crack loop. This might be simpler but I haven't thought it
through yet.
> Re-reading the pot is more generic since it also supports MPI and
> independent invocations of john (e.g., if someone manually invokes john
> with multiple wordlists one after another while also running it in
> incremental mode, the incremental run's john would remove the hashes
> cracked by the wordlist runs). So even if the shared memory approach
> above would happen to work well for --fork, I realize there may be
> demand for re-reading the pot anyway. So you may implement that in
> jumbo, and leave the shared memory for me to eventually experiment with,
> or you may try the shared memory thing yourself if you like.
The shared memory stuff sounds cool but I'll leave that to you. OTOH I
have some plans for trying to actually *use* MPI a little more - and see
if we can do something cool without harming performance. But I think
nothing could obsolete a generic reload feature.
> Oh, and when you re-read, you can start reading from a previously
> recorded offset (the last re-reads pot file size). Then it may actually
> be fast.
Right, so it will be more a re-sync than a reload. This will be trivial
once the non trivial stuff is taken care of %-) There are variants of
this - for example, once we write a new entry to the pot file, we can
detect that someone else wrote inbetween so we can trigger a resync
under certain conditions.
magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.