Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 12 Jul 2012 19:18:46 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: formats interface enhancements

Alain, all -

On Fri, Jun 22, 2012 at 09:39:00AM -0700, Alain Espinosa wrote:
> I do not send my proposal before because is time consuming and i don't
> have enough time to implement the changes i propose.

Thank you for sending this now.

> The principal flag i see is that "john-core" has the
> control (to much control) over "john-formats" and so restrict possible
> format optimization and features development.

This is both good and bad.  The good aspect is that the format
implementations are more similar to each other and thus easier to
understand and maintain.  That said, yes, there's a need to give the
formats a bit more flexibility.

> // Get the current candidates keys.
> //Note that we do not know if is incremental, wordlist or other.
> // Return 1 if keys was filled or 0 if generation of keys was complete
> // num_key: Number of keys to fill
> int get_candidate_keys(char* keys, int num_key);
> 
> #define NUM_KEYS 128
> 
> // Method to implement in each format
> void format_crack(get_candidate_keys* current_get_key)
> {
>    // Create buffer to fill keys
>    char* key_buffer = malloc(28*NUM_KEYS);
> 
>    // Check if "john-core" signals stop or keyspace search was over
>    while(continue_crack && get_candidate_keys(key_buffer, NUM_KEYS))
>    {
>        // Perform hashing
>        // ...
>        // Compare result with actual hashes
>        if(hashtable[result] != NO_ELEM)
>        {
>            // Compare more
>            if(total_match)
>                report_hash_found(key_buffer[index], hash_index);
>        }
>    }
> 
>    // Report that this thread finish cracking
>    report_thread_finish();
>    free(key_buffer);
> }

This would work, but it complicates formats that don't actually want to
take care of these things on their own.  We'd need to be providing ways
for those to leave such things for shared code - functions, macros.

Also, this implies that the keys are buffered in a straightforward
manner, whereas for bitslice DES we currently store a
partially-transposed key bits matrix instead, for a specific good
reason.  On the other hand, calling set_key() per-key might be as much
of a performance hit as partially-transposing the matrix in a separate
step would be.

> Advantages: Almost out the box we have very good multithread. Specific
> formats can be very hard optimized.

Yes, but I see other ways to allow for this.

I think the main advantage of your proposal is in avoiding the need to
copy the keys in cases where they'd be buffered in a straightforward
manner anyway.  However, I think we can achieve the same by defining e.g.

int crypt_all(int count, int block, struct db_salt *salt, char **keys)

when FMT_CRYPT_KEYS is set, the caller would then skip set_key() calls,
buffer the keys on their own, and pass this buffer by reference to
crypt_all().  For multi-threading, per-thread block ids may be used.

> Probably more contributors will
> contribute to ways to generate candidate passwords.

I don't see how your proposal makes this any more likely.  Cracking
modes are already separate from cracker.c's general/shared code.

Well, if we expand multi-threading to upper layers all the way to
cracking modes, then yes those will become more complicated to
implement, and your proposal might hide some of this complexity,
although your get_candidate_keys() would need to be thread-aware
somehow so that if 2+ simultaneous calls to it are made these return
different sets of keys.  Interrupt/restore may be a bit trickier than it
is now under either interface, and threads may need to be synchronized
once in a while under either interface unless we're OK with storing
per-thread checkpoints and not being able to adjust thread count when
restoring.

> Note that to add
> distribute capabilities we only need to add a new get_candidate_keys.

I don't see how this is different from what we can do when the caller
provides the keys to crypt_all(), like we do now or otherwise.

> Also we are free to use any optimization in a GPU implementation.

Yes, this makes the move of hash comparisons into the format
implementation explicit in the interface.  However, I have a different
idea on how to achieve the same - revise the crypt_all() interface as
above, so that it can officially do hash comparisons if it likes.

As to get_candidate_keys() in your example it is on CPU anyway, right?
Or do you mean that some GPU format would skip calling it, or would
modify/multiply the returned keys e.g. by applying a mask on top of
them?  If so, there would need to be a way for the CPU code and for the
user to expect and control such behavior.  I think my proposed
set_mask() provides such a way.

> Disadvantages: Very time consuming. An overwhelming change in john architecture.

Another disadvantage is that benefits over the more conservative
changes that I am proposing are unclear (maybe they're non-existent).

Yet another disadvantage is that reuse of JtR formats in another program
(with proper licensing) would be more difficult if each and every format
implements JtR-specific hash comparisons on its own.  Under my proposal,
presumably only a subset of formats would do that - those where this is
actually beneficial.

BTW, how does this compare to what Hash Suite uses currently, and to
what you intend to implement there in the future?  I think you don't
have perfect multi-threading like this in there yet (with only one
thread generating the candidates currently, albeit in parallel with
other threads doing the hashing), so perhaps this is something you're
merely considering now rather than have already tried out and recommend
from experience?

Thanks again,

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.