Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 12 Jul 2012 16:14:38 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: Get hash number in cmp_all()

myrice -

On Thu, Jul 12, 2012 at 01:50:16PM +0800, myrice wrote:
> On Tue, Jul 10, 2012 at 8:12 PM, Solar Designer <solar@...nwall.com> wrote:
> > Sounds inconvenient and inefficient.  John already groups the hashes by
> > salt, and you somehow undo that grouping?
> 
> John uses link list to store hashed and salts. However, link list is
> difficult to copy to GPU and inefficient on GPU. So I un-group them to
> arrays.

Sure.  But you should keep the grouping by salt.  Specifically, when
crypt_all() or cmp_all() is invoked for a specific salt, how do you
identify just which hashes to compare the computed ones against?
Scanning the entire hash array and skipping wrong-salt entries would be
inefficient.  So you need per-salt hash arrays or at least offsets or
pointers to the first proper-salt entry in the big all-hashes array.

You'll also need per-salt bitmaps and hash tables for fast lookups when
the number of hashes per salt is large (and for saltless hash types).

Thus, I guess you'd need some on-GPU array indexed by salt ID.  The ID
could be a custom fast hash function of salt value or maybe of on-CPU
salt pointer (and then you also need a next pointer in each on-GPU salt
struct to deal with possible collisions), or it could be a unique ID of
the loaded salt.  The latter should be more efficient and easier to
implement, but will likely require some support from core John code.
I might provide such support.

> >> >> 1. Reallocate the memory when hash number exceeds the pre-defined
> >> >> size. Is it a good idea to add mem_realloc() to memory.c?
> >> >
> >> > I am fine with that, but I don't see why you want to be allocating a
> >> > copy of the hashes on CPU when you need that copy on GPU.  Is this just
> >> > an intermediate step, before you transfer the hashes to GPU?
> >>
> >> Yes. I don't want to memcpy to GPU with small size of a hash.
> >
> > Why not, given that you only do it once per hash when you've just
> > started cracking them?  Well, I understand that it could slow down
> > startup significantly when you have millions of hashes loaded, but in
> > that case the increase in memory consumption (even if temporary) may be
> > problematic (on the other hand, many gigabytes of hashes won't fit in
> > GPU memory anyway).  So maybe you'd need to transfer hashes in
> > reasonably sized groups, then - e.g., 1000 at a time.
> 
> 1000 is large to me. I mean I don't want to transfer hashes one by one
> to GPU. This should not be efficient.

OK, so transfer them in 1000s if you say this is large enough for you.

> I have a question here, if we have many gigabytes of hashes, how can
> john store them? Or john first reads reasonable number of hashes and
> cracks them and move to the next chunk of hashes?

Currently, we can load as many hashes as fit in memory.  With loading of
hashes to GPU, this will be several times less (since graphics card
memory sizes tend to be less than host's).  We can accept this as a
limitation of our GPU-enabled fast hash implementations, or we can have
fallback code that will keep the loaded hashes on CPU side if their
total size (as well as bitmaps and hash tables) is excessive for loading
to GPU.  For now, you should just implement loading to GPU, and it
shouldn't be too hard to optionally revert to keeping the hashes on host
in some cases if we choose to.

Thanks,

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.