john-users - Re: --fork using different OpenCL devices

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABh=JRGkP-v4qh1nYTrRJwvjhyXxJz3wJXcZRqijKvd0-ogxPQ@mail.gmail.com>
Date: Thu, 8 Aug 2013 08:16:36 +0300
From: Milen Rangelov <gat3way@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: --fork using different OpenCL devices

Hello,

I think slow formats would benefit as well. I do something similar in
hashkill (though multiple threads, not multiple processes). There is a
noticeable speed increase when cracking "fast hashes" and less noticeable
but still present increase for slow hashes. Weird but on GCN hardware, the
increase for slow hashes is much more noticeable as compared to VLIW (I
have a theory for that but not 100% sure about it).

Regards,
Milen


On Thu, Aug 8, 2013 at 6:30 AM, Solar Designer <solar@...nwall.com> wrote:

> magnum, Claudio, all -
>
> On Wed, Aug 07, 2013 at 09:32:49PM +0200, magnum wrote:
> > Claudio had an idea a while ago that I think still hasn't been discussed
> on list so here goes:
> >
> > The idea is to have -fork pick a different device (starting from 0 or
> picking from a given list) for each child. Picture having two 7990 cards
> for a total of four devices. Using "-fork=4" with an OpenCL format would
> pick device 0 for the mother process, device 1 for first child and so on.
>
> This would provide poor man's multi-GPU support.  Unfortunately, in the
> current implementation of --fork there's some use of signals - such as
> to get the status line printed by all children on a keypress - and this
> appears incompatible with AMD's SDK.
>
> > Only very fast formats [where set_key() is a bottleneck] would benefit.
>
> This is confused/confusing.  What I think was meant here is that if we
> _don't_ direct the different fork'ed processes to different GPUs (let
> them all use one GPU), then we'll hide the latency of key setup and key
> transfers.  This is similar to how I sometimes invoke Sayantan's
> descrypt-opencl on one GPU multiple times to achieve much better
> cumulative speed than is possible with one invocation.  Yes, --fork
> would help here (already the current implementation of it, with no
> changes), except that there's the issue with AMD's SDK that I mentioned
> above.  On NVIDIA GPUs, this just works.
>
> > I think it's a cool idea and Claudio has a trivial PoC patch. Should we
> do this? It will hopefully be obsoleted by mask mode and other planned
> things. OTOH I would not mind at all applying it.
>
> I don't see mask mode as obsoleting it.
>
> I don't recall what exactly Claudio's patch did, though.  Like I said,
> for hiding the latencies for key setup/transfers with fast hashes, no
> patch is needed (but there's an issue with AMD SDK, which we have no
> patch for).  However, for poor man's multi-GPU a patch would in fact be
> needed (but it will similarly be problematic with AMD SDK).
>
> Maybe we should revise --fork such that it would not use signals (would
> use solely other IPC mechanisms).  Or maybe AMD will fix their SDK soon
> (wishful thinking).
>
> Alexander
>

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.