Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 24 Jun 2012 23:07:47 +0400
From: Solar Designer <>
Subject: Re: async key transfers to GPU

myrice -

On Sun, Jun 24, 2012 at 11:02:46PM +0800, myrice wrote:
> I think you mean we do not use multiple streams, we only overlap the
> memcpyH2D with CPU code. So in crypt_all(), I will do the followings
> 1. copy second half of the keys to GPU 2. hash first half of the keys
> 3. hash second half of the keys. When 2 finished, 1 may be already
> done and 3 will start.
> I don't know whether 1 and 2 is overlapped with the same stream. I am
> doing this. Will let you know the result soon.

Yes, I did not suggest to use multiple streams.  I am not familiar with
this, but Lukas was able to have data transfers to GPU overlap with
computation on GPU by interleaving these inside crypt_all().  I am
suggesting an improvement upon this where you'd only need two chunks for
(potentially) full efficiency, whereas Lukas' inside-crypt_all()
approach would need more chunks to get close to full efficiency (but not
reach it).

Please do try this out and post your results.



Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.