Date: Sun, 24 Jun 2012 23:07:47 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: async key transfers to GPU myrice - On Sun, Jun 24, 2012 at 11:02:46PM +0800, myrice wrote: > I think you mean we do not use multiple streams, we only overlap the > memcpyH2D with CPU code. So in crypt_all(), I will do the followings > 1. copy second half of the keys to GPU 2. hash first half of the keys > 3. hash second half of the keys. When 2 finished, 1 may be already > done and 3 will start. > > I don't know whether 1 and 2 is overlapped with the same stream. I am > doing this. Will let you know the result soon. Yes, I did not suggest to use multiple streams. I am not familiar with this, but Lukas was able to have data transfers to GPU overlap with computation on GPU by interleaving these inside crypt_all(). I am suggesting an improvement upon this where you'd only need two chunks for (potentially) full efficiency, whereas Lukas' inside-crypt_all() approach would need more chunks to get close to full efficiency (but not reach it). Please do try this out and post your results. Thanks, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.