Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 4 Apr 2012 21:07:43 +0800
From: myrice <qqlddg@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: fast hashes on GPU

On Wed, Apr 4, 2012 at 1:01 AM, Solar Designer <solar@...nwall.com> wrote:

>
> You're transferring the computed hashes to the host anyway, right?
> You'd need to avoid that (postpone it to the time that the first
> get_hash*() or cmp_one() call is made - hoping that one won't be made).
> ....
> I've attached the patch - please review and likely apply.
>

I applied your patch. Very good result:
THREADS changes to 512, sm_20:
Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
Many salts: 57100K c/s real, 57671K c/s virtual
Only one salt: 25165K c/s real, 24916K c/s virtual

With sm_10:
Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
Many salts: 63438K c/s real, 63438K c/s virtual
Only one salt: 26214K c/s real, 26214K c/s virtual

Also, I moved computed hash to get_hash* and cmp_one,
With sm_20:
Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
Many salts:     62291K c/s real, 62291K c/s virtual
Only one salt:  26214K c/s real, 25954K c/s virtual

With sm_10:
Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
Many salts:     69559K c/s real, 70254K c/s virtual
Only one salt:  27262K c/s real, 27538K c/s virtual

I searched for why sm_20 is slower. It seems that sm_20 has tighter IEEE
precision requirements which impact on performance. Since no float
computation in sha512, we could use sm_10 instead? I find sm_13 is a little
better:

With sm_13:
Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
Many salts:     70254K c/s real, 70964K c/s virtual
Only one salt:  27262K c/s real, 27262K c/s virtual

Thanks!
Dongdong Li

Content of type "text/html" skipped

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ