Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 29 Mar 2012 05:16:21 +0800
From: myrice <qqlddg@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: rawsha256.cu patch(using shared memory)

Hi,

Lukas, Solar

Thank you for help!

To Lukas:
> You used shared memory after ALL time consuming computations, totaly
> not good idea:)
> First of all you must decide will you apply for slow, or fast hashes.
> Those are DIFFERENT tasks with different needs.

I go through the code and find your comment "//use shared memory". I
suppose you mean that to use shared memory making the final output(i.e.
write to global memory) coalesce. Just as the code show, I first write it
to shared memory and then write to global memory. This contribute to the
performance gains. I just have a try on this. Now I know, in order to make
fast hash efficient, we have to do lots of works and next I will discuss
with Solar.

To Solar:
> That's nice, but this is still awfully slow.  In fact, even the
> benchmarks we have on the wiki somehow show higher speeds, even though
> you have a faster card (GTX-580, right?)

I am sorry for lack my hardware details. GTX-580 is my lab's server. But
recently it becomes unstable :(
I tested this code on my laptop with GeForce 9600M GS card and P8600 CPU.
So the performance is slow.

> The formats interface bottleneck is somewhere above 50M c/s.  Actually,
> --format=dummy shows it at around 130M c/s on Core i7-2600, which is
> what you said you use, but indeed interfacing to the GPU takes time.
> With Samuele's fast hash implementations in OpenCL and running on GPU,
> we're getting close to 50M c/s.  So you also need to get close to that.
> This is a good thing for you to attempt.

> (And once you get there, you'd need to somehow demonstrate that your
> code would be even faster without the interface bottleneck - e.g., by
> starting to implement candidate password generation and hash comparison
> on GPU in whatever quick way you can for the demo.)

Okay, I will implement XSHA512 first. If I have time, I will make this.
However, I think If I implement candidate password generation and
comparison on GPU, there are lots of work to do. I have to go
through existing code on password generation(I guess they are mainly in
Crakc.c?) and subtitute it with cuda.

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.