john-dev - Re: Lukas - status report #2

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABob6ipOY69Lo0dOWTbVAAon7ZUZEBgELcnNYCHdDT8qpXfpmw@mail.gmail.com>
Date: Tue, 1 May 2012 06:13:01 +0200
From: Lukas Odzioba <lukas.odzioba@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Lukas - status report #2

2012/5/1 Solar Designer <solar@...nwall.com>:
> Hi Lukas,
> I noticed that you created this wiki page:
>
> http://openwall.info/wiki/john/WPA-PSK
>
> Maybe you can add a specific example to it (a sample input file,
> commands to run on it, their output) and link to it more prominently?

Of course I'll add example.
Have you got any suggestions how to make it more prominently? Move it
on top of page, or link in gpu formats table?

> Here's what I am getting with the code currently in magnum-jumbo:
>
> user@...l:~/john/magnum-jumbo/run$ ./john -te -fo=wpapsk-cuda
> Benchmarking: wpapsk-cuda [GPU]... DONE
> Raw:    17341 c/s real, 17341 c/s virtual
>
> This is 43% of hashcat's reported speed for this card.

GTX460 with sm_20 and threads=256 does ~15k, 10k by default.

> user@...l:~/john/magnum-jumbo/run$ ./john -te -fo=wpapsk-opencl
> OpenCL platform 0: NVIDIA CUDA, 1 device(s).
> Using device 0: GeForce GTX 570
> OpenCL error (CL_OUT_OF_RESOURCES) in file (opencl_wpapsk_fmt.c) at line (122) - (Run kernel)
> Not important right now (since we have CUDA working), but nasty.
> I wonder if the same would occur on some AMD cards as well.

Arghh...

> user@...l:~/john/magnum-jumbo/run$ ./john -te -fo=wpapsk-opencl -pla=1
> OpenCL platform 1: AMD Accelerated Parallel Processing, 2 device(s).
> Using device 0: Tahiti
> Optimal Group work Size = 128
> Benchmarking: wpapsk-opencl [pbkdf2-sha1]... DONE
> Raw:    64000 c/s real, 531692 c/s virtual
>
> This is quite nice.  hashcat is reported to do 158.1k c/s on 5970, so
> our target speed for 7970 may be about 130k c/s.

I would be more happy to see 80-90k, previously (just pmk calculation
- most time consuming) we had 90% of hashcat's speed. For now
difference will be ever worst for super fast gpus and slow cpu.
Besides cpu side code utilizes only 1 core. Do you have any ideas to
get around it other than MPI? On the other side we could move all code
to second kernel gpu.


> user@...l:~/john/magnum-jumbo/run$ ./john -te -fo=wpapsk-opencl -pla=1 -dev=1
> OpenCL platform 1: AMD Accelerated Parallel Processing, 2 device(s).
> Using device 1: AMD FX(tm)-8120 Eight-Core Processor
> Optimal Group work Size = 16
> Benchmarking: wpapsk-opencl [pbkdf2-sha1]... DONE
> Raw:    2133 c/s real, 267 c/s virtual
>
> This is also reasonable, although a CPU-specific implementation using
> the intrinsics should be much faster.

My current code is based on openssl (it's not yet in jumbo), and It
gives ~230 c/s on i3 2100, in Aircrack-ng i have 510 c/s (uses 4
threads).
As far as I know openssl is not for super optimized, and with
intrisics we should get much better results, am I right?

Thank you for comments to my work.
Lukas

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.