[<prev] [next>] [day] [month] [year] [list]
Date: Tue, 10 May 2011 20:38:09 +0200
From: Samuele Giovanni Tonon <samu@...uxasylum.net>
To: john-users <john-users@...ts.openwall.com>
Subject: jtr 1.7.7 jumbo 1 opencl 02 patch
hello,
i've just uploaded to the wiki opencl patch 02 .
i'm posting on john-users to get more tester however i'll go a bit
on technical details if someone is interested in look to the code.
The following patch should fix segmentation fault problem on NT format
and there are also some improvements in terms of speed.
The following formats at the moment are supported:
raw_sha1 ( raw-sha1-opencl )
raw_md5 ( raw-md5-openc )
NT ( nt_opencl )
NSLDAPS ( ssha-opencl )
Code should work on Nvidia and Ati GPU, unfortunately i wasn't able to
test on Nvidia Cards please feel free to report me any problem .
for ATI users, be sure ATISTREAMSDKROOT is configured and point to the
right place, for nvidia users be sure CL/cl.h is in /usr/include or
/usr/local/include
Quick Readme:
for those interested in some tweaks you have to look for
_NUM_KEYS defines and local_work_size variable, these two are the one
responsible for how many password to try at time and how big is the
local work size given to the GPU .
NUM_KEYS for rawsha1 and NSLDAPS are integrated in the .c code and
passed as argument to .cl kernel, this is a fancy addon that help you
test faster.
Changes:
* NSLDAPS i moved sha update (salt) to cl code to save bandwith
transfer, getting some minor performance
* NT_opencl has been separated from NT_fmt for better understanding
of the code
* cmp_all and cmp_one have been improved: if cmp_all is TRUE then i
"download" all the remaining hashes in outbuffer2 once, so all the
remaining cmp_one call just check on local data instead of repeatedly
calling clEnqueueReadBuffer for each hash and for each password.
this made me gain a 20% more performance
I need benchmarkers willing to test the formats and post me speed
comparison as well as GPU specs example:
Ati Radeon HD 6970 on linux 64 bit.
Benchmarking: Raw SHA-1 OpenCL [SHA-1]...
Kernel path is : ./sha1_opencl_kernel.cl
OpenCL Platform: <<<ATI Stream>>> and device: <<<Cayman>>>
DONE
Many salts: 14155K c/s real, 15728K c/s virtual
Only one salt: 15702K c/s real, 15548K c/s virtual
Benchmarking: Raw MD5 [raw-md5-opencl]...
Kernel path is : ./md5_opencl_kernel.cl
OpenCL Platform: <<<ATI Stream>>> and device: <<<Cayman>>>
DONE
Raw: 64368K c/s real, 63736K c/s virtual
Benchmarking: Netscape LDAP SSHA OPENCL [salted SHA-1]...
Kernel path is : ./ssha_opencl_kernel.cl
OpenCL Platform: <<<ATI Stream>>> and device: <<<Cayman>>>
DONE
Many salts: 35651K c/s real, 36011K c/s virtual
Only one salt: 23301K c/s real, 23301K c/s virtual
Benchmarking: NT MD4 [OpenCL 1.0]...
Kernel path is : ./nt_opencl_kernel.cl
OpenCL Platform: <<<ATI Stream>>> and device: <<<Cayman>>>
DONE
Raw: 28550K c/s real, 28835K c/s virtual
TODO:
* better handling of *NUM_KEYS and local_work_size
* fix for nsldaps and single mode memory issue
* compare binary and hashes on .cl code and return one partial hash
(for get_hash functions) and an array of 0|1 depending if hashes
matched with binary.
Cheers
Samuele
Powered by blists - more mailing lists
Powered by Openwall GNU/*/Linux -
Powered by OpenVZ