Date: Tue, 8 May 2012 01:21:52 +0200 From: Lukas Odzioba <lukas.odzioba@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Lukas - status report #3 2012/5/8 magnum <john.magnum@...hmail.com>: > On 05/07/2012 11:22 PM, Lukas Odzioba wrote: >> I've been working on opencl problems on Bull. Unfortunatelly I wasn't >> able to fix any of them. Magnum stated that they should be trivial, >> but somehow I couldn't make formats work as they should. > > I didn't really intend to fix your problems but I noticed you never > implemented this: http://www.openwall.com/lists/john-dev/2012/04/10/4 so > I got curious now and tried it. > > > OpenCL platform 0: NVIDIA CUDA, 1 device(s). > Using device 0: GeForce GTX 570 > Optimal Group work Size = 256 > Benchmarking: PHPASS-OPENCL [PORTABLE-MD5]... DONE > Raw: 502690 c/s real, 601043 c/s virtual > > OpenCL platform 0: NVIDIA CUDA, 1 device(s). > Using device 0: GeForce GTX 570 > Optimal Group work Size = 32 > Benchmarking: wpapsk-opencl [GPU - OpenCL]... DONE > Raw: 24094 c/s real, 24094 c/s virtual > > > Simple as that ;-) I don't belive what I see... I tried that too. I have even tried to hardcode work_group_size to 128 without any change. OMG. > For crypt-md5, it's a matter of compilers not being very verbose (or > rather, not telling you a dang thing). When I get weird problems like > this I use to try all compilers, nvidia, AMD and Intel. Usually one of > them (and usually just one, and you never know which) informs about a > problem but not this time. I have yet to succeed in building clcc on > bull, but on my laptop I got this: > > $ clcc opencl/cryptmd5_kernel.cl output.ptx > Building... > > :150:6: error: call to 'rotate' is ambiguous > a = ROTATE_LEFT(AC1 + x, S11); > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ > :9:27: note: instantiated from: > #define ROTATE_LEFT(x, s) rotate(x,s) > ^~~~~~ > <built-in>:2784:22: note: candidate function > int __OVERLOADABLE__ rotate(int, int); > > ...and almost 10,000 similar lines. So armed with this knowledge it was > in fact trivial: > > -#define ROTATE_LEFT(x, s) rotate(x,s) > +#define ROTATE_LEFT(x, s) rotate(x, (uint32_t)s) > > > OpenCL platform 0: NVIDIA CUDA, 1 device(s). > Using device 0: GeForce GTX 570 > Max Group Work Size 960 > Optimal Group work Size = 128 > Benchmarking: CRYPTMD5-OPENCL [MD5-based CRYPT]... DONE > Raw: 653872 c/s real, 646647 c/s virtual > > > On the 7970 we get a nice ASIC hang as usual though :-/ Thank you for figuring this out for me. clcc hint has a great value. Hopefully next time I'll do better. Lukas
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.