Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 8 May 2012 01:21:52 +0200
From: Lukas Odzioba <lukas.odzioba@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Lukas - status report #3

2012/5/8 magnum <john.magnum@...hmail.com>:
> On 05/07/2012 11:22 PM, Lukas Odzioba wrote:
>> I've been working on opencl problems on Bull. Unfortunatelly I wasn't
>> able to fix any of them. Magnum stated that they should be trivial,
>> but somehow I couldn't make formats work as they should.
>
> I didn't really intend to fix your problems but I noticed you never
> implemented this: http://www.openwall.com/lists/john-dev/2012/04/10/4 so
> I got curious now and tried it.
>
>
> OpenCL platform 0: NVIDIA CUDA, 1 device(s).
> Using device 0: GeForce GTX 570
> Optimal Group work Size = 256
> Benchmarking: PHPASS-OPENCL [PORTABLE-MD5]... DONE
> Raw:    502690 c/s real, 601043 c/s virtual
>
> OpenCL platform 0: NVIDIA CUDA, 1 device(s).
> Using device 0: GeForce GTX 570
> Optimal Group work Size = 32
> Benchmarking: wpapsk-opencl [GPU - OpenCL]... DONE
> Raw:    24094 c/s real, 24094 c/s virtual
>
>
> Simple as that ;-)
I don't belive what I see... I tried that too. I have even tried to
hardcode work_group_size to 128 without any change. OMG.

> For crypt-md5, it's a matter of compilers not being very verbose (or
> rather, not telling you a dang thing). When I get weird problems like
> this I use to try all compilers, nvidia, AMD and Intel. Usually one of
> them (and usually just one, and you never know which) informs about a
> problem but not this time. I have yet to succeed in building clcc on
> bull, but on my laptop I got this:
>
> $ clcc opencl/cryptmd5_kernel.cl output.ptx
> Building...
>
> :150:6: error: call to 'rotate' is ambiguous
>        a = ROTATE_LEFT(AC1 + x[0], S11);
>            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> :9:27: note: instantiated from:
> #define ROTATE_LEFT(x, s) rotate(x,s)
>                          ^~~~~~
> <built-in>:2784:22: note: candidate function
> int __OVERLOADABLE__ rotate(int, int);
>
> ...and almost 10,000 similar lines. So armed with this knowledge it was
> in fact trivial:
>
> -#define ROTATE_LEFT(x, s) rotate(x,s)
> +#define ROTATE_LEFT(x, s) rotate(x, (uint32_t)s)
>
>
> OpenCL platform 0: NVIDIA CUDA, 1 device(s).
> Using device 0: GeForce GTX 570
> Max Group Work Size 960
> Optimal Group work Size = 128
> Benchmarking: CRYPTMD5-OPENCL [MD5-based CRYPT]... DONE
> Raw:    653872 c/s real, 646647 c/s virtual
>
>
> On the 7970 we get a nice ASIC hang as usual though :-/

Thank you for figuring this out for me.
clcc hint has a great value. Hopefully next time I'll do better.
Lukas

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ