Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 11 Apr 2012 01:11:57 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: OpenCL runtime errors

On 04/04/2012 02:17 AM, Solar Designer wrote:
> Somehow four of the OpenCL formats don't work for me on this system
> where I previously had a mix of Nvidia and AMD stuff, which I've now
> temporarily tried to clean up to just Nvidia.  The errors are different.
> 
> The rest of the OpenCL formats work fine - specifically, all of
> Samuele's fast hashes and magnum's new RAR with OpenCL pass the tests.
> So my setup is not 100% broken. ;-)  (Also, all CUDA stuff works.)

All,

While experimenting with RAR I found a detail that stops many runtime
errors similar to the ones mentioned - and makes adjustment to different
devices (including really weak ones) easier. Most current OpenCL formats
(I think all except mine) use this to determine the device's maximum
local worksize:

clGetDeviceInfo(devices[gpu_id], CL_DEVICE_MAX_WORK_GROUP_SIZE,
sizeof(max_group_size), &max_group_size, NULL);

...but this figure does not help you much. It just shows the maximum
supported worksize for any (lean) kernel on this device. I use this instead:

clGetKernelWorkGroupInfo(crypt_kernel, devices[gpu_id],
CL_KERNEL_WORK_GROUP_SIZE, sizeof(max_group_size), &max_group_size, NULL);

This one tells us the maximum local worksize for *this very kernel* on
this device. That is, the OpenCL implementation uses the resource
requirements of the kernel (register usage etc.) to determine the max
usable local worksize. Works like a charm. I currently don't even have a
find_best_lws(), it's just a couple of simple (and quick) tests in init().

The OpenCL support in RAR is very far from perfect though. I guess I
need to experiment with CL_KERNEL_LOCAL_MEM_SIZE or something, as well
as coalescing.

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ