Date: Wed, 11 Apr 2012 01:11:57 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: OpenCL runtime errors On 04/04/2012 02:17 AM, Solar Designer wrote: > Somehow four of the OpenCL formats don't work for me on this system > where I previously had a mix of Nvidia and AMD stuff, which I've now > temporarily tried to clean up to just Nvidia. The errors are different. > > The rest of the OpenCL formats work fine - specifically, all of > Samuele's fast hashes and magnum's new RAR with OpenCL pass the tests. > So my setup is not 100% broken. ;-) (Also, all CUDA stuff works.) All, While experimenting with RAR I found a detail that stops many runtime errors similar to the ones mentioned - and makes adjustment to different devices (including really weak ones) easier. Most current OpenCL formats (I think all except mine) use this to determine the device's maximum local worksize: clGetDeviceInfo(devices[gpu_id], CL_DEVICE_MAX_WORK_GROUP_SIZE, sizeof(max_group_size), &max_group_size, NULL); ...but this figure does not help you much. It just shows the maximum supported worksize for any (lean) kernel on this device. I use this instead: clGetKernelWorkGroupInfo(crypt_kernel, devices[gpu_id], CL_KERNEL_WORK_GROUP_SIZE, sizeof(max_group_size), &max_group_size, NULL); This one tells us the maximum local worksize for *this very kernel* on this device. That is, the OpenCL implementation uses the resource requirements of the kernel (register usage etc.) to determine the max usable local worksize. Works like a charm. I currently don't even have a find_best_lws(), it's just a couple of simple (and quick) tests in init(). The OpenCL support in RAR is very far from perfect though. I guess I need to experiment with CL_KERNEL_LOCAL_MEM_SIZE or something, as well as coalescing. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.