Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Nov 2012 01:16:02 +0100
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: Problems on recent AMD driver

On 21 Nov, 2012, at 22:06 , magnum <john.magnum@...hmail.com> wrote:
> ...but per nvidia whitepapers, the following (which is what I do, and others before me) is recommended for using "pinned" or "page-locked" memory. And it's a mix of the above:
> 
> Init:
>  1. mem_object = clCreateBuffer(...);
>  2. host_ptr = clEnqueueMapBuffer(..., mem_object, ...);
> 
> Loop:
>  3. (write stuff to host_ptr buffer)
>  4. clEnqueueWriteBuffer(..., mem_object, ..., host_ptr, ...);
>  5. clEnqueueNDRangeKernel(...);
>  6. goto 3

Whoa. After re-reading (yet again) section 3.1.1 of http://www.nvidia.com/content/cudazone/CUDABrowser/downloads/papers/NVIDIA_OpenCL_BestPracticesGuide.pdf I realized I have missed an subtle detail. It actually says we should do this:

Init:
 1. pinned_mem = clCreateBuffer(..., MEM_ALLOC_HOST_PTR, ...);
 2. device_mem = clCreateBuffer(...);
 3. host_ptr = clEnqueueMapBuffer(..., pinned_mem, ...);

(the parameters for #1 and #2 are exactly the same except for MEM_ALLOC_HOST_PTR)

Loop:
 4. (write stuff to host_ptr buffer)
 5. clEnqueueWriteBuffer(..., device_mem, ..., host_ptr, ...);
 6. clEnqueueNDRangeKernel(...);
 7. goto 4

So there are TWO allocations, one for device memory and another for the host pinned memory. The pinned memory is used for mapping while the device memory is used for clEnqueueWriteBuffer. This is obviously a whole lot different from what we've running... but will it fix the problem? Claudio, could you try fc065b5 and see if md5crypt-opencl works better with this code in place? I will hold my breath.

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ