Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 27 Aug 2012 00:44:27 +0530
From: Sayantan Datta <>
Subject: Re: bitslice DES on GPU

Hi solar,

On Sun, Aug 26, 2012 at 2:23 PM, Solar Designer <> wrote:

> So let's suppose DES_BS_EXPAND=1.  In that case, you could do the
> expansion on CPU side, but the reason why you won't is that you want to
> keep the key setup finalization on GPU.  That's fine.  However, all you
> need on GPU to perform the expansion is that constant array of 768
> indices; you don't need any pointers, and you don't want to use pointers
> for the same reason that would apply in the DES_BS_EXPAND=0 case (the
> pointers would differ between work-items and thus would require a lot
> more memory total).

int index;
for (index = 0; index < 0x300; index++)

vst(*(kvtype *)&opencl_DES_bs_all[section].KS.v[index], 0,
    *(kvtype *)opencl_DES_bs_all[section].KSp[index]);


So you are saying that I should do this portion of the code on cpu. I think
I could do it in the DES_bs_init() and get rid of KSp all together(Right?).
Also it would be easier this way.

There is another bigger problem than the previous one. I see that address
stored in E.E[] array  is aliased with the B[] array. Transferring data to
gpu would change the location of B[] array and result in same kind of
problem as above. The most complex part being the initialization of E.E[]
array in the set_salt() function. I could port both DES_bs_init() and
set_salt() to GPU(as different kernel) but it would then result in same
problem as I would need to transfer data to and from GPU in order
to synchronize it with the CPU in between the kernel calls. Whenever I
transfer data from CPU to GPU the address of B[] will change.

One more thing I would like to know that immediately after set_salt()
function which function is called? If it is the crypt_all() then maybe I
could transfer some portion of the required code from DES_bs_init to
set_salt() and port set_salt() to GPU. And if the crypt_all() function is
called just after set_salt() then there would be no need for CPU to GPU
data transfer and the address of B[] would remain the same for set_salt()
and DES_bs_crypt_25() kernels.

P.S: I assume that the GPU buffer remains at the same global memory
location unless we write the buffer explicitly.

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.