Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 17 Aug 2012 11:44:10 +0530
From: Sayantan Datta <>
Subject: Re: bitslice DES on GPU

On Fri, Aug 17, 2012 at 10:16 AM, Solar Designer <> wrote:

> With a GPU implementation, you will only have one thread on the CPU
> side (unless you try to use the CPU for computation as well, which is
> tricky), so you will probably not have a cpt parameter in your revision
> of the code.  However, if you choose not to use a huge DES_BS_DEPTH
> value, you would instead need an array of a large number of structs
> similar to DES_bs_combined, and then you'd have some variable or
> constant corresponding to this array's size.

This is what I'm trying to do . In fact I'm trying to simulate this
condition on cpu(before porting to openCL) but I'm kind of stuck with that
too. I have declared an array of DES_bs_all[] and also made some of the
necessary adjustments but certainly not all. I guess each instance of
DES_bs_combined is used for 32 hashes given that I set DES_BS_DEPTH=32. I
have also set the MAX_KEYS_PER_CRYPT to a multiple of 32. Now what are the
other global parameters that must be changed for such implementation.  Or
is it going to be too complex if I proceed this way?

>  It will have some
> similarity to cpt, as well as to DES_bs_nt.  On CPU, we need both of
> these (and the latter has the former factored into it).  When
> interfacing to GPU, you will only need one - and only if you choose to
> keep DES_BS_DEPTH low, which you don't have to.

So, is it a better idea to use a bigger vector and then process the parts
of the vector in parallel?


Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.