Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 7 Jan 2013 22:53:29 +0400
From: Solar Designer <>
Subject: Re: des-opencl

On Tue, Jan 08, 2013 at 12:07:37AM +0530, Sayantan Datta wrote:
> If I'm correct we would need more cases e.g 96,144,192.....upto 720 to
> fully harcode the entire loop.

Not exactly.  With one instance of DES fully unrolled, you would no
longer need the rounds_and_swapped variable and those branches.  You
would simply have one big loop for descrypt's 25 iterations, with one
fully unrolled instance of DES inside (no branching in it).  It'd exceed
cache size, but maybe that's OK for some GPUs with some settings (will
need to tune).

> > I'm not sure how to keep both (or all three?) approaches in the same
> > source tree best, though.  3 formats?  Or a format with compile-time
> > fallbacks (e.g., use binary patching when the target GPU type is one of
> > those where we've tested this and it works, and fallback to E[] for
> > other devices?)  Perhaps we'll make a "final" determination on that at a
> > later time, but for now we simply need to have these available.
> I'll make a compile time fallback for now if you agree.

I agree, but I'm not sure what criteria you'd use for triggering the
fallback.  What do you have in mind?  Manual switch?


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.