Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Fri, 8 Feb 2013 10:24:20 -0500
From: George D. Gal <ggal@...curity.com>
To: john-dev@...ts.openwall.com
Subject: des-opencl broken on OSX

Hey Guys,

I had a question regarding some of the recent changes that were made on the unstable and bleeding jumbo des-opencl patches. I'm having some difficulties with the kernel on my Mac with an ATI-6970

> john --format=des-opencl --test
> OpenCL platform 0: Apple, 2 device(s).
> Device 1: ATI Radeon HD 6970M
> Build log: Error getting function data from server
> Error -11 building kernel. DEVICE_INFO=1090
> OpenCL error (CL_BUILD_PROGRAM_FAILURE) in file (common-opencl.c) at line (209) - (clBuildProgram failed.)

I've tried messing with the work sizes in opencl_DES_WGS.h and also messing with the other options in there, none of which seemed to provide any success. However, when I change the default worksize from 64 either down or up, I don't get the error below, but john seems to just hang as shown below:


> john --format=des-opencl --test
> OpenCL platform 0: Apple, 2 device(s).
> Device 1: ATI Radeon HD 6970M
> 
> 


I saw some recent discussions (thread excerpt below) around switching between the other fallback modes of the kernel, but wasn't sure what this involves, or if you could provide any further guidance.

I've tried nearly all of the other opencl modules most of which seem to work okay, with the exception of this one and ntlmv2-opencl (which seems to just hang indefinitely similar to when I adjust the worksize for des-opencl.

Any ideas?

Regards,
 George

> On Tue, Jan 8, 2013 at 12:23 AM, Solar Designer <solar@...nwall.com> wrote:
> 
> > On Tue, Jan 08, 2013 at 12:07:37AM +0530, Sayantan Datta wrote:
> > > If I'm correct we would need more cases e.g 96,144,192.....upto 720 to
> > > fully harcode the entire loop.
> >
> > Not exactly.  With one instance of DES fully unrolled, you would no
> > longer need the rounds_and_swapped variable and those branches.  You
> > would simply have one big loop for descrypt's 25 iterations, with one
> > fully unrolled instance of DES inside (no branching in it).  It'd exceed
> > cache size, but maybe that's OK for some GPUs with some settings (will
> > need to tune).
> >
> 
> So you mean to say something like this would do:
> 
> for(i=1;i<=25;i++) {
>        hardcode k=0;
>        hardcode k=96;
>        hardcode k=192;
>        hardcode k=288;
>        .
>        .
>        .
>        hardcode k=672;
> }
> 
> Although this ain't anyway equivalent to the current loop structure, but
> maybe they are same mathematically. Is this what you mean?
> 
> 
> >
> > > > I'm not sure how to keep both (or all three?) approaches in the same
> > > > source tree best, though.  3 formats?  Or a format with compile-time
> > > > fallbacks (e.g., use binary patching when the target GPU type is one of
> > > > those where we've tested this and it works, and fallback to E[] for
> > > > other devices?)  Perhaps we'll make a "final" determination on that at
> > a
> > > > later time, but for now we simply need to have these available.
> > >
> > > I'll make a compile time fallback for now if you agree.
> >
> > I agree, but I'm not sure what criteria you'd use for triggering the
> > fallback.  What do you have in mind?  Manual switch?
> >
> > Alexander
> >
> 
> Use the binary patch whenever possible and manually switch between the
> other two fallback modes.


[ CONTENT OF TYPE text/html SKIPPED ]

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ