Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 11 Apr 2012 06:23:30 +0200
From: Lukas Odzioba <lukas.odzioba@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: cryptmd5.cu fails to build with -arch sm_13

2012/4/11 Solar Designer <solar@...nwall.com>:
> The current magnum-jumbo fails to build with the CUDA target on
> cryptmd5.cu as follows:
>
> cd cuda; nvcc -c -Xptxas -v -arch sm_13 cryptmd5.cu
> ptxas info    : Compiling entry function '_Z14kernel_crypt_rP18crypt_md5_passwordPj' for 'sm_13'
> ptxas info    : Used 38 registers, 64+0 bytes lmem, 30736+16 bytes smem, 21 bytes cmem[0], 48 bytes cmem[1]
> ptxas error   : Entry function '_Z14kernel_crypt_rP18crypt_md5_passwordPj' uses too much shared data (0x7810 bytes + 0x10 bytes system, 0x4000 max)
> make[1]: *** [cuda_cryptmd5.o] Error 255

On my system it works for 13:
cd cuda; nvcc -c -Xptxas -v -arch sm_13 cryptmd5.cu
ptxas info    : Compiling entry function
'_Z14kernel_crypt_rP18crypt_md5_passwordPj' for 'sm_13'
ptxas info    : Used 28 registers, 64+0 bytes lmem, 10248+16 bytes
smem, 21 bytes cmem[0], 40 bytes cmem[1]
mv cuda/cryptmd5.o cuda_cryptmd5.o

my cuda_cryptmd5.h:
#define BLOCKS 28
#define THREADS 256

You have got 3 times more shared memory (Did you changed THREADS to
768?), and 10 registers more.

I agree that john should build on every card. Auto configuration
similar to that in OpenCL code have to be created, so used don't have
to bother about sm or threads/blocks.

Lukas

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ