Date: Mon, 21 May 2012 18:56:15 -0300 From: Claudio André <claudioandre.br@...il.com> To: john-dev@...ts.openwall.com Subject: Nvidia compiler bug Hi, looking at the "verbosity" of Nvidia compiler and comparing against Lukas CUDA code compiler output (thanks for your good code), i realized the compiler was doing something silly. So, i used another valid path to achieve what i want, checked if it was doing what i was expecting and: - result 3,2x faster. Local work size (LWS) 512, Keys per crypt (KPC) 7680 Benchmarking: crypt SHA-512 (rounds=5000) [OpenCL]... DONE Raw: 11405 c/s real, 11405 c/s virtual If you use __local (or tried it and failed) try your code again and check if smem is thereas it should be. Claudio.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.