Date: Tue, 18 Jun 2013 16:58:07 +0200 From: Dániel Bali <balijanosdaniel@...il.com> To: john-dev@...ts.openwall.com Subject: Re: sha3-opencl Hello! Here are some results of the profiler for the keccak256 OpenCL kernel. The results are for a Turks (AMD Radeon HD 7600M Serie) GPU on which I had 919K c/s (real) performance. Here is some explanation for what the values mean: http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-app-profiler/user-guide/app-profiler-session/ http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-app-profiler/user-guide/app-profiler-settings/ VGPRs: 43 ScratchRegs: 23 ("If non zero, this is typically the main bottleneck. To reduce this number, reduce the number of GPRs used by the kernel.") KernelOccupancy: 15.625 (The limiting factor is the # of VGPRs available) ALUBusy (%): ~16 (This is bad) ALUPacking (%): 73 (This is okay, could be better) CacheHit (%): 0 (No caching happens) PathUtilization (%): 100 Another very useful feature is the kernel analyzer which shows statistics for different architectures. Here is what it shows for Tahiti and Turks in comparison: (Tahiti / Turks) ScratchRegs: 0 / 23 MaxVGPRs: 256 / 248 VGPRs: 199 / 43 This means that we aren't bottlenecked by ScratchRegs on Tahiti. What's strange is that even though Turks should allow 248 VGPRs the kernel only uses 43 in practice. Regards, Daniel Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.