Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 18 Jun 2013 16:58:07 +0200
From: Dániel Bali <balijanosdaniel@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: sha3-opencl

Hello!

Here are some results of the profiler for the keccak256 OpenCL kernel.
The results are for a Turks (AMD Radeon HD 7600M Serie) GPU on which I had
919K c/s (real) performance.
Here is some explanation for what the values mean:

http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-app-profiler/user-guide/app-profiler-session/

http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-app-profiler/user-guide/app-profiler-settings/

VGPRs: 43
ScratchRegs: 23 ("If non zero, this is typically the main bottleneck. To
reduce this number, reduce the number of GPRs used by the kernel.")
KernelOccupancy: 15.625 (The limiting factor is the # of VGPRs available)
ALUBusy (%): ~16 (This is bad)
ALUPacking (%): 73 (This is okay, could be better)
CacheHit (%): 0 (No caching happens)
PathUtilization (%): 100

Another very useful feature is the kernel analyzer which shows statistics
for different architectures. Here is what it shows for Tahiti and Turks in
comparison:

(Tahiti / Turks)
ScratchRegs: 0 / 23
MaxVGPRs: 256 / 248
VGPRs: 199 / 43

This means that we aren't bottlenecked by ScratchRegs on Tahiti.
What's strange is that even though Turks should allow 248 VGPRs the kernel
only uses 43 in practice.

Regards,
Daniel

Content of type "text/html" skipped

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ