Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 24 Apr 2012 11:07:26 -0300
From: Claudio André <claudioandre.br@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: New RAR OpenCL kernel - [3]

Again
----------

Hi, see atached files. Please, try to see that 2560 seems to be a "magic 
number".

- TXT: raw results (no profiler)
- The same CSV file.
- And some more summary information.

Profiler using:
Local worksize (LWS) 256, Global worksize (KPC) 2560

----
   src/opencl/rar_kernel.cl |   34 ++++++++------
   src/rar_fmt.c            |  116 
++++++++++++++++++++++++++++++++++++++++-----
   2 files changed, 122 insertions(+), 28 deletions(-)
----




Em 22-04-2012 22:07, magnum escreveu:
> On 04/23/2012 12:02 AM, Claudio André wrote:
>>> Would both these figures by closer to 100 in a dream scenario, or what?
>>>
>>> By the way my previous version of rar got an "occupancy" of 0.01 or so
>>> (lol) in nvidia profiler. We'll see if there is any change now.
>>>
>>> magnum
>>>
>> I like the "dream scenario". Valid explanation. And 100 is the target.
>>
>> Alu packing has a ">  70" expectation.
>> Alubusy is where 100% is optimal.
>>
>> I agree that sprofile is not very useful, but is better than nothing (or
>> simple guessing). Since you have NVIDIA tools, it is not that important.
> I think sprofile is useful, it's just that my laptop GPU is so weak I
> can't draw any conclusions.
>
> Your profiling info was with LWS=GWS. Please try this if you have the time:
>
> 1. Pull latest git
> 2. Run with KPC=0 (I expect it to pick 4096 or higher as best)
> 3. Do another profiling run with the best KPC
>
> The ALU figures (and speed) should go up a lot (I hope). If they are
> not, the profiling info should tell why.
>
> thanks,
> magnum
>


View attachment "0_api_trace_sum.KernelSummary.html" of type "text/html" (35575 bytes)

View attachment "0_api_trace_sum.Top10DataTransferSummary.html" of type "text/html" (37179 bytes)

View attachment "0_api_trace_sum.Top10KernelSummary.html" of type "text/html" (37258 bytes)

View attachment "0_perf_counter.csv" of type "text/csv" (6079 bytes)

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ