Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 23 Apr 2012 03:07:39 +0200
From: magnum <>
Subject: Re: New RAR OpenCL kernel

On 04/23/2012 12:02 AM, Claudio André wrote:
>> Would both these figures by closer to 100 in a dream scenario, or what?
>> By the way my previous version of rar got an "occupancy" of 0.01 or so
>> (lol) in nvidia profiler. We'll see if there is any change now.
>> magnum
> I like the "dream scenario". Valid explanation. And 100 is the target.
> Alu packing has a "> 70" expectation.
> Alubusy is where 100% is optimal.
> I agree that sprofile is not very useful, but is better than nothing (or
> simple guessing). Since you have NVIDIA tools, it is not that important.

I think sprofile is useful, it's just that my laptop GPU is so weak I
can't draw any conclusions.

Your profiling info was with LWS=GWS. Please try this if you have the time:

1. Pull latest git
2. Run with KPC=0 (I expect it to pick 4096 or higher as best)
3. Do another profiling run with the best KPC

The ALU figures (and speed) should go up a lot (I hope). If they are
not, the profiling info should tell why.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.