Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 22 May 2015 14:22:48 +0200
From: Agnieszka Bielec <>
Subject: Re: autotune_run problem

2015-05-22 2:56 GMT+02:00 magnum <>:
> On 2015-05-22 02:00, Agnieszka Bielec wrote:
>> hi,
>> I've fixed one bug in my parallel-opencl but I have problem with
>> autotune_run
>> I discovered that for different set of arguments is running different
>> algorithm to determine when tuning for gws is stopped
>> If i make
>> autotune_run(self, 1000, 0, 500);

>> computed gws is optimal on my laptop and --dev=1 but not in --dev=5,
>> it prints exceed for the optimal value and setting highest
>> duration_time doesn't work
> Did you try setting it to 1000 instead of 500? If that works better you
> should implement a split kernel though. A full second duration is way to
> long.

thanks! I was only testing the last argument for autotune_run(self, 1,
1000, 100000); i don't know why
with 1000 works so far for parallel
but I discovered that pomelo for bigger costs (5, even not so big) can
be faster 10 times than actually after setting the last argument to
100000 (x4 faster)
and when I also modify 1.8 to 1.1 is x10 faster! (I must make speed
tests again :<)
I tested this on my laptop and for --cost=5:5,5:5
yesterday I was thinking that there are 2 various algorithms for
determining when autotune_run stops but it seems that two of them
checks if we exceeded limits
it is possible to disable checking based on time?

>> when my autotune_run call looks like:
>> autotune_run(self, 1, 1000, 100000);
>> the time when we stop computing is determined by:
>> if (best_speed && speed < 1.8 * best_speed &&
>>                  max_run_time && run_time > max_run_time) {
>>              if (!optimal_gws)
>>                  optimal_gws = num;
>>              if (options.verbosity > 3)
>>                  fprintf(stderr, " - too slow\n");
>>              break;
>>          }
>> we stop computing new values for gws only when new speed isn't 1.8
>> faster than the previous
>> and 1.8 is a wrong value for parallel, a change from 1.8 to 1.1  works
>> good for --dev=1 and on my laptop but for --dev=5 it stops for
>> unoptimal gws=4096.


>> it stops on 4096 because there is no difference in the speed for
>> gws=4094 and 8192
>> for 32768 the speed is better
> Perhaps you could try bumping your starting figure. You could make it start
> at eg. 16384 or 32768 by changing the SEED macro. Setting it too high might
> be too hard for weak devices though. This might break running on weak device
> though - even if they are slower than CPU and utterly unusable, we should
> still behave.

I was thinking about making one more step forward

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.