Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Fri, 22 May 2015 02:00:27 +0200
From: Agnieszka Bielec <bielecagnieszka8@...il.com>
To: john-dev@...ts.openwall.com
Subject: autotune_run problem

hi,
I've fixed one bug in my parallel-opencl but I have problem with autotune_run
I discovered that for different set of arguments is running different
algorithm to determine when tuning for gws is stopped
If i make
autotune_run(self, 1000, 0, 500);

the stop is determined by this code: (common-opencl.c)

    if (duration_time && (endTime - startTime) > duration_time) {
            runtime = looptime = 0;

            if (options.verbosity > 4)
                fprintf(stderr, " (exceeds %s)", ns2string(duration_time));
            break;
        }

which means that we are computing various gws values until we exceed
the time, then we are choosing gws for the best speed

computed gws is optimal on my laptop and --dev=1 but not in --dev=5,
it prints exceed for the optimal value and setting highest
duration_time doesn't work

when my autotune_run call looks like:
autotune_run(self, 1, 1000, 100000);
the time when we stop computing is determined by:
if (best_speed && speed < 1.8 * best_speed &&
                max_run_time && run_time > max_run_time) {
            if (!optimal_gws)
                optimal_gws = num;

            if (options.verbosity > 3)
                fprintf(stderr, " - too slow\n");
            break;
        }

we stop computing new values for gws only when new speed isn't 1.8
faster than the previous

and 1.8 is a wrong value for parallel, a change from 1.8 to 1.1  works
good for --dev=1 and on my laptop but for --dev=5 it stops for
unoptimal gws=4096.

it stops on 4096 because there is no difference in the speed for
gws=4094 and 8192
for 32768 the speed is better

the speed for parallel on --dev=5 for various gws:
Local worksize (LWS) 64, global worksize (GWS) 4096
Benchmarking: parallel-opencl, parallel SHA-512 [ ]... DONE
Speed for cost 1 (N) of 0
Many salts:     23405 c/s real, 23630 c/s virtual
Only one salt:  23630 c/s real, 23630 c/s virtual

Local worksize (LWS) 64, global worksize (GWS) 8192
Benchmarking: parallel-opencl, parallel SHA-512 [ ]... DONE
Speed for cost 1 (N) of 0
Many salts:     23405 c/s real, 23630 c/s virtual
Only one salt:  21557 c/s real, 21557 c/s virtual

Local worksize (LWS) 64, global worksize (GWS) 32768
Benchmarking: parallel-opencl, parallel SHA-512 [ ]... DONE
Speed for cost 1 (N) of 0
Many salts:     33098 c/s real, 33098 c/s virtual
Only one salt:  33098 c/s real, 33098 c/s virtual

any idea how I can set optimal gws also for --dev=5 ?
reults above might suggest that we can have some hashes not autotuned
properly but persons with better knowledge about autotune_run should
comment this

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.