|
|
Message-ID: <CAKGDhHVG=GeZaQ0W1bDunhEeg9sG2qYf50TYPCMh-05d9AjT_w@mail.gmail.com>
Date: Fri, 21 Aug 2015 17:40:42 +0200
From: Agnieszka Bielec <bielecagnieszka8@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Argon2 on GPU
2015-08-20 22:34 GMT+02:00 Solar Designer <solar@...nwall.com>:
> Also, we're optimizing this blindfolded, and that's wrong. We should be
> reviewing the generated code. You may patch common-opencl.c:
> opencl_build_kernel_opt() to invoke opencl_build() like this:
>
> opencl_build(sequential_id, opts, 1, "kernel.out");
>
> instead of the current:
>
> opencl_build(sequential_id, opts, 0, NULL);
>
> Then when targeting NVIDIA cards it dumps PTX assembly to the filename
> specified there. It looks something like this, just much larger:
>
> http://arrayfire.com/demystifying-ptx-code/
>
> You could start by experimenting with a much simpler than Argon2 yet in
> some ways similar kernel: implement some trivial operation like XOR on
> different vector widths and see whether/how this changes the assembly.
> Then make it slightly less trivial (just enough to prevent the compiler
> from optimizing things out) and add uses of private or local memory,
> and see if you can make it run faster by using wider vectors per the
> same private or local memory usage.
>
I tested (only 960m)
-copying memory from __private to __private
- from __global to __private
-xoring private tables with __prrivate tables
using ulong, ulong2, ulong4, ulong8 (I was getting empty kernel using ulong16)
in generated PTX code ulong4 and ulong8 were changed to ulong2
something like here (uong4):
ld.global.v2.u64 {%rd73, %rd74}, [%rd926+8000];
ld.global.v2.u64 {%rd77, %rd78}, [%rd926+8016];
st.local.v2.u64 [%rd937+208], {%rd77, %rd78};
st.local.v2.u64 [%rd937+192], {%rd73, %rd74};
I was getting the best speed on ulong ( except copying from global to private )
speeds:
xoring:
//1
Local worksize (LWS) 64, global worksize (GWS) 256
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 1312 c/s real, 1312 c/s virtual
Only one salt: 1301 c/s real, 1312 c/s virtual
//2
Local worksize (LWS) 64, global worksize (GWS) 256
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 590 c/s real, 590 c/s virtual
Only one salt: 595 c/s real, 595 c/s virtual
//4
Local worksize (LWS) 64, global worksize (GWS) 256
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 914 c/s real, 914 c/s virtual
Only one salt: 906 c/s real, 898 c/s virtual
//8
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 738 c/s real, 731 c/s virtual
Only one salt: 738 c/s real, 738 c/s virtual
copying from global:
//1
Local worksize (LWS) 64, global worksize (GWS) 256
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 853 c/s real, 860 c/s virtual
Only one salt: 860 c/s real, 860 c/s virtual
//2
Local worksize (LWS) 64, global worksize (GWS) 256
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 1174 c/s real, 1185 c/s virtual
Only one salt: 1174 c/s real, 1163 c/s virtual
//4,8
Local worksize (LWS) 64, global worksize (GWS) 256
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 1122 c/s real, 1113 c/s virtual
Only one salt: 1132 c/s real, 1132 c/s virtual
copying from private:
//1
*/
/*
Local worksize (LWS) 64, global worksize (GWS) 256
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 2258 c/s real, 2236 c/s virtual
Only one salt: 2258 c/s real, 2258 c/s virtual*/
//2
/*
Local worksize (LWS) 64, global worksize (GWS) 256
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 685 c/s real, 679 c/s virtual
Only one salt: 685 c/s real, 691 c/s virtual
*/
//4
/*
Local worksize (LWS) 64, global worksize (GWS) 256
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 1153 c/s real, 1142 c/s virtual
Only one salt: 1163 c/s real, 1163 c/s virtual
*/
//8
/*Local worksize (LWS) 64, global worksize (GWS) 256
DONE
Speed for cost 1 (t) of 2, cost 2 (m) of 2
Many salts: 1796 c/s real, 1796 c/s virtual
Only one salt: 1812 c/s real, 1812 c/s virtual
*/
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.