Date: Sun, 23 Aug 2015 11:11:52 -0400 From: Alain Espinosa <alainesp@...ta.cu> To: john-dev@...ts.openwall.com Subject: Re: PHC: Argon2 on GPU -------- Original message -------- From: Solar Designer <solar@...nwall.com> Date:08/23/2015 3:21 AM (GMT-05:00) To: john-dev@...ts.openwall.com Cc: Subject: Re: [john-dev] PHC: Argon2 on GPU ...In source code, it should be something like: dst_lo = src1_lo + src2_lo; dst_hi = src1_hi + src2_hi + (dst_lo < src1_lo); In SHA512 OpenCL code I use: uint2 x1, x2;// declare vars ulong x1, x2 uint2 result = as_uint2(as_ulong(x1)+as_ulong(x1));// to sum x1 and X2 This generate the appropriate 32 bits sums with carry in Nvidia, AMD and Intel GPUs. The 64 bit rotation is done manually, not using OpenCL rotate. am_bitalign provides a very small speedup, but note that when used with multiples of 8 it generate errors, at least when I test it, so we need to use amd_bytealign then. Regards, Alain Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.