john-dev - Re: PHC: Argon2 on GPU

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <3ds6vvg1b1wnduysrpnxw7oi.1440342712808@email.android.com>
Date: Sun, 23 Aug 2015 11:11:52 -0400
From: Alain Espinosa <alainesp@...ta.cu>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Argon2 on GPU

-------- Original message --------
From: Solar Designer <solar@...nwall.com> 
Date:08/23/2015 3:21 AM (GMT-05:00) 
To: john-dev@...ts.openwall.com 
Cc: 
Subject: Re: [john-dev] PHC: Argon2 on GPU 

...In source code, it should be something like:
dst_lo = src1_lo + src2_lo;
dst_hi = src1_hi + src2_hi + (dst_lo < src1_lo);

In SHA512 OpenCL code I use:

uint2 x1, x2;// declare vars ulong x1, x2

uint2 result = as_uint2(as_ulong(x1)+as_ulong(x1));// to sum x1 and X2

This generate the appropriate 32 bits sums with carry in Nvidia, AMD and Intel GPUs.

The 64 bit rotation is done manually, not using OpenCL rotate. am_bitalign provides a very small speedup, but note that when used with multiples of 8 it generate errors, at least when I test it, so we need to use amd_bytealign then.

Regards, 
Alain
Content of type "text/html" skipped

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.