Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 18 Aug 2015 10:30:32 +0200
From: magnum <>
Subject: Re: Formats using non-SIMD SHA2 implementations

On 2015-08-18 05:51, Lei Zhang wrote:
>> On Aug 18, 2015, at 9:43 AM, magnum <> wrote:
>> RAR3 can also be tens of MB in size (per lane). But in early rar-opencl kernel I had it as just two full buffers: "unsigned char c[2*64]" (which was also in a union with other ways to describe it). Then I always wrote to buffer[index & 127]. Whenever I saw that I went into "the other" buffer, I called the digest function for the just filled buffer.
>> I'm not sure I describe it very well %-)  Maybe looking at "git show 2972a53899:src/opencl/" will show what I mean. That code did not use vectors but the idea will apply to SIMD CPU too. Very effective in terms of memory use.
> I viewed your code. It seems you only need to handle a single lane in the kernel function. The problem in the SIMD code is that I have to handle all lanes simultaneously. With your double buffers approach, I need to call the digest function when buffers for all lanes are full, but they might not be full at the same round. The buffers for some lanes might be filled faster than other lanes, thus it's complicated to determine at which points to call the digest function.
> Or perhaps I didn't understand your point correctly ?

Oh, you are right. The most effective way of handling this case might be 
to sort lengths like in Jim's sha256crypt, and *then* do it like above.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.