Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 18 Aug 2015 03:43:16 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: Formats using non-SIMD SHA2 implementations

On 2015-08-18 02:08, Lei Zhang wrote:
> The problem in 7z is that the message to be hashed needs to be constructed first.
>
> The original scalar code (simplified):
>
> for (round = 0; round < rounds; ++round) {
>      SHA256_Update(&ctx, password, len);
>      SHA256_Update(&ctx, &round, sizeof(round));
> }
>
> The 'rounds' is a really big number, and a small difference in 'len' might results in very different length of the whole message. As a result, I cannot pre-tell how many limbs are needed.
>
> One way is to construct the whole message before feeding it on-fly to a small vector buffer; or use a large vector buffer to construct the message in-place. Either way I cannot avoid using a large buffer. The optimal way might be to construct the message on-fly while feeding it to a small vector buffer. This is theoretically doable, but I found it too tricky to implement... So I chose to use a large vector buffer (~30MB for a non-OpenMP build), which works ok so far.

RAR3 can also be tens of MB in size (per lane). But in early rar-opencl 
kernel I had it as just two full buffers: "unsigned char c[2*64]" (which 
was also in a union with other ways to describe it). Then I always wrote 
to buffer[index & 127]. Whenever I saw that I went into "the other" 
buffer, I called the digest function for the just filled buffer.

I'm not sure I describe it very well %-)  Maybe looking at "git show 
2972a53899:src/opencl/rar_kernel.cl" will show what I mean. That code 
did not use vectors but the idea will apply to SIMD CPU too. Very 
effective in terms of memory use.

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ