Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 24 Aug 2015 11:43:50 +0800
From: Lei Zhang <>
Subject: Re: Formats using non-SIMD SHA2 implementations

On Aug 23, 2015, at 11:15 PM, magnum <> wrote:
>> I might not have stated my problem clearly... I'll sum it up here: suppose I have a message of length (A + B) and I've already compute the hash of the first A bytes which is H(A); given only H(A) and the remaining B bytes of data, can I compute the hash of the entire message, i.e. H(A + B) ?
>> I want to use OpenSSL's SHA function to do this, but I don't see such an interface provided. This is already in use in JtR's SIMD SHA implementation (with the SSEi_RELOAD flag), so I think it's technically doable in OpenSSL. Do I need to hack SHA_CTX somehow?
> That's easy but unfortunately it's not portable (we're not guaranteed the ctx struct internals, names differs among various libs). A better alternative might be to simply use our SIMD function (with RELOAD like you said) using just one lane.

Ok, it's done now. But the improvement is not so significant (no OpenMP):

Benchmarking: rar, RAR3 (4 characters) [SHA1 AES 32/64]... DONE
Raw:	91.1 c/s real, 91.1 c/s virtual

Benchmarking: rar, RAR3 (4 characters) [SHA1 256/256 AVX2 8x AES]... DONE
Raw:	141 c/s real, 141 c/s virtual

The code is already as optimized as 7z, so I don't expect much space of improvement.

BTW, I encountered an issue testing with OpenMP. I used some stack arrays as vector buffers, but gcc-5 cannot align them properly when OpenMP is enabled (thus giving me segfaults). This seems to be a known issue in some older gcc:

I also used stack arrays as vector buffers in 7z, but somehow it just works fine. If this cannot be solved, I could switch to global arrays anyway.


Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ