Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 17 Aug 2015 08:26:51 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: Formats using non-SIMD SHA2 implementations

On 2015-08-17 05:07, Lei Zhang wrote:
> I finally got 7z to work correctly with SIMD :)

Cool. Maybe you could do RAR3 too?

> On a AVX2 machine, with OpenMP disabled:
>
> [without SIMD]
> Benchmarking: 7z, 7-Zip (512K iterations) [SHA256 AES 32/64]... DONE
> Speed for cost 1 (iteration count) of 524288
> Raw:	14.5 c/s real, 14.5 c/s virtual
>
> [with SIMD]
> Benchmarking: 7z, 7-Zip (512K iterations) [SHA256 AES 32/64]... DONE
> Speed for cost 1 (iteration count) of 524288
> Raw:	41.0 c/s real, 41.0 c/s virtual
>
> So there's a ~3x speedup, while the ideal speedup is 8x. As magnum mentioned, the code is really tricky to write. I'm not sure if there's space for further optimization.

Are you sorting lengths, like Jim hinted? Or are you handling diverging 
lengths like in SAP F/G?

> And there's another minor issue: in 7z, the size of message to be hashed is like plaintext_length*rounds (not accurate, just for easy discussion), where rounds is a big number. The original plaintext_length in the scalar code is 125, which makes the entire message size really big and the overhead of copying the message to vector buffer extremely high. So I defined plaintext_length to a much smaller number (e.g. 28) in the SIMD code, I don't know if this would cause problem in practical use though.

I think anything over 20 is fine, but you need to add the UTF-8 kludge 
(multiply by 3 if target_enc is UTF-8) to init(). The reason we didn't 
already have that kludge was that 125 is the core max anyway so bumping 
to MIN(125, 3*125) would be a no-op.

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ