Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 21 Mar 2012 09:49:36 +0200
From: Milen Rangelov <gat3way@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: SSSE3 PSHUFB (was: AMD Bulldozer and XOP)

Hello Alexander,

Yes, it was the SSSE3 PSHUFB instruction. I am not that well acquainted
with the JtR code, so that I guess it would be better if someone that knows
what he's doing applies such a change.

The overall idea is very simple - since SHA1 expects input in w[0]...w[15]
to be in big-endian byte order (and the final result needs to be converted
to little-endian), we have two options - either do it before we load input
into xmm registers (slow), or use some bitwise magic with SSE2. What I used
before was some bit twiddling hack using several bitwise operations. I
realized that this can be performed using just one PSHUFB instruction, like
that:


__m128i swapmask = _mm_set_epi32(0x00010203, 0x04050607, 0x08090a0b,
0x0c0d0e0f );

#define SSE_Endian_Reverse(a) \
{ \
__m128i l=(a); \
(a)=_mm_shuffle_epi8(l, swapmask); \
}


It was just an idea, but it turned out to work fine.

Regards,

Content of type "text/html" skipped

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ