Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 16 Aug 2015 20:16:03 +0800
From: Lei Zhang <zhanglei.april@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Formats using non-SIMD SHA2 implementations


> On Aug 16, 2015, at 5:04 PM, magnum <john.magnum@...hmail.com> wrote:
> 
> On 2015-08-16 10:10, Lei Zhang wrote:
>> On Aug 14, 2015, at 5:31 PM, magnum <john.magnum@...hmail.com> wrote:
>>> 
>>> On 2015-08-14 04:35, Lei Zhang wrote:
>>>> 
>>>> 
>>>> I traced the execution of 7z's encryption: the size the hashed message could be really big, far beyond even 4 SHA2 input blocks. I think it's not possible to do the hashing with a single call to SIMDSHA256body().
>>>> 
>>>> Is there a way to repeatedly invoking SIMDSHA256body() just like SHA256_Update()?
>>> 
>>> Sure, you just have to do the job yourself. Last (or single) block is max 55 bytes of input, all other can be 64 bytes.
>>> 
>>> Say you need to do 189 bytes. You take the first 64 bytes (no 0x80, no length) and call SIMDSHA256body(). Then next 64 bytes and call it again. Now you have 61 bytes left. You put them in the buffer, add a 0x80 and zero the rest. And call SIMDSHA256body() again. Finally, in this case, you take a block of all zeros, just add the length (189*3) and make a final call.
>>> 
>>> The problem is when you have different length input in one vector. Say one of them required 4 limbs, and another just 3 and the rest only one. This is doable (we do in eg. SAP F/G) but tedious - and reduces benefit of SIMD much like diverging threads in OpenCL does. So we usually don't do SIMD with such formats.
>> 
>> I might need a little help here... I wrote a small snippet of code to experiment with SIMDSHA256body(), but somehow I can't get my anticipated output from it. Here's the code:
>> 
>> -------------------------------------------------
>> #include <openssl/sha.h>
>> #include "simd-intrinsics.h"
>> 
>> #define BIN_SIZE 32
>> #define BUF_SIZE 64
>> #define MSG_SIZE 8
>> #define HASH_IDX ((index&(SIMD_COEF_32 - 1)) + index/SIMD_COEF_32*BUF_SIZE/4*SIMD_COEF_32)
>> 
>> int main() {
>>     /* use OpenSSL */
>>     static uint32_t msg[MSG_SIZE/4] = {-1,-1}, // test input
>>                     out[BIN_SIZE/4];
>> 
>>     SHA256((unsigned char*)msg, sizeof(msg), (unsigned char*)out);
>> 
>>     /* use SIMD */
>>     static uint32_t vec_in [BUF_SIZE/4*SIMD_COEF_32],
>>                     vec_out[BIN_SIZE/4*SIMD_COEF_32];
>>     memset(vec_in, 0, sizeof(vec_in));
>> 
>>     int i, index;
>>     for (index = 0; index < SIMD_COEF_32; ++index) {
>>         for (i = 0; i < MSG_SIZE/4; ++i)
>>             vec_in[HASH_IDX + i*SIMD_COEF_32] = __builtin_bswap32(msg[i]);
>>         // padding
>>         vec_in[HASH_IDX + i*SIMD_COEF_32] = (0x80 << 24);
>>         vec_in[HASH_IDX + 15*SIMD_COEF_32] = MSG_SIZE*8;
>>     }
>> 
>>     SIMDSHA256body(vec_in, vec_out, NULL, SSEi_MIXED_IN);
>> 
>>     // compare results
>>     printf("0x%x == 0x%x ?\n", out[0], vec_out[0]);
>> }
>> -------------------------------------------------
>> 
>> I tweaked it for a while but couldn't find out what's wrong. I think the copying of message and the padding are fine. Maybe I used the wrong flag for SIMDSHA256body()?
> 
> I get this output:
> 
> 0x44aea312 == 0x12a3ae44 ?
> 
> Looks good to me, you just need to endian-swap the SIMD output.

Well, I know what's going on... I compiled my source file without adding '-mavx2' (I was on a AVX2 machine), and linked it with simd-intrinsics.o, which is automatically compiled by JtR with '-mavx2'. Then I got the erroneous output. 

I thought gcc would by default choose the widest SIMD instruction set available. I was wrong...


Thanks,
Lei


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.