john-dev - Re: MD5 on XOP, NEON, AltiVec

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150905024718.GA23332@openwall.com>
Date: Sat, 5 Sep 2015 05:47:18 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: MD5 on XOP, NEON, AltiVec

On Sat, Sep 05, 2015 at 05:25:16AM +0300, Solar Designer wrote:
> Here's what we had last year:
> 
> Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 8x]... (8xOMP) DONE
> Raw:    201472 c/s real, 25152 c/s virtual
> 
> Here's what we have now:
> 
> Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE
> Raw:    150272 c/s real, 18784 c/s virtual
> 
> I tried looking at "objdump -d sse-intrinsics.o" in the old build vs.
> "objdump -d simd-intrinsics.o" in the current version, and I don't see
> any obvious problem.  Moreover, raw-md5 hasn't regressed, and I think
> both it and md5crypt share the SIMDmd5body() function.  At this point,
> my best guess is we might be getting unaligned buffers.

Guess not confirmed.  We use buffers on the stack, and they are properly
aligned for 128-bit SIMD.  This is unreliable for AVX2 and above, though.

Disabling the "#if __SSE4_1__ || __MIC__" block in SIMDmd5body()
improves performance slightly:

Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE
Raw:    156160 c/s real, 19520 c/s virtual

Perhaps there are other changes like this causing regressions as well.
We'll need to bisect the changes.  magnum, will you do that?

> Once we figure this out and fix it, we'll need to revise MD5_I in
> simd-intrinsics.c to use my newly found expression with vcmov() on XOP,
> and the obvious expression with OR-NOT on NEON and AltiVec (IIRC, those
> archs have OR-NOT, which might be lower latency than select).

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.