Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 5 Sep 2015 13:00:47 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: MD5 on XOP, NEON, AltiVec

On 2015-09-05 07:16, Solar Designer wrote:
> On Sat, Sep 05, 2015 at 07:17:49AM +0300, Solar Designer wrote:
>> On Sat, Sep 05, 2015 at 05:25:16AM +0300, Solar Designer wrote:
>>> Here's what we had last year:
>>>
>>> Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 8x]... (8xOMP) DONE
>>> Raw:    201472 c/s real, 25152 c/s virtual
>>>
>>> Here's what we have now:
>>>
>>> Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE
>>> Raw:    150272 c/s real, 18784 c/s virtual
>>
>> I sort of found it: somehow the code handling SSEi_FLAT_OUT, when
>> compiled in, changes the stack frame layout in such a way that
>> performance drops.  I wasn't yet able to tell why it drops.  The
>> offsets look properly aligned to me either way.

Thanks, both your patches are committed. I will add some comments to 
those "#if 0" later, perhaps change them to "#if USE_EXPERIMENTAL".

I've had in mind to move the FLAT_IN/FLAT_OUT to separate functions. I 
never liked all that branching regardless of code/stack size.

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ