Date: Sat, 5 Sep 2015 13:00:47 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: MD5 on XOP, NEON, AltiVec On 2015-09-05 07:16, Solar Designer wrote: > On Sat, Sep 05, 2015 at 07:17:49AM +0300, Solar Designer wrote: >> On Sat, Sep 05, 2015 at 05:25:16AM +0300, Solar Designer wrote: >>> Here's what we had last year: >>> >>> Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 8x]... (8xOMP) DONE >>> Raw: 201472 c/s real, 25152 c/s virtual >>> >>> Here's what we have now: >>> >>> Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 XOP 4x2]... (8xOMP) DONE >>> Raw: 150272 c/s real, 18784 c/s virtual >> >> I sort of found it: somehow the code handling SSEi_FLAT_OUT, when >> compiled in, changes the stack frame layout in such a way that >> performance drops. I wasn't yet able to tell why it drops. The >> offsets look properly aligned to me either way. Thanks, both your patches are committed. I will add some comments to those "#if 0" later, perhaps change them to "#if USE_EXPERIMENTAL". I've had in mind to move the FLAT_IN/FLAT_OUT to separate functions. I never liked all that branching regardless of code/stack size. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.