|
|
Message-ID: <a82a919e0810181641r2f33bbdfg12feb42c9f2cfd62@mail.gmail.com>
Date: Sun, 19 Oct 2008 00:41:45 +0100
From: "Larry Bonner" <larry.bonner1@...il.com>
To: john-users@...ts.openwall.com
Subject: Re: fast freebsd MD5 implementation [with the attached file ...]
On Fri, Oct 17, 2008 at 12:52 PM, Simon Marechal <simon@...quise.net> wrote:
> And for more spam, this adds an ICC target
>
> http://btb.banquise.net/bin/john-1.7.3.1-all-5-fastMD5.3.diff.gz
>
Hi Simon, great work - really opened my eyes to how good Intel
compiler is for this stuff!!
i was looking at how in I function, you use following intrinsic to
initialize 1 SSE register.
#define I(x,y,z) \
PARA_DO(i) tmp[i] = _mm_andnot_si128((z[i]), mask); \
PARA_DO(i) tmp[i] = _mm_or_si128((tmp[i]),(x[i])); \
PARA_DO(i) tmp[i] = _mm_xor_si128((tmp[i]),(y[i]));
_mm_andnot_si128 = Computes AND and NOT = PANDN
presumably the source of your mask is stored as local variable and
only accessed with 1 MOVDQA?
given that it might access the stack for each time PANDN is used,
would another intrinsic be better?
such as..
_mm_cmpeq_pi32 = Equal = PCMPEQD
this sets all bits of an SSE register to 1 if using same register for
source/destination
pcmpeqd xmm1,xmm1
just thinking that less use of stack/memory might help..not sure.
all the best.
--
To unsubscribe, e-mail john-users-unsubscribe@...ts.openwall.com and reply
to the automated confirmation request that will be sent to you.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.