Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 25 Apr 2013 15:40:58 -0500
From: "jfoug" <>
To: <>
Subject: RE: ICC performance regression

Here are my timings.  Cygwin 32 bit, i7-2600 @ 3.40GHz:

New ICC sse2   md5: 30525  dynamic_0  27042K  (Built on my VM, not the files
you built).
Old ICC sse2   md5: 30685  dynamic_0  27191K
gcc 4.7.2 x64 32-bit cross compile sse2   md5: 31575  dynamic_0  26949K

Here are my timings on ubuntu 12.10 (running VM, no OMP builds):
New ICC sse2   md5: 38081  dynamic_0  30906K
Old ICC sse2   md5: N/A    dynamic_0  29974K
gcc 4.7.2 x64 32-bit cross compile sse2   md5: 34950  dynamic_0  27720K

The gcc build is much bigger.  But having a working icc environment, I will
look at carrying forward.  Now that I can build and test, I will look at
some of the changes we had talked about offline.  Providing a 'usual'
interlaced input/output interface.  Providing a flat 'scalar' interface.
Possibly even providing a multi-input CTX like interface.  However, as we
have seen from experience, it usually works out that a huge amount of gain,
comes from the calling format, doing a fast job of input/output buffer
handling, and letting the crypt code, just perform the crypt on 1 (para)
block of prepared data.

But I can do a lot better, having an environment myself, for doing the
builds.  Last time I looked at making any mods, when we reduced the temp
buffers in SHA1, I did not have a icc (or current linux x64) build
environment.  I have those now, so can do a lot more playing around with
that file, and now work to get sha2 functions added also.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.