Date: Thu, 25 Apr 2013 15:40:58 -0500 From: "jfoug" <jfoug@....net> To: <john-dev@...ts.openwall.com> Subject: RE: ICC performance regression Here are my timings. Cygwin 32 bit, i7-2600 @ 3.40GHz: New ICC sse2 md5: 30525 dynamic_0 27042K (Built on my VM, not the files you built). Old ICC sse2 md5: 30685 dynamic_0 27191K gcc 4.7.2 x64 32-bit cross compile sse2 md5: 31575 dynamic_0 26949K Here are my timings on ubuntu 12.10 (running VM, no OMP builds): New ICC sse2 md5: 38081 dynamic_0 30906K Old ICC sse2 md5: N/A dynamic_0 29974K gcc 4.7.2 x64 32-bit cross compile sse2 md5: 34950 dynamic_0 27720K The gcc build is much bigger. But having a working icc environment, I will look at carrying forward. Now that I can build and test, I will look at some of the changes we had talked about offline. Providing a 'usual' interlaced input/output interface. Providing a flat 'scalar' interface. Possibly even providing a multi-input CTX like interface. However, as we have seen from experience, it usually works out that a huge amount of gain, comes from the calling format, doing a fast job of input/output buffer handling, and letting the crypt code, just perform the crypt on 1 (para) block of prepared data. But I can do a lot better, having an environment myself, for doing the builds. Last time I looked at making any mods, when we reduced the temp buffers in SHA1, I did not have a icc (or current linux x64) build environment. I have those now, so can do a lot more playing around with that file, and now work to get sha2 functions added also. Jim.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.