Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 25 Apr 2013 17:18:18 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: ICC performance regression

On 25 Apr, 2013, at 16:35 , "jfoug" <jfoug@....net> wrote:
> I have just built, using gcc, to build the sse-intrinsics-32.S file, and the
> speed was almost identical to the older version made with icc.  I simply
> used the exact same command line to build to a .S file, but added  -o
> sse-intrinsic-32.S -S  and things worked.

I found out that the icc regression could be mitigated simply by changing -O3 to -O2 in the build targets. I have committed that, and new files. Please compare the 32-bit ones with yours.


Speeds for i7-3820 CPU @ 3.60GHz:

gcc 4.7.2, plain -64 build:
Benchmarking: Raw MD4 [128/128 SSE2 intrinsics 12x]... DONE
Raw:    46139K c/s real, 46139K c/s virtual

Benchmarking: Raw MD5 [128/128 SSE2 intrinsics 12x]... DONE
Raw:    34509K c/s real, 34509K c/s virtual

Benchmarking: Raw SHA-1 [128/128 SSE2 intrinsics 4x]... DONE
Raw:    18081K c/s real, 18081K c/s virtual

Benchmarking: crypt-MD5 [128/128 SSE2 intrinsics 12x]... DONE
Raw:    41700 c/s real, 42121 c/s virtual


icc 13.1.1 -O2 (builds in seconds, these are comitted):
Benchmarking: Raw MD4 [128/128 SSE2 intrinsics 12x]... DONE
Raw:    48790K c/s real, 48790K c/s virtual

Benchmarking: Raw MD5 [128/128 SSE2 intrinsics 12x]... DONE
Raw:    38513K c/s real, 38513K c/s virtual

Benchmarking: Raw SHA-1 [128/128 SSE2 intrinsics 8x]... DONE
Raw:    18043K c/s real, 18043K c/s virtual

Benchmarking: crypt-MD5 [128/128 SSE2 intrinsics 12x]... DONE
Raw:    44064 c/s real, 44064 c/s virtual


icc 13.1.1 -O3 (builds in 45 minutes):
Benchmarking: Raw MD4 [128/128 SSE2 intrinsics 12x]... DONE
Raw:    32388K c/s real, 32388K c/s virtual

Benchmarking: Raw MD5 [128/128 SSE2 intrinsics 12x]... DONE
Raw:    23006K c/s real, 23239K c/s virtual

Benchmarking: Raw SHA-1 [128/128 SSE2 intrinsics 8x]... DONE
Raw:    17986K c/s real, 17986K c/s virtual

Benchmarking: crypt-MD5 [128/128 SSE2 intrinsics 12x]... DONE
Raw:    24912 c/s real, 24912 c/s virtual


BTW I did some quick tests with SSE_PARA but they seem fine for 13.1.1 too.

magnum

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ