Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 11 Oct 2011 02:19:39 +0400
From: Solar Designer <>
Subject: Re: Benchmarks vs GCC version

On Mon, Oct 10, 2011 at 04:48:08PM -0400, Erik Winkler wrote:
> Why is the gcc 4.2.1 code significantly faster than the 4.6.1 code?  Is it because of the SSE2 intrinsics code used with gcc 4.6.1?

Maybe, but this was not supposed to be the case.  I might do some more
testing with gcc 4.6.1 specifically.  However, in current CVS tree, the
use of intrinsics has been disabled for another reason anyway.  You can
try this trivial patch too:;r2=1.11

--- john/src/x86-64.h	2011/05/08 20:55:36	1.10
+++ john/src/x86-64.h	2011/10/09 01:38:32	1.11
@@ -109,8 +109,7 @@
 #define DES_BS_ALGORITHM_NAME		"128/128 BS AVX-16"
-#elif defined(__SSE2__) && defined(__GNUC__) && \
-    ((__GNUC__ == 4 && __GNUC_MINOR__ >= 4) || __GNUC__ > 4)
+#elif defined(__SSE2__) && 0
 #define DES_BS_ASM			0
 #if 1
 #define DES_BS_VECTOR			2

Does it help?

Its commit message is:

"Don't enable the use of SSE2 intrinsics even for very recent GCC now that
x86-64.S has been optimized and appears to result in performance that is at
least as good as or is slightly better than GCC-generated code (although more
testing may be needed - on more CPUs and with more GCC versions)."

The optimizations being referred to here are post-1.7.8.  I implemented
them in the CVS tree a couple of weeks ago.  You may try this newer
x86-64.S file too (on an otherwise clean 1.7.8 tree, not jumbo).

The intrinsics will remain important for OpenMP-enabled builds, though.

Thank you for reporting!


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.