Date: Sat, 1 Sep 2012 03:07:40 +0400 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: Questions about compiling for Optimal CPU Performance On Wed, Aug 29, 2012 at 11:55:14AM -0400, Brad Tilley wrote: > >> 1. -fopenmp > >> 2. -fopenmp -msse2 [...] > I find the former to work better than the latter on 32-bit systems. This is puzzling. Normally, build on 32-bit x86 with -fopenmp, but without -msse2, "should" fail - I've just tried with john-1.7.9-jumbo-6, building it as linux-x86-sse2, and it failed with: x86-sse.o: In function `DES_bs_crypt': (.text+0x40): multiple definition of `DES_bs_crypt' DES_bs_b.o:DES_bs_b.c:(.text+0x7e2c): first defined here and so on, because in -jumbo we're adding -msse2 to CFLAGS, but not to ASFLAGS. (magnum - BTW, I think that's a minor bug. Also, the addition of -msse2 even for john.c is a bug. That's a john-dev topic, though.) Without -jumbo, when you don't specify -msse2 and build for 32-bit x86, you should have a #warning printed by DES_bs_b.c telling you that you'll only get assembly code, but not OpenMP, and suggesting you to add -msse2. Brad, why are you building with OpenMP on your Celeron? Does it have more than one logical CPU? If not, then the assembly code for DES is indeed faster than the thread-safe alternative that an OpenMP build would use. Even with two logical CPUs (but one core), the assembly code is likely faster (using only one of the logical CPUs). The performance hit of going from assembly to compiler-generated SSE2 code on 32-bit x86 is just too great. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.