Date: Tue, 26 Jun 2012 05:41:34 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: precompiled sse-intrinsics vs. -march=native magnum, Jim - It appears that we shouldn't use the precompiled sse-intrinsics files (icc's *.S) in -march=native builds. Specifically, when I tried linux-x86-64-gpu on bull where -march=native implies XOP, I got reporting that XOP intrinsics were being used, whereas in reality the build used icc-precompiled SSE2 code. What's worse, I got segfault for --format=md5 (read beyond end of heap after MD5_Update() was called with a huge size from the precompiled intrinsics code, I don't know why), and failed self-test for raw-md5 and raw-md4 (but working for raw-sha1). The misreporting issue has an obvious cause. The segfault and failed self-tests are a mystery to me: I don't see why the precompiled code would be incompatible with -march=native on this machine. The ABI should stay the same. Maybe there's a bug lurking around that will also bite us in other cases. The attached patch removes the use of precompiled sse-intrinsics from GPU targets. The alternative would have been to remove -march=native from them. I don't know which is the better choice; neither is perfect. Since modern systems tend to have at least AVX (and maybe also XOP), -march=native may be preferable over the precompiled SSE2 code. Yet another alternative would be to have more GPU targets for the different combinations, but that would be confusing. I've only tested this with linux-x86-64-gpu (the problem went away with this patch), even though the patch changes 6 targets. Alexander View attachment "john-gpu-no-precompiled-intrinsics.diff" of type "text/plain" (3483 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.