Date: Mon, 14 Sep 2015 23:43:31 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: md5crypt mmxput*() On Mon, Sep 14, 2015 at 09:46:56PM +0200, magnum wrote: > BTW the total size of simd-intrinsics.o (after stripping) actually > increased. I'm not sure how to get detailed figures (eg. per function or > something?). With "nm -S": http://www.openwall.com/lists/john-dev/2015/09/05/6 What matters even more is the size of the loops, excluding any relatively rarely performed initialization. For example, for md5crypt the size of initialization before the 1000 MD5s loop doesn't matter as much as the size of that 1000 MD5s loop - but both are counted towards size of just one function (after inlining). What also matters is how the various blocks of code are re-ordered by the compiler. We could try using gcc's __builtin_expect(), like we already do in compiler.c's virtual machine. We could wrap them in likely() and unlikely() macros like the Linux kernel uses, and put those in common.h. This could reduce the address range corresponding to the loop's body (with infrequently used conditional blocks moved to outside of that range), which might help on CPUs with low L1 instruction cache associativity (like Bulldozer). Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.