Date: Wed, 8 Aug 2012 17:48:55 -0400 From: Rich Felker <dalias@...ifal.cx> To: musl@...ts.openwall.com Subject: Re: crypt* files in crypt directory On Wed, Aug 08, 2012 at 09:03:00AM +0200, Daniel Cegiełka wrote: > > Maybe you could support -DFAST_CRYPT or the like. It could enable > > forced inlining and manual unrolls in crypt_blowfish.c. > > > > Alexander > > This can be a very sensible solution. Unless there's a really compelling reason to do so, I'd like to avoid having multiple alternative versions of the same code in a codebase. It makes it so there's more combinations you have to test to be sure the code works and doesn't have regressions. As it stands, the code I posted with the manual unrolling removed performs _better_ than the manually unrolled code with gcc 4 on x86_64 when optimized for speed, and it's 33% smaller when optimized for size. As for being slower on gcc 3, there's already much more performance-critical code that's significantly slower on musl+gcc3 than on glibc due to gcc3 badness, for example all of the endianness-swapping functions (byteswap.h and htonl,etc. in netdb.h). Really the only place where crypt performance is critical is in JtR, and there you're using your own optimized code internal to JtR, right? Even if crypt is half-speed on gcc3 without the manual unrolling, that still only makes a 1-order-of-magnitude (base 2) difference to the iterations you can use while keeping the same responsiveness/load, i.e. not nearly enough to make or break somebody's ability to crack your hashes. (In general, as long as you don't try to iterate this principle, an attacker who can afford N time (or N cores) can also afford 2*N time (or 2*N cores).) Aside from my own feelings on the matter, I'm trying to consider the impressions it makes on our user base. I've already taken some heat for replacing the heap sort qsort code in musl with smoothsort, despite it being a lot faster in a task where performance is generally important, and the size difference was less than this crypt unrolling. When someone frustrated with bloat sees hand-unrolled loops, their first reaction is "eew, this code is bloated". My intent with modernizing (and fixing the stack usage) of the old DES crypt code was to save enough space that we could get the new algorithms (blowfish, md5, sha) integrated without much (or even any, if possible) size increase versus the old bad DES code. I think this makes a difference for "selling" the idea of supporting all these algorithms to the anti-bloat faction of musl's following. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.