Date: Mon, 7 Sep 2015 19:38:21 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Intrinsics experiment with CPP macros Solar, all, https://github.com/magnumripper/JohnTheRipper/issues/1720 A little experiment is currently in the cpp-intrinsics topic branch. Specifically 51f3fe6 for now. To the formats, nothing changed. But eg. SIMDmd4body() is now a function-like macro that will optimize away some branching and make the actual functions smaller (this could be taken further). Currently only MD4 & MD5 are done, and more could be done to them. What is done, is there are now (behind the curtain) two different functions - one for single (or first) block and another for "reload". Also, the "flat to interleaved" is moved to a separate function and that is also hidden by PP macros (optimized away unless needed since SSEi_flags are a constant). Boost seems to be 5-10% depending on format. Still, I'm not quite sure we want to walk this path at all? One effect of this (version) is we now have two copies of the core MD4(w, a, b, c, d) function inside simd-intrinsics.c. The good thing about that is the optimizer may do some good stuff with the beginning of the "non-reload" version since a, b, c and d are now constants but I'm not sure the optimizer actually manages to do that with intrinsics (for plain C or OpenCL I'm pretty sure it does). Perhaps we can do this in a different (and better) way, or perhaps this is fine. Or perhaps we should forget about this whole idea, for easier-to-follow code. I have no opinion yet, I'm just experimenting. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.