Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 19 Dec 2011 16:38:22 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: optimized Lotus5

Hi,

BTW, even with my recent optimizations, lotus5_fmt_plug.c still uses
only 8-bit operations most of the time.  This is wasteful.  Possible
future optimizations are:

1. Bitslicing.  lotus_magic_table[] is essentially an 8-to-8 S-box,
which we may derive (likely suboptimal) Boolean expressions for.
(This might also be somewhat GPU-friendly, unlike a straightforward
implementation like what we have now.)

2. Making use of VSIB addressing on AVX2.  But it might take around 2
years until those CPUs are available.

The current main loop is:

  unsigned char p1, p2;
  unsigned char *t1, *t2;

  p1 = p2 = 0x00;

  for (i = 18; i > 0; i--)
    {
      t1 = m1;
      t2 = m2;
      for (j = 48; j > 0; )
        {
          p1 = t1[0] ^= lotus_magic_table[ARCH_INDEX((j + p1) & 0xff)];
          p2 = t2[0] ^= lotus_magic_table[ARCH_INDEX((j-- + p2) & 0xff)];
          p1 = t1[1] ^= lotus_magic_table[ARCH_INDEX((j + p1) & 0xff)];
          p2 = t2[1] ^= lotus_magic_table[ARCH_INDEX((j-- + p2) & 0xff)];
          t1 += 2;
          t2 += 2;
        }
    }

Alexander

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ