Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Mon, 12 Dec 2016 22:44:09 +0100
From: "Jason A. Donenfeld" <Jason@...c4.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "kernel-hardening@...ts.openwall.com" <kernel-hardening@...ts.openwall.com>, 
	LKML <linux-kernel@...r.kernel.org>, 
	Linux Crypto Mailing List <linux-crypto@...r.kernel.org>, George Spelvin <linux@...izon.com>, 
	Scott Bauer <sbauer@....utah.edu>, Andi Kleen <ak@...ux.intel.com>, 
	Andy Lutomirski <luto@...capital.net>, Greg KH <gregkh@...uxfoundation.org>, 
	Jean-Philippe Aumasson <jeanphilippe.aumasson@...il.com>, "Daniel J . Bernstein" <djb@...yp.to>
Subject: Re: [PATCH v2] siphash: add cryptographically secure hashtable function

Hi Linus,

> I guess you could try to just remove the "if (left)" test entirely, if
> it is at least partly the mispredict. It should do the right thing
> even with a zero count, and it might schedule the code better. Code
> size _should_ be better with the byte mask model (which won't matter
> in the hot loop example, since it will all be cached, possibly even in
> the uop cache for really tight benchmark loops).

Originally I had just forgotten the `if (left)`, and had the same
sub-par benchmarks. In the v3 revision that I'm working on at the
moment, I'm using your dcache trick for cases 3,5,6,7 and
short-circuiting cases 1,2,4 to just directly access those bytes as
integers. For the 32-bit case, I do something similar, but built
inside of the duff's device. This should give optimal performance for
the most popular use cases, which involve hashing "some stuff" plus a
leftover u16 (port number?) or u32 (ipv4 addr?).

#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
       switch (left) {
       case 0: break;
       case 1: b |= data[0]; break;
       case 2: b |= get_unaligned_le16(data); break;
       case 4: b |= get_unaligned_le32(data); break;
       default:
               b |= le64_to_cpu(load_unaligned_zeropad(data) &
bytemask_from_count(left));
               break;
       }
#else
       switch (left) {
       case 7: b |= ((u64)data[6]) << 48;
       case 6: b |= ((u64)data[5]) << 40;
       case 5: b |= ((u64)data[4]) << 32;
       case 4: b |= get_unaligned_le32(data); break;
       case 3: b |= ((u64)data[2]) << 16;
       case 2: b |= get_unaligned_le16(data); break;
       case 1: b |= data[0];
       }
#endif

It seems like this might be best of all worlds?

Jason

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.