Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 19 Oct 2012 00:14:51 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: bf-opencl vectorization

Sayantan,

On Thu, Oct 18, 2012 at 09:49:29PM +0530, Sayantan Datta wrote:
> As a matter of fact I did try doing two hashes per kernel . Opencl seems to
> auto vectorize the code if the instructions are mixed at BF_ENCRYPT()
> level. However if we mix at BF_ROUND() level the results are poor. I could
> achieve nearly than 2900 c/s on intel core i5 2500 with opencl. For
> comparison i5 2500 does 3600 c/s native openmp at stock speeds. However
> there wasn't much improvement for fx-8120. Maybe I could try doing four
> hashes per kernel later. I already posted the patch to git repo yesterday.

I think you're still confused.  Let me repeat: you can't possibly get
vectorized code for bf-opencl, no matter how hard and in what way you
try, when you're targeting CPUs prior to AVX2.  They simply lack gather
addressing.  The changes in performance that you're observing are thus
from something other than vectorization.  It is indeed possible that
doing several hashes per kernel in a certain way results in better or
worse performance than one hash per kernel - simply because the task
for the optimizer changes a bit, so it might perform better or worse.
The best it can do for pre-AVX2 CPUs is mix instructions like we do in C
code.  It can't vectorize.

So I am more interested in whether it'd vectorize when you target AVX2
(can you force it to target AVX2 somehow, then review the asm code?) and
how these changes affect performance on GPUs (which do have gather).

BTW, I wouldn't be surprised if the two hashes per kernel change
actually hurts performance with AVX2, as we'll have barely enough L1
data cache for _one_ set of bcrypt's using full 256-bit AVX2 vectors,
whereas with the two hashes at a time change at source level the
compiler might be mixing two sets of such 8x bcrypt's in code.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.