Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 8 Oct 2012 23:54:16 +0530
From: Sayantan Datta <>
Subject: Re: Password hashing at scale (for Internet companies
 with millions of users) - YaC 2012 slides

On Mon, Oct 8, 2012 at 9:23 PM, Solar Designer <> wrote:

> With Xeon Phi, each of its CPU'ish cores directly controls its SIMD
> unit - just like normal CPUs do, but with a simpler CPU core (based off
> the original Pentium, so no out-of-order execution, etc.) and with wider
> SIMD vector width (512-bit vs. AVX2's 256-bit).  With tightly coupled
> CPU and GPU cores in APUs, the architecture is more similar to what we
> have in computers with discrete CPU and GPU chips now - that is, I
> expect those embedded GPUs to run code on their own rather than have
> their SIMD units directly controllable by the CPU cores.

If we are doing bcrypt on xeon phi , then in order to utilize the 512 bit
wide SIMDs , I think we must mix at least 16 bcrypt hash per core at
instruction level. However for GCN GPUs we usually don't have to worry
about instruction level parallelism (only for GCN architecture, VLIW4 could
benefit from ILP) because by definition the kernels follow SIMD
execution.   Doesn't this make programming on xeon phi harder?  In my
opinion a GCN GPU with gather-scatter load/store should be the best for the


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.