Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 16 Mar 2012 01:45:23 +0400
From: Solar Designer <>
Subject: Re: AMD Bulldozer and XOP

On Thu, Mar 15, 2012 at 11:24:07PM +0200, Milen Rangelov wrote:
> I can help with some kernels. In fact, JtR is very inspiring project. I
> like to look at how people solved similar problems often in different ways.


> So 256-bit XOP is slower than 128-bit one?

According to benchmarks that were sent to me before I got a Bulldozer of
my own, yes - about twice slower per bit (four times slower per
instruction).  (I haven't tested 256-bit XOP on my own FX-8120 yet.
Will do so a bit later.)

> This reminds me of SSE2 and some old Pentium 4 CPUs :)

I think you mean SSE and Pentium 3.  Yes, that was disappointing.  In
fact, the cause might be similar: officially, those wider registers and
operations on them are "floating point" (true for both the original SSE
and now for 256-bit AVX and XOP), so there might be some overhead on
updating some CPU-internal floating-point state (flags reflecting the
current values in the vector elements if interpreted as floating-point?)
That's just a guess, though.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.