Date: Thu, 15 Mar 2012 18:24:42 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: AMD Bulldozer and XOP On Thu, Mar 15, 2012 at 04:08:31PM +0200, Milen Rangelov wrote: > Hehe I believe jtr would be better off without my buggy code inside :) It got plenty of buggy code already, adding some more wouldn't hurt. ;-) Seriously, though, I think many contributions are useful as PoCs - then other contributors to the project may re-code things in a cleaner way. > > Oh, you use explicit asm, not intrinsics? Does XOP even offer > > bit-select instructions for XMM registers? I thought it only added > > VPCMOV, which operates on YMM registers. Or do you mix XMM/YMM? I'm sorry, that comment of mine was wrong and misleading. For a moment I confused VEX-encoding vs. not with XMM vs. YMM registers. Indeed, VPCMOV exists for XMM registers as well, and in fact this is what JtR normally uses in -xop builds (except in 32-bit x86 builds, where it tries to use 256-bit AVX and XOP to compensate for the low register count, which in practice turned out to be beneficial on Sandy Bridge, but not on Bulldozer - so I am going to stop doing that for -xop). > No, I am not a big fan of having hand-written assembly. I used the VPCMOV > intrinsic (_mm_cmov_si128) and it operates on xmm registers. I was not > aware there is 256-bit version though, hm need to check that.. Actually I > was not aware about that instruction at all, I just knew they have bitwise > rotation. I was very pleasantly surprised when I looked at the XOP > intrinsics list in MSDN and found that one out. It was like "hey wtf they > have the vec_sel thing, great!". I spent too much time playing with GPUs > recently, looks like there were some interesting news on the CPU front that > I've missed :) Yeah, I was surprised to hear you got into the CPU stuff at all - I thought you were a GPU guy. ;-) Like I said above, the 256-bit bitwise ops are currently slow. On Sandy Bridge, they deliver about the same per-bit speed that the 128-bit ones do, and on Bulldozer they appear to be slower. Things should change with future CPUs, though - especially with those supporting AVX2 (which officially gives us 256-bit integer vectors). Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.