|
|
Message-Id: <FB9C8433-9FEA-4508-AEB8-05204A0F35A4@gmail.com>
Date: Tue, 7 Jul 2015 21:48:13 +0800
From: Lei Zhang <zhanglei.april@...il.com>
To: john-dev@...ts.openwall.com
Subject: AltiVec troubleshooting
Hi,
I think I met with some nasty compiler issue again, on Power this time.
In pseudo-intrinsics, there's a intrinsic 'vcmov', which is available in XOP(_mm_cmov_si128), and emulated on other archs in a same way:
#define vcmov(y, z, x) vxor(z, vand(x, vxor(y, z)))
But somehow, this emulation won't give correct result on Power. I looked the preprocessed code generated by 'gcc -E', and found the compiler generated some weird code for 'vcmov'. I give a simplified example here:
Source:
--------------------------------------
#include <altivec.h>
typedef vector unsigned vtype32;
typedef union {
vtype32 v32;
uint32_t s32[4];
} vtype;
#define vand(x, y) (vtype)vec_and((x).v32, (x).v32)
#define vxor(x, y) (vtype)vec_xor((x).v32, (y).v32)
#define vcmov(y, z, x) vxor(z, vand(x, vxor(y, z)))
int main() {
vtype y, z, x, v;
v = vcmov(y, z, x);
}
-----------------------------------------
Preprocessed code:
-----------------------------------------
(...)
int main() {
vtype y, z, x, v;
v = (vtype)__builtin_vec_xor((z).v32, ((vtype)__builtin_vec_and((x).v32, (x).v32)).v32);
}
-----------------------------------------
The generated code for vcmov obviously doesn't match its definition (missing an xor operation). This looks really weird to me. Do you think it's a bug in gcc (4.9.2)?
Lei
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.