Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 10 Jun 2015 23:59:05 +0800
From: Lei Zhang <zhanglei.april@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Interleaving of intrinsics


> On Jun 9, 2015, at 8:46 PM, Lei Zhang <zhanglei.april@...il.com> wrote:
> 
> I tried to see the 'size' of sse-intrinsics.o under different interleaving factors and compiled by clang and icc respectively.
> 
> lei-mac:src lei$ size clang/*
> __TEXT	__DATA	__OBJC	others	dec	hex
> 122863	0	0	26572	149435	247bb	clang/x1.o
> 127951	0	0	28699	156650	263ea	clang/x2.o
> 128479	0	0	28614	157093	265a5	clang/x3.o
> 127679	0	0	28527	156206	2622e	clang/x4.o
> 
> lei-mac:src lei$ size icc/*
> __TEXT	__DATA	__OBJC	others	dec	hex
> 102084	7545	0	50442	160071	27147	icc/x1.o
> 113012	9799	0	49375	172186	2a09a	icc/x2.o
> 113348	9799	0	51275	174422	2a956	icc/x3.o
> 114740	9799	0	53235	177774	2b66e	icc/x4.o

I further did some investigation into the asm code generated under x1 & x2 (SIMD_PARA_SHA256) by icc on my laptop (AVX). In SSESHA256body, there're about 200 vmovdqu instructions generated under x1, and the number is 260 under x2. Most of the vmovdqu instructions seem to be used for loading & storing xmm registers, only a few for inter-register moving. I think it's likely those additional vmovdqu instructions under x2 are for register spilling.


Lei

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ