Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 22 Dec 2005 04:04:26 +0300
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re:  john improvement suggestions

On Wed, Dec 21, 2005 at 08:57:09PM +0100, Simon Marechal wrote:
> Solar Designer wrote:
> >On some CPUs, ALU+SSE, MMX+SSE, and even ALU+MMX+SSE could be beneficial,
> >too - and people have been doing that.  There's definitely room for
> >improvement here.
> 
> MMX and SSE are the same physical registers (with FPU too) on my athlon 
> xp, and AFAIK, this is true for the other architectures too:
> 
> 	mov $0x4444, %eax
> 	movd %eax, %mm0
> 	pxor %xmm0, %xmm0
> 	movd %eax, %mm1
> 	paddd %mm1, %mm0
> 	movd %mm0, %eax
> 
> eax will have the value 0x4444
> if the "pxor %xmm0, %xmm0" line is removed, it will be 0x8888

It's not so simple.  Your code actually produces different results on
P3/SSE (shared MMX and SSE register files) vs. P4/SSE2 (distinct
register files).

> I do not think it is possible to use at the same time MMX+SSE effectivly.

Well, the execution units that we're interested in appear to be shared
between MMX and SSE on current CPUs (even when registers are distinct).
But this gives us more registers (with SSE2), so we can bring more
parallelism down to instruction level avoiding some stalls (on data
dependencies) which we would otherwise incur.

-- 
Alexander Peslyak <solar at openwall.com>
GPG key ID: B35D3598  fp: 6429 0D7E F130 C13E C929  6447 73C3 A290 B35D 3598
http://www.openwall.com - bringing security into open computing environments

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ