Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 10 Aug 2012 14:06:13 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: crypt* files in crypt directory

On Fri, Aug 10, 2012 at 09:04:35PM +0400, Solar Designer wrote:
> > > why increase ptr at the begining?
> > > it seems the idiomatic way would be
> > > 
> > >  *ptr++ = L;
> > >  *ptr++ = R;
> > 
> > For me, making this change makes it 5% faster. I suspect the
> > difference comes from the fact that gcc is not smart enough to move
> > the ptr+=2; across the rest of the loop body, and the fact that it
> > gets spilled to the stack and reloaded for *both* points of usage
> > rather than just one. The original version may perform better on
> > machines with A LOT more registers, but I'm doubtful...
> 
> The spilling theory makes sense to me, but it does not fully explain the
> 5% difference - I think it could explain a 1% difference or so.  More
> likely there's some change in register allocation overall, not only for
> ptr - or something like it.

Indeed, that's possible too. I haven't read the asm diff.

> Anyhow, this does not match my test results so far, for different
> revisions of this code.  What compiler, options, architecture, CPU?

gcc 4.6.3, -O3, generic/i486 code generation, no tuning for my cpu,
which is Atom.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.