musl - Re: [PATCH 1/2] x86_64/memset: avoid multiply insn if possible

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150213072436.GD23507@brightrain.aerifal.cx>
Date: Fri, 13 Feb 2015 02:24:36 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH 1/2] x86_64/memset: avoid multiply insn if possible

On Thu, Feb 12, 2015 at 09:36:26PM +0100, Denys Vlasenko wrote:
> On Thu, Feb 12, 2015 at 8:26 PM, Denys Vlasenko
> <vda.linux@...glemail.com> wrote:
> >> I'd actually like to extend the "short" range up to at least 32 bytes
> >> using two 8-byte writes for the middle, unless the savings from using
> >> 32-bit imul instead of 64-bit are sufficient to justify 4 4-byte
> >> writes for the middle. On the cpu I tested on, the difference is 11
> >> cycles vs 32 cycles for non-rep path versus rep path at size 32.
> >
> > The short path causes mixed feelings in me.
> >
> > On one hand, it's elegant in a contrived way.
> >
> > On the other hand, multiple
> > overlaying stores must be causing hell in store unit.
> > I'm thinking, maybe there's a faster way to do that.

In practice it performs quite well. x86's are good at this. The
generic C code in memset.c does not do any overlapping writes of
different sizes for the short buffer code path -- all writes there are
single-byte, and multiple-write only happens for some of the inner
bytes depending on the value of n.

> For example, like in the attached implementation.
> 
> This one will not perform eight stores to memory
> to fill 15 byte area... only two.

I could try comparing its performance, but I expect branches to cost a
lot more than redundant stores to cached memory. My approach in the C
code seems to be the absolute minimum possible number of branches for
short memsets, and it pays off -- it's even faster than the current
asm for these small sizes.

Rich

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.