musl - Re: Re: [J-core] Aligned copies and cacheline conflicts?

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160916221603.GS15995@brightrain.aerifal.cx>
Date: Fri, 16 Sep 2016 18:16:03 -0400
From: Rich Felker <dalias@...c.org>
To: Rob Landley <rob@...dley.net>
Cc: "j-core@...ore.org" <j-core@...ore.org>, musl@...ts.openwall.com
Subject: Re: Re: [J-core] Aligned copies and cacheline conflicts?

On Wed, Sep 14, 2016 at 10:36:45PM -0400, Rich Felker wrote:
> On Wed, Sep 14, 2016 at 07:58:52PM -0500, Rob Landley wrote:
> > On 09/14/2016 07:34 PM, Rich Felker wrote:
> > > I could put a fork of memcpy.c in sh/memcpy.c and work on it there and
> > > only merge it back to the shared one if others test it on other archs
> > > and find it beneficial (or at least not harmful).
> > 
> > Both musl and the kernel need it. And yes at the moment it seems
> > architecture-specific, but it's a _big_ performance difference...
> 
> I actually think it's justifiable to have in the generic C memcpy,
> from a standpoint that the generic C shouldn't assume an N-way (N>1,
> i.e. not direct mapped) associative cache. Just need to make sure
> changing it doesn't make gcc do something utterly idiotic for other
> archs, I guess. I'll take a look at this.

Attached is a draft memcpy I'm considering for musl. Compared to the
current one, it:

1. Works on 32 bytes per iteration, and adds barriers between the load
   phase and store phase to preclude cache line aliasing between src
   and dest with a direct-mapped cache.

2. Equally unrolls the misaligned src/dest cases.

3. Adjusts the offsets used in the misaligned src/dest loops to all be
   multiples of 4, with the adjustments to make that work outside the
   loops. This helps compilers generate indexed addressing modes (e.g.
   @(4,Rm)) rather than having to resort to arithmetic.

4. Factors the misaligned cases into a common inline function to
   reduce code duplication.

Comments welcome.

Rich

View attachment "memcpy-draft.c" of type "text/plain" (2705 bytes)

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.