musl - Re: Optimized C memcpy [updated]

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <52077217.9070004@gentoo.org>
Date: Sun, 11 Aug 2013 13:14:31 +0200
From: Luca Barbato <lu_zero@...too.org>
To: musl@...ts.openwall.com
Subject: Re: Optimized C memcpy [updated]

On 11/08/13 10:13, Rich Felker wrote:
>> Unfortunately this case seems to be compiling to a call to memcpy on
>> powerpc (but nowhere else I found). So I may need to drop the special
>> case for 64-bit alignment. I wish there was some source for knowledge
>> of the cases that can trigger gcc's stupidity, though...
> 
> It turns out mips at certain optimization levels is also generating a
> memcpy for the structure assignments. I think I just need to drop all
> of the structure-assignment tricks and use a mildly unrolled loop with
> uint32_t units for the aligned case. This gives much worse performance
> on ARM, where gcc fails to generate the proper ldmia/stmia without the
> struct, but we have asm we can use for ARM anyway. On other archs, the
> struct copy code does not even seem to help. The simple integer loop
> works just as well.
> 
> I'll do some more experimenting and probably commit the ARM asm soon,
> followed by the C code once I get some better feedback on how it
> performs on real machines.

What about sprinkling volatile here and there?

lu

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.