Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 11 Aug 2013 02:20:10 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: Optimized C memcpy [updated]

On Sun, Aug 11, 2013 at 01:11:35AM -0400, Rich Felker wrote:
> struct block32 { uint32_t data[8]; };
> struct block64 { uint64_t data[8]; };
> 
> void *memcpy(void *restrict dest, const void *restrict src, size_t n)
> {
> 	unsigned char *d = dest;
> 	const unsigned char *s = src;
> 	uint32_t w, x;
> 
> 	for (; (uintptr_t)s % 8 && n; n--) *d++ = *s++;
> 	if (!n) return dest;
> 
> 	if (n>=4) switch ((uintptr_t)d % 4) {
> 	case 0:
> 		if (!((uintptr_t)d%8)) for (; n>=64; s+=64, d+=64, n-=64)
> 			*(struct block64 *)d = *(struct block64 *)s;

Unfortunately this case seems to be compiling to a call to memcpy on
powerpc (but nowhere else I found). So I may need to drop the special
case for 64-bit alignment. I wish there was some source for knowledge
of the cases that can trigger gcc's stupidity, though...

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.