Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sat, 04 Aug 2012 01:22:10 +0200
From: John Spencer <maillist-musl@...fooze.de>
To: musl@...ts.openwall.com
Subject: Re: Re: musl libc, memcpy

i've setup a perfomance test ( https://github.com/rofl0r/memcpy-test )

these are the average results for i386 (100 runs on big sizes, 10000 on 
smaller ones)

                 asm version    current c-version
size: 3     172 ticks       199 ticks
size: 4     167 ticks       167 ticks
size: 5     197 ticks       186 ticks
size: 8     187 ticks       186 ticks
size: 15        195 ticks       196 ticks
size: 16        186 ticks       185 ticks
size: 23        202 ticks       199 ticks
size: 24        193 ticks       188 ticks
size: 25        205 ticks       212 ticks
size: 31        199 ticks       198 ticks
size: 32        195 ticks       192 ticks
size: 33        204 ticks       192 ticks
size: 63        213 ticks       255 ticks
size: 64        219 ticks       226 ticks
size: 65        208 ticks       238 ticks
size: 95        220 ticks       247 ticks
size: 96        214 ticks       239 ticks
size: 97        217 ticks       243 ticks
size: 127       233 ticks       261 ticks
size: 128       225 ticks       254 ticks
size: 129       229 ticks       266 ticks
size: 159       242 ticks       279 ticks
size: 160       235 ticks       268 ticks
size: 161       238 ticks       273 ticks
size: 191       255 ticks       288 ticks
size: 192       264 ticks       288 ticks
size: 193       248 ticks       287 ticks
size: 255       279 ticks       323 ticks
size: 256       266 ticks       313 ticks
size: 257       269 ticks       319 ticks
size: 383       332 ticks       391 ticks
size: 384       308 ticks       370 ticks
size: 385       307 ticks       384 ticks
size: 511       345 ticks       439 ticks
size: 512       315 ticks       434 ticks
size: 513       318 ticks       439 ticks
size: 767       370 ticks       571 ticks
size: 768       330 ticks       555 ticks
size: 769       334 ticks       566 ticks
size: 1023      382 ticks       740 ticks
size: 1024      349 ticks       727 ticks
size: 1025      358 ticks       694 ticks
size: 1535      423 ticks       936 ticks
size: 1536      393 ticks       930 ticks
size: 1537      400 ticks       929 ticks
size: 2048      448 ticks       1176 ticks
size: 4096      822 ticks       2404 ticks
size: 8192      3136 ticks      8310 ticks
size: 16384     6481 ticks      9780 ticks
size: 32768     11645 ticks     19060 ticks
size: 65536     29700 ticks     52051 ticks
size: 131072    307029 ticks    310875 ticks
size: 262144    608502 ticks    617698 ticks
size: 524288    1222116 ticks   1244987 ticks
size: 1048576   2500207 ticks   2712991 ticks
size: 2097152   5279016 ticks   5566665 ticks
size: 4194304   10586333 ticks  10849110 ticks
size: 8388608   21961730 ticks  22473953 ticks
size: 16777216  45966254 ticks  47159258 ticks
size: 33554432  92434464 ticks  95873868 ticks
size: 67108864  189858530 ticks 190456107 ticks

it looks as if the asm version is up to twice as fast, depending on the 
size of data copied.
now waiting for the x86_64 version (if you could provide a working 64bit 
rdtsc inline asm function, i'll gladly take that as well)

someone on ##asm suggested that movaps with xmm regs was fastest in his 
tests.
would be interesting to test such a version as well.

On 08/01/2012 08:19 AM, Rich Felker wrote:
> On Wed, Aug 01, 2012 at 01:40:11AM -0400, Rich Felker wrote:
>> On Wed, Aug 01, 2012 at 12:27:22AM -0400, Rich Felker wrote:
>>> I'm attaching a (possibly buggy; not heavily tested) rep-movsd-based
>>> version. I'd be interested in hearing how it performs.
>> And here is the attachment...
> And here's a version that might be faster; reportedly, rep movsd works
> better when the destination address is aligned.
>
> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.