Date: Thu, 29 Aug 2013 14:54:22 -0400 From: Yaniv Sapir <yaniv@...pteva.com> To: john-dev <john-dev@...ts.openwall.com> Subject: Re: Parallella: Litecoin mining On Wed, Aug 28, 2013 at 9:37 PM, Solar Designer <solar@...nwall.com> wrote: > Does this mean that replacing memcpy() improved the overall speed by as > much as 15% or so? If so, this suggests that the code wastes too much > time copying data, and needs to be revised at higher level (than memcpy() > itself), in addition to optimizing memcpy(). > > Also, I just took a look at your currently committed code - your > memcpy() replacement, at least at source code level, copies data byte by > byte. This is very slow, unless the compiler optimizes this into 32-bit > or 64-bit loads and stores somehow. I doubt that replacing memcpy() > with this implementation of blkcpy() provided any speedup (but I could > be wrong - weird things happen). > > Ideally, your blkcpy() should be a partially unrolled loop with LDRD and > STRD instructions in it, and all of the data needs to be 8 byte aligned. Absolutely. The newlib's memcpy() implementation does bytes and words copy, based on the alignment of the pointers, but no shorts or doubles. Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.