john-dev - Re: Parallella: Litecoin mining

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAFYn=yCuO_C-ce9Wgo2ha-jdEx7Shr-m0ENLmxj-ZHiiAdngFg@mail.gmail.com>
Date: Thu, 29 Aug 2013 14:54:22 -0400
From: Yaniv Sapir <yaniv@...pteva.com>
To: john-dev <john-dev@...ts.openwall.com>
Subject: Re: Parallella: Litecoin mining

On Wed, Aug 28, 2013 at 9:37 PM, Solar Designer <solar@...nwall.com> wrote:

> Does this mean that replacing memcpy() improved the overall speed by as
> much as 15% or so?  If so, this suggests that the code wastes too much
> time copying data, and needs to be revised at higher level (than memcpy()
> itself), in addition to optimizing memcpy().
>
> Also, I just took a look at your currently committed code - your
> memcpy() replacement, at least at source code level, copies data byte by
> byte.  This is very slow, unless the compiler optimizes this into 32-bit
> or 64-bit loads and stores somehow.  I doubt that replacing memcpy()
> with this implementation of blkcpy() provided any speedup (but I could
> be wrong - weird things happen).
>
> Ideally, your blkcpy() should be a partially unrolled loop with LDRD and
> STRD instructions in it, and all of the data needs to be 8 byte aligned.


Absolutely. The newlib's memcpy() implementation does bytes and words copy,
based on the alignment of the pointers, but no shorts or doubles.

Content of type "text/html" skipped

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.