Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Date: Thu, 8 Nov 2018 14:34:51 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: strstr & memcmp stuff

It's come up on libc-alpha that the performance bottleneck for strstr
(in glibc, but the same applies to musl) is likely the forward/reverse
memcmp-like comparison loops:

https://sourceware.org/ml/libc-alpha/2018-11/msg00193.html

The reason the proposed "quicksearch" (a really naive quadratic strstr
with no optimizations but a bad character shift table) does well on
glibc is because they have extremely optimized strcmp. musl's strcmp
is a trivial C loop, and as a consequence quicksearch performs
abysmally here. (Note that quicksearch is not interesting, but getting
the same benefits for a good strstr is.)

Optimizing memcmp has been something we've kinda needed to do for a
long time. What the above suggests to me is that, rather than writing
an optimized standalone memcmp (which has an awful interface, because
it throws away valuable information, the location of the mismatch), we
should write an optimized memcmp core that returns the location of the
first mismatch, and that can be shared between memcmp, strstr, and
memmem.

I'm not sure how much portable optimization can be done (since there's
no reason for the operands to have matching alignment, word-sized
accesses to both might not be possible), but arch-specific optimized
versions are definitely possible. It might be possible/reasonable to
do something like the C memcpy for the case where the operands are
misaligned with respect to each other.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.