Date: Sun, 18 Sep 2016 16:40:30 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: memchr() performance On Sun, Sep 18, 2016 at 08:54:22PM +0200, Georg Sauthoff wrote: > (please CC me as I am not subscribed to this ML) > > Hello, > > fyi, I've done some benchmarking of different memchr() and std::find() > versions. > > I also included the memchr() version from musl. > > In general, musl's memchr() implementation doesn't perform better than a > simple unrolled loop (as used in libstdc++ std::find()) - and that is > consistent over different CPU generations and architectures. > > On recent Intel CPUs it is even slower than a naive implementation: Are you assuming vectorization of the naive version by the compiler? I think it's reasonable to assume that on x86_64 but not on 32-bit since many users build for a baseline ISA that does not have vector ops (i486 or i586). > https://gms.tf/stdfind-and-memchr-optimizations.html#measurements > https://gms.tf/sparc-and-ppc-find-benchmark-results.html > > Of course, on x86, other implementations that use SIMD instructions > perform even better. I'm aware that musl's memchr (and more generally the related functions like strchr, strlen, etc.) are not performing great, but it's not clear to me what the right solution is, since the different approaches vary A LOT in terms of how they compare with each other depending on the exact cpu model and compiler. Improving this situation is probably a big project. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.