Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 10 Jul 2013 13:49:50 -0700
From: Nathan McSween <>
Subject: Re: Thinking about release

I would think the iterate-per-char-till-zero would take the most time, even
if GCC vectorized without SIMD it would still need to iterate to find the
zero in the word with the zero, current musl does this as well though.
On Jul 10, 2013 1:34 PM, "Andre Renaud" <> wrote:

> >> What also might be worth testing is whether GCC can compete if you
> >> just give it a naive loop (not the fancy pseudo-vectorized stuff
> >> currently in musl) and good CFLAGS. I know on x86 I was able to beat
> >> the fanciest asm strlen I could come up with simply by writing the
> >> naive loop in C and unrolling it a lot.
> >
> >
> > Duff's device!
> That was exactly my first idea too, but interestingly it turns out not
> to have really added any performance improvement. Looking at the
> assembler, with -O3, gcc does a pretty good job of unrolling as it is.
> Regards,
> Andre

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.