Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Date: Sat, 27 May 2017 20:00:15 +0000 (UTC)
From: Brad Conroy <technosaurus@...oo.com>
To:  <musl@...ts.openwall.com>
Subject: SSE2 strcasecmp

The recent discussion of tolower performance prompted me
to dig out my SSE2 version of strcasecmp.  It's about the same
number of instructions as musl's generic strcasecmp (although
slightly larger compiled due to SIMD instruction)

int strcasecmp_sse2(const char *s0, const char *s1){
  __m128i *l =(__m128i*)s0, *r=(__m128i*)s1,
          all0 = (__m128i){0}, all1 = (__m128i){-1,-1},
          allA = _mm_set1_epi8('A'-1), allZ = _mm_set1_epi8('Z'+1),
          all32 = _mm_set1_epi8(1<<5), lcl, lcr, tmp;
  unsigned m;
  size_t i = 0;
  do{
    lcl = _mm_loadu_si128 (l+i);
    lcr = _mm_loadu_si128 (r+i);
    tmp = _mm_cmpeq_epi8(lcl,all0);
    lcl |= (_mm_cmpgt_epi8(lcl,allA) & _mm_cmplt_epi8(lcl,allZ) & all32);
    lcr |= (_mm_cmpgt_epi8(lcr,allA) & _mm_cmplt_epi8(lcr,allZ) & all32);
    tmp |= (_mm_cmpeq_epi8(lcl,lcr) ^ all1);
    ++i;
  }while(!(m=_mm_movemask_epi8(tmp)));
  return ((union{__m128i v;char c[16];})(lcl-lcr)).c[__builtin_ctz(m)];
}

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.