Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260613011842.GL3520958@port70.net>
Date: Sat, 13 Jun 2026 03:18:42 +0200
From: Szabolcs Nagy <nsz@...t70.net>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: using builtins within musl

* Rich Felker <dalias@...c.org> [2026-06-12 19:11:12 -0400]:
> On Fri, Jun 12, 2026 at 11:43:41PM +0200, Szabolcs Nagy wrote:
> > builtins supposed to improve code generation and most useful when a
> > library call can be lowered to a few instructions. e.g. memcpy and
> > memset with a fixed small size can be a few load/store/move ops.
> > 
> > unfortunately gcc creates a mess on x86_64 of code like
> > 
> >  if (n < 64) __builtin_memcpy(d,s,n);
> >  if (n < 64) __builtin_memset(p,0,n);
> > 
> > libc.so .text change:
> >     arch   diff   size
> >   x86_64: +4525 667424
> >  riscv64:  +724 613171
> >  aarch64:  -432 679747
> >      arm:  -152 658809
> 
> Can you give a brief summary of what gcc does such a bad job of on
> x86_64? Does it inline something with a bunch of branching cases for
> different sizes or something? The results on the other archs don't
> look so bad.

x86_64 inlines more, i assume it is fast, but not the
best for size, e.g. try on godbolt:

void foo(char *d, char *s, long n)
{
    if (n < 64) __builtin_memcpy(d,s,n);
}

on x86_64 the .text changes sorted by size diff:

  -53 3836 res_msend.lo
  -44 497 gethostbyaddr_r.lo
  -33 548 sigaction.lo
  -26 387 netlink.lo
  -23 696 pthread_cancel.lo
  -19 5678 malloc.lo
  -13 356 getpw_r.lo
  -12 116 __stack_chk_fail.lo
  -12 193 mkdtemp.lo
...
  +104 6621 crypt_blowfish.lo
  +107 337 textdomain.lo
  +112 1656 getifaddrs.lo
  +116 650 execvp.lo
  +128 2091 getnameinfo.lo
  +149 1244 dn_comp.lo
  +149 3908 vfscanf.lo
  +150 19487 regcomp.lo
  +182 439 calloc.lo
  +182 439 libc_calloc.lo
  +187 4414 __tz.lo
  +204 2087 qsort.lo
  +204 896 if_nameindex.lo
  +240 2172 dcngettext.lo
  +257 3433 crypt_sha256.lo
  +357 1232 locale_map.lo
  +372 3179 crypt_md5.lo
  +616 4848 crypt_sha512.lo

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.