Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260612231112.GK27423@brightrain.aerifal.cx>
Date: Fri, 12 Jun 2026 19:11:12 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: using builtins within musl

On Fri, Jun 12, 2026 at 11:43:41PM +0200, Szabolcs Nagy wrote:
> i experimented with using compiler builtins in musl after it was shown
> that builtin memcpy allows optimising qsort for various element sizes.
> 
> i expected code gen to get more streamlined. for math functions this
> mostly works when calls are inlined as single instruction, for string
> functions cold and hot code should be handled differently ideally, but
> the compiler does not know the hot calls so it can unnecessarily bloat
> the code with inlines.
> 
> i don't know if this is worth it, attached the patches for the record.

> From 1f3b9db8c25bb5996ca5c5c2e988d1babcc5daa6 Mon Sep 17 00:00:00 2001
> From: Szabolcs Nagy <nsz@...t70.net>
> Date: Mon, 25 May 2026 08:35:41 +0000
> Subject: [PATCH 1/4] string.h: use builtin memcpy and memset internally
> 
> musl is compiled for a freestanding execution environment so the
> compiler does not make assumptions about standard API calls. If an
> API is internally used according to the public interface contract,
> i.e. not relying on musl specific behaviours, then enabling compiler
> builtin for it is safe.
> 
> builtins supposed to improve code generation and most useful when a
> library call can be lowered to a few instructions. e.g. memcpy and
> memset with a fixed small size can be a few load/store/move ops.
> 
> unfortunately gcc creates a mess on x86_64 of code like
> 
>  if (n < 64) __builtin_memcpy(d,s,n);
>  if (n < 64) __builtin_memset(p,0,n);
> 
> libc.so .text change:
>     arch   diff   size
>   x86_64: +4525 667424
>  riscv64:  +724 613171
>  aarch64:  -432 679747
>      arm:  -152 658809

Can you give a brief summary of what gcc does such a bad job of on
x86_64? Does it inline something with a bunch of branching cases for
different sizes or something? The results on the other archs don't
look so bad.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.