Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 22 Apr 2015 22:23:09 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: building musl libc.so with gcc -flto

On Wed, Apr 22, 2015 at 03:48:52PM -0700, Andre McCurdy wrote:
> Hi all,
> 
> Below are some observations from building musl libc.so with gcc's -flto
> (link time optimization) option.

Interesting!

> 1) With today's master (afbcac68), adding -flto to CFLAGS causes the
> build to fail:
> 
>  | `_dlstart_c' referenced in section `.text' of /tmp/cc8ceNIy.ltrans0.ltrans.o: defined in discarded section `.text' of src/ldso/dlstart.lo (symbol from plugin)
>  | collect2: error: ld returned 1 exit status
>  | make: *** [lib/libc.so] Error 1
> 
> Reverting f1faa0e1 (make _dlstart_c function use hidden visibility)
> seems to be a workaround.

I think the problem is that LTO is garbage collecting "unused" symbols
before it gets to the step of linking with asm for which there is no
IR code, thereby losing anything that's only referenced from asm. A
better workaround might be to define _dlstart_c with a different name
as a non-hidden function (e.g. call it __dls1) and then make
_dlstart_c a hidden alias for it via:

__attribute__((__visibility__("hidden")))
void _dlstart_c(size_t *, size_t *);

weak_alias(__dls1, _dlstart_c);

If you get a chance to try that, let me know if it works. Another
option might be adding -Wl,-u,_dlstart_c to LDFLAGS.

> 2) With f1faa0e1 reverted, the build succeeds, but with a warning about
> differing declarations for dummy_tsd and __pthread_tsd_main:
> 
>  | src/thread/pthread_create.c:169:1: warning: type of '__pthread_tsd_main' does not match original declaration
>  |  weak_alias(dummy_tsd, __pthread_tsd_main);
>  |  ^
>  | src/thread/pthread_key_create.c:4:7: note: previously declared here
>  |  void *__pthread_tsd_main[PTHREAD_KEYS_MAX] = { 0 };
>  |        ^

This should be harmless but perhaps there's a better way it could be
done.

> 3) Overall build times are similar, but archieving the best results
> with -flto relies on manually duplicating any 'make -j' options for
> the linker. Times below are from a quad core + hyperthreading system
> running 'make -j8 lib/libc.so':
> 
>   original : real 0m8.501s
>   -flto    : real 0m18.034s
>   -flto=4  : real 0m9.885s
>   -flto=8  : real 0m8.876s

Yeah that would be expected.

> 4) Changes in code size seem to be minor, except when compiling with
> -O3, where the code gets noticably larger (presumably due to -flto
> giving a lot more scope for inlining?). Results below are from building
> with gcc 4.9.2 for 32bit x86:
> 
>     text    data     bss     dec     hex filename
> 
>   536405    1416    8800  546621   8573d lib/libc.so      ( -Os )
>   536324    1324    8780  546428   8567c lib/libc.so.lto  ( -Os )
> 
>   612028    1416    8928  622372   97f24 lib/libc.so      ( -O2 )
>   611701    1304    9132  622137   97e39 lib/libc.so.lto  ( -O2 )
> 
>   687708    1416    8992  698116   aa704 lib/libc.so      ( -O3 )
>   713704    1312    9208  724224   b0d00 lib/libc.so.lto  ( -O3 )

Also seems rather like what I would expect. Any idea if performance is
significantly better? It's not very comprehensive but you could try
libc-bench.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.