Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 2 Nov 2015 17:36:49 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] configure: add gcc flags for better link-time
 optimization

On Sun, Nov 01, 2015 at 02:56:58PM -0500, Rich Felker wrote:
> On Fri, Oct 23, 2015 at 02:30:26PM +0200, Denys Vlasenko wrote:
> > +#
> > +# Put every function and data object into its own section:
> > +# .text.funcname, .data.var, .rodata.const_struct, .bss.zerovar
> > +#
> > +# Previous optimization isn't working too well by itself
> > +# because data objects aren't living in separate sections,
> > +# they are all grouped in one .data and one .bss section per *.o file.
> > +# With -ffunction/data-sections, section sorting eliminates more padding.
> > +#
> > +# Object files in static *.a files will also have their functions
> > +# and data objects each in its own section.
> > +#
> > +# This enables programs statically linked with -Wl,--gc-sections
> > +# to perform "section garbage collection": drop unused code and data
> > +# not on per-*.o-file basis, but on per-function and per-object basis.
> > +# This is a big thing: --gc-sections sometimes eliminates several percent
> > +# of unreachable code and data in final executable.
> > +#
> > +tryflag CFLAGS_AUTO -ffunction-sections
> > +tryflag CFLAGS_AUTO -fdata-sections
> > +
> > +#
> 
> This is not just an optimization but going to save us from a horrible
> class of compiler/assembler bugs that threatened to force dropping
> support for all non-bleeding-edge toolchains:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68178
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66609
> https://sourceware.org/bugzilla/show_bug.cgi?id=18561
> 
> By putting functions/objects in their own sections, the illegal but
> widespread assembler 'optimization' of resolving differences between
> symbols to a constant when one or both of the symbols has a weak
> definition is suppressed, simply because differences of this form are
> never constants when they cross sections.
> 
> As such I want to go ahead and apply this regardless of optimization
> issues, but I think we should update the comments and commit message
> to reflect that this is also working around serious toolchain issues.
> I hope to get to it soon now; working on some other things at the
> moment.
> 
> BTW thanks a lot for raising the idea of using these options. If it
> hadn't been for your pending patch I probably would never have thought
> of this as a solution to the toolchain problems above.

Unfortunately there's an issue blocking this patch: some archs'
crt_arch.h asm fragments have code that assumes a "short" branch can
reach _start_c/_dlstart_c. With -ffunction-sections that's not the
case; the entry point and C start code can be moved arbitrarily far
apart by the linker. To fix this we either need to use a fully-general
branch to reach the C code, or have file-specific suppression of
-ffunction-sections for crt1, dlstart, etc. I'd rather just fix the
asm not to make assumptions about shortness -- some of these
assumptions are dangerously close to being wrong at -O0 anyway -- but
to do that I need to audit all the crt_arch.h files, find the affected
ones, and fix them. I'll start taking a look and see how bad it looks.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.