Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 10 Sep 2020 10:18:05 +0900
From: Masahiro Yamada <masahiroy@...nel.org>
To: Sami Tolvanen <samitolvanen@...gle.com>
Cc: Will Deacon <will@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        clang-built-linux <clang-built-linux@...glegroups.com>,
        Kernel Hardening <kernel-hardening@...ts.openwall.com>,
        linux-arch <linux-arch@...r.kernel.org>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        Linux Kbuild mailing list <linux-kbuild@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-pci@...r.kernel.org, X86 ML <x86@...nel.org>
Subject: Re: [PATCH v2 00/28] Add support for Clang LTO

On Wed, Sep 9, 2020 at 8:46 AM Sami Tolvanen <samitolvanen@...gle.com> wrote:
>
> On Sun, Sep 06, 2020 at 09:24:38AM +0900, Masahiro Yamada wrote:
> > On Fri, Sep 4, 2020 at 5:30 AM Sami Tolvanen <samitolvanen@...gle.com> wrote:
> > >
> > > This patch series adds support for building x86_64 and arm64 kernels
> > > with Clang's Link Time Optimization (LTO).
> > >
> > > In addition to performance, the primary motivation for LTO is
> > > to allow Clang's Control-Flow Integrity (CFI) to be used in the
> > > kernel. Google has shipped millions of Pixel devices running three
> > > major kernel versions with LTO+CFI since 2018.
> > >
> > > Most of the patches are build system changes for handling LLVM
> > > bitcode, which Clang produces with LTO instead of ELF object files,
> > > postponing ELF processing until a later stage, and ensuring initcall
> > > ordering.
> > >
> > > Note that patches 1-4 are not directly related to LTO, but are
> > > needed to compile LTO kernels with ToT Clang, so I'm including them
> > > in the series for your convenience:
> > >
> > >  - Patches 1-3 are required for building the kernel with ToT Clang,
> > >    and IAS, and patch 4 is needed to build allmodconfig with LTO.
> > >
> > >  - Patches 3-4 are already in linux-next, but not yet in 5.9-rc.
> > >
> >
> >
> > I still do not understand how this patch set works.
> > (only me?)
> >
> > Please let me ask fundamental questions.
> >
> >
> >
> > I applied this series on top of Linus' tree,
> > and compiled for ARCH=arm64.
> >
> > I compared the kernel size with/without LTO.
> >
> >
> >
> > [1] No LTO  (arm64 defconfig, CONFIG_LTO_NONE)
> >
> > $ llvm-size   vmlinux
> >    text    data     bss     dec     hex filename
> > 15848692 10099449 493060 26441201 19375f1 vmlinux
> >
> >
> >
> > [2] Clang LTO  (arm64 defconfig + CONFIG_LTO_CLANG)
> >
> > $ llvm-size   vmlinux
> >    text    data     bss     dec     hex filename
> > 15906864 10197445 490804 26595113 195cf29 vmlinux
> >
> >
> > I compared the size of raw binary, arch/arm64/boot/Image.
> > Its size increased too.
> >
> >
> >
> > So, in my experiment, enabling CONFIG_LTO_CLANG
> > increases the kernel size.
> > Is this correct?
>
> Yes. LTO does produce larger binaries, mostly due to function
> inlining between translation units, I believe. The compiler people
> can probably give you a more detailed answer here. Without -mllvm
> -import-instr-limit, the binaries would be even larger.
>
> > One more thing, could you teach me
> > how Clang LTO optimizes the code against
> > relocatable objects?
> >
> >
> >
> > When I learned Clang LTO first, I read this document:
> > https://llvm.org/docs/LinkTimeOptimization.html
> >
> > It is easy to confirm the final executable
> > does not contain foo2, foo3...
> >
> >
> >
> > In contrast to userspace programs,
> > kernel modules are basically relocatable objects.
> >
> > Does Clang drop unused symbols from relocatable objects?
> > If so, how?
>
> I don't think the compiler can legally drop global symbols from
> relocatable objects, but it can rename and possibly even drop static
> functions.


Compilers can drop static functions without LTO.
Rather, it is a compiler warning
(-Wunused-function), so the code should be cleaned up.



> This is why we need global wrappers for initcalls, for
> example, to have stable symbol names.
>
> Sami



At first, I thought the motivation of LTO
was to remove unused global symbols, and
to perform further optimization.


It is true for userspace programs.
In fact, the example of
https://llvm.org/docs/LinkTimeOptimization.html
produces a smaller binary.


In contrast, this patch set produces a bigger kernel
because LTO cannot remove any unused symbol.

So, I do not understand what the benefit is.


Is inlining beneficial?
I am not sure.


Documentation/process/coding-style.rst
"15) The inline disease"
mentions that inlining is not always
a good thing.


As a whole, I still do not understand
the motivation of this patch set.


-- 
Best Regards
Masahiro Yamada

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.