kernel-hardening - Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFzcNfSKc2uU5=okU_g0utnYKoeOueJb5enoP78mqMBZPQ@mail.gmail.com>
Date: Tue, 6 Mar 2018 15:09:30 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Arnd Bergmann <arnd@...db.de>, Ard Biesheuvel <ard.biesheuvel@...aro.org>, 
	Daniel Micay <danielmicay@...il.com>, Ingo Molnar <mingo@...nel.org>, 
	Kees Cook <keescook@...omium.org>, Dave Hansen <dave.hansen@...ux.intel.com>, 
	Alexander Popov <alex.popov@...ux.com>, 
	Kernel Hardening <kernel-hardening@...ts.openwall.com>, PaX Team <pageexec@...email.hu>, 
	Brad Spengler <spender@...ecurity.net>, Andy Lutomirski <luto@...nel.org>, 
	Tycho Andersen <tycho@...ho.ws>, Laura Abbott <labbott@...hat.com>, Mark Rutland <mark.rutland@....com>, 
	Borislav Petkov <bp@...en8.de>, Richard Sandiford <richard.sandiford@....com>, 
	Thomas Gleixner <tglx@...utronix.de>, "H . Peter Anvin" <hpa@...or.com>, 
	Peter Zijlstra <a.p.zijlstra@...llo.nl>, "Dmitry V . Levin" <ldv@...linux.org>, 
	Emese Revfy <re.emese@...il.com>, Jonathan Corbet <corbet@....net>, 
	Andrey Ryabinin <aryabinin@...tuozzo.com>, 
	"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>, Thomas Garnier <thgarnie@...gle.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, Alexei Starovoitov <ast@...nel.org>, Josef Bacik <jbacik@...com>, 
	Masami Hiramatsu <mhiramat@...nel.org>, Nicholas Piggin <npiggin@...il.com>, 
	Al Viro <viro@...iv.linux.org.uk>, "David S . Miller" <davem@...emloft.net>, 
	Ding Tianhong <dingtianhong@...wei.com>, David Woodhouse <dwmw@...zon.co.uk>, 
	Josh Poimboeuf <jpoimboe@...hat.com>, Dominik Brodowski <linux@...inikbrodowski.net>, 
	Juergen Gross <jgross@...e.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, 
	Dan Williams <dan.j.williams@...el.com>, Mathias Krause <minipli@...glemail.com>, 
	Vikas Shivappa <vikas.shivappa@...ux.intel.com>, Kyle Huey <me@...ehuey.com>, 
	Dmitry Safonov <dsafonov@...tuozzo.com>, Will Deacon <will.deacon@....com>, X86 ML <x86@...nel.org>, 
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()

On Tue, Mar 6, 2018 at 2:52 PM, Steven Rostedt <rostedt@...dmis.org> wrote:
>
> I get your point. Basically you are saying that the language should
> have forced all local variables to be initialized to zero in the spec,
> and the compiler is free to optimize that initialization out if the
> variable is always initialized first.

Exactly.

> Just like it would optimize the below.
>
> int g(int c)
> {
>         int i = 0;
>
>         if (c < 10)
>                 i = 1;
>         else
>                 i = 2;
>         return i;
> };
>
> There's no reason to initialize i to zero because it will never return
> zero. But what you are saying is to make that implicit by just
> declaring 'i'.

Yes.

And at the same time, it is certainly true that a compiler cannot
*always* remove the unnecessary initialization.

But with any half-way modern compiler, all the _normal_ cases would be
no-brainers, and the case where the compiler couldn't do so really is
code that almost certainly wants that initialization anyway, because a
_human_ generally wouldn't see that it's initialized either.

The obvious counter-example is literally

> int g(int c)
> {
>         int i = 0;
>
>         initialize_variable(&i);
>
>         if (c < 10)

where a *human* can tell that "hey, that initialization to zero is
pointless, because 'i' is clearly being initialized by that
'initialize_variable()' function call".

But the compiler often cannot see that "obvious" initialization, and
would have to spend the extra instruction to actually do that zero
initialization.

So I'm not saying that it would always be entirely free to just have
the safe default.

But in the context of the kernel, I think that we can probably agree
that "oops, we've had exactly the kind of bug where a function
_didn't_ end up initializing the variable every time even though to a
human obviously thought it did" actually has happened.

Now, as mentioned, aggregate types are different.

They are _less_ different for the kernel than they are for user
programs (because we have to be careful about aggregate types on the
stack anyway just for stack size reasons), but they do have more
issues with the implicit initializers.

For aggregate types, initializers are obviously more expensive to
begin with, and the "initialize by passing a reference to an external
function" is much more common too. So there's a double hit.

At the same time, aggregate types are obviously where we tend to have
most problems, and I really think we should strive to avoid even
medium-sized aggregate types in the kernel anyway.

Which is why I'm much more willing to take a hit on those kinds of
things (that I think should be rare), than add things like "let's just
clear the stack after every system call" kinds of things.

But I do agree that for aggregate types, we almost certainly do want
to have some way to flag "this is explicitly not initialized".

Instead, right now, we're in the reverse situation, where we have to
add explicit initializers.

I think we'd be in a better situation if the *default* semantics was
the safe and well-defined behavior, and we'd have to do extra things
to get the undefined uninitialized behavior for when we know better,
and we really really care.

              Linus
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.