kernel-hardening - Re: [PATCH 0/2] introduce post-init read-only memory

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5656D779.17137.12A1EA53@pageexec.freemail.hu>
Date: Thu, 26 Nov 2015 10:57:13 +0100
From: "PaX Team" <pageexec@...email.hu>
To: Ingo Molnar <mingo@...nel.org>
CC: kernel-hardening@...ts.openwall.com,
        Mathias Krause <minipli@...glemail.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Kees Cook <keescook@...omium.org>,
        Andy Lutomirski <luto@...capital.net>, Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>, "H. Peter Anvin" <hpa@...or.com>,
        x86-ml <x86@...nel.org>, Arnd Bergmann <arnd@...db.de>,
        Michael Ellerman <mpe@...erman.id.au>, linux-arch@...r.kernel.org,
        Emese Revfy <re.emese@...il.com>
Subject: Re: [PATCH 0/2] introduce post-init read-only memory

On 26 Nov 2015 at 9:54, Ingo Molnar wrote:

> * PaX Team <pageexec@...email.hu> wrote:
> 
> > actually the kernel could silently recover from this given how the page fault 
> > handler could easily determine that the fault address fell into the 
> > data..read_only section and just silently undo the read-only property, log the 
> > event to dmesg and retry the faulting access.
> 
> So a safer method would be to decode the faulting instruction, to skip it by 
> fixing up the return RIP and to log the event. It would be mostly equivalent to 
> trying to write to ROM (which get ignored as well), so it's a recoverable (and 
> debuggable) event.

if by skipping you mean ignoring the write attempt then it's not a good idea
as it has a good chance to cause unexpected behaviour down the line.

e.g., imagine that the write was to a function pointer (even an entire ops
structure) or a boolean that controls some important feature for after-init
code. ignoring/dropping such writes could cause all kinds of logic bugs (if
not worse).

my somewhat related war story is that i once tried to constify machine_ops
(both the struct and the variable of the same name) directly and just forced
the writes in kvm/xen/etc via type casts. now i knew it was all undefined
behaviour but i didn't expect gcc to take advantage of it but it did (const
propagated the *initial* fptr values into the indirect calls by turning them
into direct calls) and which in turn prevented proper reboots for guests
(an event which obviously happens much later after init/boot to the great
puzzlement of end users and myself).

misusing __read_only and ignoring write attempts would effectively produce
the same misbehaviour as above so i strongly advise against it.

now i understand that the motivation is probably to preserve the read-only
property despite wrong use of __read_only but for this i'd suggest to simply
not make the silent recovery the default behaviour and enable it on the
kernel command line only. after all, this is expected to be a reproducible
problem and affected users can just reboot (in fact, they'd be advised to)
when it happens and get a proper report that they could then send back to lkml.

i also don't expect this to be a frequent problem as __read_only will have
to be handled carefully in every case and not just at the time of adding it
to a variable but also later during code evolution. as i suggested in the
constify plugin thread earlier, use of __read_only should be tied to compile
time checking of all uses of the affected variable to eliminate the entire
problem of problematic writes.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.