kernel-hardening - Re: Linux guest kernel threat model for Confidential Computing

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m2zg9xi8gr.fsf@redhat.com>
Date: Wed, 01 Feb 2023 18:13:22 +0100
From: Christophe de Dinechin <dinechin@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: James Bottomley <jejb@...ux.ibm.com>, "Reshetova, Elena"
 <elena.reshetova@...el.com>, Leon Romanovsky <leon@...nel.org>, Greg
 Kroah-Hartman <gregkh@...uxfoundation.org>, "Shishkin, Alexander"
 <alexander.shishkin@...el.com>, "Shutemov, Kirill"
 <kirill.shutemov@...el.com>, "Kuppuswamy, Sathyanarayanan"
 <sathyanarayanan.kuppuswamy@...el.com>, "Kleen, Andi"
 <andi.kleen@...el.com>, "Hansen, Dave" <dave.hansen@...el.com>, Thomas
 Gleixner <tglx@...utronix.de>, Peter Zijlstra <peterz@...radead.org>,
 "Wunner, Lukas" <lukas.wunner@...el.com>, Mika Westerberg
 <mika.westerberg@...ux.intel.com>, Jason Wang <jasowang@...hat.com>,
 "Poimboe, Josh" <jpoimboe@...hat.com>, "aarcange@...hat.com"
 <aarcange@...hat.com>, Cfir Cohen <cfir@...gle.com>, Marc Orr
 <marcorr@...gle.com>, "jbachmann@...gle.com" <jbachmann@...gle.com>,
 "pgonda@...gle.com" <pgonda@...gle.com>, "keescook@...omium.org"
 <keescook@...omium.org>, James Morris <jmorris@...ei.org>, Michael Kelley
 <mikelley@...rosoft.com>, "Lange, Jon" <jlange@...rosoft.com>,
 "linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>, Linux Kernel
 Mailing List <linux-kernel@...r.kernel.org>, Kernel Hardening
 <kernel-hardening@...ts.openwall.com>
Subject: Re: Linux guest kernel threat model for Confidential Computing


On 2023-02-01 at 11:02 -05, "Michael S. Tsirkin" <mst@...hat.com> wrote...
> On Wed, Feb 01, 2023 at 02:15:10PM +0100, Christophe de Dinechin Dupont de Dinechin wrote:
>>
>>
>> > On 1 Feb 2023, at 12:01, Michael S. Tsirkin <mst@...hat.com> wrote:
>> >
>> > On Wed, Feb 01, 2023 at 11:52:27AM +0100, Christophe de Dinechin Dupont de Dinechin wrote:
>> >>
>> >>
>> >>> On 31 Jan 2023, at 18:39, Michael S. Tsirkin <mst@...hat.com> wrote:
>> >>>
>> >>> On Tue, Jan 31, 2023 at 04:14:29PM +0100, Christophe de Dinechin wrote:
>> >>>> Finally, security considerations that apply irrespective of whether the
>> >>>> platform is confidential or not are also outside of the scope of this
>> >>>> document. This includes topics ranging from timing attacks to social
>> >>>> engineering.
>> >>>
>> >>> Why are timing attacks by hypervisor on the guest out of scope?
>> >>
>> >> Good point.
>> >>
>> >> I was thinking that mitigation against timing attacks is the same
>> >> irrespective of the source of the attack. However, because the HV
>> >> controls CPU time allocation, there are presumably attacks that
>> >> are made much easier through the HV. Those should be listed.
>> >
>> > Not just that, also because it can and does emulate some devices.
>> > For example, are disk encryption systems protected against timing of
>> > disk accesses?
>> > This is why some people keep saying "forget about emulated devices, require
>> > passthrough, include devices in the trust zone".
>> >
>> >>>
>> >>>> </doc>
>> >>>>
>> >>>> Feel free to comment and reword at will ;-)
>> >>>>
>> >>>>
>> >>>> 3/ PCI-as-a-threat: where does that come from
>> >>>>
>> >>>> Isn't there a fundamental difference, from a threat model perspective,
>> >>>> between a bad actor, say a rogue sysadmin dumping the guest memory (which CC
>> >>>> should defeat) and compromised software feeding us bad data? I think there
>> >>>> is: at leats inside the TCB, we can detect bad software using measurements,
>> >>>> and prevent it from running using attestation.  In other words, we first
>> >>>> check what we will run, then we run it. The security there is that we know
>> >>>> what we are running. The trust we have in the software is from testing,
>> >>>> reviewing or using it.
>> >>>>
>> >>>> This relies on a key aspect provided by TDX and SEV, which is that the
>> >>>> software being measured is largely tamper-resistant thanks to memory
>> >>>> encryption. In other words, after you have measured your guest software
>> >>>> stack, the host or hypervisor cannot willy-nilly change it.
>> >>>>
>> >>>> So this brings me to the next question: is there any way we could offer the
>> >>>> same kind of service for KVM and qemu? The measurement part seems relatively
>> >>>> easy. Thetamper-resistant part, on the other hand, seems quite difficult to
>> >>>> me. But maybe someone else will have a brilliant idea?
>> >>>>
>> >>>> So I'm asking the question, because if you could somehow prove to the guest
>> >>>> not only that it's running the right guest stack (as we can do today) but
>> >>>> also a known host/KVM/hypervisor stack, we would also switch the potential
>> >>>> issues with PCI, MSRs and the like from "malicious" to merely "bogus", and
>> >>>> this is something which is evidently easier to deal with.
>> >>>
>> >>> Agree absolutely that's much easier.
>> >>>
>> >>>> I briefly discussed this with James, and he pointed out two interesting
>> >>>> aspects of that question:
>> >>>>
>> >>>> 1/ In the CC world, we don't really care about *virtual* PCI devices. We
>> >>>>  care about either virtio devices, or physical ones being passed through
>> >>>>  to the guest. Let's assume physical ones can be trusted, see above.
>> >>>>  That leaves virtio devices. How much damage can a malicious virtio device
>> >>>>  do to the guest kernel, and can this lead to secrets being leaked?
>> >>>>
>> >>>> 2/ He was not as negative as I anticipated on the possibility of somehow
>> >>>>  being able to prevent tampering of the guest. One example he mentioned is
>> >>>>  a research paper [1] about running the hypervisor itself inside an
>> >>>>  "outer" TCB, using VMPLs on AMD. Maybe something similar can be achieved
>> >>>>  with TDX using secure enclaves or some other mechanism?
>> >>>
>> >>> Or even just secureboot based root of trust?
>> >>
>> >> You mean host secureboot? Or guest?
>> >>
>> >> If it’s host, then the problem is detecting malicious tampering with
>> >> host code (whether it’s kernel or hypervisor).
>> >
>> > Host.  Lots of existing systems do this.  As an extreme boot a RO disk,
>> > limit which packages are allowed.
>>
>> Is that provable to the guest?
>>
>> Consider a cloud provider doing that: how do they prove to their guest:
>>
>> a) What firmware, kernel and kvm they run
>>
>> b) That what they booted cannot be maliciouly modified, e.g. by a rogue
>>    device driver installed by a rogue sysadmin
>>
>> My understanding is that SecureBoot is only intended to prevent non-verified
>> operating systems from booting. So the proof is given to the cloud provider,
>> and the proof is that the system boots successfully.
>
> I think I should have said measured boot not secure boot.

The problem again is how you prove to the guest that you are not lying?

We know how to do that from a guest [1], but you will note that in the
normal process, a trusted hardware component (e.g. the PSP for AMD SEV)
proves the validity of the measurements of the TCB by encrypting it with an
attestation signing key derived from some chip-unique secret. For AMD, this
is called the VCEK, and TDX has something similar. In the case of SEV, this
goes through firmware, and you have to tell the firmware each time you
insert data in the original TCB (using SNP_LAUNCH_UPDATE). This is all tied
to a VM execution context. I do not believe there is any provision to do the
same thing to measure host data. And again, it would be somewhat pointless
if there isn't also a mechanism to ensure the host data is not changed after
the measurement.

Now, I don't think it would be super-difficult to add a firmware service
that would let the host do some kind of equivalent to PVALIDATE, setting
some physical pages aside that then get measured and become inaccessible to
the host. The PSP or similar could then integrate these measurements as part
of the TCB, and the fact that the pages were "transferred" to this special
invariant block would ensure the guests that the code will not change after
being measured.

I am not aware that such a mechanism exists on any of the existing CC
platforms. Please feel free to enlighten me if I'm wrong.

[1] https://www.redhat.com/en/blog/understanding-confidential-containers-attestation-flow
>
>>
>> After that, I think all bets are off. SecureBoot does little AFAICT
>> to prevent malicious modifications of the running system by someone with
>> root access, including deliberately loading a malicious kvm-zilog.ko
>
> So disable module loading then or don't allow root access?

Who would do that?

The problem is that we have a host and a tenant, and the tenant does not
trust the host in principle. So it is not sufficient for the host to disable
module loading or carefully control root access. It is also necessary to
prove to the tenant(s) that this was done.

>
>>
>> It does not mean it cannot be done, just that I don’t think we
>> have the tools at the moment.
>
> Phones, chromebooks do this all the time ...

Indeed, but there, this is to prove to the phone's real owner (which,
surprise, is not the naive person who thought they'd get some kind of
ownership by buying the phone) that the software running on the phone has
not been replaced by some horribly jailbreaked goo.

In other words, the user of the phone gets no proof whatsoever of anything,
except that the phone appears to work. This is somewhat the situation in the
cloud today: the owners of the hardware get all sorts of useful checks, from
SecureBoot to error-correction for memory or I/O devices. However, someone
running in a VM on the cloud gets none of that, just like the user of your
phone.

--
Cheers,
Christophe de Dinechin (https://c3d.github.io)
Theory of Incomplete Measurements (https://c3d.github.io/TIM)
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.