Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 19 Feb 2020 18:38:15 +0000
From: James Morse <james.morse@....com>
To: Sami Tolvanen <samitolvanen@...gle.com>, Marc Zyngier <maz@...nel.org>
Cc: Will Deacon <will@...nel.org>, Catalin Marinas <catalin.marinas@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu
 <mhiramat@...nel.org>, Ard Biesheuvel <ard.biesheuvel@...aro.org>,
 Mark Rutland <mark.rutland@....com>, Dave Martin <Dave.Martin@....com>,
 Kees Cook <keescook@...omium.org>, Laura Abbott <labbott@...hat.com>,
 Marc Zyngier <maz@...nel.org>, Nick Desaulniers <ndesaulniers@...gle.com>,
 Jann Horn <jannh@...gle.com>, Miguel Ojeda
 <miguel.ojeda.sandonis@...il.com>,
 Masahiro Yamada <yamada.masahiro@...ionext.com>,
 clang-built-linux@...glegroups.com, kernel-hardening@...ts.openwall.com,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8 00/12] add support for Clang's Shadow Call Stack

Hi Sami,

(CC: +Marc)

On 19/02/2020 00:08, Sami Tolvanen wrote:
> This patch series adds support for Clang's Shadow Call Stack
> (SCS) mitigation, which uses a separately allocated shadow stack
> to protect against return address overwrites.

I took this for a spin on some real hardware. cpu-idle, kexec hibernate etc all work
great... but starting a KVM guest causes the CPU to get stuck in EL2.

With CONFIG_SHADOW_CALL_STACK disabled, this doesn't happen ... so its something about the
feature being enabled.


I'm using clang-9 from debian bullseye/sid. (I tried to build tip of tree ... that doesn't
go so well on arm64)

KVM takes an instruction abort from EL2 to EL2, because some of the code it runs is not
mapped at EL2:

| ffffa00011588308 <__kvm_tlb_flush_local_vmid>:
| ffffa00011588308:       d10103ff        sub     sp, sp, #0x40
| ffffa0001158830c:       f90013f3        str     x19, [sp, #32]
| ffffa00011588310:       a9037bfd        stp     x29, x30, [sp, #48]
| ffffa00011588314:       9100c3fd        add     x29, sp, #0x30
| ffffa00011588318:       97ae18bf        bl      ffffa0001010e614 <__kern_hyp_va>

INSTRUCTION ABORT!

| ffffa0001158831c:       f9400000        ldr     x0, [x0]
| ffffa00011588320:       97ae18bd        bl      ffffa0001010e614 <__kern_hyp_va>
| ffffa00011588324:       aa0003f3        mov     x19, x0
| ffffa00011588328:       97ae18c1        bl      ffffa0001010e62c <has_vhe>


__kern_hyp_va() is static-inline which is patched wherever it appears at boot with the EL2
ASLR values, it converts a kernel linear-map address to its EL2 KVM alias:

| ffffa0001010dc5c <__kern_hyp_va>:
| ffffa0001010dc5c:       92400000        and     x0, x0, #0x1
| ffffa0001010dc60:       93c00400        ror     x0, x0, #1
| ffffa0001010dc64:       91000000        add     x0, x0, #0x0
| ffffa0001010dc68:       91400000        add     x0, x0, #0x0, lsl #12
| ffffa0001010dc6c:       93c0fc00        ror     x0, x0, #63
| ffffa0001010dc70:       d65f03c0        ret


The problem here is where __kern_hyp_va() is. Its outside the __hyp_text section:
| morse@...on:~/kernel/linux-pigs$ nm -s vmlinux | grep hyp_text
| ffffa0001158b800 T __hyp_text_end
| ffffa000115838a0 T __hyp_text_start


If I disable CONFIG_SHADOW_CALL_STACK in Kconfig, I get:
| ffffa00011527fe0 <__kvm_tlb_flush_local_vmid>:
| ffffa00011527fe0:       d100c3ff        sub     sp, sp, #0x30
| ffffa00011527fe4:       a9027bfd        stp     x29, x30, [sp, #32]
| ffffa00011527fe8:       910083fd        add     x29, sp, #0x20
| ffffa00011527fec:       92400000        and     x0, x0, #0x1
| ffffa00011527ff0:       93c00400        ror     x0, x0, #1
| ffffa00011527ff4:       91000000        add     x0, x0, #0x0
| ffffa00011527ff8:       91400000        add     x0, x0, #0x0, lsl #12
| ffffa00011527ffc:       93c0fc00        ror     x0, x0, #63
| ffffa00011528000:       f9400000        ldr     x0, [x0]
| ffffa00011528004:       910023e1        add     x1, sp, #0x8
| ffffa00011528008:       92400000        and     x0, x0, #0x1
| ffffa0001152800c:       93c00400        ror     x0, x0, #1
| ffffa00011528010:       91000000        add     x0, x0, #0x0
| ffffa00011528014:       91400000        add     x0, x0, #0x0, lsl #12
| ffffa00011528018:       93c0fc00        ror     x0, x0, #63
| ffffa0001152801c:       97ffff78        bl      ffffa00011527dfc <__tlb_switch_>
| ffffa00011528020:       d508871f        tlbi    vmalle1
| ffffa00011528024:       d503201f        nop


This looks like reserving x18 is causing Clang to not-inline the __kern_hyp_va() calls,
losing the vitally important section information. (I can see why the compiler thinks this
is fair)

Is this a known, er, thing, with clang-9?

>From eyeballing the disassembly __always_inline on __kern_hyp_va() is enough of a hint to
stop this, ... with this configuration of clang-9. But KVM still doesn't work, so it isn't
the only inlining decision KVM relies on that is changed by SCS.

I suspect repainting all KVM's 'inline' with __always_inline will fix it. (yuck!) I'll try
tomorrow.

I don't think keeping the compiler-flags as they are today for KVM is the right thing to
do, it could lead to x18 getting corrupted with the shared vhe/non-vhe code. Splitting
that code up would lead to duplication.

(hopefully objtool will be able to catch these at build time)


Thanks,

James

> SCS is currently supported only on arm64, where the compiler
> requires the x18 register to be reserved for holding the current
> task's shadow stack pointer.

> Changes in v8:
>  - Added __noscs to __hyp_text instead of filtering SCS flags from
>    the entire arch/arm64/kvm/hyp directory

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.