Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 26 Oct 2021 10:48:23 +0200
From: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
To: mpe@...erman.id.au
Cc: linuxppc-dev@...ts.ozlabs.org, oss-security@...ts.openwall.com,
 "debian-powerpc@...ts.debian.org" <debian-powerpc@...ts.debian.org>
Subject: Re: Linux kernel: powerpc: KVM guest can trigger host crash on Power8

Hi Michael!

> The Linux kernel for powerpc since v5.2 has a bug which allows a
> malicious KVM guest to crash the host, when the host is running on
> Power8.
> 
> Only machines using Linux as the hypervisor, aka. KVM, powernv or bare
> metal, are affected by the bug. Machines running PowerVM are not
> affected.
> 
> The bug was introduced in:
> 
>     10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
> 
> Which was first released in v5.2.
> 
> The upstream fix is:
> 
>   cdeb5d7d890e ("KVM: PPC: Book3S HV: Make idle_kvm_start_guest() return 0 if it went to guest")
>   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cdeb5d7d890e14f3b70e8087e745c4a6a7d9f337
> 
> Which will be included in the v5.16 release.

I have tested these patches against 5.14 but it seems the problem [1] still remains for me
for big-endian guests. I built a patched kernel yesterday, rebooted the KVM server and let
the build daemons do their work over night.

When I got up this morning, I noticed the machine was down, so I checked the serial console
via IPMI and saw the same messages again as reported in [1]:

[41483.963562] watchdog: BUG: soft lockup - CPU#104 stuck for 25521s! [migration/104:175]
[41507.963307] watchdog: BUG: soft lockup - CPU#104 stuck for 25544s! [migration/104:175]
[41518.311200] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[41518.311216] rcu:     136-...0: (135 ticks this GP) idle=242/1/0x4000000000000000 softirq=32031/32033 fqs=2729959 
[41547.962882] watchdog: BUG: soft lockup - CPU#104 stuck for 25581s! [migration/104:175]
[41571.962627] watchdog: BUG: soft lockup - CPU#104 stuck for 25603s! [migration/104:175]
[41581.330530] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[41581.330546] rcu:     136-...0: (135 ticks this GP) idle=242/1/0x4000000000000000 softirq=32031/32033 fqs=2736378 
[41611.962202] watchdog: BUG: soft lockup - CPU#104 stuck for 25641s! [migration/104:175]
[41635.961947] watchdog: BUG: soft lockup - CPU#104 stuck for 25663s! [migration/104:175]
[41644.349859] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[41644.349876] rcu:     136-...0: (135 ticks this GP) idle=242/1/0x4000000000000000 softirq=32031/32033 fqs=2742753 
[41671.961564] watchdog: BUG: soft lockup - CPU#104 stuck for 25697s! [migration/104:175]
[41695.961309] watchdog: BUG: soft lockup - CPU#104 stuck for 25719s! [migration/104:175]
[41707.369190] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[41707.369206] rcu:     136-...0: (135 ticks this GP) idle=242/1/0x4000000000000000 softirq=32031/32033 fqs=2749151 
[41735.960884] watchdog: BUG: soft lockup - CPU#104 stuck for 25756s! [migration/104:175]
[41759.960629] watchdog: BUG: soft lockup - CPU#104 stuck for 25778s! [migration/104:175]
[41770.388520] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[41770.388548] rcu:     136-...0: (135 ticks this GP) idle=242/1/0x4000000000000000 softirq=32031/32033 fqs=2755540 
[41776.076307] rcu: rcu_sched kthread timer wakeup didn't happen for 1423 jiffies! g49897 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[41776.076327] rcu:     Possible timer handling issue on cpu=32 timer-softirq=1056014
[41776.076336] rcu: rcu_sched kthread starved for 1424 jiffies! g49897 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=32
[41776.076350] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[41776.076360] rcu: RCU grace-period kthread stack dump:
[41776.076434] rcu: Stack dump where RCU GP kthread last ran:
[41783.960374] watchdog: BUG: soft lockup - CPU#104 stuck for 25801s! [migration/104:175]
[41807.960119] watchdog: BUG: soft lockup - CPU#104 stuck for 25823s! [migration/104:175]
[41831.959864] watchdog: BUG: soft lockup - CPU#104 stuck for 25846s! [migration/104:175]
[41833.407851] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[41833.407868] rcu:     136-...0: (135 ticks this GP) idle=242/1/0x4000000000000000 softirq=32031/32033 fqs=2760381 
[41863.959524] watchdog: BUG: soft lockup - CPU#104 stuck for 25875s! [migration/104:175]

It seems that in this case, it was the testsuite of the git package [2] that triggered the bug. As you
can see from the overview, the git package has been in the building state for 8 hours meaning the
build server crashed and is no longer giving feedback to the database.

Adrian

> [1] https://bugzilla.kernel.org/show_bug.cgi?id=206669
> [2] https://buildd.debian.org/status/package.php?p=git&suite=experimental

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz@...ian.org
`. `'   Freie Universitaet Berlin - glaubitz@...sik.fu-berlin.de
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.