Date: Thu, 30 Sep 2021 18:05:54 +0300 From: Alexander Popov <alex.popov@...ux.com> To: Petr Mladek <pmladek@...e.com>, "Paul E. McKenney" <paulmck@...nel.org> Cc: Jonathan Corbet <corbet@....net>, Andrew Morton <akpm@...ux-foundation.org>, Thomas Gleixner <tglx@...utronix.de>, Peter Zijlstra <peterz@...radead.org>, Joerg Roedel <jroedel@...e.de>, Maciej Rozycki <macro@...am.me.uk>, Muchun Song <songmuchun@...edance.com>, Viresh Kumar <viresh.kumar@...aro.org>, Robin Murphy <robin.murphy@....com>, Randy Dunlap <rdunlap@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>, Kees Cook <keescook@...omium.org>, Luis Chamberlain <mcgrof@...nel.org>, Wei Liu <wl@....org>, John Ogness <john.ogness@...utronix.de>, Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, Alexey Kardashevskiy <aik@...abs.ru>, Christophe Leroy <christophe.leroy@...roup.eu>, Jann Horn <jannh@...gle.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Mark Rutland <mark.rutland@....com>, Andy Lutomirski <luto@...nel.org>, Dave Hansen <dave.hansen@...ux.intel.com>, Steven Rostedt <rostedt@...dmis.org>, Will Deacon <will.deacon@....com>, David S Miller <davem@...emloft.net>, Borislav Petkov <bp@...en8.de>, kernel-hardening@...ts.openwall.com, linux-hardening@...r.kernel.org, linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, notify@...nel.org, Linus Torvalds <torvalds@...ux-foundation.org>, Dmitry Vyukov <dvyukov@...gle.com> Subject: Re: [PATCH] Introduce the pkill_on_warn boot parameter On 30.09.2021 12:15, Petr Mladek wrote: > On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote: >> On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote: >>> On 29.09.2021 21:58, Alexander Popov wrote: >>>> Currently, the Linux kernel provides two types of reaction to kernel >>>> warnings: >>>> 1. Do nothing (by default), >>>> 2. Call panic() if panic_on_warn is set. That's a very strong reaction, >>>> so panic_on_warn is usually disabled on production systems. > > Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn() > work as expected. I wonder who uses it in practice and what is > the experience. > > The problem is that many developers do not know about this behavior. > They use WARN() when they are lazy to write more useful message or when > they want to see all the provided details: task, registry, backtrace. > > Also it is inconsistent with pr_warn() behavior. Why a single line > warning would be innocent and full info WARN() cause panic/pkill? > > What about pr_err(), pr_crit(), pr_alert(), pr_emerg()? They inform > about even more serious problems. Why a warning should cause panic/pkill > while an alert message is just printed? That's a good question. I guess various kernel continuous integration (CI) systems have panic_on_warn enabled. [Adding Dmitry Vyukov to this discussion] If we look at the syzbot dashboard  with the results of Linux kernel fuzzing, we see the issues that appear as various kernel crashes and warnings. We don't see anything from pr_err(), pr_crit(), pr_alert(), pr_emerg(). Maybe these situations are not considered as kernel bugs that require fixing. Anyway, from a security point of view, a kernel warning output is interesting for attackers as an infoleak. The messages printed by pr_err(), pr_crit(), pr_alert(), pr_emerg() provide less information. : https://syzkaller.appspot.com/upstream > It somehow reminds me the saga with %pK. We were not able to teach > developers to use it correctly for years and ended with hashed > pointers. > > Well, this might be different. Developers might learn this the hard > way from bug reports. But there will be bug reports only when > anyone really enables this behavior. They will enable it only > when it works the right way most of the time. > > >>>> From a safety point of view, the Linux kernel misses a middle way of >>>> handling kernel warnings: >>>> - The kernel should stop the activity that provokes a warning, >>>> - But the kernel should avoid complete denial of service. >>>> >>>> From a security point of view, kernel warning messages provide a lot of >>>> useful information for attackers. Many GNU/Linux distributions allow >>>> unprivileged users to read the kernel log, so attackers use kernel >>>> warning infoleak in vulnerability exploits. See the examples: >>>> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html >>>> https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html >>>> >>>> Let's introduce the pkill_on_warn boot parameter. >>>> If this parameter is set, the kernel kills all threads in a process >>>> that provoked a kernel warning. This behavior is reasonable from a safety >>>> point of view described above. It is also useful for kernel security >>>> hardening because the system kills an exploit process that hits a >>>> kernel warning. >>>> >>>> Signed-off-by: Alexander Popov <alex.popov@...ux.com> >>> >>> This patch was tested using CONFIG_LKDTM. >>> The kernel kills a process that performs this: >>> echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT >>> >>> If you are fine with this approach, I will prepare a patch adding the >>> pkill_on_warn sysctl. >> >> I suspect that you need a list of kthreads for which you are better >> off just invoking panic(). RCU's various kthreads, for but one set >> of examples. > > I wonder if kernel could survive killing of any kthread. I have never > seen a code that would check whether a kthread was killed and > restart it. The do_group_exit() function calls do_exit() from kernel/exit.c, which is also called during a kernel oops. This function cares about a lot of special cases depending on the current task_struct. Is it fine? Best regards, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.