Date: Thu, 30 Sep 2021 12:59:03 -0400 From: Steven Rostedt <rostedt@...dmis.org> To: Petr Mladek <pmladek@...e.com> Cc: "Paul E. McKenney" <paulmck@...nel.org>, Alexander Popov <alex.popov@...ux.com>, Jonathan Corbet <corbet@....net>, Andrew Morton <akpm@...ux-foundation.org>, Thomas Gleixner <tglx@...utronix.de>, Peter Zijlstra <peterz@...radead.org>, Joerg Roedel <jroedel@...e.de>, Maciej Rozycki <macro@...am.me.uk>, Muchun Song <songmuchun@...edance.com>, Viresh Kumar <viresh.kumar@...aro.org>, Robin Murphy <robin.murphy@....com>, Randy Dunlap <rdunlap@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>, Kees Cook <keescook@...omium.org>, Luis Chamberlain <mcgrof@...nel.org>, Wei Liu <wl@....org>, John Ogness <john.ogness@...utronix.de>, Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, Alexey Kardashevskiy <aik@...abs.ru>, Christophe Leroy <christophe.leroy@...roup.eu>, Jann Horn <jannh@...gle.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Mark Rutland <mark.rutland@....com>, Andy Lutomirski <luto@...nel.org>, Dave Hansen <dave.hansen@...ux.intel.com>, Thomas Garnier <thgarnie@...gle.com>, Will Deacon <will.deacon@....com>, Ard Biesheuvel <ard.biesheuvel@...aro.org>, Laura Abbott <labbott@...hat.com>, David S Miller <davem@...emloft.net>, Borislav Petkov <bp@...en8.de>, kernel-hardening@...ts.openwall.com, linux-hardening@...r.kernel.org, linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, notify@...nel.org, Linus Torvalds <torvalds@...ux-foundation.org> Subject: Re: [PATCH] Introduce the pkill_on_warn boot parameter On Thu, 30 Sep 2021 11:15:41 +0200 Petr Mladek <pmladek@...e.com> wrote: > On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote: > > On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote: > > > On 29.09.2021 21:58, Alexander Popov wrote: > > > > Currently, the Linux kernel provides two types of reaction to kernel > > > > warnings: > > > > 1. Do nothing (by default), > > > > 2. Call panic() if panic_on_warn is set. That's a very strong reaction, > > > > so panic_on_warn is usually disabled on production systems. > > Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn() > work as expected. I wonder who uses it in practice and what is > the experience. Several people use it, as I see reports all the time when someone can trigger a warn on from user space, and it's listed as a DOS of the system. > > The problem is that many developers do not know about this behavior. > They use WARN() when they are lazy to write more useful message or when > they want to see all the provided details: task, registry, backtrace. WARN() Should never be used just because of laziness. If it is, then that's a bug. Let's not use that as an excuse to shoot down this proposal. WARN() should only be used to test assumptions where you do not believe something can happen. I use it all the time when the logic prevents some state, and have the WARN() enabled if that state is hit. Because to me, it shows something that shouldn't happen happened, and I need to fix the code. Basically, WARN should be used just like BUG. But Linus hates BUG, because in most cases, these bad areas shouldn't take down the entire kernel, but for some people, they WANT it to take down the system. > > Also it is inconsistent with pr_warn() behavior. Why a single line > warning would be innocent and full info WARN() cause panic/pkill? pr_warn() can be used for things that the user can hit. I'll use pr_warn, for memory failures, and such. Something that says "we ran out of resources, this will not work the way you expect". That is perfect for pr_warn. But not something that requires a stack dump. > > What about pr_err(), pr_crit(), pr_alert(), pr_emerg()? They inform > about even more serious problems. Why a warning should cause panic/pkill > while an alert message is just printed? Because really, WARN() == BUG() but like I said, Linus doesn't like taking down the entire system on these areas. > > > It somehow reminds me the saga with %pK. We were not able to teach > developers to use it correctly for years and ended with hashed > pointers. > > Well, this might be different. Developers might learn this the hard > way from bug reports. But there will be bug reports only when > anyone really enables this behavior. They will enable it only > when it works the right way most of the time. The panic_on_warn() has been used for years now. I do not think this is an issue. > > I wonder if kernel could survive killing of any kthread. I have never > seen a code that would check whether a kthread was killed and > restart it. We can easily check if the thread is a kernel thread or a user thread, and make the decision on that. -- Steve
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.