Date: Tue, 16 Nov 2021 11:34:17 +0300 From: Alexander Popov <alex.popov@...ux.com> To: Christophe Leroy <christophe.leroy@...roup.eu>, Steven Rostedt <rostedt@...dmis.org>, Lukas Bulwahn <lukas.bulwahn@...il.com> Cc: Linus Torvalds <torvalds@...ux-foundation.org>, Jonathan Corbet <corbet@....net>, Paul McKenney <paulmck@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>, Thomas Gleixner <tglx@...utronix.de>, Peter Zijlstra <peterz@...radead.org>, Joerg Roedel <jroedel@...e.de>, Maciej Rozycki <macro@...am.me.uk>, Muchun Song <songmuchun@...edance.com>, Viresh Kumar <viresh.kumar@...aro.org>, Robin Murphy <robin.murphy@....com>, Randy Dunlap <rdunlap@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>, Petr Mladek <pmladek@...e.com>, Kees Cook <keescook@...omium.org>, Luis Chamberlain <mcgrof@...nel.org>, Wei Liu <wl@....org>, John Ogness <john.ogness@...utronix.de>, Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, Alexey Kardashevskiy <aik@...abs.ru>, Jann Horn <jannh@...gle.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Mark Rutland <mark.rutland@....com>, Andy Lutomirski <luto@...nel.org>, Dave Hansen <dave.hansen@...ux.intel.com>, Will Deacon <will@...nel.org>, Ard Biesheuvel <ardb@...nel.org>, Laura Abbott <labbott@...nel.org>, David S Miller <davem@...emloft.net>, Borislav Petkov <bp@...en8.de>, Arnd Bergmann <arnd@...db.de>, Andrew Scull <ascull@...gle.com>, Marc Zyngier <maz@...nel.org>, Jessica Yu <jeyu@...nel.org>, Iurii Zaikin <yzaikin@...gle.com>, Rasmus Villemoes <linux@...musvillemoes.dk>, Wang Qing <wangqing@...o.com>, Mel Gorman <mgorman@...e.de>, Mauro Carvalho Chehab <mchehab+huawei@...nel.org>, Andrew Klychkov <andrew.a.klychkov@...il.com>, Mathieu Chouquet-Stringer <me@...hieu.digital>, Daniel Borkmann <daniel@...earbox.net>, Stephen Kitt <steve@....org>, Stephen Boyd <sboyd@...nel.org>, Thomas Bogendoerfer <tsbogend@...ha.franken.de>, Mike Rapoport <rppt@...nel.org>, Bjorn Andersson <bjorn.andersson@...aro.org>, Kernel Hardening <kernel-hardening@...ts.openwall.com>, linux-hardening@...r.kernel.org, "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>, linux-arch <linux-arch@...r.kernel.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, linux-fsdevel <linux-fsdevel@...r.kernel.org>, notify@...nel.org, main@...ts.elisa.tech, safety-architecture@...ts.elisa.tech, devel@...ts.elisa.tech, Shuah Khan <shuah@...nel.org>, Gabriele Paoloni <gpaoloni@...hat.com>, Robert Krutsch <krutsch@...il.com> Subject: Re: [PATCH v2 0/2] Introduce the pkill_on_warn parameter On 16.11.2021 09:37, Christophe Leroy wrote: > Le 15/11/2021 à 17:06, Steven Rostedt a écrit : >> On Mon, 15 Nov 2021 14:59:57 +0100 >> Lukas Bulwahn <lukas.bulwahn@...il.com> wrote: >> >>> 1. Allow a reasonably configured kernel to boot and run with >>> panic_on_warn set. Warnings should only be raised when something is >>> not configured as the developers expect it or the kernel is put into a >>> state that generally is _unexpected_ and has been exposed little to >>> the critical thought of the developer, to testing efforts and use in >>> other systems in the wild. Warnings should not be used for something >>> informative, which still allows the kernel to continue running in a >>> proper way in a generally expected environment. Up to my knowledge, >>> there are some kernels in production that run with panic_on_warn; so, >>> IMHO, this requirement is generally accepted (we might of course >> >> To me, WARN*() is the same as BUG*(). If it gets hit, it's a bug in the >> kernel and needs to be fixed. I have several WARN*() calls in my code, and >> it's all because the algorithms used is expected to prevent the condition >> in the warning from happening. If the warning triggers, it means either that >> the algorithm is wrong or my assumption about the algorithm is wrong. In >> either case, the kernel needs to be updated. All my tests fail if a WARN*() >> gets hit (anywhere in the kernel, not just my own). >> >> After reading all the replies and thinking about this more, I find the >> pkill_on_warning actually worse than not doing anything. If you are >> concerned about exploits from warnings, the only real solution is a >> panic_on_warning. Yes, it brings down the system, but really, it has to be >> brought down anyway, because it is in need of a kernel update. >> > > We also have LIVEPATCH to avoid bringing down the system for a kernel > update, don't we ? So I wouldn't expect bringing down a vital system > just for a WARN. Hello Christophe, I would say that different systems have different requirements. Not every Linux-based system needs live patching (it also has own limitations). That's why I proposed a sysctl and didn't change the default kernel behavior. > As far as I understand from > https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on, > WARN() and WARN_ON() are meant to deal with those situations as > gracefull as possible, allowing the system to continue running the best > it can until a human controled action is taken. I can't agree here. There is a very strong push against adding BUG*() to the kernel source code. So there are a lot of cases when WARN*() is used for severe problems because kernel developers just don't have other options. Currently, it looks like there is no consistent error handling policy in the kernel. > So I'd expect the WARN/WARN_ON to be handled and I agree that that > pkill_on_warning seems dangerous and unrelevant, probably more dangerous > than doing nothing, especially as the WARN may trigger for a reason > which has nothing to do with the running thread. Sorry, I see a contradiction. If killing a process hitting a kernel warning is "dangerous and unrelevant", why killing a process on a kernel oops is fine? That's strange. Linus calls that behavior "fairly benign" here: http://lkml.iu.edu/hypermail/linux/kernel/1610.0/01217.html Best regards, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.