![]() |
|
Message-ID: <874it6qzd0.ffs@tglx> Date: Sat, 13 Sep 2025 15:02:51 +0200 From: Thomas Gleixner <tglx@...utronix.de> To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, LKML <linux-kernel@...r.kernel.org> Cc: Peter Zilstra <peterz@...radead.org>, "Paul E. McKenney" <paulmck@...nel.org>, Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet <corbet@....net>, Prakash Sangappa <prakash.sangappa@...cle.com>, Madadi Vineeth Reddy <vineethr@...ux.ibm.com>, K Prateek Nayak <kprateek.nayak@....com>, Steven Rostedt <rostedt@...dmis.org>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Arnd Bergmann <arnd@...db.de>, linux-arch@...r.kernel.org, Florian Weimer <fweimer@...hat.com>, "carlos@...hat.com" <carlos@...hat.com>, libc-coord@...ts.openwall.com Subject: Re: [patch 00/12] rseq: Implement time slice extension mechanism On Fri, Sep 12 2025 at 15:26, Mathieu Desnoyers wrote: > On 2025-09-12 12:31, Thomas Gleixner wrote: >>> 2) Slice requests are a good fit for locking. Locking typically >>> has nesting ability. >>> >>> We should consider making the slice request ABI a 8-bit >>> or 16-bit nesting counter to allow nesting of its users. >> >> Making request a counter requires to keep request set when the >> extension is granted. So the states would be: >> >> request granted >> 0 0 Neutral >> >0 0 Requested >> >=0 1 Granted > Second thoughts on this. Such a scheme means that slice_ctrl.request must be read only for the kernel because otherwise the user space decrement would need to be an atomic dec_if_not_zero(). We just argued the one atomic operation away. :) That means, the kernel can only set and clear Granted. That in turn loses the information whether a slice extension was denied or revoked, which was something the Oracle people wanted to have. I'm not sure whether that was a functional or more a instrumentation feature. But what's worse: this is a receipe for disaster as it creates obviously subtle and hard to debug ways to leak an increment, which means the request would stay active forever defeating the whole purpose. And no, the kernel cannot keep track of the counter and observe whether it became zero at some point or not. You surely could come up with a convoluted scheme to work around that in form of sequence counters or whatever, but that just creates extra complexity for a very dubious value. The point is that the time slice extension is just providing an opportunistic priority ceiling mechanism with low overhead and without guarantees. Once a request is not granted or revoked, the performance of that particular operation goes south no matter what. Nesting does not help there at all, which is a strong argument for using KISS as the primary engineering principle here. The simple boolean request/granted pair is simple and very well defined. It does not suffer from any of those problems. If user space wants nesting, then it can do so on its own without creating an ill defined and fragile kernel/user ABI. We created enough of them in the past and all of them resulted in long term headaches. > Handling syscall within granted extension by killing the process I'm absolutely not opposed to lift the syscall restriction to make things easier, but this is the wrong argument for it: > will likely reserve this feature to the niche use-cases. Having this used only by people who actually know what they are doing is actually the preferred outcome. We've seen it over and over that supposedly "easy" features result in mindless overutilization because everyone and his dog thinks they need them just because and for the very wrong reasons. The unconditional usage of the most power hungry floating point extensions just because they are available, is only one example of many. Thanks, tglx
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.