Date: Tue, 30 Oct 2018 21:41:13 -0700 From: Andy Lutomirski <luto@...capital.net> To: Matthew Wilcox <willy@...radead.org> Cc: Igor Stoppa <igor.stoppa@...il.com>, Tycho Andersen <tycho@...ho.ws>, Kees Cook <keescook@...omium.org>, Peter Zijlstra <peterz@...radead.org>, Mimi Zohar <zohar@...ux.vnet.ibm.com>, Dave Chinner <david@...morbit.com>, James Morris <jmorris@...ei.org>, Michal Hocko <mhocko@...nel.org>, Kernel Hardening <kernel-hardening@...ts.openwall.com>, linux-integrity <linux-integrity@...r.kernel.org>, LSM List <linux-security-module@...r.kernel.org>, Igor Stoppa <igor.stoppa@...wei.com>, Dave Hansen <dave.hansen@...ux.intel.com>, Jonathan Corbet <corbet@....net>, Laura Abbott <labbott@...hat.com>, Randy Dunlap <rdunlap@...radead.org>, Mike Rapoport <rppt@...ux.vnet.ibm.com>, "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de> Subject: Re: [PATCH 10/17] prmem: documentation On Tue, Oct 30, 2018 at 2:36 PM Matthew Wilcox <willy@...radead.org> wrote: > > On Tue, Oct 30, 2018 at 10:43:14PM +0200, Igor Stoppa wrote: > > On 30/10/2018 21:20, Matthew Wilcox wrote: > > > > > So the API might look something like this: > > > > > > > > > > void *p = rare_alloc(...); /* writable pointer */ > > > > > p->a = x; > > > > > q = rare_protect(p); /* read-only pointer */ > > > > With pools and memory allocated from vmap_areas, I was able to say > > > > protect(pool) > > > > and that would do a swipe on all the pages currently in use. > > In the SELinux policyDB, for example, one doesn't really want to > > individually protect each allocation. > > > > The loading phase happens usually at boot, when the system can be assumed to > > be sane (one might even preload a bare-bone set of rules from initramfs and > > then replace it later on, with the full blown set). > > > > There is no need to process each of these tens of thousands allocations and > > initialization as write-rare. > > > > Would it be possible to do the same here? > > What Andy is proposing effectively puts all rare allocations into > one pool. Although I suppose it could be generalised to multiple pools > ... one mm_struct per pool. Andy, what do you think to doing that? Hmm. Let's see. To clarify some of this thread, I think that the fact that rare_write uses an mm_struct and alias mappings under the hood should be completely invisible to users of the API. No one should ever be handed a writable pointer to rare_write memory (except perhaps during bootup or when initializing a large complex data structure that will be rare_write but isn't yet, e.g. the policy db). For example, there could easily be architectures where having a writable alias is problematic. On such architectures, an entirely different mechanism might work better. And, if a tool like KNOX ever becomes a *part* of the Linux kernel (hint hint!) If you have multiple pools and one mm_struct per pool, you'll need a way to find the mm_struct from a given allocation. Regardless of how the mm_structs are set up, changing rare_write memory to normal memory or vice versa will require a global TLB flush (all ASIDs and global pages) on all CPUs, so having extra mm_structs doesn't seem to buy much. (It's just possible that changing rare_write back to normal might be able to avoid the flush if the spurious faults can be handled reliably.)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.