![]() |
|
Message-ID: <5c2383449991a272ab3e3fe63b758e894172dd66.camel@intel.com> Date: Wed, 12 Dec 2018 19:51:18 +0000 From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com> To: "luto@...nel.org" <luto@...nel.org> CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "daniel@...earbox.net" <daniel@...earbox.net>, "jeyu@...nel.org" <jeyu@...nel.org>, "rostedt@...dmis.org" <rostedt@...dmis.org>, "ast@...nel.org" <ast@...nel.org>, "ard.biesheuvel@...aro.org" <ard.biesheuvel@...aro.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>, "jannh@...gle.com" <jannh@...gle.com>, "Dock, Deneen T" <deneen.t.dock@...el.com>, "kristen@...ux.intel.com" <kristen@...ux.intel.com>, "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "will.deacon@....com" <will.deacon@....com>, "mingo@...hat.com" <mingo@...hat.com>, "namit@...are.com" <namit@...are.com>, "kernel-hardening@...ts.openwall.com" <kernel-hardening@...ts.openwall.com>, "Keshavamurthy, Anil S" <anil.s.keshavamurthy@...el.com>, "mhiramat@...nel.org" <mhiramat@...nel.org>, "naveen.n.rao@...ux.vnet.ibm.com" <naveen.n.rao@...ux.vnet.ibm.com>, "davem@...emloft.net" <davem@...emloft.net>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, "Hansen, Dave" <dave.hansen@...el.com> Subject: Re: [PATCH v2 4/4] x86/vmalloc: Add TLB efficient x86 arch_vunmap On Tue, 2018-12-11 at 18:24 -0800, Andy Lutomirski wrote: > On Tue, Dec 11, 2018 at 4:12 PM Rick Edgecombe > <rick.p.edgecombe@...el.com> wrote: > > > > This adds a more efficient x86 architecture specific implementation of > > arch_vunmap, that can free any type of special permission memory with only 1 > > TLB > > flush. > > > > In order to enable this, _set_pages_p and _set_pages_np are made non-static > > and > > renamed set_pages_p_noflush and set_pages_np_noflush to better communicate > > their different (non-flushing) behavior from the rest of the set_pages_* > > functions. > > > > The method for doing this with only 1 TLB flush was suggested by Andy > > Lutomirski. > > > > Suggested-by: Andy Lutomirski <luto@...nel.org> > > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@...el.com> > > --- > > arch/x86/include/asm/set_memory.h | 2 + > > arch/x86/mm/Makefile | 3 +- > > arch/x86/mm/pageattr.c | 11 +++-- > > arch/x86/mm/vmalloc.c | 71 +++++++++++++++++++++++++++++++ > > 4 files changed, 80 insertions(+), 7 deletions(-) > > create mode 100644 arch/x86/mm/vmalloc.c > > > > diff --git a/arch/x86/include/asm/set_memory.h > > b/arch/x86/include/asm/set_memory.h > > index 07a25753e85c..70ee81e8914b 100644 > > --- a/arch/x86/include/asm/set_memory.h > > +++ b/arch/x86/include/asm/set_memory.h > > @@ -84,6 +84,8 @@ int set_pages_x(struct page *page, int numpages); > > int set_pages_nx(struct page *page, int numpages); > > int set_pages_ro(struct page *page, int numpages); > > int set_pages_rw(struct page *page, int numpages); > > +int set_pages_np_noflush(struct page *page, int numpages); > > +int set_pages_p_noflush(struct page *page, int numpages); > > > > extern int kernel_set_to_readonly; > > void set_kernel_text_rw(void); > > diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile > > index 4b101dd6e52f..189681f863a6 100644 > > --- a/arch/x86/mm/Makefile > > +++ b/arch/x86/mm/Makefile > > @@ -13,7 +13,8 @@ CFLAGS_REMOVE_mem_encrypt_identity.o = -pg > > endif > > > > obj-y := init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o > > mmap.o \ > > - pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o > > + pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o \ > > + vmalloc.o > > > > # Make sure __phys_addr has no stackprotector > > nostackp := $(call cc-option, -fno-stack-protector) > > diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c > > index db7a10082238..db0a4dfb5a7f 100644 > > --- a/arch/x86/mm/pageattr.c > > +++ b/arch/x86/mm/pageattr.c > > @@ -2248,9 +2248,7 @@ int set_pages_rw(struct page *page, int numpages) > > return set_memory_rw(addr, numpages); > > } > > > > -#ifdef CONFIG_DEBUG_PAGEALLOC > > - > > -static int __set_pages_p(struct page *page, int numpages) > > +int set_pages_p_noflush(struct page *page, int numpages) > > Maybe set_pages_rwp_noflush()? Sure. > > diff --git a/arch/x86/mm/vmalloc.c b/arch/x86/mm/vmalloc.c > > new file mode 100644 > > index 000000000000..be9ea42c3dfe > > --- /dev/null > > +++ b/arch/x86/mm/vmalloc.c > > @@ -0,0 +1,71 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * vmalloc.c: x86 arch version of vmalloc.c > > + * > > + * (C) Copyright 2018 Intel Corporation > > + * > > + * This program is free software; you can redistribute it and/or > > + * modify it under the terms of the GNU General Public License > > + * as published by the Free Software Foundation; version 2 > > + * of the License. > > This paragraph may be redundant with the SPDX line. Ok. > > + */ > > + > > +#include <linux/mm.h> > > +#include <linux/set_memory.h> > > +#include <linux/vmalloc.h> > > + > > +static void set_area_direct_np(struct vm_struct *area) > > +{ > > + int i; > > + > > + for (i = 0; i < area->nr_pages; i++) > > + set_pages_np_noflush(area->pages[i], 1); > > +} > > + > > +static void set_area_direct_prw(struct vm_struct *area) > > +{ > > + int i; > > + > > + for (i = 0; i < area->nr_pages; i++) > > + set_pages_p_noflush(area->pages[i], 1); > > +} > > + > > +void arch_vunmap(struct vm_struct *area, int deallocate_pages) > > +{ > > + int immediate = area->flags & VM_IMMEDIATE_UNMAP; > > + int special = area->flags & VM_HAS_SPECIAL_PERMS; > > + > > + /* Unmap from vmalloc area */ > > + remove_vm_area(area->addr); > > + > > + /* If no need to reset directmap perms, just check if need to flush > > */ > > + if (!(deallocate_pages || special)) { > > + if (immediate) > > + vm_unmap_aliases(); > > + return; > > + } > > + > > + /* From here we need to make sure to reset the direct map perms */ > > + > > + /* > > + * If the area being freed does not have any extra capabilities, we > > can > > + * just reset the directmap to RW before freeing. > > + */ > > + if (!immediate) { > > + set_area_direct_prw(area); > > + vm_unmap_aliases(); > > + return; > > + } > > + > > + /* > > + * If the vm being freed has security sensitive capabilities such as > > + * executable we need to make sure there is no W window on the > > directmap > > + * before removing the X in the TLB. So we set not present first so > > we > > + * can flush without any other CPU picking up the mapping. Then we > > reset > > + * RW+P without a flush, since NP prevented it from being cached by > > + * other cpus. > > + */ > > + set_area_direct_np(area); > > + vm_unmap_aliases(); > > + set_area_direct_prw(area); > > Here you're using "immediate" as a proxy for "was executable". And > it's barely faster to omit immediate -- it's the same number of > flushes, and all you save is one pass over the direct map. Not sure I understand, I think its still generic to "special capabilities". You mean fix the comment? > Do we really need to support all these combinations? Even if we do > support them, I think that "immediate" needs a better name. It saves TLB flushes in the generic implementation for the just RO case, but yea here it doesn't save much here.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.