Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 17 Jan 2019 22:39:15 +0000
From: Nadav Amit <namit@...are.com>
To: "H. Peter Anvin" <hpa@...or.com>
CC: Masami Hiramatsu <mhiramat@...nel.org>, Rick Edgecombe
	<rick.p.edgecombe@...el.com>, Andy Lutomirski <luto@...nel.org>, Ingo Molnar
	<mingo@...hat.com>, LKML <linux-kernel@...r.kernel.org>, X86 ML
	<x86@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov
	<bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, Peter Zijlstra
	<peterz@...radead.org>, Damian Tometzki <linux_dti@...oud.com>,
	linux-integrity <linux-integrity@...r.kernel.org>, LSM List
	<linux-security-module@...r.kernel.org>, Andrew Morton
	<akpm@...ux-foundation.org>, Kernel Hardening
	<kernel-hardening@...ts.openwall.com>, Linux-MM <linux-mm@...ck.org>, Will
 Deacon <will.deacon@....com>, Ard Biesheuvel <ard.biesheuvel@...aro.org>,
	Kristen Carlson Accardi <kristen@...ux.intel.com>, "Dock, Deneen T"
	<deneen.t.dock@...el.com>, Kees Cook <keescook@...omium.org>, Dave Hansen
	<dave.hansen@...el.com>
Subject: Re: [PATCH 01/17] Fix "x86/alternatives: Lockdep-enforce text_mutex
 in text_poke*()"

> On Jan 17, 2019, at 1:15 PM, hpa@...or.com wrote:
> 
> On January 16, 2019 10:47:01 PM PST, Masami Hiramatsu <mhiramat@...nel.org> wrote:
>> On Wed, 16 Jan 2019 16:32:43 -0800
>> Rick Edgecombe <rick.p.edgecombe@...el.com> wrote:
>> 
>>> From: Nadav Amit <namit@...are.com>
>>> 
>>> text_mutex is currently expected to be held before text_poke() is
>>> called, but we kgdb does not take the mutex, and instead *supposedly*
>>> ensures the lock is not taken and will not be acquired by any other
>> core
>>> while text_poke() is running.
>>> 
>>> The reason for the "supposedly" comment is that it is not entirely
>> clear
>>> that this would be the case if gdb_do_roundup is zero.
>>> 
>>> This patch creates two wrapper functions, text_poke() and
>>> text_poke_kgdb() which do or do not run the lockdep assertion
>>> respectively.
>>> 
>>> While we are at it, change the return code of text_poke() to
>> something
>>> meaningful. One day, callers might actually respect it and the
>> existing
>>> BUG_ON() when patching fails could be removed. For kgdb, the return
>>> value can actually be used.
>> 
>> Looks good to me.
>> 
>> Reviewed-by: Masami Hiramatsu <mhiramat@...nel.org>
>> 
>> Thank you,
>> 
>>> Cc: Andy Lutomirski <luto@...nel.org>
>>> Cc: Kees Cook <keescook@...omium.org>
>>> Cc: Dave Hansen <dave.hansen@...el.com>
>>> Cc: Masami Hiramatsu <mhiramat@...nel.org>
>>> Fixes: 9222f606506c ("x86/alternatives: Lockdep-enforce text_mutex in
>> text_poke*()")
>>> Suggested-by: Peter Zijlstra <peterz@...radead.org>
>>> Acked-by: Jiri Kosina <jkosina@...e.cz>
>>> Signed-off-by: Nadav Amit <namit@...are.com>
>>> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@...el.com>
>>> ---
>>> arch/x86/include/asm/text-patching.h |  1 +
>>> arch/x86/kernel/alternative.c        | 52
>> ++++++++++++++++++++--------
>>> arch/x86/kernel/kgdb.c               | 11 +++---
>>> 3 files changed, 45 insertions(+), 19 deletions(-)
>>> 
>>> diff --git a/arch/x86/include/asm/text-patching.h
>> b/arch/x86/include/asm/text-patching.h
>>> index e85ff65c43c3..f8fc8e86cf01 100644
>>> --- a/arch/x86/include/asm/text-patching.h
>>> +++ b/arch/x86/include/asm/text-patching.h
>>> @@ -35,6 +35,7 @@ extern void *text_poke_early(void *addr, const void
>> *opcode, size_t len);
>>>  * inconsistent instruction while you patch.
>>>  */
>>> extern void *text_poke(void *addr, const void *opcode, size_t len);
>>> +extern void *text_poke_kgdb(void *addr, const void *opcode, size_t
>> len);
>>> extern int poke_int3_handler(struct pt_regs *regs);
>>> extern void *text_poke_bp(void *addr, const void *opcode, size_t
>> len, void *handler);
>>> extern int after_bootmem;
>>> diff --git a/arch/x86/kernel/alternative.c
>> b/arch/x86/kernel/alternative.c
>>> index ebeac487a20c..c6a3a10a2fd5 100644
>>> --- a/arch/x86/kernel/alternative.c
>>> +++ b/arch/x86/kernel/alternative.c
>>> @@ -678,18 +678,7 @@ void *__init_or_module text_poke_early(void
>> *addr, const void *opcode,
>>> return addr;
>>> }
>>> 
>>> -/**
>>> - * text_poke - Update instructions on a live kernel
>>> - * @addr: address to modify
>>> - * @opcode: source of the copy
>>> - * @len: length to copy
>>> - *
>>> - * Only atomic text poke/set should be allowed when not doing early
>> patching.
>>> - * It means the size must be writable atomically and the address
>> must be aligned
>>> - * in a way that permits an atomic write. It also makes sure we fit
>> on a single
>>> - * page.
>>> - */
>>> -void *text_poke(void *addr, const void *opcode, size_t len)
>>> +static void *__text_poke(void *addr, const void *opcode, size_t len)
>>> {
>>> 	unsigned long flags;
>>> 	char *vaddr;
>>> @@ -702,8 +691,6 @@ void *text_poke(void *addr, const void *opcode,
>> size_t len)
>>>  */
>>> 	BUG_ON(!after_bootmem);
>>> 
>>> -	lockdep_assert_held(&text_mutex);
>>> -
>>> 	if (!core_kernel_text((unsigned long)addr)) {
>>> 		pages[0] = vmalloc_to_page(addr);
>>> 		pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
>>> @@ -732,6 +719,43 @@ void *text_poke(void *addr, const void *opcode,
>> size_t len)
>>> return addr;
>>> }
>>> 
>>> +/**
>>> + * text_poke - Update instructions on a live kernel
>>> + * @addr: address to modify
>>> + * @opcode: source of the copy
>>> + * @len: length to copy
>>> + *
>>> + * Only atomic text poke/set should be allowed when not doing early
>> patching.
>>> + * It means the size must be writable atomically and the address
>> must be aligned
>>> + * in a way that permits an atomic write. It also makes sure we fit
>> on a single
>>> + * page.
>>> + */
>>> +void *text_poke(void *addr, const void *opcode, size_t len)
>>> +{
>>> +	lockdep_assert_held(&text_mutex);
>>> +
>>> +	return __text_poke(addr, opcode, len);
>>> +}
>>> +
>>> +/**
>>> + * text_poke_kgdb - Update instructions on a live kernel by kgdb
>>> + * @addr: address to modify
>>> + * @opcode: source of the copy
>>> + * @len: length to copy
>>> + *
>>> + * Only atomic text poke/set should be allowed when not doing early
>> patching.
>>> + * It means the size must be writable atomically and the address
>> must be aligned
>>> + * in a way that permits an atomic write. It also makes sure we fit
>> on a single
>>> + * page.
>>> + *
>>> + * Context: should only be used by kgdb, which ensures no other core
>> is running,
>>> + *	    despite the fact it does not hold the text_mutex.
>>> + */
>>> +void *text_poke_kgdb(void *addr, const void *opcode, size_t len)
>>> +{
>>> +	return __text_poke(addr, opcode, len);
>>> +}
>>> +
>>> static void do_sync_core(void *info)
>>> {
>>> 	sync_core();
>>> diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
>>> index 5db08425063e..1461544cba8b 100644
>>> --- a/arch/x86/kernel/kgdb.c
>>> +++ b/arch/x86/kernel/kgdb.c
>>> @@ -758,13 +758,13 @@ int kgdb_arch_set_breakpoint(struct kgdb_bkpt
>> *bpt)
>>> if (!err)
>>> 		return err;
>>> 	/*
>>> -	 * It is safe to call text_poke() because normal kernel execution
>>> +	 * It is safe to call text_poke_kgdb() because normal kernel
>> execution
>>>  * is stopped on all cores, so long as the text_mutex is not
>> locked.
>>>  */
>>> 	if (mutex_is_locked(&text_mutex))
>>> 		return -EBUSY;
>>> -	text_poke((void *)bpt->bpt_addr, arch_kgdb_ops.gdb_bpt_instr,
>>> -		  BREAK_INSTR_SIZE);
>>> +	text_poke_kgdb((void *)bpt->bpt_addr, arch_kgdb_ops.gdb_bpt_instr,
>>> +		       BREAK_INSTR_SIZE);
>>> 	err = probe_kernel_read(opc, (char *)bpt->bpt_addr,
>> BREAK_INSTR_SIZE);
>>> if (err)
>>> 		return err;
>>> @@ -783,12 +783,13 @@ int kgdb_arch_remove_breakpoint(struct
>> kgdb_bkpt *bpt)
>>> if (bpt->type != BP_POKE_BREAKPOINT)
>>> 		goto knl_write;
>>> 	/*
>>> -	 * It is safe to call text_poke() because normal kernel execution
>>> +	 * It is safe to call text_poke_kgdb() because normal kernel
>> execution
>>>  * is stopped on all cores, so long as the text_mutex is not
>> locked.
>>>  */
>>> 	if (mutex_is_locked(&text_mutex))
>>> 		goto knl_write;
>>> -	text_poke((void *)bpt->bpt_addr, bpt->saved_instr,
>> BREAK_INSTR_SIZE);
>>> +	text_poke_kgdb((void *)bpt->bpt_addr, bpt->saved_instr,
>>> +		       BREAK_INSTR_SIZE);
>>> 	err = probe_kernel_read(opc, (char *)bpt->bpt_addr,
>> BREAK_INSTR_SIZE);
>>> if (err || memcmp(opc, bpt->saved_instr, BREAK_INSTR_SIZE))
>>> 		goto knl_write;
>>> -- 
>>> 2.17.1
> 
> If you are reorganizing this code, please do so so that the caller doesn’t
> have to worry about if it should call text_poke_bp() or text_poke_early().
> Right now the caller had to know that, which makes no sense.

Did you look at "[11/17] x86/jump-label: remove support for custom poker”?

https://lore.kernel.org/patchwork/patch/1032857/

If this is not what you regard, please be more concrete. text_poke_early()
is still used directly on init and while modules are loaded, which might not
be great, but is outside of the scope of this patch-set.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.