Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 7 Jul 2016 13:37:43 -0400
From: Kees Cook <>
To: Arnd Bergmann <>
Cc: "" <>, LKML <>, 
	Jan Kara <>, 
	"" <>, 
	Catalin Marinas <>, Will Deacon <>, 
	Linux-MM <>, sparclinux <>,, Christoph Lameter <>, 
	Andrea Arcangeli <>, linux-arch <>, 
	"" <>, Russell King <>, 
	"" <>, PaX Team <>, 
	Mathias Krause <>, Fenghua Yu <>, 
	Rik van Riel <>, David Rientjes <>, Tony Luck <>, 
	Andy Lutomirski <>, Joonsoo Kim <>, 
	Dmitry Vyukov <>, Laura Abbott <>, 
	Brad Spengler <>, Ard Biesheuvel <>, 
	Pekka Enberg <>, Casey Schaufler <>, 
	Andrew Morton <>, "David S. Miller" <>
Subject: Re: [PATCH 1/9] mm: Hardened usercopy

On Thu, Jul 7, 2016 at 4:01 AM, Arnd Bergmann <> wrote:
> On Wednesday, July 6, 2016 3:25:20 PM CEST Kees Cook wrote:
>> This is the start of porting PAX_USERCOPY into the mainline kernel. This
>> is the first set of features, controlled by CONFIG_HARDENED_USERCOPY. The
>> work is based on code by PaX Team and Brad Spengler, and an earlier port
>> from Casey Schaufler. Additional non-slab page tests are from Rik van Riel.
>> This patch contains the logic for validating several conditions when
>> performing copy_to_user() and copy_from_user() on the kernel object
>> being copied to/from:
>> - address range doesn't wrap around
>> - address range isn't NULL or zero-allocated (with a non-zero copy size)
>> - if on the slab allocator:
>>   - object size must be less than or equal to copy size (when check is
>>     implemented in the allocator, which appear in subsequent patches)
>> - otherwise, object must not span page allocations
>> - if on the stack
>>   - object must not extend before/after the current process task
>>   - object must be contained by the current stack frame (when there is
>>     arch/build support for identifying stack frames)
>> - object must not overlap with kernel text
>> Signed-off-by: Kees Cook <>
> Nice!
> I have a few further thoughts, most of which have probably been
> considered before:
>> +static inline const char *check_bogus_address(const void *ptr, unsigned long n)
>> +{
>> +     /* Reject if object wraps past end of memory. */
>> +     if (ptr + n < ptr)
>> +             return "<wrapped address>";
>> +
>> +     /* Reject if NULL or ZERO-allocation. */
>> +     if (ZERO_OR_NULL_PTR(ptr))
>> +             return "<null>";
>> +
>> +     return NULL;
>> +}
> This checks against address (void*)16, but I guess on most architectures the
> lowest possible kernel address is much higher. While there may not be much
> that to exploit if the expected kernel address points to userland, forbidding
> any obviously incorrect address that is outside of the kernel may be easier.
> Even on architectures like s390 that start the kernel memory at (void *)0x0,
> the lowest address to which we may want to do a copy_to_user would be much
> higher than (void*)0x16.

Yeah, that's worth exploring, but given the shenanigans around
set_fs(), I'd like to leave this as-is, and we can add to these checks
as we remove as much of the insane usage of set_fs().

>> +
>> +     /* Allow kernel rodata region (if not marked as Reserved). */
>> +     if (ptr >= (const void *)__start_rodata &&
>> +         end <= (const void *)__end_rodata)
>> +             return NULL;
> Should we explicitly forbid writing to rodata, or is it enough to
> rely on page protection here?

Hm, interesting. That's a very small check to add. My knee-jerk is to
just leave it up to page protection. I'm on the fence. :)

>> +     /* Allow kernel bss region (if not marked as Reserved). */
>> +     if (ptr >= (const void *)__bss_start &&
>> +         end <= (const void *)__bss_stop)
>> +             return NULL;
> accesses to .data/.rodata/.bss are probably not performance critical,
> so we could go further here and check the kallsyms table to ensure
> that we are not spanning multiple symbols here.

Oh, interesting! Yeah, would you be willing to put together that patch
and test it? I wonder if there are any cases where there are
legitimate usercopys across multiple symbols.

> For stuff that is performance critical, should there be a way to
> opt out of the checks, or do we assume it already uses functions
> that avoid the checks? I looked at the file and network I/O path
> briefly and they seem to use kmap_atomic() to get to the user pages
> at least in some of the common cases (but I may well be missing
> important ones).

I don't want to start with an exemption here, so until such a case is
found, I'd rather leave this as-is. That said, the primary protection
here tends to be buggy lengths (which is why put/get_user() is
untouched). For constant-sized copies, some checks could be skipped.
In the second part of this protection (what I named
CONFIG_HARDENED_USERCOPY_WHITELIST in the RFC version of this series),
there are cases where we want to skip the whitelist checking since it
is for a constant-sized copy the code understands is okay to pull out
of an otherwise disallowed allocator object.


Kees Cook
Chrome OS & Brillo Security

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.