kernel-hardening - Re: [PATCH] slub: Introduce CONFIG_SLUB_RCU

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez34DN_xsj7hio8epvoE8hM3F_xFoqwWYM-_LVZb39_e9A@mail.gmail.com>
Date: Mon, 28 Aug 2023 16:39:33 +0200
From: Jann Horn <jannh@...gle.com>
To: Dmitry Vyukov <dvyukov@...gle.com>
Cc: Andrey Ryabinin <ryabinin.a.a@...il.com>, Christoph Lameter <cl@...ux.com>, 
	Pekka Enberg <penberg@...nel.org>, David Rientjes <rientjes@...gle.com>, 
	Joonsoo Kim <iamjoonsoo.kim@....com>, Vlastimil Babka <vbabka@...e.cz>, 
	Alexander Potapenko <glider@...gle.com>, Andrey Konovalov <andreyknvl@...il.com>, 
	Vincenzo Frascino <vincenzo.frascino@....com>, Andrew Morton <akpm@...ux-foundation.org>, 
	Roman Gushchin <roman.gushchin@...ux.dev>, Hyeonggon Yoo <42.hyeyoo@...il.com>, 
	kasan-dev@...glegroups.com, linux-kernel@...r.kernel.org, linux-mm@...ck.org, 
	linux-hardening@...r.kernel.org, kernel-hardening@...ts.openwall.com
Subject: Re: [PATCH] slub: Introduce CONFIG_SLUB_RCU_DEBUG

On Sat, Aug 26, 2023 at 5:32 AM Dmitry Vyukov <dvyukov@...gle.com> wrote:
> On Fri, 25 Aug 2023 at 23:15, Jann Horn <jannh@...gle.com> wrote:
> > Currently, KASAN is unable to catch use-after-free in SLAB_TYPESAFE_BY_RCU
> > slabs because use-after-free is allowed within the RCU grace period by
> > design.
> >
> > Add a SLUB debugging feature which RCU-delays every individual
> > kmem_cache_free() before either actually freeing the object or handing it
> > off to KASAN, and change KASAN to poison freed objects as normal when this
> > option is enabled.
> >
> > Note that this creates a 16-byte unpoisoned area in the middle of the
> > slab metadata area, which kinda sucks but seems to be necessary in order
> > to be able to store an rcu_head in there without triggering an ASAN
> > splat during RCU callback processing.
>
> Nice!
>
> Can't we unpoision this rcu_head right before call_rcu() and repoison
> after receiving the callback?

Yeah, I think that should work. It looks like currently
kasan_unpoison() is exposed in include/linux/kasan.h but
kasan_poison() is not, and its inline definition probably means I
can't just move it out of mm/kasan/kasan.h into include/linux/kasan.h;
do you have a preference for how I should handle this? Hmm, and it
also looks like code outside of mm/kasan/ anyway wouldn't know what
are valid values for the "value" argument to kasan_poison().
I also have another feature idea that would also benefit from having
something like kasan_poison() available in include/linux/kasan.h, so I
would prefer that over adding another special-case function inside
KASAN for poisoning this piece of slab metadata...

I guess I could define a wrapper around kasan_poison() in
mm/kasan/generic.c that uses a new poison value for "some other part
of the kernel told us to poison this area", and then expose that
wrapper with a declaration in include/mm/kasan.h? Something like:

void kasan_poison_outline(const void *addr, size_t size, bool init)
{
  kasan_poison(addr, size, KASAN_CUSTOM, init);
}

> What happens on cache destruction?
> Currently we purge quarantine on cache destruction to be able to
> safely destroy the cache. I suspect we may need to somehow purge rcu
> callbacks as well, or do something else.

Ooh, good point, I hadn't thought about that... currently
shutdown_cache() assumes that all the objects have already been freed,
then puts the kmem_cache on a list for
slab_caches_to_rcu_destroy_workfn(), which then waits with an
rcu_barrier() until the slab's pages are all gone.

Luckily kmem_cache_destroy() is already a sleepable operation, so
maybe I should just slap another rcu_barrier() in there for builds
with this config option enabled... I think that should be fine for an
option mostly intended for debugging.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.