Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 20 Dec 2016 12:55:02 +0200
From: Liljestrand Hans <ishkamiel@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "Reshetova, Elena" <elena.reshetova@...el.com>, 
 "kernel-hardening@...ts.openwall.com"
 <kernel-hardening@...ts.openwall.com>, Greg KH
 <gregkh@...uxfoundation.org>,  Kees Cook <keescook@...omium.org>,
 "will.deacon@....com" <will.deacon@....com>, Boqun Feng
 <boqun.feng@...il.com>, David Windsor <dwindsor@...il.com>, "aik@...abs.ru"
 <aik@...abs.ru>, "david@...son.dropbear.id.au" <david@...son.dropbear.id.au>
Subject: Re: Conversion from atomic_t to refcount_t: summary of issues

On Tue, 2016-12-20 at 10:41 +0100, Peter Zijlstra wrote:
> On Tue, Dec 20, 2016 at 09:13:58AM +0000, Reshetova, Elena wrote:
> > > On Mon, Dec 19, 2016 at 07:55:15AM +0000, Reshetova, Elena wrote:
> > > > Well, again, you are right in theory, but in practice for example for struct
> > > sched_group { atomic_t ref; ... }:
> > > >
> > > > http://lxr.free-electrons.com/source/kernel/sched/core.c#L6178
> > > >
> > > > To me this is a refcounter that needs the protection.
> > > 
> > > Only if you have more than UINT_MAX CPUs or something like that.
> > > 
> > > And if you really really want to use refcount_t there, you could +1 the
> > > scheme and it'd work again.
> > 
> > Well, yes, probably, but there are many cases like this in practice,
> > so we would need to have a good plan how to get it all submitted and
> > tested properly. The current patch set is already bigger than what we
> > had before and it is only growing.  Hans will provide more info later
> > today based on his testing, which shows many places in kernel core
> > where we DO actually have increment on zero happening in practice and
> > whole kernel doesn't even boot with the strictest approach (refusing
> > to inc on zero). And we are only able to test for x86.... 
> > 
> > Given the massive amount of changes, it would be good to merge this at
> > least in couple of stages: 
> > 
> > 1) first soft version of refcount_t API which at least allows
> > increment on zero and all atomic_t used as refcounter occurrences that
> > don't require reference counter scheme change (+1 or other) 2) patch
> > set that fixes all problematic places (potentially with code rewrite)
> > 3) patch that removes possibility of inc on zero from refcount_t
> 
> I don't get it. Why ?
> 
> Just leave the weird and problematic cases using atomic_t. Its far
> harder to remove crap later.

Yes, ideally we would either fix or leave them as atomic_t. One reason
for the proposal is subtle places that might not get caught in
audit/testing, in those cases allowing refcount_inc to increment on 0
(with a WARN) would ensure the code still works.

We were also hoping reviewing might have been easier with that
separation, but perhaps that was misguided, and separating/skipping the
weird places might serve the same purpose without mucking with the API.


For reference, I've listed here the places that were causing "increment
on 0" WARNs on my previous boot (temporarily allowed inc on 0 to make
boot possible). These seem to be mostly related to resource reuse, but
we haven't yet to looked in detail on how to deal with them.

fs/ext4/mballoc.c:3399          ext4_mb_use_preallocated
        Seems to have separate tracking of destruction
net/ipv4/fib_semantics.c:994    fib_create_info
net/ipv4/devinet.c:233          inetdev_init
net/ipv4/tcp_ipv4.c:1793        inet_sk_rx_dst_set
net/ipv4/route.c:2153:          __ip_route_output_key_hash
net/ipv6/ip6_fib.c:949          fib6_add
net/ipv6/route.c:1048           ip6_pol_route
net/ipv6/addrconf.c:930         ipv6_add_addr
net/ipv6/addrconf.c:357         ipv6_add_dev
net/core/filter.c:940           sk_filter_charge
        net stuff related to caching?
fs/inode.c:813                  find_inode_fast
        Seems to reuse freeing resources?
mm/backing-dev.c:399            wb_congested_get_create
        Initializes to 0

There's also some places that initializes the refcounts to zero (either
using REFCOUNT_INIT or refcount_set). Some of these places are quite
confusing (or, at least to me), so the idea was that doing the changes
incrementally might keep them more manageable.

Regards,
-hans liljestrand

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.