owl-dev - Re: kref

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20120812190141.GB3000@albatros>
Date: Sun, 12 Aug 2012 23:01:41 +0400
From: Vasily Kulikov <segoon@...nwall.com>
To: owl-dev@...ts.openwall.com
Subject: Re: kref_overflow

Solar,

On Sun, Aug 12, 2012 at 22:31 +0400, Solar Designer wrote:
> On Sun, Aug 12, 2012 at 10:00:21PM +0400, Vasily Kulikov wrote:
> > The light version of PAX_REFCOUNT was backported to Owl kernel.
> > It protects kref only, not all atomic_t.  The pro is almost zero maintenance
> > time.  The con is obviously missing protection for counters which were not
> > explicitly marked as refcounter by using kref instead of atomic_t.
> > 
> > The sysctl for it is kernel.kref_overflow_action.  It can be set to:
> > 
> > 0 - no overflow check at all.  Current upstream behaviour.
> > 1 - protection is on (default).  Each overflow emits stack dump and a big log
> >     warning.
> 
> Is this protection at all?

The protection itself is decrementing the counter.  IOW, kref_get() is a noop
from the refcounter point of view.

Compared to the "0", with the protection the refcounter is not able to
overflow, thus cannot reach zero value (no users) by increments and thus
cannot lead to use-after-free bugs.  So, use-after-free bug becomes a memory
leak, which is much better.

> It's at best knowing that there was an
> overflow, and only if the attacker (in case this was malicious) could
> not or otherwise did not overwrite the log records yet (in case they're
> local and the attack is successful).
> 
> > 2 - the same as 1 plus the current task is killed.

Actually, PAX_USERCOPY implements "2" unconditionally.  I've implemented "1"
for rare cases: a "debug" mode or a theoretically possible case where root
knows that indeed there is a kernel bug, but he wants to live with it for now
because e.g. a legitimate user does essential increment over UINT_MAX/2.
There were kernel bugs where legitimate users overflowed some counters on
x86_64, don't know how real such scenario for _reference_ counter can be...

> Does "action 2" above protect against a subset of attacks (which ones?) or is
> it almost equivalent to "action 1"?

>From the identification point of view they are the same.

> I'm afraid that we might have to make "action 3" the default in order
> for this protection to be of much use, but then we'll also make systems
> potentially less reliable in practice (causing kernel panics where the
> system could otherwise mostly stay up for longer, until a sysadmin
> reboots it more cleanly).

Yes, an attacker may just do fork() and only the spawned child will be killed.
Other user processes will be OK.

I didn't think about the defaults yet, now I think 2 or 3 should be the
default.

> Presumably most of these overflows won't
> actually be malicious.

These are _refcounter's_ overflows, not just some integers.  My patch protects
only variables explicitly marked as counters.  Having billions users of a
kernel object is a somewhat unreal.


I was thinking about such generic scheme (not for refcounter overflows, but
for other kernel bugs catching like PAX_USERCOPY'ish or even to userspace bugs
catching):

1) Kernel identifies someone tries to exploit some bug (kernel/userspace).
2) It signals about the event to a userspace application.  (synchroniously, not
stopping the triggering process)
3) The process identifies whether this exploitation attempt is dangerous and
probably undertakes some actions, e.g. it kills/freezes all tasks of the user
which triggered a bug and locks this account.

The policy of danger estimation moves from the kernel to userspace (which
might be rather tricky or even dynamic).

The audit daemon can be used here (or at least audit kernel feature).
RHEL6'ish kernel has full audit capabilities.

Thanks,

-- 
Vasily
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.