kernel-hardening - Re: [PATCH 0/2] sysctl: allow CLONE

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8737tp0zhr.fsf@x220.int.ebiederm.org>
Date: Fri, 22 Jan 2016 21:02:40 -0600
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Kees Cook <keescook@...omium.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,  Al Viro
 <viro@...iv.linux.org.uk>,  Richard Weinberger <richard@....at>,  Andy
 Lutomirski <luto@...capital.net>,  Robert Święcki
 <robert@...ecki.net>,  Dmitry Vyukov <dvyukov@...gle.com>,  David Howells
 <dhowells@...hat.com>,  Miklos Szeredi <mszeredi@...e.cz>,  Kostya
 Serebryany <kcc@...gle.com>,  Alexander Potapenko <glider@...gle.com>,
  Eric Dumazet <edumazet@...gle.com>,  Sasha Levin
 <sasha.levin@...cle.com>,  linux-doc@...r.kernel.org,
  linux-kernel@...r.kernel.org,  kernel-hardening@...ts.openwall.com
Subject: Re: [PATCH 0/2] sysctl: allow CLONE_NEWUSER to be disabled

Kees Cook <keescook@...omium.org> writes:

> There continues to be unexpected side-effects and security exposures
> via CLONE_NEWUSER. For many end-users running distro kernels with
> CONFIG_USER_NS enabled, there is no way to disable this feature when
> desired. As such, this creates a sysctl to restrict CLONE_NEWUSER so
> admins not running containers or Chrome can avoid the risks of this
> feature.

I don't actually think there do continue to be unexpected side-effects
and security exposures with CLONE_NEWUSER.  It takes a while for all of
the fixes to trickle out to distros.  At most what I have seen recently
are problems with other kernel interfaces being amplified with user
namespaces.  AKA the current mess with devpts, and the unexpected
issues with bind mounts in mount namespaces.

I have a couple of concerns with a sysctl.

1) As user namespaces settle out this sysctl has the potential to
   decrease the security of the system overall as sandboxing
   features of the kernel will not be available to unprivileged
   applications.

   Web browsing with chrome will be less safe for example.

2) I strongly suspect the granularity of a sysctl is wrong for access to
   user namespaces on a production system.

   In general I suspect what we want is something like seccomp.  I
   believe all of the relevant bits are in registers.  I actually
   thought that was enough for seccomp.  Does seccomp not work for
   some reason?

3) A sysctl breeds a false sense of security in thinking that if a
   security issue is discovered you can just flip a switch, disable
   all new user namespaces and you won't be vulnerable.

   In fact most of the issues in the past have only required being in
   a user namespace to trigger.  Which means any containers or user
   namespaces that already exist could be used to exploit any new
   found issue.  Which means that a I don't think a sysctl will give
   the desired level of protection.

   In my analysis of the issues to date I don't know of anything
   short of a reboot that would meaninfully remove the threat.

4) With applications like docker coming on-line I don't think a
   restriction to processes with capabilities is actually meaninful
   for restricting access to user namespaces.

So I have concerns about both efficacy and usability with the proposed
sysctl.

So to keep this productive.  Please tell me about the threat model
you envision, and how you envision knobs in the kernel being used to
counter those threats.

Eric
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.