Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 13 Aug 2011 10:41:54 -0500
From: "H. Peter Anvin" <hpa@...or.com>
To: Vasiliy Kulikov <segoon@...nwall.com>
CC: Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
        James Morris <jmorris@...ei.org>, kernel-hardening@...ts.openwall.com,
        x86@...nel.org, linux-kernel@...r.kernel.org,
        linux-security-module@...r.kernel.org
Subject: Re: [RFC] x86: restrict pid namespaces to 32 or 64 bit syscalls

Vasiliy Kulikov <segoon@...nwall.com> wrote:

>On Fri, Aug 12, 2011 at 15:08 -0500, H. Peter Anvin wrote:
>> On 08/12/2011 10:03 AM, Vasiliy Kulikov wrote:
>> > This patch allows x86-64 systems with 32 bit syscalls support to
>lock a
>> > pid namespace to 32 or 64 bitness syscalls/tasks.  By denying
>rarely
>> > used compatibility syscalls it reduces an attack surface for 32 bit
>> > containers.
>> > 
>> > The new sysctl is introduced, abi.bitness_locked.  If set to 1, it
>locks
>> > all tasks inside of current pid namespace to the bitness of init
>task
>> > (pid_ns->child_reaper).  After that:
>> > 
>> > 1) a task trying to do a syscall of other bitness would get a
>signal as
>> > if the corresponding syscall is not enabled (IDT entry/MSR is not
>> > initialized).
>> > 
>> > 2) loading ELF binaries of another bitness is prohibited (as if the
>> > corresponding CONFIG_BINFMT_*=N).
>[...]
>> However, I have to question the value of this... if this is enabled
>in
>> the system as a whole (as opposed to compiled out) it seems kind of
>> pointless...
>
>No, it is not for the system as a whole, but for containers (however,
>it's possible to lock the whole system).  We use OpenVZ kernels with
>multiple containers, some of them are 32 bit, some are 64 bit.  64 bit
>syscalls are not needed for 32 bit containers and 32 bit syscalls are
>not needed for 64 bit containers.  As a needless interfaces they
>unreasonably increase the kernel attack surface.  Some compatibility 32
>bit syscalls are rarely used, sometimes they are not tested well.
>
>In IA-64 the IA-32 compatibility support was broken for 2 years:
>
>http://www.spinics.net/lists/linux-ia64/msg07840.html
>
>In amd64 some specific rarely used syscalls might behave similar way.
>Removing this attack vector is the goal of the patch.
>
>> if there are bugs we need to deal with them anyway.
>
>Definitely.
>
>> > Qestions/thoughts:
>> > 
>> > The patch adds a check in syscalls code.  Is it a significant
>> > slowdown for fast syscalls?  If so, probably it worth moving the
>check
>> > into scheduler code and enabling/disabling corresponding
>interrupt/MSRs
>> > on each task switch?
>> > 
>> 
>> *YOU* are the person who needs to answer that question by providing
>> measurements.  Quite frankly I suspect checks in the syscall code
>*or*
>> task switching MSRs are going to be unacceptable from a performance
>> point of view.
>
>OK, I'll do it.
>
>Thank you,
>
>-- 
>Vasiliy Kulikov
>http://www.openwall.com - bringing security into open computing
>environments

IA64 is totally different.  I'm extremely sceptical to this patch; it feels like putting code in a super-hot path to paper over a problem that has to be fixed anyway.
-- 
Sent from my mobile phone. Please excuse my brevity and lack of formatting.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.