kernel-hardening - Re: [PATCH v8 3/8] seccomp: add system call filtering using BPF

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAObL_7GJ2TdKpdvJRpiLh1c9KPGcTsELdhw7v4Vj39a0P0JxZg@mail.gmail.com>
Date: Thu, 16 Feb 2012 16:23:53 -0800
From: Andrew Lutomirski <luto@....edu>
To: Will Drewry <wad@...omium.org>
Cc: "H. Peter Anvin" <hpa@...or.com>, Markus Gutschke <markus@...omium.org>, linux-kernel@...r.kernel.org, 
	linux-arch@...r.kernel.org, linux-doc@...r.kernel.org, 
	kernel-hardening@...ts.openwall.com, netdev@...r.kernel.org, x86@...nel.org, 
	arnd@...db.de, davem@...emloft.net, mingo@...hat.com, oleg@...hat.com, 
	peterz@...radead.org, rdunlap@...otime.net, mcgrathr@...omium.org, 
	tglx@...utronix.de, eparis@...hat.com, serge.hallyn@...onical.com, 
	djm@...drot.org, scarybeasts@...il.com, indan@....nu, pmoore@...hat.com, 
	akpm@...ux-foundation.org, corbet@....net, eric.dumazet@...il.com, 
	keescook@...omium.org
Subject: Re: [PATCH v8 3/8] seccomp: add system call filtering using BPF

On Thu, Feb 16, 2012 at 3:00 PM, Will Drewry <wad@...omium.org> wrote:
> On Thu, Feb 16, 2012 at 4:06 PM, H. Peter Anvin <hpa@...or.com> wrote:
>> On 02/16/2012 01:51 PM, Will Drewry wrote:
>>>>
>>>> Put the bloody bit in there and let the pattern program make that decision.
>>>
>>> Easy enough to add a bit for the mode: 32-bit or 64-bit.  It seemed
>>> like a waste of cycles for every 32-bit program or every 64-bit
>>> program to check to see that its calling convention hadn't changed,
>>> but it does take away a valid decision the pattern program should be
>>> making.
>>>
>>> I'll add a flag for 32bit/64bit while cleaning up seccomp_data. I
>>> think that will properly encapsulate the is_compat_task() behavior in
>>> a way that is stable for compat and non-compat tasks to use.  If
>>> there's a more obvious way, I'm all ears.
>>>
>>
>> is_compat_task() is not going to be the right thing for x86 going
>> forward, as we're introducing the x32 ABI (which uses the normal x86-64
>> entry point, but with different eax numbers, and bit 30 set.)
>>
>> The actual state is the TS_COMPAT flag in the thread_info structure,
>> which currently matches is_compat_task(), but perhaps we should add a
>> new helper function syscall_namespace() or something like that...
>
> Without the addition of x32, it is still the intersection of
> is_compat_task()/TS_COMPAT and CONFIG_64BIT for all arches to
> determine if the call is 32-bit or 64-bit, but this will add another
> wrinkle.  Would it make sense to assume that system call namespaces
> may be ever expanding and offer up an unsigned integer value?
>
> struct seccomp_data {
>  int nr;
>  u32 namespace;
>  u64 instruction_pointer;
>  u64 args[6];
> }
>
> Then syscall_namespace(current, regs) returns
> * 0 - SYSCALL_NS_32 (for existing 32 and config_compat)
> * 1 - SYSCALL_NS_64 (for existing 64 bit)
> * 2 - SYSCALL_NS_X32 (everything after 2 is arch specific)
> * ..
>
> This patch series is pegged to x86 right now, so it's not a big deal
> to add a simple syscall_namespace to asm/syscall.h.  Of course, the
> code is always the easy part.  Even easier would be to only assign 0
> and 1 in the seccomp_data for 32-bit or 64-bit, then leave the rest of
> the u32 untouched until x32 stabilizes and the TS_COMPAT interactions
> are sorted.
>
> The other option, of course, is to hide it from the users and peg to
> is_compat_task and later to however x32 is exposed, but that might
> just be me trying to avoid adding more dependencies to this patch
> series :)
>
>> Either that or we can just use another bit in the syscall number field...
>
> That would simplify the case here. The seccomp_data bit would say the
> call is 64-bit and then the syscall number with the extra bit would
> say that it is x32 and wouldn't collide with the existing 64-bit
> numbering, and the filter program author wouldn't make a filter
> program that allows a call that it shouldn't.

Presumably this works for x32 (since bit 30 might as well be part of
the syscall number), but the namespace or whatever it's called would
be nice to distinguish between the three 32-bit entry points.

For 32-bit code, I can easily see two different entry points getting
used in the same program -- some library could issue int80 directly,
but other code (in the vdso, for example, if it ever starts being
useful) could hit the other entry.  And if 64-bit code ever gets a new
entry point, the same problem would happen.

Of course, if the args are magically fixed up in the 32-bit case, then
maybe the multiple entries are a nonissue.  (Sorry, I haven't kept
track of that part of this patch set.)

--Andy
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.