kernel-hardening - Re: [PATCH v2] kernel: escape non-ASCII and control characters in printk()

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110626165409.GA2584@albatros>
Date: Sun, 26 Jun 2011 20:54:09 +0400
From: Vasiliy Kulikov <segoon@...nwall.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	James Morris <jmorris@...ei.org>, Namhyung Kim <namhyung@...il.com>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	kernel-hardening@...ts.openwall.com, linux-kernel@...r.kernel.org,
	Alan Cox <alan@...rguk.ukuu.org.uk>
Subject: Re: [PATCH v2] kernel: escape non-ASCII and control characters in
 printk()

Hi Ingo,

On Sun, Jun 26, 2011 at 12:39 +0200, Ingo Molnar wrote:
> > +	if (!iscntrl(c) || (c == '\n') || (c == '\t'))
> > +		emit_log_char(c);
> > +	else {
> > +		len = sprintf(buffer, "#x%02x", c);
> > +		for (i = 0; i < len; i++)
> > +			emit_log_char(buffer[i]);
> > +	}
> 
> Nit: please use balanced curly braces.

OK.

> Also, i think it would be better to make this opt-out, i.e. exclude 
> the handful of control characters that are harmful (such as backline 
> and console escape), instead of trying to include the known-useful 
> ones.

Do you see any issue with the check above?

> The whole non-ASCII-languages issue would not have happened if such 
> an approach was taken.
>
> It's also the better approach for the kernel: we handle known harmful 
> things and are permissive otherwise.

I hope it is not a universal tip for the whole kernel development.
Black lists are almost always suck.

Could you instantly answer without reading the previous discussion what
control characters are harmful, what are sometimes harmful (on some
ttys), and what are always safe and why (or even answer why it is
harmful at all)?  I'm not a tty guy and I have to read console_codes(4)
or similar docs to answer this question, the majority of kernel devs
might have to read the docs too.

Writing the black list implies the full knowledge of _all_ possible
malformed input values, which is somewhat hard to achieve (or even
impossible).  Some developers might not be interested in learning such
details, but still interested in how this API can be used.

Quite the contrary, the allowed values set makes sense to the developer
and more stricktly defines the API in question.  Discussing the API
goals and reaching the consensus about its usage is much more
productive.  It might catch some wrong and dangerous API misuses.  If the
allowed set becomes too strict one day, no problem - just explicitly
relax the check.  If you lose some value in the black list (e.g. it
becomes known that some control char sequence can be used to fake the
logs), the miss significance would be higher.

And from the cynical point of view the white list is simply smaller and
cleaner than the black list.

Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.