Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 23 Dec 2022 00:21:12 +0100
From: Solar Designer <solar@...nwall.com>
To: Dominique Martinet <asmadeus@...ewreck.org>
Cc: oss-security@...ts.openwall.com,
	Alejandro Colomar <alx.manpages@...il.com>,
	Michael Kerrisk <mtk.manpages@...il.com>,
	linux-kernel@...r.kernel.org, linux-man@...r.kernel.org
Subject: Re: [patch] proc.5: tell how to parse /proc/*/stat correctly

On Fri, Dec 23, 2022 at 07:03:17AM +0900, Dominique Martinet wrote:
> Alexey Dobriyan wrote on Thu, Dec 22, 2022 at 07:42:53PM +0300:
> > --- a/man5/proc.5
> > +++ b/man5/proc.5
> > @@ -2092,6 +2092,11 @@ Strings longer than
> >  .B TASK_COMM_LEN
> >  (16) characters (including the terminating null byte) are silently truncated.
> >  This is visible whether or not the executable is swapped out.
> > +
> > +Note that \fIcomm\fP can contain space and closing parenthesis characters. 
> > +Parsing /proc/${pid}/stat with split() or equivalent, or scanf(3) isn't
> > +reliable. The correct way is to locate closing parenthesis with strrchr(')')
> > +from the end of the buffer and parse integers from there.
> 
> That's still not enough unless new lines are escaped, which they aren't:
> 
> $ echo -n 'test) 0 0 0
> ' > /proc/$$/comm
> $ cat /proc/$$/stat
> 71076 (test) 0 0 0
> ) S 71075 71076 71076 34840 71192 4194304 6623 6824 0 0 10 3 2 7 20 0 1 0 36396573 15208448 2888 18446744073709551615 94173281726464 94173282650929 140734972513568 0 0 0 65536 3686404 1266761467 1 0 0 17 1 0 0 0 0 0 94173282892592 94173282940880 94173287231488 140734972522071 140734972522076 140734972522076 140734972526574 0
> 
> The silver lining here is that comm length is rather small (16) so we
> cannot emulate full lines and a very careful process could notice that
> there are not enough fields after the last parenthesis... So just look
> for the last closing parenthesis in the next line and try again?

No, just don't treat this file's content as a line (nor as several
lines) - treat it as a string that might contain new line characters.

The ps command from procps-ng seems to manage, e.g. for your test "ps c"
prints:

29394 pts/3    S      0:00 test) 0 0 0?

where the question mark is what it substitutes for the non-printable
character (the new line character).  I didn't check whether the process
name it prints comes from /proc/$$/stat or /proc/$$/status, though (per
strace, it reads both).

> But, really, I just don't see how this can practically be said to be parsable...

This format certainly makes it easier to get a parser wrong than to get
it right.

I agree the above man page edit is not enough, and should also mention
the caveat that this shouldn't be read in nor parsed as a line.

Also, the Linux kernel does have problems with new lines in the comm
field elsewhere, at least in the log messages it produces:

https://github.com/lkrg-org/lkrg/issues/165

Here I looked into this in context of LKRG development, but with the
kernel itself also producing messages with comm in them the point of
only fixing LKRG's messages is moot.

Alexander

P.S. While this thread goes well so far, please note that in general
CC'ing other lists on postings to oss-security (or vice versa) is
discouraged.  With such CC's, possible follow-ups from members of those
other lists can be off-topic for oss-security - e.g., they might focus
on non-security technicalities.  Probably not this time when only a man
page is to be patched, but proposed patches to the Linux kernel often
result in lengthy discussions and multiple versions of the patch.  In
those cases, I think it's better to have separate threads and only post
summary follow-up(s) to oss-security (e.g., one message stating that a
patch was proposed and linking to the thread, and another after the
final version is merged).

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.