oss-security - CVE abstraction choices and the Linux kernel

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.GSO.4.64.1303080956290.14630@faron.mitre.org>
Date: Fri, 8 Mar 2013 09:57:00 -0500 (EST)
From: "Steven M. Christey" <coley@...re.org>
To: oss-security@...ts.openwall.com
Subject: CVE abstraction choices and the Linux kernel


Apologies to all for the long post, but this discussion might
significantly impact how CVE assignment occurs in the future, and I
intend to reference it heavily.

We (MITRE CVE) specifically request feedback from members of the
various Linux distributions on this list, since they will be affected
most directly, although input from anybody is welcome.

In the "Linux kernel: various info leaks, some NULL ptr derefs"
thread, Petr Matousek said:

>In the past we've usually assigned one CVE per issue even for info leak
>bugs. Or at least one CVE per subsystem, as Alexander says. I agree with
>Alexander that one CVE for about ~20 issues is not right.

Established CVE practice does not dictate assigning a separate ID for
each bug.  While this is useful for some people, it's not useful for
others, and many times this information is not even available - or it
can change over time.  Although it often looks like one CVE is
assigned per issue, that is by accident, and there are other drivers
that decide how many CVEs should be assigned.  More explanation later
in this post.

The "spirit" of CVE content decisions is documented here:

http://cve.mitre.org/cve/editorial_policies/cd_abstraction.html

In the first couple years of CVE, we tried to assign unique IDs for
each vulnerability, but there was too much inconsistency and too much
uncertainty.  In many cases, we simply did not have enough data to
know how many vulnerabilities there even were.  Or, we might choose
to assign "X" number of IDs to a multi-issue disclosure, and then 2
weeks later, more details would come out that would really suggest
needing a different number of IDs than we had originally assigned.
You see this kind of "counting uncertainty" on oss-security on a
regular basis, even today.  So, CVE needs to operate in a space where
the amount of detail varies widely.

CVE also needs to be usable to many communities with different
needs.  Its primary role is to help these communities to coordinate
vulnerability information with each other - that is, CVE acts as a
"coordination ID."  Different vulnerability-information consumers
operate at different levels of abstraction.  For example, many system
administrators don't necessarily care about individual kernel bugs,
but they might care much more about a single patch action as implied
by a single vendor advisory that updates to a new kernel version;
these admins may operate on the "advisory ID" level of abstraction.
On the opposite end of the spectrum is the open source community,
which has effectively started using CVE as a "universal bug ID" -
that is, they operate at the "bug ID" level of abstraction.  This is
useful for coordination between distro maintainers, but not for
coordination with other communities that CVE serves.

With respect to the number of IDs that get assigned to a disclosure
of one or more bugs/vulnerabilities, CVE's abstraction has evolved to
be somewhere in between that of the vendor ID and the bug ID.  Since
CVE's primary role is to support coordination across many communities
who operate at different levels of abstraction, being "in the middle"
maximizes CVE's utility to all of these communities - but it also
means that it is rarely a perfect match for each individual
community.

For CVE, we knowingly combine multiple vulnerabilities into the same
ID, if they (1) are the same vulnerability type, (2) affect the same
code versions, and (3) were disclosed at the same time by the same
person/organization.  (Note that this is a simplification.)  We have
found that these details are usually available in disclosures, they
are provided very early in the disclosure process, and they don't
often change significantly over time.  With these guidelines, it is
easier for different people to consistently assign the same number of
CVE IDs.  The system is not perfect, and CVE Numbering Authorities
(CNAs) don't always follow these guidelines, but it works pretty well
to keep CVEs usable as coordination IDs.

Whatever decision MITRE makes on how to go forward, we will be
following the spirit of these well-established practices.  I know
this conflicts with the open source community's need for a "universal
bug ID," and that's why I'm suggesting the creation of a separate bug
ID system, perhaps centered around a scheme such as a commit ID/hash
(which is often used already, such as in the original CVE request
that prompted this message).

There is still a question about how CVE can reasonably handle
disclosures of multiple issues for the Linux kernel and other
complex, large software that is heavily reused and adapted.  Such
code may be maintained by a single upstream developer, but as we all
know, each distribution has its own practices and maintains its own
versions.  For the Linux kernel, this is a special challenge.

Let's take Kurt's assignment of CVE-2012-6138 for the kernel
info-leak issues discovered by Mathias Krause.  Mathias said that he
did not investigate these too closely, which is completely
understandable - yet now we have some very raw, detailed, public
information that is not necessarily expressed in ways that help CVE
to make appropriate decisions about the number of IDs to assign.

With respect to CVE's practice of "SPLIT by vuln type," while the
issues are all information leaks, note that an "information leak" is
a *consequence* of a bug, not an actual *type* of bug.  There are
many different bug types that can lead to the disclosure of
potentially-sensitive information.  The issues reported by Mathias
appear to contain problems like out-of-bounds reads and
improperly-uninitialized data, so a closer investigation would likely
produce a SPLIT.

With respect to CVE's practice of "SPLIT by version," we don't know
for sure which bugs affect which versions.  We know when the bugs got
fixed, but not necessarily when they were introduced - it's too early
in the disclosure.  And the distributions are likely to handle these
groups of bugs differently, so even if they are fixed in the same
upstream kernel version, there are likely to be variations in which
issues are fixed by each distro in their "local" (downstream) kernel
versions.

While it would be convenient to assume that each bug might affect
some slightly different version in at least one distro, assigning a
unique CVE ID for each bug would increase the volume of CVEs to a
point where it hurts the usability of CVE to other consumers outside
the distro-maintenance community.

Solar Designer's suggestion of per-subsystem SPLITs is an intriguing,
approximate solution to CVE's "version" problem in widely-shared code
like the Linux kernel.  It seems likely that many subsystems are
introduced in different upstream kernel versions, and probably
updated in different versions.  Some subsystems might be enabled or
disabled by sysadmins.  By using the directory structure of the
source code tree, subsystems might be reasonably inferred on a
consistent basis.  It is by no means perfect, but it should be fairly
repeatable.

Considering the Krause kernel info-leaks as an example, this might
suggest about 11 CVEs for crypto, xfrm_user, net (including net/tun),
ipvs, dccp, llc, l2tp, Bluetooth, atm, udf, and isofs.  There might
be additional SPLITs based on bug type.

What do people think?  To the distro maintainers: given that CVE
cannot support per-bug IDs for the reasons I've already described,
are per-subsystem SPLITs workable?

- Steve & the CVE-Assign Team
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.