Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 07 Apr 2015 21:19:29 -0400 (EDT)
From: "David A. Wheeler" <dwheeler@...eeler.com>
To: "oss-security" <oss-security@...ts.openwall.com>
Subject: Re: Hanno Boeck found Heartbleed using afl + ASan!

On Tue, 7 Apr 2015 13:27:40 -0700, Michal Zalewski <lcamtuf@...edump.cx> wrote:
> You know... on some level, I'm happy - but on another, I'm always
> trying to be skeptical when such claims are made for other projects.
> It's only fair not to treat this case differently.

Fair enough!

> It's worth remembering that the authors of several static analysis or
> symbolic execution frameworks have also claimed that their products
> would have found Heartbleed.

To be fair, the (reputable) posts I've seen don't make such claims.
Instead, the static analysis tool creators generally acknowledge that they
could *not* have found Heartbleed at the time, and then discuss changes
they're making so that they can find similar problems in the future.

For example, Andy Chou (Coverity) stated,
"Many of our customers have asked whether Coverity can detect Heartbleed.
The answer is Not Yet - but we've put together a new analysis heuristic
that works remarkably well and does detect it."
http://security.coverity.com/blog/2014/Apr/on-detecting-heartbleed-with-static-analysis.html

Similarly, Paul Anderson (GrammaTech) said,
"The minute I heard about Heartbleed... I downloaded the source code and ran CodeSonar
to see if it would find the defect. Unfortunately it didn’t...."
http://www.grammatech.com/blog/finding-heartbleed-with-codesonar

You *do* have to read a little between the lines in the post by Roy Sarkar (Klockwork),
but his post does clearly state that you have to create specialized overrides for their
tool to detect Heartbleed.  That admits that it CANNOT find Heartbleed without significant help.
http://blog.klocwork.com/software-security/saving-you-from-heartbleed/


> IIRC, their experiments were far more
> convoluted than Hanno's, but the bottom line is that when you're
> trying to "discover" a bug you already know about, it's almost
> impossible to avoid subconsciously optimizing for the expected outcome.

That's a fair critique.

> So, I always urge people to ask a simple question: would someone think
> of running the tool this particular way and on this particular code
> before we knew about the bug? And if yes, why haven't they?=)

A few answers below.  Sorry, it got long :-).

> The answer I've always heard from commercial software vendors is that
> "they had no time to work on open source projects", but that's about
> as unconvincing as it gets. I bet they would love to be credited for
> this or any comparably serious find.

Nit: Let me change "commercial" to "proprietary", since there is
lots of commercially-supported OSS.

Those are weird comments you're being told.
Many companies who develop proprietary software absolutely *do* examine
open source software, and they ask for credit if their tool finds things.
They get that credit, too.

The most obvious example for Heartbleed is Codenomicon, who were
one of the two organizations to find and report Heartbleed in the first place.
(Google also found it, through source code examination.)
Codenomicon found Heartbleed using their generation ("smart") fuzzing approach
as implemented in their proprietary tools.
They had programmed in what the protocol was supposed to do (the typical
generation approach), and also developed an additional mechanism called “Safeguard” to
analyze the response (output). Safeguard doesn't check if the result is *exactly*
as expected; it checks if the result meets certain expectations.
Safeguard was the key in finding Heartbleed;
since Heartbleed was an out-of-bounds read, and out-of-bounds reads don't normally
cause crashes, fuzzing processes that only detect native crashes can't find Heartbleed.
Some details, and links to more, are here:
http://www.dwheeler.com/essays/heartbleed.html#fuzzer-examine-output

Codenomicon's approach took some effort (you need to describe the protocol
and the required postconditions), but their approach unquestionably worked.
It works even if you don't have access to the binaries, so you can use it on systems
where you only have I/O access to the system... something that is *NOT* practical today with afl.

Codenomicon absolutely *did* ask for (and got) credit.  Credit they rightly deserved, too.
They even created the Heartbleed logo (which I happen to like, BTW; sue me).

They aren't the only ones to look for vulnerabilities in OSS, find them, and get credit.
Coverity has an entire program for examining open source software:
  https://scan.coverity.com/
HP/Fortify has the "Fortify Open Source Review" project:
  https://hpfod.com/open-source-review-project
These programs absolutely *do* find vulnerabilities, and many OSS projects
happily give credit to the tools for finding vulnerabilities or other problems
(just like they would give credit for any reporter).

Indeed, my understanding is that proprietary tool makers are heavy users
of OSS as sample programs.   Tools can be created that
detect vulnerabilities in trivial programs, but fail as the complexity goes up.
Proprietary static analysis tools are often applied to custom or proprietary software
where the user is NOT willing to send the code to the toolmaker, and this creates
a problem - how can a toolmaker improve his tool without this feedback?
One solution: The toolmakers can use OSS to help them detect cases where complexity is
causing their tool to fail, and then help them determine what to do about it.

All toolmakers are trying to improve their tools.
So the real question is, are these improvements so general that they will
improve catching a large swath of related problems, or are they fairly limited
and only help re-find Heartbleed?

For static analyzers in many cases the answer is unfortunately mixed.
Andy Chou (Coverity) came up with a clever heuristic -
detect byte swaps (e.g., ntohs), and presume they are tainted because they
"constitute fairly strong evidence that the data is from the outside network and therefore tainted".
http://security.coverity.com/blog/2014/Apr/on-detecting-heartbleed-with-static-analysis.html
This approach has been duplicated in clang, and shown to work for its purpose:
http://blog.trailofbits.com/2014/04/27/using-static-analysis-and-clang-to-find-heartbleed/

However, this clever heuristic is pretty limited; it helps find Heartbleed, and a few
extremely similar problems, but it's hardly a general solution for finding vulnerabilities.
So I'll commend this heuristic as a tool improvement, while simultaneously noting
that this particular improvement will have limited utility.
Other toolmakers have done other things, but I think this demos the point.


> Today, I'm asking myself the same
> question about AFL. Was it too counterintuitive to set this up? Were
> there other barriers to entry? Can I fix this now?

I think there are a couple issues, some of which aren't really "tool" technology issues.

One is that until recently, fuzzers generally have been much more limited in
their ability to actually find and report vulnerabilities:
1. Fuzzers (particuarly ones that are not generational)
are notorious for doing only "shallow" examinations.  Fuzzers
that just generate random data, and have no feedback loop like afl's, don't go deep.
Generational fuzzers get much deeper, but they require more work to give them that info.
Reports like "Pulling JPEGs out of thin air"
http://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html
show that perhaps some fuzzers are becoming better at diving in.
2. Fuzzers historically don't detect problems like this.
Out-of-bound reads do not normally cause crashes, so fuzzers historically won't detect them.
Reports like this one show that when combined ASan, fuzzers are more likely to
detect whole classes of problems.

So one "barrier to entry" is simply the time it takes for people to
discover that something has gotten better, including tutorials that show how to use ASan
simultaneously with fuzzers like afl.  Making it *easier* to use them together helps too.
So there need to be better tutorials and easier ways to combine ASan with fuzzers
(as you know, I co-created one solution that works today on Linux).

Also, if you want to test a network protocol (like SSL/TLS),
you have to actually run the code for the protocol.  That's hardly a big insight :-).
Currently afl is *only* a file fuzzer.  Hanno simply created some stub code so that
it could use the file fuzzing to run the network code.  This is hardly new; other people
have done this before.  What's more, I think that is a *much* more general
technique than the "detect byte swap" heuristic I described above.
That's why I think this blog post is valuable; it shows that by simply wrapping a
network protocol, a really *general* approach that's not hard to emulate,
you can use an advanced file-fuzzing tool combined with ASan to find vulnerabilities.

That means that another "barrier to entry" to be solved is that it should be
easier to fuzz network protocols using tools like afl.
That's been discussed in the afl-users mailing list,
but perhaps this gives it more urgency.  If anyone has ideas on how to solve this
in a general and easy-to-use way, I'd love to hear about it.

So while it's right to be skeptical of reports that say, "I found the problem I
already knew was there", I think this has a lot of merit.
If you want to fuzz a network protocol, and you have a fuzzer that only
generates files, then to use it (today) you need to wrap it in something
that does a conversion between files and (quasi) network activity.
That is unsurprising and very general.

Anyway, I hope this helps.

--- David A. Wheeler

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.