Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 8 Oct 2014 08:38:12 -0700
From: Tim <tim-security@...tinelchicken.org>
To: oss-security@...ts.openwall.com
Cc: "David A. Wheeler" <dwheeler@...eeler.com>
Subject: Re: Thoughts on Shellshock and beyond


Hi Michal,

I sincerely admire your work, both on this issue and in many of your
past projects.  However, on this I definitely disagree with you on
many points below.


> I feel that to some extent, "separation of code and data" is an
> overused, overly simplistic, and arbitrarily applied mantra; it is
> also the antithesis of interpreted scripting (and a good chunk of
> other stuff in computing), for mostly valid reasons.

Of course any principle or rule of thumb can be over-used or applied
to situations that don't make sense.  Exceptions make the rule.  But
if you aren't even aware of the rule or principle to begin with (as
apparently many programmers aren't), then you won't realize where the
edge cases are and when you can get away with breaking it.

I do appreciate a pragmatic attitude about these kinds of things when
I try to apply principles to concrete problems.  But it was my
impression that David was looking for some inductive reasoning to
generalize this to a "class" of issues or in some other way try to
help us understand how to avoid similar issues in the future.  So
generalizing something was the goal.


> Heck, did you know that web fonts loaded and displayed by your browser
> come with an embedded hinting bytecode that gets executed in a
> miniature VM? And while this is kind of crazy, it's there because...
> well, there aren't that many sane alternatives.

I'm not familiar with the details of web font hinting.  Maybe you are
right and there aren't really other ways to achieve this without some
kind of code embedded in the data.  And if the bytecode is
well-defined and well sandboxed, then perhaps there's not much risk.
But if such a system were designed by a programmer who didn't
appreciate the potential risk involved in combining code and data,
then how well do you think that would have worked out?

This case is a stark contrast to both the OGNL/Struts case and the
Bash case. 

Countless web frameworks successfully parse POST parameters without
interpreting them as code.  The parameters are simply mapped to hash
tables, arrays, and strings.  The Struts programmers didn't appreciate
the risk of reusing their OGNL parser as a data parser, so the project
has basically become a security nightmare repeatedly waiting to
happen.  It's not like they are using special OGNL scripting features
in these parameters, it's just used as a data parser.  There was
clearly no need for this kind of mixing.

In Bash, you guys have already found a reasonable solution to
separating code and data by putting functions in their own namespace.
Could there been alternative ways to communicate function definitions
to subprocesses if the bash programmers had more than a few days to
think about it?  Probably.  So there was clearly not a need for mixing
code and data, it just happened to be convenient so many years ago.

So I assert that while some edge cases may require mixing code with
data, the vast majority of the time it isn't necessary and indeed
won't diminish the power of the software.

On the flip side, we DO need to provide programmers with simple rules
of thumb to help keep them out of trouble.  So many developers don't
really care all that much about security.  While they wouldn't want
their software to be vulnerable to attack, they won't spend a great
deal of time studying security issues.  So we need "soundbites", if
you will, to give them a starting point on where to tread lightly.


> (Some 15 years ago, I would have given you a different answer - I even
> had a pet project of a brand new operating system that would solve all
> of world's ills. Today, I sort of accept that we're stuck with Unix
> and that there's plenty of usability-security trade-offs that exist
> for a reason, not just because other people are clueless ;-).

Yup, I went through a similar phase.  You and I are about the same age
and likely have a similarly geeky back-story.

Now let me say that I think "security-usability tradeoffs" is an
over-used concept.  While it is *totally true* that in some cases you
simply have to chose more security or more usability, these are often
not mutually exclusive.  This trade-off concept is used over and over
again by people (who are mentally lazy) to stop thinking about more
clever solutions.  There always is a trade-off, when you dig deep
enough, but in many real-world problems there are solutions that
improve both security *and* usability.  At least, until you reach some
sort of asymptotic limit (at which point you really do have to
choose between the two).  I'm not saying you are mentally lazy by any
stretch, but don't forget that many other people are.  So don't give
them an excuse to stop thinking.


> 1) The feature was clearly added with no basic consideration for the
> possibility of ever seeing untrusted data in the value of an
> environmental variable. This lack of a threat model seems to be the
> core issue, essentially precluding the discussion of potential "best
> practices" such as namespaces, magical out-of-band function passing,
> etc.
> 
> Ideally, post Morris worm, this assumption should have raised some
> eyebrows. On the flip side, the code predated much of the modern
> infosec practice, and it's unlikely that any security engineers
> monitor bash development even today - so while it's easy to prescribe
> solutions in retrospect, not sure how credible they can be...

It was certainly hard for the original developer to anticipate how
this would become a problem, given the time and place.  But I think we
can try to learn from this and similar issues and hopefully make fewer
of these mistakes in the future.



> 2) The mechanism wasn't well-documented *and* just as importantly, has
> fallen into near complete obscurity, largely precluding security
> researchers from bumping into it by accident. The "not falling into
> obscurity" part is not solvable, although it's a pattern that also
> haunts the browser world, and may be an argument for aggressively
> sunsetting features that do not catch on - something currently not
> mentioned on your list.
> 
> The detailed documentation part is perhaps easier to tackle. The
> security properties of shells are generally under-documented and
> counterintuitivie, as evidenced in some of the followup discussions
> where somebody was showing off a "safe" use of system() supposedly
> rendered unsafe by Florian's patch. Decent security-centric docs,
> authored or even merely just reviewed by the maintainers, would have
> helped highlight the risk.

To some extent, "don't mix code and data" is about expectations.  Both
user expectations and programmer expectations.  If I invent a new file
format for storing images of cute puppies (with a .PUP file
extension), then it seems unlikely that users would expect that file
format to include executable code and be somehow harmful to their
computer.

So yes, documentation is important for setting expectations.  But no
one reads the manual, either.  Users and programmers often make a
significant number of assumptions.  If a format primarily carries
data, they will *assume* it is just data, and not code.  If the format
were primarily billed as a code format, then people will treat it as
such.  The danger comes when the expectation is that something is
data, when it can also include code.  Mixing the two confuses
expectations for at least some large percentage of users/programmers.


> 3) Apparently, for 20+ years, nobody in the security community has
> ever read a book on shell programming that mentioned this feature, and
> has never ventured deep enough into the man page, to have a "hmm, I
> wonder how that works" moment when seeing a vague mention of the
> feature.

See previous comment on no one reading the manual. ;-)


Cheers,
tim

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.