oss-security - Re: linux-distros list policy and Linux kernel, again

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230830152633.GA6199@openwall.com>
Date: Wed, 30 Aug 2023 17:26:33 +0200
From: Solar Designer <solar@...nwall.com>
To: Willy Tarreau <w@....eu>
Cc: oss-security@...ts.openwall.com,
	Vegard Nossum <vegard.nossum@...cle.com>,
	Jiri Kosina <jkosina@...e.cz>, Donald Buczek <buczek@...gen.mpg.de>,
	Greg KH <gregkh@...uxfoundation.org>
Subject: Re: linux-distros list policy and Linux kernel, again

Hi Willy,

On Mon, Aug 28, 2023 at 09:17:56PM +0200, Willy Tarreau wrote:
> what I suspect instead
> is that reporting security issues is so stressful for anyone (constantly
> making sure not to do a mistake nor to send to the wrong people) that once
> they see the fix merged, they just relax and consider the job done, so
> most likely linux-distros isn't even contacted at this point. And it's
> very possible that some having experienced a friendly process on s@k.o
> and felt some unneeded pressure on l-d just don't want to go there again.
> I personally see this a bit like projects asking to sign a CLA: you come
> there saying "hey, you had a bug there, I fixed it, look" and in return
> you feel like you're swamped by some heavy process so you just give up,
> swearing you'll never go there again. That might be exagerated but I
> can understand how it could be felt that way. I'm having periods where
> it's very difficult for me to find even one extra hour a day, and I would
> certainly not appreciate at all being pressured like this to tidy my stuff
> and prepare for it to be published when I have other things to do, after
> having made the effort to report a bug. So that's something to keep in
> mind, not everyone deals with it the same way.

Of course, I understand this.  (linux-)distros isn't a send-and-forget
list, and this does exclude its usage by people who are aware of this
fact and only want or have time to send one message without staying on
top of the issue afterwards.  The obvious alternative would be
vendor-sec alike, without specific rules, which had its other problems.

In practice, no matter what we say in the policy, sometimes the reporter
just won't communicate further.  In those cases, (specific) list members
should take over, including making the eventual public disclosure.  What
we could possibly do, if we want to and have the resources, is make this
a pre-allowed option for reporters, instead of an undesirable exception,
which it currently is.

> I couldn't blame a bug reporter for
> wanting to have their week-ends and nights again and think everything's
> behind them and in someone else's hands now.

Right.  This is in part a matter of resources - are we providing only
the lists infrastructure and list members' best-effort volunteer
contributions to issue handling, or are we providing any guaranteed
service?  For the latter, perhaps list admin(s) (me) should always take
over whenever the member distros don't handle that sort of
contributing-back tasks on time.  Then we'll be able to provide a
guarantee that all issues will be handled without the reporter having to
stay on top of them.

A drawback is that this may encourage lower-quality or lower-relevance
reports, including of issues that are not worth handling in private.  So
it could end up wasting those extra resources allocated to this effort.

> On Mon, Aug 28, 2023 at 08:05:18PM +0200, Solar Designer wrote:
> > That said, can you share more detail on the specific issue you referred
> > to above and its handling/disclosure timeline?  Was it ever brought to
> > oss-security, and if not then why not?
> 
> I just checked and I'm not seeing any traces of it there. I don't even
> know who normally notifies about such issues there.

If you worked on the issue, then perhaps you were the most appropriate
person to notify oss-security about it?

Note: this is unrelated to disclosure timelines, policy, etc. - I am
talking about public notification for the already-public issue.

> > I am guessing this is related to your work on random32 in 2020:
> > 
> > https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
> 
> Ah yes indeed it's that one! How painful memories suddently come back!
> 
> > If so, it looks like the original issue became public via your commit in
> > July 2020, but further issues with that fix commit were discovered and
> > fixes for them prepared in public in August and only merged in October.
> > 
> > So I guess some lengthy private discussion occurred before July 2020,
> 
> Yeah it started in early March, and Eric, Amit and I basically spent all
> our week-ends and numerous evenings experimenting with different methods
> to deliver good enough randoms without breaking the principle of not
> reusing the same IDs too fast (still have a long minimal period), and
> running tests on real traffic, counting failures. At some point in July
> I gave up and concluded we couldn't fix it alone between us and needed
> some public help, hence the posting.

After I posted the above, Brad Spengler pointed me at another related
issue that you worked on in 2022:

https://lore.kernel.org/all/20220502084614.24123-1-w@1wt.eu/
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ef562489818723ea0a66c57bfdfbf151ad568c42

In fact, your description above sounds like it could be (in part?) for
that newer issue.

Anyway, perhaps both of these should have been brought to oss-security
at some point, but they were not?  As to handling them in private on
linux-distros, I see little value in that, so they're not a reason for
us to have allowed longer embargoes.

> > but it wasn't enough anyway, which makes me question the value of having
> > the initial handling in private.  Maybe the issue wasn't critical enough
> > and privately-fixable enough for that.  Maybe this actually illustrates
> > that such issues are best handled entirely in public... if it were not
> > for the researchers' incentive you mentioned (plan to publish a paper).
> 
> It's always the same for random attacks: the reporter sees a very high
> success rate in a lab while those dealing with production know for sure
> that the success rate is so close to zero in field that it cannoot be
> represented on a float. But there's a wide spectrum between the two,
> such as mostly idle routers serving as route reflectors, or monitoring
> devices etc. Thus you start from "it could theoretically be damaging in
> certain environments, let's be careful", with the researchers initially
> willing to be discrete since working to prepare a paper. As we made
> progress and saw the risks of attack significantly fade away but never
> close enough to zero, we concluded that in the worst case we had something
> better than the original and it wasn't that much of a problem anymore to
> make it public. But I think the researchers also progressed on their side
> seeing the hopes to get a quick fix fade away and the reality hit the
> theory, thus being more willing to disclose more of their work. It's a
> bit of everything.

OK.  None of this feels like good material for linux-distros (except
maybe very close to its publication, if there was a known date), but it
does feel like good material for eventual summary on oss-security.

> > Alternatively, we may need to relax the policy.
> 
> I personally think it does have a flaw that is emphasized by the linux
> kernel handling but can actually affect other projects. Some sole
> developers might just not have enough resources to do everything in
> 14 days, from diagnosing the problem at night or only during a few work
> hours, setting up a lab on the week-end to test a fix, to contacting
> whoever needs to be contacted and making releases. Some even make the
> mistake of developing new stuff in maintenance branches and feel like
> they need to finish before releasing (already seen)! I remember having
> had to search in my boxes of hardware to re-assemble a working PC with
> a floppy drive just to be able to validate a fix in the floppy driver.
> You can be sure I only did that the week-end after the report, but
> that's possibly 5 days lost already!

This is partially addressed in our current instructions, which say:

"Please notify upstream projects/developers of the affected software,
other affected distro vendors, and/or affected Open Source projects
before notifying one of these mailing lists in order to ensure that
these other parties are OK with the maximum embargo period that would
apply (and if not, then you may have to delay your notification to the
mailing list)"

Incidentally, this is consistent with the Linux kernel documentation
edit that prompted this thread.

> I understand the rationale behind your policy. I, too, was on vendor-sec
> where we saw some vendors say "just FYI we're trying to fix this, we'll
> keep you updated" and one year later, no news. But all those doing a
> serious work (and there are, and the linux security team is doing that
> serious work) can be heavily penalized by that policy when they're not
> quick enough to obtain a fix. The linux people are known for being vocal,
> so you hear about them. But other developers might just feel completely
> crushed by this and it could really be harmful to them, especially when
> they're new to this and haven't been dealing with security reports for
> 25 years like many of us.
> 
> That's why I tend to think that what would better address what you want
> to prevent, is ensuring the discussion doesn't come to a stall. This
> could remove a lot of frustration. And if something has to be published
> before the end because the developers or vendor stay silent, it's much
> more powerful to say "they didn't dare responding for 14 days" than
> "they couldn't figure a working fix for this complex issue in 14 days".

I had similar thoughts too, but OTOH allowing arbitrarily long even if
non-stalled discussions means not only longer embargoes and higher risk
and impact of leaks, but also a greater number of simultaneous
discussions on the list.  When issues take a long time to handle and
many are tracked at once, this increases/wastes the effort per issue.

> > So the real problem
> > may be that (linux-)distros is misunderstood as permanently-private
> > rather than temporarily-private.  Unfortunately, I don't know how to
> > address that reliably.  Even with automated delayed publication, some
> > people would initially have the wrong idea... maybe unless they have to
> > pass through a web page with the public archives before finding the
> > posting address?
> 
> Just a stupid idea, it could possibly be addressed by a confirmation
> e-mail on an opening thread. Something like "we need you to confirm that
> what you posted will be made public by YY/MM/DD, if that's really what
> you want, please visit this link within 24h otherwise all your materials
> will be destroyed".

We already use a somewhat obscure posting address and a required Subject
prefix, although the latter is currently not enforced strictly (is
mostly an anti-spam measure, so is bypassed by some other keywords
contained in the headers and/or message).  I think part of the problem
was that the kernel documentation gave these away directly, without
people having to see our policy and instructions first.

> I'm not sure, that's just an idea. But yes, it needs
> to be understood as public so that confidential stuff is not shared
> there, and it must be possible to ask for some materials to be erased
> early if the reporter wasn't aware of this or made a mistake (e.g. send
> a pcap just before the security team says "never ever share a pcap!").

There's no reliable way to erase stuff from all subscribers' mailboxes.
At "best", we could exclude it from delayed publication.

> You're welcome. I don't want to interfere with the lists you operate
> nor with those working on them, but I observe that there has been some
> frictions multiple times for reasons that are probably not too hard to
> address if respective participants discuss just a bit, which is why I'm
> sharing some observations ;-)

I appreciate this.

Thanks,

Alexander
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.