Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 23 Oct 2015 00:27:20 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Re: Would love to see reconsideration for domain and
 search

On Thu, Oct 22, 2015 at 04:37:18PM -0700, Tim Hockin wrote:
> On Thu, Oct 22, 2015 at 4:00 PM, Josiah Worcester <josiahw@...il.com> wrote:
> > On Thu, Oct 22, 2015 at 3:37 PM Tim Hockin <thockin@...gle.com> wrote:
> >>
> >> On Thu, Oct 22, 2015 at 2:56 PM, Rich Felker <dalias@...c.org> wrote:
> >> > On Thu, Oct 22, 2015 at 02:24:11PM -0700, Tim Hockin wrote:
> >> >> Hi all,
> >> >>
> >> >> I saw this thread on the web archive but am not sure how to respond to
> >> >> the thread directly as a new joinee of the ML.  I hope this finds its
> >> >> way...
> >> >
> >> > No problem; just starting a new thread like this and quoting the old
> >> > one is fine.
> >> >
> >> >> I am one of the developers of Kubernetes and I own the DNS portion, in
> >> >> particular.  I desperately want to use Alpine Linux (based on musl)
> >> >> but for now I have to warn people NOT to use it because of this issue.
> >> >>
> >> >> On Fri, Sep 04, 2015 at 02:04:29PM -0400, Rich Felker wrote:
> >> >> > On Fri, Sep 04, 2015 at 12:11:36PM -0500, Andy Shinn wrote:
> >> >> >> I'm writing the wonderful musl project today to open discussion
> >> >> >> about the future possibility of DNS search and domain keyword
> >> >> >> support. We've been using musl libc (by way of Alpine Linux) for
> >> >> >> new development of applications as containers that discover each
> >> >> >> other through DNS and other software defined networking.
> >> >> >>
> >> >> >> In particular, we are starting to use applications like SkyDNS,
> >> >> >> Consul, and Kubernetes, all of which rely on local name
> >> >> >> resolution in some way using search paths. Many users of the
> >> >> >> Alpine Linux container image have also expressed their desire for
> >> >> >> this feature at
> >> >> >> https://github.com/gliderlabs/docker-alpine/issues/8.
> >> >> >>
> >> >> >> On the functional differences between glibc page, the domain and
> >> >> >> search keyword "Support may be added in the future if there is
> >> >> >> demand". So please consider this request an addition to whatever
> >> >> >> demand for the feature already exists.
> >> >> >>
> >> >> >> Thank you for your time and great work on the musl libc project!
> >> >> >
> >> >> > I think this is a reasonable request. I'll look into it more.
> >> >> >
> >> >> > One property I do not want to break is deterministic results, so
> >> >> > when a search is performed, if any step of the search ends with
> >> >> > an error rather than a positive or negative result, the whole
> >> >> > lookup needs to stop and report the error rather than falling
> >> >> > back. Falling back is not safe and creates a situation where DoS
> >> >> > can be used to control which results are returned.
> >> >>
> >> >> I understand your point, though the world at large tends to disagree.
> >> >> Everyone has a primary and secondary `nameserver` record (or should).
> >> >> If the first one times out, try the second.  Most resolver libs seem
> >> >> to accept a SERVFAIL response or a timeout as a signal to try the next
> >> >> server, and I would encourage you to do the same.
> >> >
> >> > musl intentionally does not do this because it yields abysmal
> >> > performance. If the first nameserver is overloaded or the packet is
> >> > lost, you suffer several-second lookup latency.
> >>
> >> But at least it works eventually.  You're faced with a choice.  Wait 2
> >> seconds for ns1 to timeout and then fail in a way that most apps don't
> >> handle well or wait for 2 seconds and then (usually) get a fast
> >> response from ns2.
> >>
> >> It seems better in every way to eventually succeed, though I agree
> >> it's a bit less visible.
> >
> >
> > With musl's current design, you get a request to ns1 and ns2, and the first
> > authoritative response wins. So, if ns1 fails then all is well and
> > performance isn't even notably impacted. What you are describing appears to
> > be how you would *have* to implement it if you decide against considering
> > all servers equal, but instead try and serve the union of their responses
> > (that is, wait for timeout and then fail).
> 
> The authoritative-ness is a dimension I had not considered.  I could
> believe that the first authoritative answer wins, but what if you only
> get a non-authoritative answer? from ns1 and ns2 never responds?

I don't think Josiah's use of "authoritative" here is consistent with
what musl actually does, and it's likely not a useful dimension. A
stub resolver is generally not intended to communicate with
authoritative nameservers. The current behavior is that a positive or
negative (nxdomain) result is treated as concluding the query, and
other results (servfail, refused, etc.) are treated as inconclusive
and allow the query to continue until it times out. (Also, servfail
results in a limited number of immediate retries; this behavior was
arrived at based on the behavior of nameservers that fail with
servfail.)

> > Consider what would happen if ns1 and ns2 have different responses, but ns1
> > for whatever reason times out (potentially an attacker). Then you get the
> > results for ns2, even though ns1 is intended to override it.
> 
> I agree in theory.  And yet this is how most resolvers work today.
> Are they all broken?

No, the resolvers are not broken. The configuration is, at least the
way I see it. The intent of the resolvers is that they're
communicating with redundant servers, not a sequence of overlaid
hostname databases. If that assumption is satisfied, there's no
problem.

BTW I think there are other strong reasons to move to a model based on
a local nameserver that does the unioning, not just performance. The
most compelling is DNSSEC, which requires a trusted channel between
the nameserver and the stub resolver in order for results to be
meaningful/trusted. In the future everybody should be running a
nameserver on localhost to do DNSSEC signature validation. In that
scheme, resolv.conf would just contain 127.0.0.1 (or could be omitted
entirely since that's the default, at least on musl).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.