Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 31 Aug 2022 19:59:15 -0400
From: Rich Felker <dalias@...c.org>
To: Dalton Hubble <dghubble@...il.com>
Cc: musl@...ts.openwall.com
Subject: Re: musl resolver handling of "search ." in /etc/resolv.conf

On Wed, Aug 31, 2022 at 10:33:05AM -0700, Dalton Hubble wrote:
> Hey folks,
> 
> I wanted to flag a possible issue with musl handling of DNS "search ." in
> /etc/resolv.conf.The easiest way I have to repro and consume musl is
> starting an alpine or busybox musl container image.
> 
> podman run -it docker.io/alpine:3.16.2 /bin/ash
> 
> Edit /etc/resolv.conf to the following (not the "." at the end of search):
> 
> ```
> search default.svc.cluster.local .
> nameserver 8.8.8.8
> options ndots:5
> ```
> 
> ```
> wget www.google.com
> wget: bad address 'www.google.com'
> ```
> 
> Remove the "." from search and wget will work fine again.
> 
> https://github.com/coreos/fedora-coreos-tracker/issues/1287 has some great
> details showing DNS packet capture and a malformed packet.
> 
> Broader context is that systemd and recently Kubernetes start adding
> "search ." to resolv.conf in certain scenarios, which seems to break
> musl-based resolvers.
> - https://github.com/systemd/systemd/pull/17201
> - https://github.com/kubernetes/kubernetes/pull/109441
> - https://github.com/kubernetes/kubernetes/issues/112135

Uhg. It was not forseen that . would be put in the search domains
list, and putting it there, especially anywhere but the final position
in the list, recreates a bad behavior that we explicitly tried to
avoid having in musl.

The mechanism of the failure is that malformed DNS queries are sent
with a literal . at the end of the name. This probably also happens if
the domains in the search list end in dot. Since the queries are
malformed, they don't get responses (or ServFail) and then the search
cannot continue.

This can be fixed by properly stripping the final dot in search
entries, and skipping ones that are otherwise malformed. Then we need
to decide what to do with the empty (root) search suffix. There are 3
options I see:

- Actually support it as a search. This is *bad* behavior, but at
  least unlike the version of this behavior musl explicitly does not
  implement, it was explicitly requested by the user. Except that it
  wasn't, because systemd is just putting it in everyone's
  resolv.conf..

- Skip it completely. Never search root; wait for the end of the
  search list and query root as always.

- End search on encountering it and go directly to the post-search
  query at root.

If it weren't for systemd and other things creating searches for .
without the user's intent, I think the first option would clearly be
the most reasonable. It provides a way to explicitly "get back" the
functionality musl omits, on an opt-in basis. And maybe systemd is
only emitting it as "search .", not putting . in the middle of other
search domains?

One of the other options might be a more conservative choice to make
now, to avoid creating a new "feature" without thinking through what
consequences it might have. We could always allow searching root
later after there's been time to think through the consequences,
rather than rushing is as part of a bugfix.

Anyone care strongly about this one way or another?

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.