musl - Re: Resolver overhaul concepts

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140504162437.GA27258@brightrain.aerifal.cx>
Date: Sun, 4 May 2014 12:24:37 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Resolver overhaul concepts

On Sun, May 04, 2014 at 05:07:33PM +0100, Laurent Bercot wrote:
> 
>  I believe the very first thing to address is what exactly you call
> a resolver.

There are some legacy dn_*/res_* interfaces in demand which are
presently supported only poorly or not at all. Part of the side goal
of the resolver overhaul is to provide them cleanly without code
duplication. But for the most part, "resolver" means "getaddrinfo"
since it is the only standard, non-deprecated interface to name
resolution.

>  getaddrinfo() is a horrible interface, and one of the reasons why
> is that it is loosely designed. Not much is standardized, and it's up
> to you to decide exactly what to do with it; it's important to be
> clear about what is implemented, and to document it, because not all
> applications have the same expectations, and it's very easy to get
> confused when the resolution path is unexpected.

It's standardized by POSIX, and the POSIX text is sufficient to tell
you how to use it for all portable usages. Most of the confusion/mess
comes from non-conforming implemnentations, particularly in the area
of returning wrong error codes.

>  glibc's getaddrinfo() is the entry point to the NSS layer, which
> can basically implement *any* kind of "name resolution". AFAICT,
> it's not a goal of musl to reimplement the whole NSS spaghetti
> monster, but some applications will depend on /etc/nsswitch.conf
> or something similar; even without supporting /etc/nsswitch.conf,
> it would be nice to provide a mechanism to selectively enable/disable
> at least /etc/hosts lookup and DNS lookup. The current resolution

The policy for supporting something like nss has always been that musl
implements a perfectly reasonable public protocol for providing any
back-end you want: the DNS protocol. You can run a local daemon
speaking DNS and serving names from any backend you like, and this is
the correct way to achieve it (rather than linking random buggy,
likely-not-namespace-clean libraries into the application's address
space). In order to make this the most useful, though, musl should
support nameservers on non-default ports (is there a standard syntax
for this, or can we support one without breaking anything?), and it
would also be nice to be able to override resolv.conf on a per-process
basis (e.g. via the environment).

> policy is hardcoded as "/etc/hosts, then DNS, and nothing else",
> which is a very sensible default, but probably shouldn't be the only
> alternative - or if it is, it should be made abundantly clear.

There was a legacy file, /etc/host.conf, that allowed the order to be
changed, but changing the order seems rather useless to me. On the
other hand suppressing /etc/hosts could be useful in some instances.

> >The concepts of the new DNS query backend are not really solid yet.
> >One idea is that it should support the "search"/"domain" functionality
> >of resolv.conf to allow querying multiple seach suffixes in parallel
> >and returning as soon as there's a (possibly zero-length) initial run
> >of negative results followed immediately by a positive result. The
> >cleanest way to implement this kind of thing may be using a callback
> >function for writing each packet and for reading the responses;
> >otherwise, storing all the queries and responses as full DNS packets
> >would take an unwantedly-large amount of space.
> 
>  This is the approach I used in s6-dns (src/libs6dns/s6dns_resolveq.c)
> and it has worked fine for me so far.
>  I don't think the amount of space is a concern here: the typical
> search line is very short - 3 to 4 suffixes at most. You will have
> to store the queries anyway to check the responses against them.

4 suffixes times 2 RR's (A and AAAA) makes for 8 queries, which takes
4k to store the responses and up to 2k to store the queries. That's
not too bad, but along with the address lists, file buffers, and other
stuff getaddrinfo has around, it's getting the stack usage up to the
point where getaddrinfo would probably be the biggest stack user in
musl, which in turn increases the minimum stack size you need for some
usage cases (think: getaddrinfo_a, which makes one thread per query
and would like to be able to set the thread stack to one page with no
guard page).

>  Another question that comes to mind is the timeout and retry policy.
> This is network, it's going to suck; this is DNS, it's going to suck
> even more. getaddrinfo() doesn't allow the user to specify a timeout
> (yay for unboundedly synchronous network-facing interfaces), so it's

For asynchronous use, you call it from its own thread (or use the
getaddrinfo_a extension, which we don't yet provide but which is easy
to provide on your own and which I may add to musl since it's
convenient and ultra-light).

> up to musl to decide what to do: do you resend a query after a soft
> timeout ? do you have a hard timeout after which you report failure ?
> or do you block indefinitely ?

There is presently a hard-coded failure timeout of 5 seconds and a
retry time of 1 second. It would be nice to honor settings from
resolv.conf to tweak these.

>  Doing network communications the right way (especially with an old
> and ugly protocol) is complex. It should be way outside the scope of
> a libc. glibc people have it easy: the DNS part of NSS directly ties
> into libresolv, so they have a full-fledged resolver to use. I say
> we should do the same and tie musl to libs6dns. :P

Using a full-fledged DNS library to provide getaddrinfo is akin to
using GMP to provide printf...

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.