musl - Re: Resolver overhaul concepts

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5366B1E1.7020008@skarnet.org>
Date: Sun, 04 May 2014 22:32:17 +0100
From: Laurent Bercot <ska-dietlibc@...rnet.org>
To: musl@...ts.openwall.com
Subject: Re: Resolver overhaul concepts

> Requiring port 53 is not very prohibitive relative to resolv.conf and
> nsswitch.conf which are impossible to override without root, but it's
> slightly worse: it might be a problem if you also need a public DNS on
> the same machine.

  You mean running a resolver and a server on the same machine ? I've
been doing that for years with tinydns on the outside and dnscache on
127.0.0.1: there's no reason why a user couldn't do the same with a
custom resolution daemon.
  Of course, providing custom resolution *and* DNS data to the outside
world then requires two different public IP addresses, but that's
nothing new: using the same port for resolution and data service is a
fundamental flaw in the DNS protocol in the first place (and the main
reason why mainstream DNS software is so hopelessly monolithic), and
a custom resolution daemon won't be in a different position from, say,
dnscache.


> Of course if we want to make it possible to override the config on a
> per-process basis, requiring port 53 is a fairly serious limitation.

  I don't feel the same way. Name resolution is name resolution; if a
resolver, no matter how it resolves, reads and understands client DNS
queries, then it makes sense for it to listen on port 53 somewhere
(if we forget that DNS data servers also listen on the same port).
I understand the desire for flexibility, and I can imagine cases where
this would be useful, but I don't see a blatant need for it.
Especially with IPv6 around the corner (only two or three decades now),
where address space is cheap.

  
> Yes, tcp is not supported at all. I don't see any reason one would
> need tcp for a non-recursive resolver. In principle a response just
> needs a few more bytes than the request, plus 4 bytes per address (or
> 16 for AAAA), and the request size is bounded just above 256 bytes
> (the max hostname length).

  Plus the authority and additional sections, and... oh wait, you only
have to handle A and AAAA, which shouldn't have any of those. OK, now
I understand how you don't need a full DNS engine. :)

  Still, I wouldn't bet it's going to remain that way. Having 6 or 7 A
fields isn't uncommon nowadays (google.com has 6 in most places).
Now imagine 6 or 7 AAAA fields instead: it will begin to seriously
flirt with the limit. It won't happen in the near future, but it will
happen.


> More complexity, more failure cases, and then it also depends on free
> as opposed to just malloc.
>
> Also it avoids additional fragmentation.

  Don't get me wrong, I'm a huge advocate of using the stack whenever
possible. s6-dns actually started when I studied djbdns's client
library and went "ewww, there are way too many mallocs in there -
I can do better."
  It's just that for generic DNS responses, there's no way around
malloc - but if you don't need to support TCP, then you can store
all responses in the stack indeed, and that's a lot of savings.


> A round trip network query (even to localhost) takes several times as
> long as creating a thread (and for tcp, typically takes hundreds of
> times the resources of thread creation since the kernel allocates
> massively bloated send/recv buffers).

  I was thinking ease of use from a programmer's point of view.
Creating a thread to perform a simple operation, then join the thread,
is not elegant. In Go, this is absolutely the right way of doing
things (because goroutines are even lighter than threads - the runtime
has its own multiplexer), but in C, something like getaddrinfo_a is
easier and more idiomatic.


> Similarly, getaddrinfo needs DNS, but it only needs generation of
> fixed-form queries, minimal data extraction from result packets, and
> some degree of validation.

  Being able to guarantee that all queries and responses fit into the
stack is simply huge.


> FYI the current code is ~4k binary and the overhaul is not expected to
> increase that much. I really doubt you could achieve that with general
> DNS library code.

  Indeed. My s6-dnsip4 static binary (linked against musl, for x86_64) is
just short of 40k, including roughly 25k of DNS code, and I've made it as
lean as I could without sacrificing readability.

-- 
  Laurent
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.