Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 1 Jun 2014 10:53:52 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Requirements for new dns backend, factoring considerations

On Sun, Jun 01, 2014 at 12:19:44PM +0100, Laurent Bercot wrote:
> 
>  Hi Rich,
>  Great work, as usual.
> 
> 
> >The problem however is implementing this on top of something that
> >looks like res_send. Even if not for search paths, res_search and
> >res_query need parallel A and AAAA queries, whereas res_send has no
> >means to request both. We could imagine implementing res_send on top
> >of a hypothetical "res_multisend" with a count of 1. While this would
> >work, it's not a very friendly interface for implementing res_search
> >or res_query since they would have to provide a large number of
> >pre-generated query packets (6 search domains * 2 RR types * up to 280
> >bytes per query packet = 3360 bytes of stack usage) despite it being
> >trivial to generate the Nth packet in just 280 bytes of storage. The
> >storage requirements for storing all the results are even worse
> >(6*2*512 = 6144) compared to what's actually needed (2*512 = 1024).
> 
>  I actually never thought about that. Since s6-dns stores answers in the
> heap, it doesn't have to pre-allocate storage for them, so it happily
> sends everything in parallel.

Well the important part is the same: dirty pages. Whether it's on the
heap or the stack, an extra 9k of temp data means touching 2-3 extra
pages that may have previously been untouched. If this just happens at
startup, the memory usage persists for the rest of the process's
lifetime and it's essentially wasted.

Also the 9k is just _additional_ here. There are already several
0.5k-1k buffers for accessing files, storing address results, storing
the canonical name, etc. and it quickly adds up. It doesn't reach the
threshold where I'd say "this isn't reasonable to assume we have
available on the stack" but it's cost and probably throws getaddrinfo
into being "musl's biggest stack user" by a nontrivial margin
(otherwise printf with float is the biggest).

> >The alternative I see is some sort of "res_multisend" that uses a
> >callback to let the caller generate the Nth packet and a callback to
> >notify the caller of replies. Then res_send could be implemented with
> >callbacks that just feed in and save out the single query/response.
> >And res_search would generate all the query packets but only save the
> >"current best match" for each address family.
> 
>  That sounds reasonable.

It's definitely reasonable from an efficiency standpoint, but it's
also the most complex approach (storing the working set in structs and
passing a context back and forth, how to pass buffers back and forth
without wasteful copying, etc.), and possibly has the largest code
size too.

> >As another alternative, we could drop the goal of doing search
> >suffixes in parallel.
> 
>  The best choice depends on your timeout values and retry policy.

Current retry time is 1s and failure timeout is 5s. These should be
configurable, but to do that, we need a way of reinterpreting
resolv.conf's timeout settings for the way musl does things: musl does
not wait for one nameserver to timeout then fallback to the next, but
queries them in parallel. (I know some people have doubts about this,
but in practice it results in massive performance improvement for
resolving, especially if you have several nameservers with different
latency and caching properties, such as localhost, isp-nameserver, and
8.8.8.8.)

>  So if your timeouts are very short, sure, serial search will work. But

Timeout is not really the relevant factor unless your nameservers are
misconfigured. A properly configured nameserver returns a negative
response rather than just timing out, but it might not cache negative
results (or might not cache them long) so the request may have latency
higher than a typical request due to repeating the whole recursive
lookup, but it still should be nowhere near as long as a timeout.

> >negligible, I think) and make a trivial res_multisend that does N(=2)
> >queries in parallel using packets provided by the caller.
> 
>  That's what the synchronous s6-dns resolver does, and I think that's
> the least amount of parallelism that's still acceptable in a v4+v6 world.

That's what musl's current implementation does too, and this is not a
feature I'd want to drop. In fact I consider it essential to adoption
of IPv6; otherwise everyone will disable IPv6 because it "makes DNS
lookups slower".

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.