Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 18 Oct 2022 13:27:27 -0400
From: Rich Felker <dalias@...c.org>
To: Tom Shen <sjiagc.dev@...il.com>
Cc: musl@...ts.openwall.com
Subject: Re: gethostbyname2_r returns invalid IPv6 address if DNS
 server replies IPv4 address

On Wed, Oct 19, 2022 at 12:02:36AM +0800, Tom Shen wrote:
> Hi Musl team,
> 
> Recently we encountered an issue in requesting name resolution in Alpine
> Linux with Musl.
> 
> Environment:
> * Alpine Linux 3.14.2
> * IPv6 module disabled
> * musl-libc 1.2.2
> * CoreDNS with "rewrite stop type AAAA A" configured
> 
> Reproduce steps:
> 1. getent hosts <any-name-only-has-ipv4-record>
> 2. Check output
> 
> Expected: Get correct IPv4 address, e.g. 10.96.36.74
> 
> Actual: Return an invalid IPv6 address, e.g. a60:244a::, which is the
> actual IPv4 32-bit address followed by zeros.
> 
> Debugging:
> 
> 1. Alpine Linux's getent (
> https://github.com/alpinelinux/aports/blob/3.14-stable/main/musl/getent.c)
> tries gethostbyname2 with AF_INET6 first, if no result, then calls with
> AF_INET.

gethostbyname2 is legacy and has lots of inherent problems, so this
really should be changed to use getaddrinfo. If nothing else, such a
change would make it finish twice as fast.

> 2. With requested family AF_INET6, gethostbyname2_r -> __lookup_name
> -> name_from_dns_search sends a DNS request with RR_AAAA to CoreDNS.
> 3. In CoreDNS, we configured "rewrite stop type AAAA A", which results in
> the RR_AAAA request type being rewritten to RR_A. So the response from
> CoreDNS is: AF_INET (2) and 32-bit IP 10.96.36.74 (0x0a60244a below).
> 
> > (gdb) p addrs
> > $55 = {{family = 2, scopeid = 0, addr = "\n`$J", '\000' <repeats 11
> > times>, sortkey = 0}
> > (gdb) x/4xb addrs[0]->addr
> > 0x7fffffffe0d8: 0x0a    0x60    0x24    0x4a

Are you saying that the configured nameserver (coredns) is responding
with an answer packet for a different question (A instead of AAAA)
than the query packet asked for? I'm confused why it would do that.
The expected behavior of a process receiving such an answer would be
to ignore it as bogus, but maybe we're not doing that, and then
wrongly returning answers that don't match the type we asked for?

It sounds like this is also wrong behavior by coredns. If it's
rewriting AAAA queries to A for the sake of determining if the name
exists vs NxDomain, that's fine for the outgoing query it performs,
but instead of returning the answer to the rewritten query back to the
asker as an answer that mismatches the question, it should just be
removing the answer section to make it into a NODATA answer (if it's
not already NxDomain).

> 4. In gethostbyname2_r, the result address family is assigned to the
> request's family, regardless of the family in the response.
> 
> > h->h_addrtype = af;
> > h->h_length = af==AF_INET6 ? 16 : 4;
> >
>  This results in the IPv4 address replied from CoreDNS is wrongly copied to
> the result as IPv6 address which is a60:244a:: .

OK, it looks like gethostbyname2_r is assuming the results match the
address type it asked for, which is mostly correct (except AF_UNSPEC?
that doesn't seem to be handled and it's probably invalid but we
should probably error out on it or something); it's __lookup_name
which is wrong to return results of the wrong type (as a consequence
of not validating the DNS responses).

> 5. getent gets an IPv4 address, so it won't try AF_INET, rather directly
> return the invalid IPv6 address a60:244a:: .
> 
> Suggested fix:
> In gethostbyname2_r when adding answered addresses into the result, we need
> to filter out the addresses with mismatching address family.
> 
> > for (i=0; i<cnt; i++) {
> >     // if (addrs[i].family != h->h_addrtype) continue; // need to fix the
> > index of h->h_addr_list too
> >     h->h_addr_list[i] = (void *)buf;
> >     buf += h->h_length;
> >     memcpy(h->h_addr_list[i], addrs[i].addr, h->h_length);
> > }
> >
> 
> Could you review this issue? Thanks in advance!

I'd rather fix this at the layer the actual bug (missing validation)
is at.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.