Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 05 Oct 2022 09:05:49 -0500
From: Mike Karels <mike@...els.net>
To: libc-coord@...ts.openwall.com
Subject: Re: EAI_NOADDR ?

On Tue, 04 Oct 2022, Rich Felker wrote:
> On Mon, Oct 03, 2022 at 12:47:21PM -0500, Mike Karels wrote:
> > Replying to my own message:
> > 
> > > On Wed, Sep 28, Rich Felker wrote:
> > > > On Wed, Sep 28, 2022 at 02:19:23PM -0500, Mike Karels wrote:
> > > > > Hi, I am Mike Karels, a FreeBSD committer.  Coincidentally, I sent a
> > > > > message on this subject to a FreeBSD list yesterday; you can see it at
> > > > > https://lists.freebsd.org/archives/freebsd-net/2022-September/002461..html.
> > > > > Kostik pointed me at this thread.
> > > > > 
> > > > > On Wed, Sep 28 Hajimu UMEMOTO <ume@...eBSD.org> wrote:
> > > > > > Hi,
> > > > > 
> > > > > > >>>>> On Tue, 27 Sep 2022 23:36:20 +0300
> > > > > > >>>>> Konstantin Belousov <kostikbel@...il.com> said:
> > > > > 
> > > > > > kostikbel> On Tue, Sep 20, 2022 at 03:29:35PM -0400, Rich Felker wrote:
> > > > > > > On Tue, Sep 20, 2022 at 11:39:55AM +0300, Konstantin Belousov wrote:
> > > > > > > > On Tue, Sep 20, 2022 at 10:28:16AM +0200, Florian Weimer wrote:
> > > > > > > > > * Rich Felker:
> > > > > > > > > 
> > > > > > > > > > On Mon, Sep 19, 2022 at 10:57:55PM +0200, Florian Weimer wrote:
> > > > > > > > > >> * Rich Felker:
> > > > > > > > > >> 
> > > > > > > > > >> > One problem I've seen come up again and again with libc stub resolver
> > > > > > > > > >> > API is that there's no way to distinguish between NxDomain and NODATA
> > > > > > > > > >> > responses from DNS. These have very different meanings ("name doesn't
> > > > > > > > > >> > exist" vs "name exists but has no address (or whatever record type you
> > > > > > > > > >> > were looking for") and being able to distinguish them is important for
> > > > > > > > > >> > implementing containerized-type DNS service on top of the host's
> > > > > > > > > >> > resolver API rather than direct proxying to outside DNS (when the
> > > > > > > > > >> > latter isn't desirable).
> > > > > > > > > >> >
> > > > > > > > > >> > POSIX defines EAI_NONAME as:
> > > > > > > > > >> >
> > > > > > > > > >> > [EAI_NONAME]
> > > > > > > > > >> >     The name does not resolve for the supplied parameters. 
> > > > > > > > > >> >
> > > > > > > > > >> > which, under generous interpretation of "parameters", seems to cover
> > > > > > > > > >> > both cases, although arguably it does "resolve" to just an empty list
> > > > > > > > > >> > of addresses in the NODATA case.
> > > > > > > > > >> >
> > > > > > > > > >> > To address this, I'm considering proposing a new error code EAI_NOADDR
> > > > > > > > > >> > that would be defined something like:
> > > > > > > > > >> >
> > > > > > > > > >> > [EAI_NOADDR]
> > > > > > > > > >> >     The name does not have any addresses for the supplied parameters.
> > > > > > > > > >> >
> > > > > > > > > >> > Would other implementators be on-board with such a proposal?
> > > > > > > > > >> 
> > > > > > > > > >> I think several libcs implemented this as EAI_NODATA already.  I see it
> > > > > > > > > >> documented for AIX, glibc, NetBSD, OpenBSD, QNX, Solaris.  Apparently,
> > > > > > > > > >> it's absent from FreeBSD (and Windows).
> > > > > 
> > > > > FreeBSD has EAI_NOADDR and EAI_ADDRFAMILY defined inside #if 0 in the
> > > > > header, but still included in the error strings.  EAI_NOADDR is "No
> > > > > address associated with hostname", and EAI_ADDRFAMILY is "Address
> > > > > family for hostname not supported."  Based on these strings, I proposed
> > > > > EAI_ADDRFAMILY for the case where the name was valid but had no
> > > > > address for the address family, as opposed to "No address associated
> > > > > with hostname" (which implies that there are no addresses at all).
> > 
> > > > Distinguishing EAI_ADDRFAMILY vs EAI_NOADDR like this requires
> > > > querying both A and AAAA even if the caller only requested one, which
> > > > users would probably not be happy with as an added cost.
> > 
> > > The BSD/FreeBSD resolver code distinguishes between NXDOMAIN (name
> > > doesn't resolve), and zero answers of the type requested.  The latter
> > > might mean that there are addresses of other types, or records such
> > > as NS, MX or others.  That means the name is valid.  fwiw, I have a
> > > prototype of getaddrinfo() distinguishing the two by simply shuffling
> > > error returns.  It returns EAI_NONAME if there is an NXDOMAIN error,
> > > or EAI_ADDRFAMILY if there is no address.  As noted earlier, FreeBSD
> > > does not currently use EAI_NODATA (or EAI_ADDRFAMILY).
> > 
> > > > > fwiw, NetBSD and OpenBSD seem to use EAI_NOADDR, or at least that
> > > > > error string, for both "name invalid" and "no address of requested
> > > > > family".
> > 
> > Oops, that's EAI_NODATA (No address associated with hostname).
> > 
> > > > This is what we're leaning toward in musl for the reason above.
> > 
> > Just to be sure: you mean using EAI_NODATA, or a new EAI_NOADDR?

> Yes, I meant EAI_NODATA. EAI_NOADDR was the proposed name I introduced
> it as in this thread, not remembering it was a thing some
> implementations already had under the name EAI_NODATA, just not in the
> standard.

I'm torn between EAI_ADDRFAMILY (which has a better current error message
in FreeBSD) and EAI_NODATA.  I could change the error message for EAI_NODATA,
but then it will sound close to EAI_ADDRFAMILY.  Changing the English error
message is easy, but we have several translations as well.

Any other opinions on the best choice?  I suppose glibc is unlikely to
change.

> > > > Alternatively, I suppose EAI_ADDRFAMILY could be used for both cases
> > > > (all NODATA responses), but that seems less intuitive and less inline
> > > > with current practices on existing systems that have one or both of
> > > > these error codes.x
> > 
> > > > > > > > > > Oh, perfect! In that case, can we push this for standardization?
> > > > > > > > > 
> > > > > > > > > I think a separate error code makes sense.
> > > > > > > > > 
> > > > > > > > > > And, it looks like glibc also defines EAI_ADDRFAMILY with somewhat
> > > > > > > > > > overlapping meaning. Is there good documentation for how they're
> > > > > > > > > > distinguished? I don't think you can meaningfully choose which to
> > > > > > > > > > return unless you query both A and AAAA even when only one was
> > > > > > > > > > requested..?
> > > > > > > > > 
> > > > > > > > > EAI_ADDRFAMILY is only used when the host name is a numeric address that
> > > > > > > > > implies an address family, and a different address family is requested.
> > > > > > > > > EAI_NODATA implies that the host name exists, which doesn't really apply
> > > > > > > > > to a numeric address, so I guess that's why a different error code was
> > > > > > > > > introduced.
> > > > > 
> > > > > It seems that Linux (at least Ubuntu 22.04.1) uses EAI_ADDRFAMILY, or at
> > > > > least "Address family for hostname not supported", for the case where
> > > > > there is no address but the name is valid.  That was also part of the
> > > > > reason I proposed EAI_ADDRFAMILY for this case.
> > 
> > > > Are you sure? I couldn't find any indication of this in the glibc
> > > > source and couldn't get it to happen testing either.
> > 
> > > Hmm, my test case was ping6, as that was where I tripped over this
> > > on FreeBSD.  Now I see that ping6 is not representative on Ubuntu;
> > > no idea why.  Things like telnet and ftp say "No address associated
> > > with hostname".
> > 
> > Ubuntu behavior for the case where there is no address for the name
> > doesn't seem to match the getaddrinfo(3) man page, which has:
> > 
> >        EAI_ADDRFAMILY
> >               The  specified  network host does not have any network addresses
> >               in the requested address family.
> >        EAI_NODATA
> >               The specified network host exists, but does not have any network
> >               addresses defined.
> > 
> > EAI_ADDRFAMILY seems like the better match.  It also seems to be used as
> > described above for numeric addresses that don't match:
> > 
> > mike@...ntu:~$ telnet -6 127.0.0.1
> > telnet: could not resolve 127.0.0.1/telnet: Address family for hostname not supp
> > orted
> > 
> > I don't see that this means that the same error shouldn't be used for
> > another purpose that also matches the description.  However, EAI_NODATA
> > seems to be used now in this case.  There is something to be said for
> > consistency, although it would also be nice if the error string was
> > informative to the end user.  "No address associated with hostname"
> > seems to over-generalize.  The current FreeBSD situation for this error
> > produces "Name does not resolve", which is worse, and I want to fix.
> > 
> > Does anyone know why Linux/glibc does what it does?

> Distinguishing "no address" from "no address in the requested family"
> fundamentally requires spurious queries for the unrequested family. I
> would assume everyone deems this unnecessarily costly (not to mention
> error-prone -- some environments are using AI_ADDRCONFIG with ipv6
> disabled because they're behind broken middleboxes that barf on AAAA
> queries) just for the sake of distinguishing these cases that are
> otherwise semantically the same (the name exists but doesn't translate
> to an address in the requested form(s)).

No, you can tell the difference between "no address" from "no address
in the requested family" when using DNS as described above with a single
query (per name), and glibc seems to do this.  If you do an AAAA query,
for example, and  there is an NXDomain error, the getaddrinfo error is
looks like EAI_NONAME:

mike@...ntu:~$ telnet -6 nosuchhost
telnet: could not resolve nosuchhost/telnet: Name or service not known

and the DNS exchange looks like this:
08:15:54.354575 IP X.Y.Z.P.49219 > X.Y.Z.Q.53: 8064+ AAAA? nosuchhost.A.B. (39)
08:15:54.355053 IP X.Y.Z.Q.53 > X.Y.Z.P.49219: 8064 NXDomain* 0/1/0 (86)
08:15:54.355187 X.Y.Z.P.52791 > X.Y.Z.Q.53: 42941+ AAAA? nosuchhost. (28)
08:15:54.432793 IP X.Y.Z.Q.53 > X.Y.Z.P.52791: 42941 NXDomain 0/1/0 (103)

On the other hand, if there is no error but no answer record for a AAAA
query, the error seems to be EAI_NODATA:

mike@...ntu:~$ telnet -6 redrock
telnet: could not resolve redrock/telnet: No address associated with hostname

and DNS did this:
08:18:23.608066 IP X.Y.Z.P.44280 > X.Y.Z.Q.53: 22570+ AAAA? redrock.A.B. (36)
08:18:23.608691 IP X.Y.Z.Q.53 > X.Y.Z.P.44280: 22570* 0/1/0 (83)
08:18:23.608778 IP X.Y.Z.P.59096 > X.Y.Z.Q.53: 10534+ AAAA? redrock. (25)
08:18:23.609037 IP X.Y.Z.Q.53 > X.Y.Z.P.59096: 10534 NXDomain 0/1/0 (100)

Similarly for A records, of course.

btw, I mentioned that ping's behavior is different on Ubuntu.  It is
doing both A and AAAA requests in parallel, not sure why, but it has
additional information.  Apparently it is generating EAI_ADDRFAMILY for
the case with no address of the requested family.

		Mike

> FWIW, I don't like the glibc error strings. The one we're planning to
> use in musl so far (for EAI_NODATA) is "Name has no usable address"
> which makes no statement about whether it has any address at all.

> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.