Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 6 Oct 2022 01:14:07 -0400
From: Rich Felker <dalias@...c.org>
To: libc-coord@...ts.openwall.com
Subject: Re: EAI_NOADDR ?

On Wed, Oct 05, 2022 at 09:05:49AM -0500, Mike Karels wrote:
> On Tue, 04 Oct 2022, Rich Felker wrote:
> > On Mon, Oct 03, 2022 at 12:47:21PM -0500, Mike Karels wrote:
> > > Replying to my own message:
> > > 
> > > > On Wed, Sep 28, Rich Felker wrote:
> > > > > On Wed, Sep 28, 2022 at 02:19:23PM -0500, Mike Karels wrote:
> > > > > > Hi, I am Mike Karels, a FreeBSD committer.  Coincidentally, I sent a
> > > > > > message on this subject to a FreeBSD list yesterday; you can see it at
> > > > > > https://lists.freebsd.org/archives/freebsd-net/2022-September/002461..html.
> > > > > > Kostik pointed me at this thread.
> > > > > > 
> > > > > > On Wed, Sep 28 Hajimu UMEMOTO <ume@...eBSD.org> wrote:
> > > > > > > Hi,
> > > > > > 
> > > > > > > >>>>> On Tue, 27 Sep 2022 23:36:20 +0300
> > > > > > > >>>>> Konstantin Belousov <kostikbel@...il.com> said:
> > > > > > 
> > > > > > > kostikbel> On Tue, Sep 20, 2022 at 03:29:35PM -0400, Rich Felker wrote:
> > > > > > > > On Tue, Sep 20, 2022 at 11:39:55AM +0300, Konstantin Belousov wrote:
> > > > > > > > > On Tue, Sep 20, 2022 at 10:28:16AM +0200, Florian Weimer wrote:
> > > > > > > > > > * Rich Felker:
> > > > > > > > > > 
> > > > > > > > > > > On Mon, Sep 19, 2022 at 10:57:55PM +0200, Florian Weimer wrote:
> > > > > > > > > > >> * Rich Felker:
> > > > > > > > > > >> 
> > > > > > > > > > >> > One problem I've seen come up again and again with libc stub resolver
> > > > > > > > > > >> > API is that there's no way to distinguish between NxDomain and NODATA
> > > > > > > > > > >> > responses from DNS. These have very different meanings ("name doesn't
> > > > > > > > > > >> > exist" vs "name exists but has no address (or whatever record type you
> > > > > > > > > > >> > were looking for") and being able to distinguish them is important for
> > > > > > > > > > >> > implementing containerized-type DNS service on top of the host's
> > > > > > > > > > >> > resolver API rather than direct proxying to outside DNS (when the
> > > > > > > > > > >> > latter isn't desirable).
> > > > > > > > > > >> >
> > > > > > > > > > >> > POSIX defines EAI_NONAME as:
> > > > > > > > > > >> >
> > > > > > > > > > >> > [EAI_NONAME]
> > > > > > > > > > >> >     The name does not resolve for the supplied parameters. 
> > > > > > > > > > >> >
> > > > > > > > > > >> > which, under generous interpretation of "parameters", seems to cover
> > > > > > > > > > >> > both cases, although arguably it does "resolve" to just an empty list
> > > > > > > > > > >> > of addresses in the NODATA case.
> > > > > > > > > > >> >
> > > > > > > > > > >> > To address this, I'm considering proposing a new error code EAI_NOADDR
> > > > > > > > > > >> > that would be defined something like:
> > > > > > > > > > >> >
> > > > > > > > > > >> > [EAI_NOADDR]
> > > > > > > > > > >> >     The name does not have any addresses for the supplied parameters.
> > > > > > > > > > >> >
> > > > > > > > > > >> > Would other implementators be on-board with such a proposal?
> > > > > > > > > > >> 
> > > > > > > > > > >> I think several libcs implemented this as EAI_NODATA already.  I see it
> > > > > > > > > > >> documented for AIX, glibc, NetBSD, OpenBSD, QNX, Solaris.  Apparently,
> > > > > > > > > > >> it's absent from FreeBSD (and Windows).
> > > > > > 
> > > > > > FreeBSD has EAI_NOADDR and EAI_ADDRFAMILY defined inside #if 0 in the
> > > > > > header, but still included in the error strings.  EAI_NOADDR is "No
> > > > > > address associated with hostname", and EAI_ADDRFAMILY is "Address
> > > > > > family for hostname not supported."  Based on these strings, I proposed
> > > > > > EAI_ADDRFAMILY for the case where the name was valid but had no
> > > > > > address for the address family, as opposed to "No address associated
> > > > > > with hostname" (which implies that there are no addresses at all).
> > > 
> > > > > Distinguishing EAI_ADDRFAMILY vs EAI_NOADDR like this requires
> > > > > querying both A and AAAA even if the caller only requested one, which
> > > > > users would probably not be happy with as an added cost.
> > > 
> > > > The BSD/FreeBSD resolver code distinguishes between NXDOMAIN (name
> > > > doesn't resolve), and zero answers of the type requested.  The latter
> > > > might mean that there are addresses of other types, or records such
> > > > as NS, MX or others.  That means the name is valid.  fwiw, I have a
> > > > prototype of getaddrinfo() distinguishing the two by simply shuffling
> > > > error returns.  It returns EAI_NONAME if there is an NXDOMAIN error,
> > > > or EAI_ADDRFAMILY if there is no address.  As noted earlier, FreeBSD
> > > > does not currently use EAI_NODATA (or EAI_ADDRFAMILY).
> > > 
> > > > > > fwiw, NetBSD and OpenBSD seem to use EAI_NOADDR, or at least that
> > > > > > error string, for both "name invalid" and "no address of requested
> > > > > > family".
> > > 
> > > Oops, that's EAI_NODATA (No address associated with hostname).
> > > 
> > > > > This is what we're leaning toward in musl for the reason above.
> > > 
> > > Just to be sure: you mean using EAI_NODATA, or a new EAI_NOADDR?
> 
> > Yes, I meant EAI_NODATA. EAI_NOADDR was the proposed name I introduced
> > it as in this thread, not remembering it was a thing some
> > implementations already had under the name EAI_NODATA, just not in the
> > standard.
> 
> I'm torn between EAI_ADDRFAMILY (which has a better current error message
> in FreeBSD) and EAI_NODATA.  I could change the error message for EAI_NODATA,
> but then it will sound close to EAI_ADDRFAMILY.  Changing the English error
> message is easy, but we have several translations as well.
> 
> Any other opinions on the best choice?  I suppose glibc is unlikely to
> change.

Per the EAI names, I prefer EAI_NODATA. It corresponds directly to the
familar DNS condition and can reasonably mean "name exists but doesn't
have an address in any of the families you requested"; this is just a
special case of not having any addresses at all.

On the other hand, EAI_ADDRFAMILY comes across as implying
affirmatively that there *is* an address in at least one family, just
not the one(s) you requested. I would lean towards saying that it's
wrong for getaddrinfo to fail with EAI_ADDRFAMILY when AF_UNSPEC was
requested.

So I think if we're going with just one of the two errors (not doing
the spurious queries to disambiguate them), EAI_NODATA is the
preferred choice.

> 
> > > > > Alternatively, I suppose EAI_ADDRFAMILY could be used for both cases
> > > > > (all NODATA responses), but that seems less intuitive and less inline
> > > > > with current practices on existing systems that have one or both of
> > > > > these error codes.x
> > > 
> > > > > > > > > > > Oh, perfect! In that case, can we push this for standardization?
> > > > > > > > > > 
> > > > > > > > > > I think a separate error code makes sense.
> > > > > > > > > > 
> > > > > > > > > > > And, it looks like glibc also defines EAI_ADDRFAMILY with somewhat
> > > > > > > > > > > overlapping meaning. Is there good documentation for how they're
> > > > > > > > > > > distinguished? I don't think you can meaningfully choose which to
> > > > > > > > > > > return unless you query both A and AAAA even when only one was
> > > > > > > > > > > requested..?
> > > > > > > > > > 
> > > > > > > > > > EAI_ADDRFAMILY is only used when the host name is a numeric address that
> > > > > > > > > > implies an address family, and a different address family is requested.
> > > > > > > > > > EAI_NODATA implies that the host name exists, which doesn't really apply
> > > > > > > > > > to a numeric address, so I guess that's why a different error code was
> > > > > > > > > > introduced.
> > > > > > 
> > > > > > It seems that Linux (at least Ubuntu 22.04.1) uses EAI_ADDRFAMILY, or at
> > > > > > least "Address family for hostname not supported", for the case where
> > > > > > there is no address but the name is valid.  That was also part of the
> > > > > > reason I proposed EAI_ADDRFAMILY for this case.
> > > 
> > > > > Are you sure? I couldn't find any indication of this in the glibc
> > > > > source and couldn't get it to happen testing either.
> > > 
> > > > Hmm, my test case was ping6, as that was where I tripped over this
> > > > on FreeBSD.  Now I see that ping6 is not representative on Ubuntu;
> > > > no idea why.  Things like telnet and ftp say "No address associated
> > > > with hostname".
> > > 
> > > Ubuntu behavior for the case where there is no address for the name
> > > doesn't seem to match the getaddrinfo(3) man page, which has:
> > > 
> > >        EAI_ADDRFAMILY
> > >               The  specified  network host does not have any network addresses
> > >               in the requested address family.
> > >        EAI_NODATA
> > >               The specified network host exists, but does not have any network
> > >               addresses defined.
> > > 
> > > EAI_ADDRFAMILY seems like the better match.  It also seems to be used as
> > > described above for numeric addresses that don't match:
> > > 
> > > mike@...ntu:~$ telnet -6 127.0.0.1
> > > telnet: could not resolve 127.0.0.1/telnet: Address family for hostname not supp
> > > orted
> > > 
> > > I don't see that this means that the same error shouldn't be used for
> > > another purpose that also matches the description.  However, EAI_NODATA
> > > seems to be used now in this case.  There is something to be said for
> > > consistency, although it would also be nice if the error string was
> > > informative to the end user.  "No address associated with hostname"
> > > seems to over-generalize.  The current FreeBSD situation for this error
> > > produces "Name does not resolve", which is worse, and I want to fix.
> > > 
> > > Does anyone know why Linux/glibc does what it does?
> 
> > Distinguishing "no address" from "no address in the requested family"
> > fundamentally requires spurious queries for the unrequested family. I
> > would assume everyone deems this unnecessarily costly (not to mention
> > error-prone -- some environments are using AI_ADDRCONFIG with ipv6
> > disabled because they're behind broken middleboxes that barf on AAAA
> > queries) just for the sake of distinguishing these cases that are
> > otherwise semantically the same (the name exists but doesn't translate
> > to an address in the requested form(s)).
> 
> No, you can tell the difference between "no address" from "no address
> in the requested family" when using DNS as described above with a single
> query (per name), and glibc seems to do this.  If you do an AAAA query,
> for example, and  there is an NXDomain error, the getaddrinfo error is
> looks like EAI_NONAME:

That's not the difference between "no address" and "no address in the
requested family". It's the difference between "name does not exist"
and "no address in the requested family".

Folks get this wrong all the time, but it's really important. The name
not existing (NxDomain) is very different from the name existing and
not having the record you asked for (NODATA). That's the whole topic I
started this thread for -- exposing this distinction correctly to the
application so that it can act on the difference. There are various
places it matters; some that come to mind are:

- Applying DNSSEC and DANE logic where nonexistence has different
  semantics.

- Implementing a DNS gateway server on top of the libc getaddrinfo API
  (several virtualization-oriented implementations have been caught
  doing this wrong, specifically the NxDomain/NODATA distinction,
  thereby breaking guests that care).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.