Date: Fri, 30 Mar 2018 14:19:48 -0500 From: William Pitcock <nenolod@...eferenced.org> To: musl@...ts.openwall.com Subject: Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given Hello, On Fri, Mar 30, 2018 at 2:14 PM, Rich Felker <dalias@...c.org> wrote: > On Fri, Mar 30, 2018 at 06:52:25PM +0000, William Pitcock wrote: >> In the event of no results being given by any of the lookup modules, EAI_NONAME will still >> be thrown. >> >> This is intended to mitigate problems that occur when zones are hosted by weird DNS servers, >> such as the one Cloudflare have implemented, and appear in the search path. >> --- >> src/network/lookup_name.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c >> index 209c20f0..b068bb92 100644 >> --- a/src/network/lookup_name.c >> +++ b/src/network/lookup_name.c >> @@ -202,7 +202,7 @@ static int name_from_dns_search(struct address buf[static MAXADDRS], char canon[ >> memcpy(canon+l+1, p, z-p); >> canon[z-p+1+l] = 0; >> int cnt = name_from_dns(buf, canon, canon, family, &conf); >> - if (cnt) return cnt; >> + if (cnt > 0) return cnt; >> } >> } > > This patch is incorrect, and the reason should be an FAQ item if it's > not already. Only a return value of 0 means that the requested name > does not exist and that it's permissible to continue search. Other > nonpositive return values indicate either that the name does exist but > does not have a record of the quested type, or that a transient error > occurred, making it impossible to determine whether the search can be > continued and thus requiring the error to be reported to the caller. > Anything else results in one or both of the following bugs: > > - Nondeterministically returning different results for the same query > depending on transient unavailability of the nameservers to answer > on time. > > - Returning inconsistent results (for different search components) > depending on whether AF_INET, AF_INET6, or AF_UNSPEC was requested. > > I'm aware that at least rancher-dns and Cloudflare's nameservers have > had bugs related to this issue. I'm not sure what the status on > getting them fixed is, and for Cloudflare I don't know exactly what it > is they're doing wrong or why. But I do know the problem is that > they're returning semantically incorrect dns replies. Kubernetes imposes a default search path with the cluster domain last, so: - local.prod.svc.whatever - prod.svc.whatever - svc.whatever - yourdomain.com The cloudflare issue is that they send SUCCESS code with 0 replies, which causes musl to error when it hits the yourdomain.com. Do you have any suggestions on a mitigation which would be more palatable? We need to ship a mitigation for this in Alpine 3.8 regardless. I would much rather carry a patch that is upstreamable, but I am quite willing to carry one that isn't, in order to solve this problem. William
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.