|
|
Message-ID: <aeTfPpPXFJWqju1w@pie> Date: Sun, 19 Apr 2026 13:57:18 +0000 From: Yao Zi <me@...ao.cc> To: musl@...ts.openwall.com Subject: Re: [PATCH] dns: Avoid division-by-zero when zero attempts is specified in resolv.conf On Sat, Apr 18, 2026 at 09:02:53PM -0400, Rich Felker wrote: > On Sat, Apr 18, 2026 at 10:28:13PM +0000, Yao Zi wrote: > > DNS query retry interval is calculated through timeout / attempts in > > __res_msend_rc(), both are loaded from resolv.conf in > > __get_resolv_conf(), while value of attempts isn't checked. This would > > trigger an undefined behavior if attempts is set to zero in > > configuration, causing misfunction or termination with SIGFPE. > > > > Gracefully handle it by returning early, with errno set to ECONNREFUSED > > to make res_send(), which invokes __res_msend_rc() and inherits its > > errno, fail with the same errno as glibc in this case. > > I don't think errno is meaningful after any of the res_* functions. At > least it's not documented as being meaningful. According to the docs, > any failure reason for this functions is shown in h_errno. Yes, I had a brief look among Linux and FreeBSD's manpages, and none of them mentions errno. > I thought the glibc behavior you observed is probably just a stray > errno (it's always valid for libc functions to clobber it unless > documented otherwise as long as they don't set it to zero), but it > looks like it is intentional. I'm not sure why they're doing that. > Moreover, the comment where it seems to be set indicates it's also > supposed to cover the "no nameservers configured" case, but it looks > like they already covered that earlier with ESRCH. > > Since we don't set errno elsewhere and since it's not documented, I > don't think there's any good reason to do it here. We generally do not > try to match undocumented glibc behaviors. This makes sense, I'll wait for some time and change it in v2. > > I have to admit this bug is found by auditing musl source code with > > Large Language Model's help. However, LLM is only used to seek for > > problems, and I've confirmed the issue by hand and reproduced it. This > > fix, and the commit message are all written by myself. > > > > I've viewed AI policies[1] proposed by Rich before auditing. Besides a > > little worried about authorship problems since I did view AI-generated > > analysis and an initial reproducer, I'd prefer to say this doesn't > > violate it. It would be good if we could agree on a clear policy for > > such AI usage for code auditing. > > Thank you so much for your disclosure. I don't see any problem with > what you've done here. > > If you were making complex changes to code where the patch is more > elaborate, there is a question of whether that's derivative of any > LLM-suggested code you might have read. But this is a very simple > change. In the future please just disclose as you've done here. If > it's questionable we can always have someone rewrite a patch based on > an understanding of the bug you found. Really thanks for the confirmation. > > Back to the technical part, this problem could be easily reproduced by > > adding > > > > options attempts:0 > > > > to resolv.conf (do not close the editor since many stuff might > > segfault!), then basically every program recursively invoking > > __res_msend_rc() through either gethostbyname*() or res_send() would > > crash on x86_64 with SIGFPE. > > > > It does sound weird to make zero attempt on DNS requests, but glibc > > accepts so, and fails with errno = ECONNREFUSED, which seems an > > unintended behavior. I chose to follow it in this patch, while > > considering errno = EINVAL to be a better outcome. > > > > It's worth noting that before the commit introduced this bug, musl > > silently clamps attempts to be at least one, which makes sense but > > introducing such an implicit behavior might be unexpected. > > I think for now we can just return -1 without doing anything else, or > restore the historical behavior of clamping it at 1. I think allowing > 0 to intentionally fail all queries is probably a better behavior. Okay. I prefer to return -1 here, too. > This report has also raised the issue that we are not setting h_errno > like we should for the res_* interfaces. res_query has logic to > produce a value for h_errno, but the rest of them leave it untouched. > I don't think it makes sense to mix fixing that with fixing the > attempts:0 crash. We can figure out what to do about that separately. > > Rich Best regards, Yao Zi
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.