Clients get DNS timeouts because ipv6 means more queries for each lookup

Mark Andrews marka at isc.org
Wed Jul 13 06:13:08 UTC 2011


No.  The fix is to correct the nameservers.  They are not correctly
following the DNS protocol and everything else is a fall out from
that.

> Well, all the prodding from people here prompted me to investigate 
> further exactly what's going on. The problem isn't what I thought it 
> was. It appears to be a bug in glibc, and I've filed a bug report and 
> found a workaround.

There is no bug in glibc.

> In a nutshell, the getaddrinfo function in glibc sends both A and AAAA 
> queries to the DNS server at the same time and then deals with the 
> responses as they come in. Unfortunately, if the responses to the two 
> queries come back in reverse order, /and/ the first one to come back is 
> a server failure, both of which are the case when you try to resolve 
> en.wikipedia.org immediately after restarting your DNS server so nothing 
> is cached, the glibc code screws up and decides it didn't get back a 
> successful response even though it did.

There is *nothing* wrong with sending both queries at once.

> If you do the same lookup again, it works, because the CNAME that was 
> sent in response to the A query is cached, so both the A and AAAA 
> queries get back valid responses from the DNS server. And even if that 
> weren't the case, since the CNAME is cached it gets returned first, 
> since the server doesn't need to do a query to get it, whereas it does 
> need to do another query to get the AAAA record (which recall isn't 
> being cached because of the previously discussed FORMERR problem). It'll 
> keep working until the cached records time out, at which point it'll 
> happen again, and then be OK again until the records time out, etc.
> 
> The workaround is to put "options single-request" in /etc/resolv.conf to 
> prevent the glibc innards from sending out both the A and AAAA queries 
> at the same time.
> 
> FYI, here's the glibc bug I filed about this:
> 
> http://sourceware.org/bugzilla/show_bug.cgi?id=12994
> 
> Thank you for telling me I was full of it and making me dig deeper into 
> this until I located the actual cause of the issue. :-)
> 
>    jik

Note your "fix" won't help clients that only ask for AAAA records
because it is the authoritative servers that are broken, not the
resolver library or the recursive server.

Mark
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: marka at isc.org



More information about the bind-users mailing list