Clients get DNS timeouts because ipv6 means more queries for each lookup

Mon Jul 11 21:50:21 UTC 2011

On 7/11/2011 2:11 PM, Jonathan Kamens wrote:
> The number of DNS queries required for each address lookup requested 
> by a client has gone up considerably because of IPV6. The problem is 
> being exacerbated by the fact that many DNS servers on the net don't 
> yet support IPV6 queries. The result is that address lookups are 
> frequently taking so long that the client gives up before getting the 
> result.
>
> The example I am seeing this with most frequently is my RSS feed 
> reader, rss2email, trying to read a feed from en.wikipedia.org in a 
> cron job that runs every 15 minutes. I am regularly seeing this in the 
> output of the cron job:
>
>     W: Name or service not known [8]
>     http://en.wikipedia.org/w/index.php?title=/[elided]/&feed=atom&action=history
>
> The wikipedia.org domain has three DNS servers. Let's assume that the 
> root and org. nameservers are cached already when rss2email does its 
> query. If so, then it has to do the following queries:
>
>     wikipedia.org DNS
>     en.wikipedia.org AAAA
>     en.wikipedia.org A
>
> This is fine when the wikipedia.org nameservers are working, but let's 
> postulate for the moment that two of them are down, unreachable, or 
> responding slowly, which apparently happens pretty often. Then we end 
> up doing:
>
>     wikipedia.org DNS
>     en.wikipedia.org AAAA /times out
>     /en.wikipedia.org AAAA /times out
>     /en.wikipedia.org AAAA
>     en.wikipedia.org A /times out/
>     en.wikipedia.org A /times out
>     /en.wikipedia.org A
>
> By now the end of that sequence, the typical 30-second DNS request 
> timeout has been exceeded, and the client gives up.
The math isn't working. I just ran a quick test and named (9.7.x) failed 
over from a non-working delegated NS to a working delegated NS in less 
than 30 milliseconds. How are you reaching a 30-*second* timeout 
threshold in only 6 queries?

In practice, it would also be quite unlikely that named would pick 
"dead" nameservers before live ones for *both* the AAAA and the A query. 
At the very least, once the timeouts were encountered for the AAAA 
query, those NSes would be penalized in terms of NS selection, so they 
are unlikely to be chosen *again*, ahead of the working NS, for the A 
query. Any en.wikipedia.org NSes which were found to be *persistently* 
broken, would gravitate to the bottom of the selection list, and be 
tried approximately never.

I think maybe you need to probe deeper and find out what _else_ is going on.

                                             - Kevin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20110711/3f9b1469/attachment.html>