Caching only nameserver fails to resolve external zones periodically

Curtis Rempel curtis at telus.net
Mon May 17 03:50:59 UTC 2004


Hi,

I've got a caching name server which also handles a zone (.lan) on an
internal 192.168.1.0/24 network.   Both internal and external lookups work
fine as I have a forwarder entry defined in 
/var/named/chroot/etc/named.conf

That is, until "something" happens which causes the external lookups to
fail.  The internal zone resolution still works, however, it seems as far
as I can tell, that the forwarder entry does not respond and then it
starts crawling through the root name servers and eventually gives up.

Here's some sample output (from Fedora Core 1 Linux and bind 9.2.2.P3-9

When everything is working (i.e. immediately after a 'service named
restart' command), the following 'host' command works.  However, when
things aren't working, I get the following output:

[root at vault root]# host www.telus.net
;; connection timed out; no servers could be reached

This can be rectified by restarting the name server as above, but only for
awhile (which seems to vary), and then external lookups hang again.  The
internal zone information can still be resolved.

When the system is not responding to external zone lookups, a tcpdump
looks like this with the above 'host' command:

15:51:01.996338 vault.lan.33305 > ns7so.cg.shawcable.net.domain:  35946+ [1au] A? www.telus.net. (42) (DF)
15:51:03.728476 vault.lan.33305 > f.root-servers.net.domain:  50741 PTR? 182.181.179.142.in-addr.arpa. (46) (DF)
15:51:06.008121 vault.lan.33305 > 198.41.0.4.domain:  14024 [1au] A? www.telus.net. (42) (DF)
15:51:07.747854 vault.lan.33305 > G.ROOT-SERVERS.NET.domain:  52631 PTR? 182.181.179.142.in-addr.arpa. (46) (DF)
15:51:10.027489 vault.lan.33305 > 128.9.0.107.domain:  65124 [1au] A? www.telus.net. (42) (DF)
15:51:11.767237 vault.lan.33305 > 128.63.2.53.domain:  65468 PTR? 182.181.179.142.in-addr.arpa. (46) (DF)
15:51:14.046919 vault.lan.33305 > 192.33.4.12.domain:  65502 A? www.telus.net. (31) (DF)
15:51:15.786573 vault.lan.33305 > 192.36.148.17.domain:  32751 PTR? 182.181.179.142.in-addr.arpa. (46) (DF)
15:51:18.066210 vault.lan.33305 > d.root-servers.net.domain:  55260 A? www.telus.net. (31) (DF)
15:51:19.038994 laser.lan.1024 > vault.lan.domain:  27316 A? fsa.cpsc.ucalgary.ca. (50)
15:51:19.805969 vault.lan.33305 > k.root-servers.net.domain:  13778 PTR? 182.181.179.142.in-addr.arpa. (46) (DF)
15:51:22.085587 vault.lan.33305 > E.ROOT-SERVERS.NET.domain:  3376 A? www.telus.net. (31) (DF)
15:51:23.825310 vault.lan.33305 > 202.12.27.33.domain:  1688 PTR? 182.181.179.142.in-addr.arpa. (46) (DF)
15:51:26.104947 vault.lan.33305 > f.root-servers.net.domain:  844 A? www.telus.net. (31) (DF)
15:51:27.844754 vault.lan.33305 > j.root-servers.net.domain:  33190 PTR? 182.181.179.142.in-addr.arpa. (46) (DF)
15:51:30.124317 vault.lan.33305 > G.ROOT-SERVERS.NET.domain:  49363 A? www.telus.net. (31) (DF)
15:51:31.864043 vault.lan.33305 > l.root-servers.net.domain:  18756 PTR? 182.181.179.142.in-addr.arpa. (46) (DF)
15:51:34.143694 vault.lan.33305 > 128.63.2.53.domain:  4724 A? www.telus.net. (31) (DF)
15:51:35.883596 vault.lan.33305 > ns7so.cg.shawcable.net.domain:  2362+ PTR? 182.181.179.142.in-addr.arpa. (46) (DF)
15:51:38.163051 vault.lan.33305 > 192.36.148.17.domain:  1181 A? www.telus.net. (31) (DF)
15:51:40.902620 vault.lan.33305 > 198.41.0.4.domain:  24263 PTR? 182.181.179.142.in-addr.arpa. (46) (DF)
15:51:42.182418 vault.lan.33305 > k.root-servers.net.domain:  22529 A? www.telus.net. (31) (DF)

The first entry above (15:51:01) indicates that the requested is being
forwarded to the "forwarders" entry which resolves to
ns7so.cg.shawcable.net

When external resolution is working, this is the last entry as
ns7so.cg.shawcable.net provides the answer.

In a "hung" lookup, the output is above, first stop is the forwarder entry
and then the root servers and finally failure.

Does anybody have any idea why this external name resolution is
periodically failing like this?  Any suggestions for debugging info?

It seems that external lookups can function fine for days and then quit,
sometimes only minutes and then quit.

Thanks!

curtis at telus dot net (which the smarter spambots can likely figure out
anyway...)


More information about the bind-users mailing list