Repeatable resolving failure
Scott Haneda
lists at newgeo.com
Thu Oct 28 02:51:03 UTC 2004
Hello, I am having a tough time tracking down some DNS problems. I am
relatively new to using bind and its options. I am equally new to the use
of dig, I do however have a fair understanding of how DNS in general works.
Forgive me if I do not use the correct terminology.
Network Background
Several OS X servers running in a colo cabinet, I have full control of this
cabinet, the router is managed by the colo provider. I am assured that as
of now for my testing the 2 machines I am testing are wide open in regards
to ports.
My "workstation" is connected via Comcast, though I have access to other
machines in other states as well.
Problem.
On my workstation, I have my DNS server, ns1.hostwizard.com listed in my
tcp/ip settings, it is this machine that use to resolve hosts locally. When
I type in a hostname in my browser (Safari, but tested others) it thinks
about it for 10 seconds or so, then fails. If I simply re-load that
hostname, it is then fine.
I have this issue when on my Comcast connection or if I TB2/VNC into my colo
machines and try it internally on the same subnet as the DNS server. The
DNS server sees little load, 2-5 queries per second.
named -v
BIND 9.2.3
I am running this with QuickDNS as the front end to configuring it.
If I use a alternate NS in my tcp/ip settings, of course the issue goes
away, so using camcast's DHCP supplied NS's or some friends, always resolves
this issue.
As a test, I installed bind 9.2.3 on another server, I did not add any
zones, just turned it on, I then used that new servers IP as my tcp/ip DNS
setting, the problem persists as well.
I think I have ruled out it being a machine specific issue with that test.
I also think I have ruled out this being a Comcast issue as I have tested
this on various other networks and as long as my DNS is used, the problem
rears its head.
I seem to be able to load in Safari any websites I host just fine. So of
the few hundred zones I have, those all seem to work out and resolve nice
and fast.
If I watch the bind logs, I can see sort of what is happening, as the
browser is looking up the hostname, I see the request for the A record come
in, and I see it attempted many times.
Oct 27 19:41:53.614 queries: info: client 24.5.47.136#50508: query:
www.redlightcafe.com IN A
Oct 27 19:41:54.325 queries: info: client 24.5.47.136#50508: query:
www.redlightcafe.com IN A
Oct 27 19:41:55.025 queries: info: client 24.5.47.136#50508: query:
www.redlightcafe.com IN A
Oct 27 19:41:55.731 queries: info: client 24.5.47.136#50508: query:
www.redlightcafe.com IN A
Oct 27 19:41:57.248 queries: info: client 24.5.47.136#50511: query:
www.redlightcafe.com IN A
Oct 27 19:41:57.726 queries: info: client 24.5.47.136#50512: query:
www.redlightcafe.com IN AAAA
At this point, the browser alerts me to trouble, a reload then reports this:
Oct 27 19:42:31.803 queries: info: client 24.5.47.136#50516: query:
www.redlightcafe.com IN A
At which point the page is rendered.
The only thing I have been able to learn about all this is the +trace option
to dig, which I think may lead to a answer, not entirely sure.
Rlc.net is a domain my DNS should have never seen before, this is tested via
me ssh'd into the alternate DNS machine I just set up, which gets no other
traffic at all.
dig rlc.net +trace
; <<>> DiG 9.2.2 <<>> rlc.net +trace
;; global options: printcmd
. 511574 IN NS E.ROOT-SERVERS.NET.
. 511574 IN NS F.ROOT-SERVERS.NET.
. 511574 IN NS G.ROOT-SERVERS.NET.
. 511574 IN NS H.ROOT-SERVERS.NET.
. 511574 IN NS I.ROOT-SERVERS.NET.
. 511574 IN NS J.ROOT-SERVERS.NET.
. 511574 IN NS K.ROOT-SERVERS.NET.
. 511574 IN NS L.ROOT-SERVERS.NET.
. 511574 IN NS M.ROOT-SERVERS.NET.
. 511574 IN NS A.ROOT-SERVERS.NET.
. 511574 IN NS B.ROOT-SERVERS.NET.
. 511574 IN NS C.ROOT-SERVERS.NET.
. 511574 IN NS D.ROOT-SERVERS.NET.
;; Received 340 bytes from 64.84.37.40#53(64.84.37.40) in 4 ms
net. 172800 IN NS E.GTLD-SERVERS.net.
net. 172800 IN NS F.GTLD-SERVERS.net.
net. 172800 IN NS G.GTLD-SERVERS.net.
net. 172800 IN NS H.GTLD-SERVERS.net.
net. 172800 IN NS I.GTLD-SERVERS.net.
net. 172800 IN NS J.GTLD-SERVERS.net.
net. 172800 IN NS K.GTLD-SERVERS.net.
net. 172800 IN NS L.GTLD-SERVERS.net.
net. 172800 IN NS M.GTLD-SERVERS.net.
net. 172800 IN NS A.GTLD-SERVERS.net.
net. 172800 IN NS B.GTLD-SERVERS.net.
net. 172800 IN NS C.GTLD-SERVERS.net.
net. 172800 IN NS D.GTLD-SERVERS.net.
;; Received 510 bytes from 192.203.230.10#53(E.ROOT-SERVERS.NET) in 10 ms
rlc.net. 172800 IN NS ns1.musictoday.com.
rlc.net. 172800 IN NS ns1.rlcom.net.
rlc.net. 172800 IN NS ns2.musictoday.com.
rlc.net. 172800 IN NS ns2.rlcom.net.
;; Received 181 bytes from 192.12.94.30#53(E.GTLD-SERVERS.net) in 25 ms
dig: Couldn't find server 'ns1.musictoday.com': No address associated with
nodename
paris:~ haneda$
I seem to get repeated errors like this with the +trace option on the first
try, the second one normally works, a third try almost always works, and I
have never had it take more than four tries.
At the same time I ran the +trace above, I also tail -f'd on the bind logs
and got this:
Oct 27 19:45:30.959 queries: info: client 64.84.37.40#63231: query: . IN NS
Oct 27 19:45:30.970 queries: info: client 64.84.37.40#63232: query:
E.ROOT-SERVERS.NET IN A
Oct 27 19:45:30.994 queries: info: client 64.84.37.40#63233: query:
E.ROOT-SERVERS.NET IN A
Oct 27 19:45:30.996 queries: info: client 64.84.37.40#63234: query:
E.ROOT-SERVERS.NET IN AAAA
Oct 27 19:45:31.006 queries: info: client 64.84.37.40#63235: query:
E.ROOT-SERVERS.NET.hostwizard.com IN AAAA
Oct 27 19:45:31.071 queries: info: client 64.84.37.40#63237: query:
E.GTLD-SERVERS.net IN A
Oct 27 19:45:31.092 queries: info: client 64.84.37.40#63238: query:
E.GTLD-SERVERS.net IN A
Oct 27 19:45:31.095 queries: info: client 64.84.37.40#63239: query:
E.GTLD-SERVERS.net IN AAAA
Oct 27 19:45:31.113 queries: info: client 64.84.37.40#63240: query:
E.GTLD-SERVERS.net.hostwizard.com IN AAAA
Oct 27 19:45:31.168 queries: info: client 64.84.37.40#63242: query:
ns1.musictoday.com IN A
Oct 27 19:45:31.873 queries: info: client 64.84.37.40#63242: query:
ns1.musictoday.com IN A
Oct 27 19:45:32.579 queries: info: client 64.84.37.40#63242: query:
ns1.musictoday.com IN A
Oct 27 19:45:33.284 queries: info: client 64.84.37.40#63242: query:
ns1.musictoday.com IN A
Oct 27 19:45:33.990 queries: info: client 64.84.37.40#63243: query:
ns1.musictoday.com.hostwizard.com IN A
Oct 27 19:45:34.025 queries: info: client 64.84.37.40#63244: query:
ns1.musictoday.com IN A
Oct 27 19:45:34.733 queries: info: client 64.84.37.40#63244: query:
ns1.musictoday.com IN A
Oct 27 19:45:35.269 queries: info: client 64.84.37.40#63245: query:
ns1.musictoday.com IN AAAA
Oct 27 19:45:35.974 queries: info: client 64.84.37.40#63245: query:
ns1.musictoday.com IN AAAA
Oct 27 19:45:36.680 queries: info: client 64.84.37.40#63245: query:
ns1.musictoday.com IN AAAA
Oct 27 19:45:37.396 queries: info: client 64.84.37.40#63245: query:
ns1.musictoday.com IN AAAA
Oct 27 19:45:38.102 queries: info: client 64.84.37.40#63246: query:
ns1.musictoday.com.hostwizard.com IN AAAA
Some of it I just don't get, like why did it try to resolve
ns1.musictoday.com.hostwizard.com, since I never asked that and that is not
a hostname I would ever deal with.
At this point, I am at a 100% loss as to what the issue is and how to fix
it. I have had about 10 clients that I allow to use my DNS for local
lookups that were essentially knocked offline as a result of this. I have
since had them fall back on comcast, verizon etc for NS's, however, these
people often make many DNS changes and what to see those changes quickly,
which is why I generally add them to my allow list of people who are able to
use my DNS for queries.
Sorry for the length of this email, I felt it would help to give as much
detail as possible, hopefully one of you can shed some insight into what is
happening.
Thanks again
--
-------------------------------------------------------------
Scott Haneda Tel: 415.898.2602
<http://www.newgeo.com> Fax: 313.557.5052
<scott at newgeo.com> Novato, CA U.S.A.
More information about the bind-users
mailing list