BIND fails if one of 2 servers is bad?

Barry Margolin barry.margolin at level3.com
Mon Sep 29 19:16:59 UTC 2003


In article <bl9uc3$9n$1 at sf1.isc.org>,
Andre Burgoyne <comp.protocols.dns.bind at fishbear.com> wrote:
>Running BIND 9.2.1 (RedHat 9), I get the following results:
>
># dig counterpunch.org
>
>; <<>> DiG 9.2.1 <<>> counterpunch.org
>;; global options:  printcmd
>;; Got answer:
>;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 20001
>;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
>
>;; QUESTION SECTION:
>;counterpunch.org.              IN      A
>
>But when I do +trace I get:
>
># dig counterpunch.org +trace
>
>; <<>> DiG 9.2.1 <<>> counterpunch.org +trace
>;; global options:  printcmd
>...
>org.                    86400   IN      NS      TLD2.ULTRADNS.NET.
>org.                    86400   IN      NS      TLD1.ULTRADNS.NET.
>;; Received 116 bytes from 195.206.104.13#53(M.ROOT-SERVERS.ORSC) in 192 ms
>
>COUNTERPUNCH.ORG.       172800  IN      NS      NS.LEB.NET.
>COUNTERPUNCH.ORG.       172800  IN      NS      NS.DOLEH.COM.
>;; Received 100 bytes from 204.74.113.1#53(TLD2.ULTRADNS.NET) in 35 ms
>
>counterpunch.org.       86400   IN      A       38.117.146.196
>counterpunch.org.       86400   IN      NS      ns.leb.net.
>;; Received 74 bytes from 206.127.55.2#53(NS.LEB.NET) in 108 ms
>
>So NS.LEB.NET is working and answers for the domain, but when I do the
>simple query (e.g. for normal web browsing) I get the server fail.
>(Presumably because NS.DOLEH.COM does not exist).  Is my server somehow
>mis-configured? Seems like it should answer as long as one of the name
>servers is responding (isn't that the whole point of redundant servers?)

My first guess is that NS.DOLEH.COM *did* exist at the time you did the
initial query, but it didn't have the zone loaded into its configuration.
So you had a 50% chance of querying a lame server.  Failover only occurs if
a server doesn't respond at all; if it responds with a SERVFAIL error code,
this response is passed on to the client.

BTW, it looks like the registration of the domain has since been updated;
the delegation now points to ns1.leb.net and ns2.leb.net.  However, this
could be the source of another problem, because this didn't match the NS
record on the authoritative server, which was:

counterpunch.org.	IN	NS	ns.leb.net.

This is what's in my caching server's memory, but it looks like leb.net has
fixed it to match the delegation since it got cached.  However, there's a
similar inconsistency in the leb.net domain itself, which they *haven't*
fixed.  This could cause problems due to glue issues.

-- 
Barry Margolin, barry.margolin at level3.com
Level(3), Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.


More information about the bind-users mailing list