Name-server redundancy

Kevin Darcy kcd at chrysler.com
Mon Jun 9 21:47:33 UTC 2014


That scenario still shouldn't have led to an NXDOMAIN. If none of the 
delegated nameservers are responding, you'd get a timeout or SERVFAIL. 
So I think there's still some investigation to be done. But using dig 
instead of nslookup at least makes things clearer :-)

Of course, caching may complicate things here. The NS records published 
at the apex (which I assume were all 6 of them) take precedence over the 
delegation NS'es, so for a period of time, some resolvers would be able 
to resolve names in the zone, and some would not. Eventually, depending 
on your TTLs, everyone would expire the cached NS records and the zone 
would be completely unresolvable.

         - Kevin

On 6/9/2014 5:38 PM, Sid Shapiro wrote:
> Thanks, Kevin, for your quick reply. In the last few minutes, I've 
> come to realize that my problem is likely that the domain is only 
> registered with two name servers - the one which were offline. Even 
> though the zone has 6 NS records, the .com servers probably only know 
> of the ones in the registration. So registration and DNS not in sync. 
> Silly mistake.
>
> (And FWIW, I *was* using dig, not nslookup)
>
> --
> Sid Shapiro sid_shapiro at bio-rad.com <mailto:sid_shapiro at bio-rad.com>
> Bio-Rad Corporate IT  - Desk: (510) 741-6846   Mobile: (510) 224-4343
>
>
> On Mon, Jun 9, 2014 at 2:32 PM, Kevin Darcy <kcd at chrysler.com 
> <mailto:kcd at chrysler.com>> wrote:
>
>     Well, you shouldn't be getting an NXDOMAIN just because some of
>     your auth servers are off-line, but you could get some query
>     timeouts if performance to your failover servers is really bad (or
>     blocked, due to firewall rules, bad routes, etc.), or, if your
>     expire times are *really* low, and the master's been down a while,
>     it's possible the zone may have expired on the slaves.
>
>     In any of those cases, I'm suspecting you're using nslookup, and
>     you might be suffering from its horrible misfeature where it
>     searchlists on a query failure, and then reports the *last* RCODE
>     it received as the result of the entire lookup. So, for example,
>     if your query is www.example.com <http://www.example.com> and your
>     searchlist ends in the domain department1.example.com
>     <http://department1.example.com>, if the first query fails (e.g.
>     with a timeout or a SERVFAIL), nslookup might work through the
>     searchlist, ultimately querying
>     www.example.com.department1.example.com
>     <http://www.example.com.department1.example.com>, which returns
>     NXDOMAIN, and that's what nslookup (mis-)reports as the result of
>     the query.
>
>     You can avoid this by dot-terminating the original query (thus
>     inhibiting nslookup's searchlist behavior), or even better, using
>     a real DNS troubleshooting tool like dig or host. If you want to
>     continue to use nslookup, at the very least add the -debug flag so
>     you can see what it's really doing under the covers.
>
>                                 - Kevin
>
>     On 6/9/2014 4:36 PM, Sid Shapiro wrote:
>>     Hello,
>>     I've got 6 name-servers, 2 in each of 3 global regions. Each
>>     name-server has a net connection. Each name-server is
>>     authoritative. the domains it server have all six NS records.
>>
>>     My question has to do with redundancy. If one of my "regions"
>>     goes down, I would have expected that a query against a domain
>>     would reach one of the other region's name-servers. However,
>>     during a maintenance window when one regions was off the air, I
>>     did some simple queries. I did not have a lot of time to do a lot
>>     of detailed testing and tracing. I was simply trying to see if I
>>     could get a query resolved.
>>
>>     What I got, was a "no name-server" error. I do not have the exact
>>     message, nor the timings. I could see (somehow) that there might
>>     be some time-out issue on the client, but the no name-servers
>>     response came pretty quickly.
>>
>>     This doesn't seem like a configuration problem, although I
>>     suppose it might be. It seems more like a misunderstanding how
>>     redundancy works at the domain level.
>>
>>     Have I totally misunderstood a concept here?
>>     Thanks
>>     --
>>     Sid Shapiro sid_shapiro at bio-rad.com <mailto:sid_shapiro at bio-rad.com>
>>     Bio-Rad Corporate IT  - Desk: (510) 741-6846
>>     <tel:%28510%29%20741-6846>   Mobile: (510) 224-4343
>>     <tel:%28510%29%20224-4343>
>>
>>
>>     _______________________________________________
>>     Please visithttps://lists.isc.org/mailman/listinfo/bind-users  to unsubscribe from this list
>>
>>     bind-users mailing list
>>     bind-users at lists.isc.org  <mailto:bind-users at lists.isc.org>
>>     https://lists.isc.org/mailman/listinfo/bind-users
>
>
>     _______________________________________________
>     Please visit https://lists.isc.org/mailman/listinfo/bind-users to
>     unsubscribe from this list
>
>     bind-users mailing list
>     bind-users at lists.isc.org <mailto:bind-users at lists.isc.org>
>     https://lists.isc.org/mailman/listinfo/bind-users
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20140609/cbcfe831/attachment.html>


More information about the bind-users mailing list