Name-server redundancy

Sid Shapiro sid_shapiro at bio-rad.com
Mon Jun 9 21:56:15 UTC 2014


Again - thanks for the quick response - that'll  teach me to post without
all the facts. I simply don't remember what the specific error was, darn
it. It might have been NXDOMAIN or SERVFAIL - I didn't write it down.
The test I was running was on a barely, if ever used, domain, so I was
pretty sure it wasn't cached anywhere.

I'm trying to figure out ways to test this without taking name servers
offline :-)

--
Sid Shapiro
sid_shapiro at bio-rad.com
Bio-Rad Corporate IT  - Desk: (510) 741-6846   Mobile: (510) 224-4343


On Mon, Jun 9, 2014 at 2:47 PM, Kevin Darcy <kcd at chrysler.com> wrote:

>  That scenario still shouldn't have led to an NXDOMAIN. If none of the
> delegated nameservers are responding, you'd get a timeout or SERVFAIL. So I
> think there's still some investigation to be done. But using dig instead of
> nslookup at least makes things clearer :-)
>
> Of course, caching may complicate things here. The NS records published at
> the apex (which I assume were all 6 of them) take precedence over the
> delegation NS'es, so for a period of time, some resolvers would be able to
> resolve names in the zone, and some would not. Eventually, depending on
> your TTLs, everyone would expire the cached NS records and the zone would
> be completely unresolvable.
>
>                                                                         -
> Kevin
>
>
> On 6/9/2014 5:38 PM, Sid Shapiro wrote:
>
> Thanks, Kevin, for your quick reply. In the last few minutes, I've come to
> realize that my problem is likely that the domain is only registered with
> two name servers - the one which were offline. Even though the zone has 6
> NS records, the .com servers probably only know of the ones in the
> registration. So registration and DNS not in sync. Silly mistake.
>
>  (And FWIW, I *was* using dig, not nslookup)
>
>  --
> Sid Shapiro
> sid_shapiro at bio-rad.com
>  Bio-Rad Corporate IT  - Desk: (510) 741-6846   Mobile: (510) 224-4343
>
>
> On Mon, Jun 9, 2014 at 2:32 PM, Kevin Darcy <kcd at chrysler.com> wrote:
>
>>  Well, you shouldn't be getting an NXDOMAIN just because some of your
>> auth servers are off-line, but you could get some query timeouts if
>> performance to your failover servers is really bad (or blocked, due to
>> firewall rules, bad routes, etc.), or, if your expire times are *really*
>> low, and the master's been down a while, it's possible the zone may have
>> expired on the slaves.
>>
>> In any of those cases, I'm suspecting you're using nslookup, and you
>> might be suffering from its horrible misfeature where it searchlists on a
>> query failure, and then reports the *last* RCODE it received as the result
>> of the entire lookup. So, for example, if your query is www.example.com
>> and your searchlist ends in the domain department1.example.com, if the
>> first query fails (e.g. with a timeout or a SERVFAIL), nslookup might work
>> through the searchlist, ultimately querying
>> www.example.com.department1.example.com, which returns NXDOMAIN, and
>> that's what nslookup (mis-)reports as the result of the query.
>>
>> You can avoid this by dot-terminating the original query (thus inhibiting
>> nslookup's searchlist behavior), or even better, using a real DNS
>> troubleshooting tool like dig or host. If you want to continue to use
>> nslookup, at the very least add the -debug flag so you can see what it's
>> really doing under the covers.
>>
>>
>>             - Kevin
>>
>> On 6/9/2014 4:36 PM, Sid Shapiro wrote:
>>
>>  Hello,
>> I've got 6 name-servers, 2 in each of 3 global regions. Each name-server
>> has a net connection. Each name-server is authoritative. the domains it
>> server have all six NS records.
>>
>>  My question has to do with redundancy. If one of my "regions" goes
>> down, I would have expected that a query against a domain would reach one
>> of the other region's name-servers. However, during a maintenance window
>> when one regions was off the air, I did some simple queries. I did not have
>> a lot of time to do a lot of detailed testing and tracing. I was simply
>> trying to see if I could get a query resolved.
>>
>>  What I got, was a "no name-server" error. I do not have the exact
>> message, nor the timings. I could see (somehow) that there might be some
>> time-out issue on the client, but the no name-servers response came pretty
>> quickly.
>>
>>  This doesn't seem like a configuration problem, although I suppose it
>> might be. It seems more like a misunderstanding how redundancy works at the
>> domain level.
>>
>>  Have I totally misunderstood a concept here?
>> Thanks
>>  --
>> Sid Shapiro
>> sid_shapiro at bio-rad.com
>>  Bio-Rad Corporate IT  - Desk: (510) 741-6846 <%28510%29%20741-6846>
>> Mobile: (510) 224-4343 <%28510%29%20224-4343>
>>
>>
>>  _______________________________________________
>> Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list
>>
>> bind-users mailing listbind-users at lists.isc.orghttps://lists.isc.org/mailman/listinfo/bind-users
>>
>>
>>
>> _______________________________________________
>> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
>> unsubscribe from this list
>>
>> bind-users mailing list
>> bind-users at lists.isc.org
>> https://lists.isc.org/mailman/listinfo/bind-users
>>
>
>
>
> _______________________________________________
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> unsubscribe from this list
>
> bind-users mailing list
> bind-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20140609/c2f50dc5/attachment-0001.html>


More information about the bind-users mailing list