What happens when one out of three NSs are down?

Chris Buxton clists at buxtonfamily.us
Wed Jun 12 15:56:13 UTC 2013


On Jun 11, 2013, at 4:12 PM, Gary Wallis <wgg1970 at gmail.com> wrote:
> DNS experts:
> 
> What really happens in the real world when 1 out of three authoritative NSs are down for 30 minutes due to a datacenter outage?
> 
> For example, we have 3 NSs:
> 
> ns1.someisp.net 12.23.34.45
> ns2.someisp.net 23.34.45.56
> ns3.someisp.net 34.45.56.67
> 
> All in different datacenters.
> All are authoritative for a given zone.
> All have the same zone data and SOA serial number for the zone.
> 
> Where the datacenter handling ns3 broke routing (mistake in new router configuration) for 34.45.56.0/24 and ns3 is no longer reachable.
> 
> I think I have a grasp on the basic theory here, but in practice, the unreachable ns3 nameserver creates problems for a small group of customers trying to reach web sites with zones hosted by these three authoritative NSs.
> 
> Will round robin glue NS records help?
> 
> Can quick or automated changes at the registrar of the NS3 IP help? For example to change to a hot spare in some other datacenter? In this case would the running NSs have to have the changed NS A record also match?
> 
> Any comments and best practice solution info very welcome.

You might consider using anycast to route around the problem.

In practice, though, your best bet is to find out why that small group of customers are having problems. Are they querying the servers directly?

Chris Buxton


More information about the bind-users mailing list