DNS Round Robin

Fri Apr 14 15:24:55 UTC 2000

Kevin Darcy wrote:

> There are 2 basic DNS-based approaches to this problem, each with serious
> drawbacks:
> 
> 1) Define the name with multiple A records, set a "fixed" rrset-order on
> the master and slaves.
> DRAWBACKS: A) unless you can configure this on all of the slaves, and all
> servers which may potentially cache the name -- if the name is an Internet
> name, then forget it -- you're going to get a certain amount of
> "leakage" to your backup server, since caching servers will usually
> round-robin answers from cache. You can minimize the effect of the caching
> servers by lowering the TTL values on the records, but only at the cost of
> increasing DNS traffic, B) each client needs to be smart enough to
> failover to the second IP in the list it gets from the nameserver. Not all
> clients -- especially older clients -- are this smart.
> 
> 2) Define the name with a single A record and then change it -- using
> Dynamic Update or some other mechanism -- when that host fails. DRAWBACK:
> as with option #1, caching is going to get in your way here, not to
> mention the fact that the slaves may take a while to get the change, even
> if they are NOTIFY-aware. Again, you can minimize the effect of caching by
> lowering TTL values and putting up with the increased traffic, but unlike
> option #1, where round-robin'ing caching servers will at least give out a
> working address first in the list 50% of the time, even during an outage,
> with option #2 when the "primary" is down, clients will get the
> non-working address 100% of the time until their local caching server
> times out the cache entry and fetches the changed A record. Depending on
> the protocol and the client software, a 50% connection failure rate may
> still allow the users to continue working -- although probably with
> degraded performance -- and may therefore be preferable to a temporary
> 100% failure rate.

Or, if you combine the two and have multiple IPs for one name, but in
the
case of failure of one of the hosts you use Dynamic Update just to
delete
failed host's IP, your clients will be getting failed IP only 50% (in
case
of 2 IPs for a name, or (N-1)/N * 100% for N IPs) of the time, until 
caching servers update their entries.

I would also like to ask something. A lot of people here come up with
notorious fact that it's up to clients to try other IP addresses from
DNS
response if first one fails to respond. Now, could somebody be more
specific? Which applications do implement that, which browsers, how
common
is that today? Even if we still can't rely on that, it would be useful
to know the trend. If that is becoming a standard for today's Internet
applications, then this problem will exist only for some short period of
time, until everybody switch to new applications.

damir