Global Availibity

Brad Knowles brad.knowles at skynet.be
Thu Aug 23 15:12:48 UTC 2001


At 9:29 AM -0500 8/23/01, Joseph K Gainey wrote:

>        Client------------[Client DNS]
>                          |
>        +-----------------+                       [Office]
>        |                                         DNS(MASTER)
>        |                                           |
>        |
>        +---------------------+---------------------+
>    (t1)|                 (t1)|                 (t1)|
>        SITEA(Seattle)    SITEB(New York) SITEC(Houston)
>        DNS1(SLAVE)       DNS2(SLAVE)     DNS3(SLAVE)
>        WWW(1)            WWW(2)          WWW(3)

	If any and all of these sites can go down, and cannot be trusted 
to remain operational, then you've got a very real problem.  You 
could have Layer Four load-balancing switches at each site that are 
constantly monitoring the accessibility of the other sites, and which 
are configured to hand-off connections to a less loaded site once you 
reach a certain threshold, but someone somewhere is going to have to 
have a set of IP addresses that gets this thing started somehow, and 
gets those connections pointed towards a hopefully operational server.


	I imagine that you could "anycast" a shared IP address of one or 
two virtual slave nameservers (actually, IP aliases set up on each of 
the slave nameservers, but which are configured to do TCP connections 
via their real IP address), and have the IP address of the web server 
sit off in a subdomain of its own, and with a very low TTL.  You 
could then have the L4 load-balancing switches automatically update 
this zone whenever they noticed that one of the other sites went 
down, so as to remove that IP address from the list to be handed out.

	This would at least minimize the chance that someone would get 
and cache for a long time a non-functional IP address for your 
virtual web server, and then you could leave the actual 
load-balancing issues to the L4 load-balancing switches (such as the 
RadWARE WSD Pro+, or other related members of the RadWARE family).


	But this is getting dangerously close to DNS-based load-balancing 
schemes that I am violently opposed to.

	If you go this route, make absolutely damn bloody sure that you 
don't ever cause DNS packet truncation, because anycasting only works 
properly with UDP, and a DNS query would have to be retried with TCP 
if it resulted in truncation -- there would be too much chance of a 
network route changing in the middle of a TCP connection setup that 
would result in your talking to two or more servers answering for the 
same IP address, and thus resulting in a connection reset and retry, 
which would be far worse than just making things work properly with 
UDP.

>  We have entered into our contracts with clients that we will have
>  99.99% uptime, if 1/3 (33%) of connection made in a 24hr period fail
>  then this is not 99.99% uptime.  The problem i've run into is that the
>  only way for client's (and thier DNS servers) to not see the down site
>  is to remove the down site from dns.  Not a problem right, except I'd
>  rather not be called at 2am to remove something from dns i'd rather
>  have dns do it itself.

	Then you should have comparable SLA's from your ISPs, and when 
you get hit with the consequences of a failure, you should be able to 
get full restitution from the ISP that caused that failure.  It's 
insane to give out guarantees of 100% (or very near 100%) 
reliability, without in turn requiring that same level of reliability 
from the sources you're using to try to build that system.

-- 
Brad Knowles, <brad.knowles at skynet.be>

H4sICIFgXzsCA2RtYS1zaWcAPVHLbsMwDDvXX0H0kkvbfxiwVw8FCmzAzqqj1F4dy7CdBfn7
Kc6wmyGRFEnvvxiWQoCvqI7RSWTcfGXQNqCUAnfIU+AT8OZ/GCNjRVlH0bKpguJkxiITZqes
MxwpSucyDJzXxQEUe/ihgXqJXUXwD9ajB6NHonLmNrUSK9nacHQnH097szO74xFXqtlbT3il
wMsBz5cnfCR5cEmci0Rj9u/jqBbPeES1I4PeFBXPUIT1XDSOuutFXylzrQvGyboWstCoQZyP
dxX4dLx0eauFe1x9puhoi0Ao1omEJo+BZ6XLVNaVpWiKekxN0VK2VMpmAy+Bk7ZV4SO+p1L/
uErNRS/qH2iFU+iNOtbcmVt9N16lfF7tLv9FXNj8AiyNcOi1AQAA


More information about the bind-users mailing list