Inbound solid-state multihoming without BGP

Fri Oct 31 22:56:11 UTC 2003

Guillaume Filion wrote:

>Hi all,
>
>We're looking for a cheap multihoming solution, and I spent some time
>trying to understand how commercial products that do link sharing
>work. (AstroCom PowerLink, f5.Net BigIP, Fat Pipe Xtreme, Radware
>Linkproof, Alteon Link Optimizer, etc). I think that I found a way of
>doing it with a PC running OpenBSD, I'd like to have your comments on
>this.
>
>Inbound solid-state multihoming without BGP
>
>When trying to design a custom made multihoming system without having
>to use BGP, I found out that most of the 'art' of multihoming was with
>figuring out when a link is working and when it's not.
>
>I think that I found a way of doing 'smart' DNS without having to
>monitor links at the router. Unlike systems that monitor their links
>to check their validity to switch ISPs, this system is always in the
>same state, it cannot fall into an unknown or unwanted state. One
>could make an analogy with the physical world by calling this
>multihoming 'solid-state' in the sense that it has no 'moving parts'.
>The main idea behind this system is to let the network do the logic.
>
>Here's an example, let's say that we have two ISPs:
>ISP1 provides us with addresses 1.1.1.0/24
>ISP2 provides us with addresses 2.2.2.0/24
>
>At the egde of the network, we set up a firewall router with three
>ports. One connected to the LAN (interface name:lan0,
>subnet:10.10.0.0/16), one connected to ISP1 (interface name:isp1,
>subnet:1.1.1.0/24) and one connected to ISP2 (interface name:isp2,
>subnet:2.2.2.0/24).
>
>The firewall is set for 1:1 mapping (bidirectional NAT):
>-- pf.conf --
># Web server
>accept quick on isp1 from any to 1.1.1.40 port 80
>binat on isp1 from 10.10.1.40 to any -> 1.1.1.40
>accept quick on isp2 from any to 2.2.2.40 port 80
>binat on isp2 from 10.10.2.40 to any -> 2.2.2.40
>
># Mail server
>accept quick on isp1 from any to 1.1.1.45 port 25
>binat on isp1 from 10.10.1.45 to any -> 1.1.1.45
>accept quick on isp2 from any to 2.2.2.45 port 25
>binat on isp2 from 10.10.2.45 to any -> 2.2.2.45
>
># FTP Server that's only available from isp2
>accept quick on isp2 from any to 2.2.2.55 port 21
>binat on isp2 from 10.10.2.55 to any -> 2.2.2.55
>
># DNS server for isp1
>accept quick on isp1 from any to 1.1.1.5 port 53
>binat on isp1 from 10.10.1.15 to any -> 1.1.1.5
>
># DNS server for isp2
>accept quick on isp2 from any to 2.2.2.5 port 53
>binat on isp2 from 10.10.2.15 to any -> 2.2.2.5
>
># Normal NAT for clients (needs to be switched)
>nat on isp1 from 10.10.10.0/24 to any -> 1.1.1.10
># nat on isp2 from 10.10.10.0/24 to any -> 2.2.2.10
>----
>
>Inside the LAN, the web server binds to 10.10.1.40 and 10.10.2.40 and
>serves the same content to both addresses. Similarly, the mail servers
>binds to 10.10.1.45 and 10.10.2.45.
>
>We also setup two DNS servers inside the LAN, note that they can be on
>the same machine using two different IPs.
>
>One DNS server binds to 10.10.1.15 and serves zone-isp1:
>-- zone-isp1 ---
>ttl=30 secs
>www -> 1.1.1.40
>mail -> 1.1.1.45
>ftp -> 2.2.2.55
>dns-isp1 -> 1.1.1.5
>dns-isp2 -> 2.2.2.5
>----
>
>The other DNS server binds to 10.10.2.15 and serves zone-isp2:
>-- zone-isp2 --
>ttl=30 secs
>www -> 2.2.2.40
>mail -> 2.2.2.45
>ftp -> 2.2.2.55
>dns-isp1 -> 1.1.1.5
>dns-isp2 -> 2.2.2.5
>----
>
>Basically, when a client wants to connect to a server inside our
>network, it will need to reach one of our DNS servers, either dns-isp1
>(1.1.1.5) or dns-isp2 (2.2.2.5). Four things can happen:
>
>1) Both links work.
>
>The client has an egual chance of doing its request to either DNS
>server, so it will be served with the addresses of either ISP.
>
>2) The link with ISP1 is down, but the link with ISP2 is up.
>
>The client will only be able to reach the DNS server dns-isp2
>(2.2.2.5), and will be served with addresses that use the link with
>ISP2.
>
>3) The link with ISP1 is up, but the link with ISP2 is down.
>
>The client will only be able to reach the DNS server dns-isp1
>(1.1.1.5), so it will only be served with addresses that use the link
>with ISP1.
>
>4) Both links are down.
>
>Murphy's law in action. We're screwed. Call another ISP, rince,
>repeat.
>
>It's also possible to do crude static link balancing by setting more
>than 2 DNS servers. For example, set 2 DNS servers to point to isp1
>and only 1 that points to isp2. Again, all these DNS servers can be
>hosted on the same computer. This should give roughly a 2:1 use ratio
>for ISP1 vs ISP2. Make sure that you list at least 1 DNS server for
>each ISP on your whois record.
>
>Upsides:
>
>- It does the switching from the client point of view, rather than
>from the server's. That means that if ISP1 looses some peering links,
>and for example European clients can't reach ISP1, they will
>automatically use ISP2 while other clients will continue to use ISP1
>and ISP2.
>
>- It's simple and can't fall into an unwanted or unknown state, it has
>only one state.
>
>- It can benefit from pf's traffic shaping features.
>
>- It's cheap. It can be done with a low end PC running OpenBSD. For
>example, one could use a Soekris net4801 single board computer
>(http://www.soekris.com/) that costs around US$300. Someone familiar
>with pf, networking and DNS could set this up in a couple days.
>
>Downsides:
>
>- We can't set up DNS servers outside of our network. While this is
>common practice, not doing it is not a big problem. The only real
>different is more DNS traffic inside our network. Read
>http://cr.yp.to/djbdns/third-party.html for a discussion about using
>external (third-party) DNS servers.
>
>- Some dumb clients might do caching that does not honor the DNS ttl
>value. These would have to wait until their caching mechanism has
>expired before making the switch to the 'good' ISP when the one
>they're using is down. I read that this caching is generally between 5
>to 15 minutes so it's not too much of a problem.
>
>- It gives less control over the load balancing than the traditionnal
>ping-and-switch solution. The traditionnal solution can change the DNS
>values in real time to adjust the load.
>
>- I have not figured out a way of doing switching the traffic
>initiated from inside the LAN. It's not a problem for my situation, we
>could have an admin log into the router and change the default route,
>our priority is making our servers available to the outside world.
>
>I'm surprised that most commercial link balancing products don't use
>this. It's much simpler and more reliable than traditionnal
>ping-and-switch. It reminds of something I heared a while ago: "We all
>agree that your theory is crazy, but we can't agree whether it's crazy
>enough to work. I personally believe it's not crazy enough."
>
>I would really like to hear comments about this. We're actively
>searching for a cheap multihoming solution and I believe that this
>could be what we're looking for.
>
Using DNS for load-balancing is inherently icky. In order to defeat the 
effects of caching, you have to reduce your TTL values to levels that 
are arguably anti-social to the rest of the Internet (always remember, 
when you lower your TTL, you're not just making your own nameserver work 
harder to answer queries, you're also making other peoples' nameservers 
and network in between work harder too).

Beyond that, though, your solution consists basically of tying each DNS 
server to given link. How is this really an improvement? Sure, you get 
out of the business of trying to figure out whether a link is really 
down (as long as the query packets aren't getting to the DNS server on 
that particular link, or the responses can't get back, then effectively 
it's "down" in your solution), but in eliminating that source of 
uncertainty, you're introducing another class of uncertainty, i.e. where 
the link is fine, but DNS or the DNS server on that link is having a 
problem. The simplest case would be the crash -- or simply the reboot -- 
of the DNS server. Do you really want all of your network traffic 
"failing over" every time your DNS server hiccups?

Lastly, for completeness, I'll point out a couple of discrete downsides 
that you neglected to mention in your list:
1) the hassle of maintaining two different versions of the same zone(s) 
on two different nameservers
2) the fact that, due to the inability to set up off-site nameservers, 
the query load on the nameserver associated with a particular link will 
approximately double whenever the other link is down.

         - Kevin