KEEPING THE DOOR OPEN

Redundant Links to ISPs Help Prevent Outages

by Rik Farrow

Suppose you have done everything you can to secure your Web servers. You have them behind firewalls, the firewall (or another service) inspects all HTTP requests, you have audited all scripts and programs used by the Web servers, and have patched both the Web server and its supporting OS. If your Web server needs to be working for your organization 24/7, you still have forgotten something important.

What happens if your link to the Internet goes down? All the work you have done in securing your Web servers will not help if your customers cannot even access your Web servers.

The obvious answer is to get a second Internet connection. But getting multiple Internet connections to support HTTP servers is not an easy proposition. And packaged solutions that promise to make this work may not be the most appropriate solution. You would do best to choose to use either BGP or a hosting service.

Redundancy

About the only thing you can say about your Internet link going down is that it's wasn't your problem. Unless you misconfigured your router, or unplugged the wrong cable, most problems with your Internet link involve either off-premises wiring or your ISP. And you might feel that you have dealt with ISP downtime by getting a service guarantee.

But there a couple of things that a service level agreement won't help you with. The most common example comes from the dreaded backhoe. While having a construction or road repair project tear through your cable might not seem like a security problem, it is about the most effective denial of service (DOS) attack that exists--outside of someone bombing your facility.

Speaking of DoS attacks, while it might seem like having a second link to the Internet might help you there, the truth is that second link might get just as saturated with traffic as your first link. The problem arises because the attacker can take advantage of routing to reach your site, and both links have routes to your site. I'll get to a possible solution to DOS attacks later.

A lot of network devices, even very low end ones made by D-Link, support failover connections, so having multiple connections appears to be something simple to do. The real problem arises because the goal here is to have redundant live links to the Internet that both route to the same IP network address.

Most home users, and organization concerned primarily with email and outgoing connections, don't have this problem. Home users, unless they pay a premium, get a different IP address assigned to them every time they connect. Since the home user is not running any servers, having the IP address change is not a big deal. But for any site that wants publicly accessible servers, permanent IP addresses are crucial. And that is where the problems with redundant links begins.

BGP

Throughout the world, each ISP has been granted blocks of IP addresses. These blocks are logically grouped together through the assignment of an AS (Autonomous System) number. Each ISP that manages its own block or blocks of IP network addresses has one (or more) AS numbers assigned to represent thoses blocks, and BGP (Border Gate Protocol) uses the AS numbers to handle routing on the Internet (see http://www.spirit.com/Network/net0102.html).

When you connect to an ISP, whether you are using dialup, DSL, a cable modem, or leased lines, your end of that connection gets assigned an IP address that belongs to your ISP's AS number. IP traffic from other sites on the Internet gets routed to your ISP using its AS number; then your ISP routes your traffic to your network. If you decide to get second Internet connection from a different ISP, that ISP will assign you one or more addresses from its own blocks of addresses, and routing to your site relies on that ISP's AS number.

Now, perhaps, you can see the problem. Each ISP has assigned you one or more IP addresses from their own netblocks. While you can certainly use NAT (Network Address Translation) so that your servers see a single external IP address, the rest of the world has IP addresses for your site that are from two different networks. For example, one ISP has assigned 192.2.200/24 to you, and the other has assigned you 204.175.22/24. Suppose the NAT device or firewall sitting in front of your Web servers uses 192.2.200.2 for one network, and 204.175.22.2 for the other network. Now the rest of the world can reach your Web servers using either of these addresses--until one link fails.

DNS servers will happily send out multiple address records associated with a single domain name. In the example above, we could have A (address) records for both addresses associated with the same name, say www.trouble.org. And DNS servers, like BIND (Berkeley Internet Name Daemon) will even change the order of the addresses each time it sends them out (using a simple round-robin algorithm). What happens when one link goes down is that a Web browser has a fifty-fifty chance that it will have received the working address first. If the address associated with the failed network was sent first, the browser will timeout trying to make the connection over the failed link.

Note that this wouldn't be a problem for mail servers, as DNS supports having multiple MX records for mail servers, and mail servers are supposed to failover automatically.

There are products, such as ones from Radware (http://www.rad-direct.com/) and StoneSoft (http://www.stonesoft.com/) that claim to handle this addressing problem for you. What these products do is work with DNS or DDNS (Dynamic DNS) so that only the currently working IP address will appear in public DNS records. This technique only works when caching of DNS records is either totally disabled, or the cache timeout (time to live) value is set to a very short interval. Disabling DNS caching is considered an abuse of the DNS system. Instead of a browser resolving your HTTP server's domain name once, it must do so for every connection it makes, placing additional loads on your network and DNS server, as well as the rest of the Internet.

The proper solution is to get both your own network address and an AS number assigned to your organization. While neither of these tasks are impossible, both will be difficult. You must apply to ARIN to get your own AS number (see Resources), and there are only a limited number of AS numbers available worldwide. You must also find ISPs that will cooperate with you in this venture, which essentially implies getting competitors to cooperate. Each ISP must also be willing to work with you by modifying how they exchange BGP updates with the rest of the Internet, so that your network will be reachible from both ISPs.

And finally, you must own and configure routers that will use BGP4. Note that I said routers, because if you are going to the expense and trouble to have redundant links, you will also want to have redundant routers as well.

BGP4 is the most complex IP routing protocol to properly configure and maintain. And routers that can handle the entire routing table for the Internet require a lot of memory, making them more expensive to purchase. But all is not lost. Your ISPs may have some expertise in working with BGP configuration, and most likely will be happy to be paid to help you setup your routers. After all, if you screw up the configuration of your routers, it will impact your ISP as well--enlightened self interest at work.

Also, you can configure your pair of BGP-aware routers so that both accept only partial routes, instead of the entire routing map of the Internet. Then, choose the best connected of your ISPs (usually the largest ISP) to act as your default route. Having one of the two ISP set as the default route means that your outgoing traffic will not be evenly balanced between your two links. But this would likely be true anyway, as the larger of the two ISPs will also have better connectivity, and would be more likely to be chosen as the best route even if you did have complete routes.

Once you have your two links and BGP working properly, people will have no trouble reaching your network over either of the two links. If one link goes down, whether from a router failure or backhoe accident, the link between the pair of BGP neighbor routers will also go down. Shortly after this link goes down, the information that the route through that link no longer works will also begin to diffuse through the Internet. This information does not have to spread through the entire Internet to be effective, just to the nearest router that is on the path between a remote client and both of your ISPs.

There is still an obvious pitfall in this not-so-simple scenario. What if the backhoe cuts through both of your connections to your ISPs? Don't laugh, as this has happened. Just as the telephone trunks leaving your building may all take the same physical route, it is quite likely that so will the leased lines or fiber connecting you to your ISPs follow the same route, at least close to your building. If you really want to have fully redundant links, you must also research the path taken between your site and your ISPs.

DDoS

I hope by now you understand why not everybody has run out and arranged for redundant connections to the Internet. Just the organizational end is complex enough, without having to trace physical lines and learn BGP. What's worse, redundant links won't be of much use to you if your site gets targeted by a flood of packets in a DOS attack. At best, you now have more bandwidth between your site and the Internet, so more packets will be required to flood both links. Imagine that you have incited the wrath of a political group that has 2,000 members all willing to run a script or DoS tool aimed at your site. Even if all 2,000 sources of this attack have nothing but a dialup modem for an Internet connection, they can still send (with at 38400 baud connection) some 76,800,000 bits per second in your direction.

To make matters even worse, you now must rely on getting your two upstream ISPs to help you by filtering out as much of the flood as they can. For the attackers, life is easy. They just fire up their DoS tools and let rip.

You might want to look into an alternative solution to having highly available Web servers--find a large hosting facility. Most likely, even your own ISP already has redundant links to the Internet, an improvement over your own situation. Even better would be to use a very large hosting organization, such as Cable & Wireless (www.cw.com). Cable and Wireless has their own network (of OC192 links) that connect them to different sites, as well as to the Internet at many different locations. And if you are willing to pay a premium for Incident Response, they will also watch your network, and react to any signs of DoS attacks for you, without your having to do anything.

Redundant Internet links just might be the final step in making your Web site secure and always available. But the complexities involved in making should make you want to consider the alternative of using a large hosting site.

Resources:

An Open Source (Linux) solution for load balancing over two links and running email servers:
http://www.emailxl.com/~cross/redundant.htm

ARIN information about registering an AS number: http://www.arin.net/registration/asn/index.html

Cisco guide to setting up BGP with HSRP: http://www.cisco.com/warp/public/459/hsrp_bgp.html