NETWORK DEFENSE

Routing Instability

by Rik Farrow

Border Gateway Protocol, the routing glue of the Internet, Lacks Strong Security

People have been talking about the Internet failing catastrophically since the Internet became a popular topic in 1995. And, so far, these predictions have failed to materialize. True, there have been some localized disruptions, like the DDoS attacks against e-commerce sites in 2000, and DNS problems, similar to the ones that made Microsoft controlled Web sites, mike msnbc.com, disappear from the Internet in 2001.

But when network engineers gather to discuss real Internet instability, Border Gateway Protocol (BGP) always comes up. BGP version 4 (BGP-4) distributes routing information between all the routers that interconnect different network providers, and is crucial to the moment-to-moment functioning of the Internet. And BGP-4 lacks a security design that prevents serious Internet instability.

BGP-4 does have security features, and these features have proven critical to its proper functioning. It is just that these features can be ignored and accidently or intentionally misconfigured with serious consequences, as happened in April of 1997.

Routing

TCP/IP is a packet-switched protocol. That means that each time a packet arrives at a router, a decision must be made about the next destination of that packet. The next destination must be another router directly connected to the current router, and this process continues until the packet either reaches it final destination or is dropped. Note that routing decisions only control the next hop, that is, the next router on the path.

Routers make their decisions about where to send a packet based on routing tables. The router looks up the destination address of the packet in its routing table and forwards the packet to the next router, the next hop on the packet's journey. The router may then cache this information, so that subsequent packets in the same stream will learn the routing information from the cache, instead of a routing table lookup, which is slower. But the routing table is the key, and setting up the routing table correctly is critical.

Organizations use Interior Gateway Protocols (IGP) like OSPF (Open Shortest Path First) to setup and maintain consistent routing table information on interior networks. Some router manufacturers make this very easy to do, as a newly installed router will probe its interfaces for other routers, and begin exchanging routing information with them. Routing tables can also be staticly defined, that is, entries made for all the possible routes in a configuration file. In either case, for network addresses not found in the organization's network, a default route, that generally points to the Internet, is used.

Routers that belong to the large network service providers (NSPs), like Worldcom and Sprint, obviously cannot rely on default routes. They are the default route, and must have routing information for all network addresses. These routers exchange information using Exterior Gateway Protocols (EGP), with BGP-4 being the standard. While there would be over two million network addresses, Classless Internet Domain Routing (CIDR) replaced past practices of using classful routing (dividing network addresses into several classes using 8, 16, or 24 bits of the IP address) to reduce the size of routing tables. With CIDR, blocks of network addresses are aggregated by assigning each block to a single large NSP or organization.

The CIDR blocks of network addresses then get collected into even smaller chunks. Every large entity in the Internet gets represented by an Autonomous System (AS) number, where an AS is a single organization, consisting possibly of many CIDR blocks, that controls all the internal routing for that organization. Think of an AS as a way of abstracting large chunks of the Internet for routing purposes. Instead of having to be concerned with routing through an AS, BGP-4 arranges for routes to networks via the various ASes they must transit.

Using AS numbers helps by making the routes exchanged shorter, as well as making routing loops easier to detect. Each time a router receives a BGP-4 update, it can look for its own AS number in the AS_PATH (list of AS numbers in this route), and detect if there is a loop. And
when a router sends out updates, the information is a network address prefix and the list of AS numbers. The first AS number on the list represents the next hop, and will be a router on the edge of that AS.

BGP-4

Routers that "speak" BGP-4 use TCP connections to exchange information with neighboring routers in other networks that are also using BGP-4. These neighbor routers are called peers, and the TCP connections serve as a mechanism to show that each peer is still up and reachable. When a peering connection gets broken, for whatever reason, each end of the connection withdraws all routes that go through the now-unreachable neighbor. Also, whenever a BGP speaker learns of additional or withdrawn routes, it may also share these with its peers.

BGP-4 implementations support several security mechanisms. Neighbors may authenticate each other using a login and password, for example. More common is the use of MD5 digital signatures, where the header information and data that is part of each packet in a peering connection includes an MD5 signature. Each neighbor must be configured with a secret, and that secret is included in the MD5 signature calculation, making it unlikely that an attacker could modify or inject information into a BGP peering exchange.

The trouble with these authentication/integrity mechanisms is that they require coordination between the organizations managing neighboring routers. Also, the secrets these mechanisms rely on is part of the configuration of the routers themselves, and anyone who can view the configuration can learn the secrets. Finally, if an attacker breaks into a router, any routing information injected by the attacker goes over the authenticated connection. So even if organizations use the optional authentication, someone who can control the router, or control another router that is trusted by this router, can potentially inject misleading routing information.

Filtering

The final, and most important, mechanism for protecting the integrity of BGP is policy-based filtering. Policy-based filtering is a complex topic, but it boils down to this: which routing information will be accepted from (ingress) or sent to (egress) a particular neighbor. Filtering is the crucial issue in proper operation of BGP today.

Policy-based filtering began not as a security mechanism, but as a way to control the routes an AS would advertise or accept. For example, imagine that your organization is international in scope, and has its own leased lines for carrying internal traffic. Also, your organization has connections to the Internet at several locations around the world, so your private leased line doesn't wind up carrying traffic destined for the Internet to a single access point. In this case, your exterior routers would be running BGP-4, so your internal network has routing information about the best way to reach different networks within the Internet.

But what about the rest of the world? Do you want just anybody on the Internet to be able to use your leased lines? Policy-based filtering allows you to block the advertise of routes that would transit your own AS. Your AS becomes a stub network--one that participates in the Internet, but does not carry traffic destined for other ASes than your own.

Policy-based filtering is optional. Your organization, for example, might have a single connection to the Internet, and thus it appears that figuring out how to use the filtering is a lot of work for not much gain. But when policy-based filtering is not done properly, amazing and embarassing things will happen.

On Friday morning, April 25 of 1997, a small ISP in Florida made a mistake in
the configuration of the router that joined their small network to Sprint. This ISP, known by their AS number, 7007, allowed all the routes learned from Sprint using BGP to be exported back to Sprint as their own routes. This actually is easy to do, as BGP implementations can take routes from IGP and convert them into EGP routes. In this case, the IGP converted CIDR routes into classful routes.

The Sprint BGP speaker was not filtering properly either, and began sending out updates that added AS7007 as the correct route for a portion every CIDR block (essentially, the first class C, or 24 bit long, network prefix).

This misinformation first spread through Sprint's network, then to neighboring NSPs, including ANS, MCI, UUNet, and other NSPs. Many routers crashed, as their routing tables suddenly doubled in size (an additional route added for each CIDR block), and the routing instability spread throughout the Internet. Remember that when a router crashes, it drops its BPG connection with its peer, who then sends out an update withdrawing all the routes previous announced by the crashed router. It took over an hour for the Internet to gradually become stable again. Network managers added filters that blocked routes that included AS7007, fixing the problem until the ISP solved their local problem and Sprint reconnected them to the Internet.

And this was just an accident, caused by a misconfiguration that redistributed routing information learned from BGP, into an IGP, then back into BGP.

Lessons Learned

NSPs have learned to be very careful with their BGP route filtering. There are also tools that smaller organizations can use to help them to create create filters. Routing Registeries, like RADB (see Resources) can be used with free tools like the RAToolset to build BGP filters.

But does filtering really solve the problem? If peers are not checking digital signatures on the data exchanged, an attacker could potentially inject routes--routes that might be trusted because of their source. A simpler attack would be to reset the peering connection, forcing each peer to withdraw all the routes learned from its neighbor. Either of these attacks would cause, at least, local disruption.

More coordinated attacks would have a much greater range of effect. If something like a DDoS attack were used, that is, a large number of cooperating agents that simultaneously subvert routers and begin announcing spoofed routes, an attack would cause widespread problems. And difficult ones to fix, as the source of the problems would appear to be the routers themselves. Also, just as in the Internet Worm of 1988, the Internet itself would be unusable, leaving each network manager on his or her own to discover the source of the attack.

The defense against a distributed attack is for every BGP speaking router to be secure against remote attacks-- a tall order. At the very least, routers must be administered using strong authentication and encrypted links (SSH) whenever possible.

There have been other proposed solutions, such as including a digital signature with every AS included in a routing path. This solution, proposed by BBN, relies on a non-existent PKI, and requires participating routers to verify signatures on every AS included in a BGP update, something that would be an enormous drain on the router CPU. And this solution would still not solve the problem of a distributed attack that subverted routers--the AS numbers would still come with signatures.

In a paper presented in 1997, Craig Labovitz, of Arbor Networks, and others analyzed BGP updates at Internet exchange points. What they found was evidence of at least an order of magnitude more BGP updates than required, including many duplicates. More recently, Labovitz claims that the problems have not gotten better. BGP-4 makes the Internet work. But it is still not perfect, and a problem that someday, someone will take serious advantage of.

Resources:

Cisco's Guide for ISPs:
http://cio.cisco.com/warp/public/459/13.html

BGP-4 RFCs, 1771 protocol defined, and 1772, usage and policies: http://www.ietf.org/rfc/rfc1771.txt

Protection of BGP Sessions via the TCP MD5 Signature Option: http://www.ietf.org/rfc/rfc2385.txt

RFC draft for improvements to BGP-4, including some security issues:
http://www.ietf.org/internet-drafts/draft-iab-bgparch-02.txt

The largest collection of routing registrations: http://www.radb.net/

Tool for creating configuration file using routing registration databases: http://www.isi.edu/ra/RAToolSet/

Open source routing software that supports BGP: MRTD, www.mrtd.net and Zebra, www.zebra.org

Labovitz, et al, Internet Routing Instability: http://www.comsoc.org/confs/ieee-infocom/1999/papers/