Networks Lecture 17

Inter-A.S. Routing

Aggregation and Fragmentation

Each A.S. advertises which subnets may be reached via the A.S. CIDR is used to aggregate related addresses into a common prefix. For example, say that the following subnets may be reached via an A.S.:

CIDR           binary
-----------------------------------------
138.16.64/24   10001010 00010000 01000000
138.16.65/24   10001010 00010000 01000001
138.16.66/24   10001010 00010000 01000010
138.16.67/24   10001010 00010000 01000011
138.16.68/24   10001010 00010000 01000100
138.16.69/24   10001010 00010000 01000101
138.16.70/24   10001010 00010000 01000110
138.16.71/24   10001010 00010000 01000111

All of these prefixes have the binary prefix 10001010 00010000 01000 in common. In CIDR notation, this prefix may be written 138.16.64/21. So, the A.S. can advertise a route to all of these subnets using a single address prefix.

Assume the A.S. is an ISP, and the subnets attached to it are customers of the ISP. Say that the customer using 138.16.66/24 takes their business to another ISP, so that the 138.16.66/24 subnet is no longer directly connected to the original A.S. Can the A.S. continue advertising 138.16.64/21?

In can if it is still willing to forward traffic to 138.16.66/24. A route advertisement is a promise that the A.S. is willing to forward traffic to all hosts matching the advertised address prefix.

If the A.S. is no longer willing to forward traffic to 138.16.66/24, then it must change its routing advertisements so that they don't include this prefix. This causes fragmentation of the address space: the original A.S. no longer directly connects all of the subnets matching 138.16.64/21. In general, when a subnet is removed from an address prefix, how bad can the fragmentation be?

We can think of an address space as being a binary tree, where each edge represents either 0 or 1, and each node is a specific prefix. The address space described by the example above can be viewed this way:

Each of the original subnets is a leaf. An address prefix can be advertised only if the A.S. is willing to forward to every host falling within the prefix. This is equivalent to requiring that only perfect binary trees are allowed as advertised address prefixes. If we remove one of the leaves (indicated by the red leaf below), then we no longer have a perfect binary tree:

Each colored node can no longer be included in an address prefix. However, at each level of the tree from the root to the node that is being removed, one of the children is still the root of a perfect binary tree. Thus, when a subnet is removed from an address prefix containing 2^n subnets, at most (n-1) new prefixes are needed to advertise routes to the remaining subnets.

BGP

BGP = Border Gateway Protocol

This the internet's inter-A.S. routing protocol. It determines which sequence of autonomous systems a datagram should be forwarded through in order to reach a particular destination. An intra-A.S. routing protocol is assumed to exist for forwarding a datagram through each A.S.

Each BGP session between two routers is a long-lived TCP connection. One session exists between each pair of connected routers in different autonomous systems, and between each pair of routers within an A.S. eBGP is used to advertise address prefixes that are reachable by routing datagrams to or through the A.S. iBGP is used to distribute received route advertisements to all routers within the A.S.

For example: A.S. 3 might advertise a new prefix P. A.S. 3's gateway router, 3b, would convey this advertisement to its peer 1c in A.S. 1. From there it could be conveyed to 1b: depending on how BGP is configured, 1b could advertise the route to 2a, indicating that A.S. 1 is willing to forward datagrams destined for hosts that are part of P.

Each A.S. as a unique A.S. number assigned by a central authority. BGP messages consist of collections of advertisement. An advertisement is a prefix for some reachable network along with a collection of attributes. Two of the important attributes include:

  1. AS-PATH: the sequence of autonomous systems traversed to reach the destination network
  2. NEXT-HOP: the next-hop router: an A.S. may have multiple points of connection to the A.S. to which it is advertising the route: the NEXT-HOP attribute can be used to specify which gateway router should be used.

BGP offers many ways that the administrators of an A.S. may configure to affect which routes are used. By default, BGP picks the route with the shortest AS-PATH attribute, which makes BGP a distance-vector protocol. Other factors affecting the choice of route are how close the NEXT-HOP gateway is, and local preference, which can be any arbitrary criteria defined by the administrators.

In many respects, the problem of inter-A.S. routing is not simply finding the shortest path over which to send datagrams, but finding a path that is compatible with the policy decisions made by the administrators of the A.S. and the agreements made with the operators of other autonomous systems.

A common principle in defining routing policy is that an intermediate A.S. B should not carry traffic originating in network A and destined for network C if A and C are directly connected.

Broadcast routing

Most LAN technologies (such as ethernet) have the capability to broadcast a frame to every node attached to the LAN. However, datagrams broadcast at the link layer are generally not forwarded by routers: you wouldn't expect an ethernet frame broadcast from your PC to be delivered to every computer on the internet!

However, there are some situations in which a packet should be delivered to every node in a network, even when there are routers. For example, the routers within an A.S. might be using a link-state protocol like OSPF that requires periodic broadcasts of routing information. To the extent possible, we would like these broadcasts to be done efficiently, meaning that as few packets as possible are send on each network link. Ideally, a broadcast packet is only sent on a particular link once, or not at all. Broadcast routing is the general problem of delivering a broadcast packet to every node in a network.

A very simple broadcast routing algorithm is n-way unicast. The sender of a broadcast packet simply sends one copy of the packet to every recipient using normal unicast routing. This approach has one nice property: it does not require any support by routers.

Example: say that node S wants to send a broadcast packet to 100 recipients. The recipients are reached through a first-hop router, which has 4 outgoing links.

The blue lines show the direction the broadcast packets travel in, and how many packets are sent on each link. Because n-way unicast generates separate copies of the packet for each recipient, the first link carries 100 copies, and then the outgoing links of the router must carry multiple copies (here shown as 25 copies on each link). This is clearly not an ideal situation: we are wasting network resources by sending multiple, idential packets on each link.

The preferred situation is to send exactly one copy of the broadcast packet on each link:

Uncontrolled flooding

Forward broadcast packet on each outgoing link (not including the link on which the broadcast packet was received). This works very well for any network that does not contain cycles. However, if the network does contain cycles, a broadcast storm is created where packets multiply without limit.

Controlled flooding

Like uncontrolled flooding, except that each packet has a unique sequence number. Each router drops packets it has seen already. Requires book-keeping at each router.

Reverse Path Forwarding

Drop any broadcast packet not arriving on the link on the shortest unicast path to the sender. Otherwise, forward on every outgoing link. This works fairly well because the unicast paths to the sender of the broadcast form a spanning tree of the network. Some unnecessary forwarding takes place, but each link will carry only 1 or 2 copies of the broadcast packet.

The blue lines identify packets send along the reverse unicast paths, and the red lines identify packets not sent along the reverse unicast paths.