Transcript Routing

Routing
Outline
Algorithms
Scalability
1
Overview
• Forwarding vs Routing
– forwarding: to select an output port based on
destination address and routing table
– routing: process by which routing table is built
• Network as a Graph
A
3
4
C
6
1
2
1
B
9
E
F
1
D
• Problem: Find lowest cost path between two nodes
• Factors
– static: topology
– dynamic: load
2
Distance Vector
• Each node maintains a set of triples
– (Destination, Cost, NextHop)
• Directly connected neighbors exchange updates
– periodically (on the order of several seconds)
– whenever table changes (called triggered update)
• Each update is a list of pairs:
– (Destination, Cost)
• Update local table if receive a “better” route
– smaller cost
– came from next-hop
• Refresh existing routes; delete if they time out
3
Example
B
C
A
D
E
F
G
Destination Cost NextHop
A
1
A
C
1
C
D
2
C
E
2
A
F
2
A
G
3
A
Routing Table at B
4
Routing Loops
• Example 1
–
–
–
–
–
–
F detects that link to G has failed
F sets distance to G to infinity and sends update t o A
A sets distance to G to infinity since it uses F to reach G
A receives periodic update from C with 2-hop path to G
A sets distance to G to 3 and sends update to F
F decides it can reach G in 4 hops via A
• Example 2
–
–
–
–
–
–
link from A to E fails
A advertises distance of infinity to E
B and C advertise a distance of 2 to E
B decides it can reach E in 3 hops; advertises this to A
A decides it can read E in 4 hops; advertises this to C
C decides that it can reach E in 5 hops…
5
Loop-Breaking Heuristics
• Set infinity to 16
– can not loop for ever
• Split horizon
– B does not send update (D, h, A) to A, since it learned
from A.
– prevents two-node loops
• Split horizon with poison reverse
– B sends update (D, inf., A) to A, since it learned from A.
– prevents two-node loops
• Not scale to large networks
6
Link State
• Strategy
– send to all nodes (not just neighbors)
information about directly connected links (not
entire routing table)
• Link State Packet (LSP)
–
–
–
–
id of the node that created the LSP
cost of link to each directly connected neighbor
sequence number (SEQNO)
time-to-live (TTL) for this packet
7
Link State (cont)
• Reliable flooding
– store most recent LSP from each node
– forward LSP to all nodes but one that sent it
– generate new LSP periodically
• increment SEQNO
– start SEQNO at 0 when reboot
– decrement TTL of each stored LSP
• discard when TTL=0
8
Route Calculation
• Dijkstra’s shortest path algorithm
• Let
–
–
–
–
–
N denotes set of nodes in the graph
l (i, j) denotes non-negative cost (weight) for edge (i, j)
s denotes this node
M denotes the set of nodes incorporated so far (labeled set)
C(n) denotes cost of the path from s to node n
M = {s}
for each n in N - {s}
C(n) = l(s, n)
while (N != M)
M = M union {w} such that C(w) is the minimum for
all w in (N - M)
for each n in (N - M)
C(n) = MIN(C(n), C (w) + l(w, n ))
9
Metrics
• Original ARPANET metric
– measures number of packets queued on each link
– took neither latency or bandwidth into consideration
• New ARPANET metric
– stamp each incoming packet with its arrival time (AT)
– record departure time (DT)
– when link-level ACK arrives, compute
Delay = (DT - AT) + Transmit + Latency
– if timeout, reset DT to departure time for retransmission
– link cost = average delay over some time period
• Fine Tuning
– compressed dynamic range
– replaced Delay with link utilization
10
Routing Table at Routers
Subnet mask: 255.255.255.128
Subnet number: 128.96.34.0
128.96.34.15
128.96.34.1
R1
H1
Subnet mask: 255.255.255.128
Subnet number: 128.96.34.128
128.96.34.130
128.96.34.139
128.96.34.129
H3
R2
H2
128.96.33.1
128.96.33.14
Subnet mask: 255.255.255.0
Subnet number: 128.96.33.0
Forwarding table at router R1
Subnet Number
128.96.34.0
128.96.34.128
128.96.33.0
Subnet Mask
255.255.255.128
255.255.255.128
255.255.255.0
Next Hop
interface 0
interface 1
R2
11
Forwarding Algorithm
D = destination IP address
for each entry (SubnetNum, SubnetMask, NextHop)
D1 = SubnetMask & D
if D1 = SubnetNum
if NextHop is an interface
deliver datagram directly to D
else
deliver datagram to NextHop
•
•
•
•
Use a default router if nothing matches
Not necessary for all 1s in subnet mask to be contiguous
Can put multiple subnets on one physical network
Subnets not visible from the rest of the Internet
12
Internet Structure
Recent Past
NSFNET backbone
Stanford
ISU
BARRNET
regional
Berkeley
Westnet
regional
PARC
■■■
UNM
NCAR
MidNet
regional
UNL
KU
UA
13
Internet Structure
Today
Large corporation
“Consumer”ISP
Peering
point
Backbone service provider
“Consumer”ISP
Large corporation
Peering
point
“Consumer”ISP
Small
corporation
14
How to Make Routing Scale
• Still Too Many Networks
– routing tables do not scale
– route propagation protocols do not scale
15
CIDR: Classless Inter-Domain Routing
• CIDR (RFC 1519) assigns variable-sized
addresses, without regard to classes to solve
address shortage of IPv4.
– IP address is accompanied by a network mask to
indicate the boundary. Usually written as:
128.131.0.0/22 (first IP address + number of bits in the
network part
• Longest prefix match and address aggregation for
scalable routing.
16
Longest Prefix Match and Address
Aggregation
•
•
•
•
A:
B:
C:
D:
11000010
11000010
11000010
11000010
00011000
00011000
00011000
00011000
00000000 00000000 /21
00001000 00000000 /22
00001100 00000000 /22
00010000 00000000 /20
host bits: 11
host bits: 10
host bits: 10
host bits: 12
• If a packet comes in with destination address: 11000010 00011000 00010001
00000100 (194.24.17.4), the only entry that produces a match is D.
• The above 4 entries can be further aggregated into 1 if the router has the same
next hop for the 4 destinations, in the form of 194.24.0.0/19, or 11000010
00011000 00000000 00000000 /19.
17
How Routing Works in the Internet
• Know a smarter router
–
–
–
–
hosts know local router (default router)
local routers know site routers
site routers know core router
core routers know everything
• Autonomous System (AS)
– corresponds to an administrative domain
– examples: University, company, backbone network
– assign each AS a 16-bit number
• Two-level route propagation hierarchy
– interior gateway protocol (each AS selects its own)
– exterior gateway protocol (Internet-wide standard)
18
Popular Interior Gateway Protocols
• RIP: Route Information Protocol
–
–
–
–
developed for XNS
distributed with Unix
distance-vector algorithm
based on hop-count
• OSPF: Open Shortest Path First
–
–
–
–
recent Internet standard
uses link-state algorithm
supports load balancing
supports authentication
19
EGP: Exterior Gateway Protocol
• Overview
– designed for tree-structured Internet
– concerned with reachability, not optimal routes
• Protocol messages
– neighbor acquisition: one router requests that another
be its peer; peers exchange reachability information
– neighbor reachability: one router periodically tests if
the another is still reachable; exchange HELLO/ACK
messages; uses a k-out-of-n rule
– routing updates: peers periodically exchange their
routing tables (distance-vector)
20
BGP-4: Border Gateway Protocol
• AS Types
– stub AS: has a single connection to one other AS
• carries local traffic only
– multihomed AS: has connections to more than one AS
• refuses to carry transit traffic
– transit AS: has connections to more than one AS
• carries both transit and local traffic
• Each AS has:
– one or more border routers
– one BGP speaker that advertises:
• local networks
• other reachable networks (transit AS only)
• gives path information
21
BGP Example
• Speaker for AS2 advertises reachability to P and Q
– network 128.96, 192.4.153, 192.4.32, and 192.4.3, can be reached
directly from AS2
Customer P
(AS 4)
128.96
192.4.153
Customer Q
(AS 5)
192.4.32
192.4.3
Customer R
(AS 6)
192.12.69
Customer S
(AS 7)
192.4.54
192.4.23
Regional provider A
(AS 2)
Backbone network
(AS 1)
Regional provider B
(AS 3)
• Speaker for backbone advertises
– networks 128.96, 192.4.153, 192.4.32, and 192.4.3 can be reached
along the path (AS1, AS2).
• Speaker can cancel previously advertised paths
22
IP Version 6
• Features:
– Address is 16 byte long (IPv4 has 4 bytes).
– Header is simplifies, having only 7 fields (IPv4 has 13).
– Less used features are put in the option fields, which are made
easier to be processed.
– Better support for security.
– Better support for QoS.
• Header
– 40-byte “base” header
– extension headers (fixed order, mostly fixed length)
•
•
•
•
fragmentation
source routing
authentication and security
other options
23
The Main IPv6 Header
24
The Main IPv6 Header
• The version field is always 6.
• The traffic class field indicates the QoS treatment required for the
packet.
• The flow label field provides a mechanism to implement a virtual
circuit, which is uniquely identified by the tuple (source address,
destination address, flow label). Virtual circuit makes providing QoS
easier.
• The payload field indicates the number of bytes of the packet
excluding the 40-byte fixed header.
• The next header field indicates which of the six extension headers
(options) follows the fixed header, if none, indicates the upper layer
protocol, e.g. TCP or UDP, to pass the data to.
• The hop limit field indicates the maximum hop the packet is allowed to
go through, to prevent a packet looping for ever, similar to TTL in
IPv4
• IPv4 header has fragmentation (option in IPv6), checksum, HLEN, etc.
25
IPv6 Address
• The address fields use 16-byte IPv6 addresses. The number of possible
IPv6 addresses is 2^128 or 10^38. A new notation is used, i.e., an
address is written as eight groups of four hexadecimal numbers, with
colons between the groups, like this:
8000:0000:0000:0000:0123:4567:89AB:CDEF
• Since many zeros can appear in an address, three optimizations are
made
– Leading zeros are omitted, so 0123 becomes 123.
– One or more groups of all zeros can be replaced by a pair of colons, so the
above address becomes 8000::123:4567:89AB:CDEF.
– IP addresses can be written as a pair of colons and an old dotted decimal
number, e.g., ::192.31.20.46.
26