Transcript IP: Addresses and Forwarding
BANANAS: An Evolutionary Framework for Explicit and Multipath Routing in the Internet
5 B D 2 3 C 5 A 2 1 3 1 2 E 1 2 F Hema T. Kaur, Shiv Kalyanaraman, Andreas Weiss, Shifalika Kanwar, Ayesha Gandhi
Rensselaer Polytechnic Institute
http://www.ecse.rpi.edu/Homepages/shivkuma
Rensselaer Polytechnic Institute 1 Shivkumar Kalyanaraman
Acknowledgements
Biplab Sikdar (faculty colleague)
Mehul Doshi (MS)
Niharika Mateti (MS)
Also thanks to:
Satish Raghunath (PhD)
Jayasri Akella (PhD)
Hemang Nagar (MS)
Work funded in part by
Intel Corp and DARPA ITO, NMS Program. Contract number: F30602 00-2-0537
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 2
The Question
Can we emulate a
subset
of MPLS properties
without signaling
?
Key: Can we do source routing ?
without signaling without variable (and large) per-packet overhead being backward compatible with OSPF & BGP allowing incremental network upgrades
Shortest Path TE Spectrum … MPLS BANANAS-TE Signaled TE
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 3
Why cannot we do it today?
Connectionless TE today uses a
parametric
approach: Eg: changing link weights in OSPF, IS-IS or parameters of BGP-4 (LOCAL_PREF, MED etc) Performance limited by the
single
shortest/policy path 1 1 2 2 2 A 1 1 E 2 Links AC and CD are overloaded Alt: Connection-oriented/signaled approach (eg: MPLS) Complex to extend MPLS-TE across multiple areas. Not a solution for inter-AS issues. MPLS also needs the support of all the nodes along the path Rensselaer Polytechnic Institute 4 Shivkumar Kalyanaraman
MPLS Signaling and Forwarding Model
MPLS label is swapped at each hop along the LSP Labels = LOCAL IDENTIFIERS … Signaling
maps global identifiers
spec) to
local identifiers
(addresses, path
Seattle San Francisco (Ingress)
IP 1321 1321 120 IP 120
New York (Egress)
5 IP 0
Miami
IP Label Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 5
Global Path Identifiers
Instead of using
local path identifiers
(labels in MPLS), consider the use of
global path identifiers
Seattle New York (Egress)
IP
San Francisco (Ingress)
IP 36 IP 27 IP 0
Miami
IP
PathId
Rensselaer Polytechnic Institute 6 Shivkumar Kalyanaraman
Global Path Identifier: Key Ideas
IP
PathId(1,j) i
w 1
1
w 2
2 k m-1
w m
j
IP
PathId(i,j) Key ideas
:
1.
Swap/process
global
pathids hop-by-hop instead of
local
labels!
2.
Avoid inefficient encoding (IP) or signaling (MPLS)
3. Only upgraded
nodes need to
locally compute
a
subset of valid
PathIDs.
Rensselaer Polytechnic Institute 7 Shivkumar Kalyanaraman
Global Path Identifier (continued)
i
w 1
1
w 2
2 m-1
w m
j k
Path = {i, w 1 , 1, w 2 , 2, …, w k , k, w k+1 , … , w m , j}
Sequence of
globally known
node IDs & Link weights Global Path ID is a
hash
of this sequence =>
locally computable
without the need for signaling!
Potential hash functions: [j, { h(1) + h(2) + …+h(k)+ … +h(m-1) } mod 2 b ]:
node ID sum MD5 one-way hash, XOR (eg: LIRA), 32-bit CRC
etc… Canonical method: MD5 hashing of the subsequence of nodeIDs followed by a CRC-32 to get a 32-bit hash value (MD5+CRC)
Low collision (i.e. non-uniqueness) probability
Different PathID encodings have different architectural implications Rensselaer Polytechnic Institute 8 Shivkumar Kalyanaraman
Abstract Forwarding Paradigm
Forwarding table (Eg; at
Node k
):
[Destination Prefix, [j, PathID H{k, k+1, … , m-1} ]
]
[Next-Hop, [k+1, SuffixPathID H{k+1, … , m-1} ] ]
Incoming Packet Hdr: Destination address (
j
) & PathID =
H{k, k+1, … , m-1}
Outgoing Packet Hdr: [
j, PathID = H{k+1, … , m-1} ]
Longest prefix match + exact label match + label swap!
PathID mismatch => map to shortest (default) path, and set PathID = 0
No signaling
because of globally meaningful pathIDs!
i
w 1
1
w 2
2 m-1
w m
j k
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 9
BANANAS TE: Explicit, Multi-Path Forwarding
Explicit source-directed routing:
Not limited by the shortest path nature of IGP Different PathIds => different next-hops (
multi-paths
)
No signaling
required to set-up the paths Traffic mapping is
decoupled
from route discovery
Seattle
IP 0 IP 5
New York (Egress)
IP
San Francisco (Ingress)
IP 36 IP 27 IP 0
Miami
Rensselaer Polytechnic Institute 10 IP
PathId
Shivkumar Kalyanaraman
BANANAS TE: Partial Deployment
Only “red” routers are upgraded Non-upgraded routers forward everything on the shortest path (default path): forming a “
virtual hop
”
Seattle
IP
San Francisco (Ingress)
IP 5 IP 27 Rensselaer Polytechnic Institute IP 27 11
X
IP 0
New York (Egress) 27
IP 0 IP 27
Miami
Shivkumar Kalyanaraman
Simplistic Route Computation Strategy: All Paths Under Partial Upgrades
Assume 1-bit in LSA’s to advertise that an upgraded router is
“multi-path capable” (MPC)
Two phase
algorithm: (assume m upgraded nodes) 1.
(N-m) Dijkstra’s
for non-upgraded nodes or one all-pairs shortest path (Floyd Warshall) 2.
DFS
to discover valid paths to destinations. Explore all neighbors of upgraded nodes Explore only shortest-path next-hop of non-upgraded nodes Visited bit set to avoid loops Computes all possible valid paths under PU constraints in a fully distributed manner (global consistency) Rensselaer Polytechnic Institute 12 Shivkumar Kalyanaraman
Simulation/Implementation/Testing Platforms
MIT’s
Click Modular Router
On Linux: Forwarding Plane
Modular Router
Utah’s
Emulab
Testbed: Experiments with
Linux/Zebra/Click
implementation Rensselaer Polytechnic Institute 13
SSFnet Simulation
for OSPF/BGP Dynamics Shivkumar Kalyanaraman
Zebra/Click Implementation on Linux (Tested on Utah Emulab)
75 13
3 9 6
53 21 4 45 51 83 3
4 1 2 7
93 38 67 51 5 67
5 1 8 0
Part of table at node1: (PathID= Link Weights, for simplicity) Destination 4 4 4 4 PathID 260 98 51 160 NextHop 2 3 4 5 SuffixPathID 177 (=260 – 83) 0 (= 98 – 98) 0 (= 51 – 51) 0 (=160 – 160) Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 14
SSFnet Simulation Results A-MPC
Nodes
A B D
Avg. # of Paths to each Dest 6.3
5.7
5.6
P-MPC
Nodes: #Paths
Avg.
Max Min
7.7
13.1
2.8
B A A-MPC
Nodes
B C D
Avg. # of Paths to each Dest 6.7
9.4
6.2
P-MPC
Nodes: #Paths
Avg.
Max Min
7.2
13.9
2.8
C D E A-MPC
Nodes
B D E
Avg. # of Paths to each Dest 6.4
6.9
7.9
P-MPC
Nodes: #Paths
Avg.
Max Min
6.9
15.5
2.7
Flat OSPF Area, 19 Nodes; Only 3 Active-MPC nodes
Rensselaer Polytechnic Institute 15 Shivkumar Kalyanaraman
Refinement 1: Heterogeneous Route Computation 5 B 3 C 5 A 1 2 2 D 3 2 1 E 1 2 F
Goal
: Upgraded nodes (eg:
A, D, E
) can use
any
route computation algorithm, so long as it computes the shortest (default) path!
Eg:
k
shortest-paths from a given source
s
to each vertex in the graph, in total time
O ( E
+
V
log
V + kV)
: lower complexity than AP-PU Issue:
Forwarding for k-shortest paths may not exist
Need to
validate
the forwarding availability for paths!
Rensselaer Polytechnic Institute 16 Shivkumar Kalyanaraman
Two-Phase Path Validation Algorithm
Concept: Forwarding for path exists only if the forwarding for
each of its suffixes
exists.
Phase 1 (cont’d):
compute {k-shortest} paths for
all other upgraded
and 1-shortest paths for non-upgraded nodes. nodes,
Sort computed paths by hopcount
Phase 2:
Validate paths starting from hopcount = 1. All 1-hop paths valid.
p-hop paths valid if the (p-1)-hop path suffix is valid Throw out invalid paths as they are found Polynomial complexity to discover all valid paths in the network & validation can be done in the background Validation algorithm correct by
mathematical induction
Rensselaer Polytechnic Institute 17 Shivkumar Kalyanaraman
OSPF LSA Extensions
Rensselaer Polytechnic Institute 18 Shivkumar Kalyanaraman
B D C Active Nodes
B(k=3)
Avg. # of Paths to each Dest 2.94
D(k=3)
2.94
C(k=3)
2.79
Avg. # of Paths/k *100 98% 98% 93%
B D
Linux/Zebra/Emulab Results
C
B D C Active Nodes
B(k=5)
Avg. # of Paths to each Dest 4.83
D(k=5)
4.78
C(k=5)
4.44
Avg. # of Paths/k *100 97% 96% 89% B D C Active Nodes
B(k=7)
Avg. # of Paths to each Dest 6.5
D(k=5)
4.78
C(k=5)
4.44
Avg. # of Paths/k *100 93% 96% 89%
Flat OSPF Area, 3 Active-MPC nodes; Upto k-shortest, validated paths
Rensselaer Polytechnic Institute 19 Shivkumar Kalyanaraman
Refinement 2: Index-based PathID Encoding
Issue: increase in computation/storage complexity at upgraded nodes Question: Can we move complexity to the network “edges” and simplify “core” nodes ?
Ans: YES! The key is to consider an alternative, global PathID encoding
Globally-known link IDs can be locally hashed using a well-known function (eg: link ID index)
Rensselaer Polytechnic Institute 20
PathID = concatenation of well known local link ID hashes
Shivkumar Kalyanaraman
Why is the Index-based Encoding Interesting?
Ans: Architectural flexibility and simplification Core (interior) nodes: Forwarding function dramatically simplified Minimal state (only the index table) No control-plane computation complexity at interior nodes Edge nodes: Path validation dramatically simplified Edge-nodes can store an arbitrary subset of validated paths Heterogeneous route computation algorithms can be used Rensselaer Polytechnic Institute 21 Shivkumar Kalyanaraman
Index-Based Forwarding Example
Rensselaer Polytechnic Institute 22 Shivkumar Kalyanaraman
Area 1
Multiple Areas
Area 2 5 A 1 2 B 2 D Area 0 1 1 3 2 C 4 ABR2 7 1 1 ABR1 1 2 ABR4 2 4 7 2 3 4 4 2 ABR3 5 G H 2 J 1 5 1 ABR5 I
Red nodes: upgraded Green nodes: regular PathID re-initialized after crossing area boundaries Eg: From node A (area 1) to node I (area 2) Available paths: A-B-C-ABR1-area2, A-B-C-ABR2-area2 etc When the packet reaches area2, ABR3 may choose one of many paths to reach I. Eg: ABR3-H-I, ABR3-J-I, ABR3-H-G-I etc Source-routing notion similar to, but weaker than PNNI Rensselaer Polytechnic Institute 23 Shivkumar Kalyanaraman
Inter-domain TE
Outbound TE:
Multi-
exit
(or Explicit-
exit
) routing Useful to manage
peering vs transit costs
Goal: fine-grained traffic engineering policy BANANAS Hash = (Exit ASBR, destination address) Forwarding paradigm: Connectionless tunneling thru the AS
Inbound TE:
NOT ADDRESSED DIRECTLY Multi-
AS-Path
or Explicit
AS-Path
routing: Framework similar to IGP:
e-PathID
concept Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 24
BGP Explicit-Exit Routing: Route Selection
Explicit-Exit routing is
easier
than Explicit-Path Routing Only the “source” and “exit” nodes need upgrades !
Explicit exit routing easily extended to “
multi-exit
” routing Upgrade selected EBGP and IBGP routers All BGP routers synchronize on the
default
every destination prefix (as usual) policy route to Only upgraded IBGP routers and EBGP routers
synchronize on a set of exits for chosen prefixes
Upgraded IBGP routers can
independently
choose any exit without further synchronization with other BGP nodes Rensselaer Polytechnic Institute 25 Shivkumar Kalyanaraman
BGP Explicit-Exit Routing: Forwarding
IBGP locally installs explicit & default exits for chosen prefix
Dest-Prefix Exit-ASBR Next-Hop
Dest-Prefix Default-Next-Hop
Next-hop refers to the IGP next hop to reach Exit-ASBR Default-Next-Hop: regular IBGP function When a packet matches the explicit route (policy definable):
Push
its destination address into an
Address Stack
field Replace destination address with
Exit-ASBR
address. Emulates
1-level label-stacking
(I.e.
tunneling
) Exit-ASBR simply
swaps back
the destination address, before regular IP lookup =>
popping
the stack Rensselaer Polytechnic Institute 26 Shivkumar Kalyanaraman
Explicit-Exit Routing Example
AS2 ASBR1 ASBR4 ABR2 ASBR2 AS4
Dest. d
ABR1 AS3 ASBR3 AS1
Default (AS Path , Exit) to d = (1-3-4, ASBR3)
Now, ABR1 can have explicit exits ASBR4 (implied ASPath = 1-2-4), ASBR2 (implied ASPath =1-3-4) as well!!
Rensselaer Polytechnic Institute 27 Shivkumar Kalyanaraman
Inter-AS Explicit AS-Path Choice
AS0 AS2 ASBR1 ASBR2 AS4
Dest. d
AS3 AS1 ASBR3
Allow AS0 to explicitly choose an AS-PATH: e.g.
0-1-2-4
or
0-1-3-4
, Explicit AS-Path choice encoded as an
e-PathID = Hash{1,2,4}
e-PathID
routers. is updated only when the packet leaves the AS at Exit border At ASBR1, this explicit AS-path choice is mapped to an exit ASBR.
Within an upgraded AS, the packet is tunneled using the routing header as explained earlier.
Only selected EBGP nodes need be upgraded & synchronized Rensselaer Polytechnic Institute 28 Shivkumar Kalyanaraman
Re-advertisements of Multi-AS-Paths
3 AS-paths to “d” (0 4) (0 3 4) (0 5 4)
AS5 AS2 ASBR2
1 AS-path or 3 AS-paths to “d”??
iBG-1 AS1 ASBR1 iBG-3 AS0 AS3 AS4
Dest. d
Issue: in path-vector algorithms, without re-advertisements (of a subset of paths),
remote
AS’s cannot see the
availability of multiple paths
But, re-advertisements
adds control traffic overhead
An AS may choose to re-advertise only, and not support multi-path forwarding (I.e. interpreting e-PathID or Address Stack fields) Rensselaer Polytechnic Institute 29 Shivkumar Kalyanaraman
Putting It Together: Integrated OSPF/BGP Simulation
Rensselaer Polytechnic Institute 30 Shivkumar Kalyanaraman
E-PathID Processing
Rensselaer Polytechnic Institute 31 Shivkumar Kalyanaraman
Blow-up of AS2’s Internal Topology
Rensselaer Polytechnic Institute 32 Shivkumar Kalyanaraman
FORWARDING Table in AS2 (node#5) Corresponding Changes in Packet Headers
Rensselaer Polytechnic Institute 33 Shivkumar Kalyanaraman
Future: Exploiting Multiplicity In The Internet
Phone modem Firewire/802.11a/b USB/802.11a/b 802.11a
WiFi (802.11b) Ethernet
AS1
Rensselaer Polytechnic Institute
ISP-1 .
.
.
ISP-n
34
Internet
Shivkumar Kalyanaraman
Exploiting Multiplicity…
Unlike telephony, data networking can get statistical multiplexing gains from simultaneously using: Multiple transmission modes (802.11a/b, 3G etc) Multiple exits (USB, Firewire, Ethernet, modem) Multiple paths (routes) Lightweight distributed QoS on each path Eg: OverQoS (UCB) or Closed-loop QoS (Dave Harrison’s work) Scavenge performance from this path diversity to meet requirements of high-quality multimedia apps!
BANANAS concepts are generic Can be applied for intra-domain, inter-domain, overlay routing, or ad-hoc peer-to-peer routing Rensselaer Polytechnic Institute 35 Shivkumar Kalyanaraman
Eg: Multipath MPEG using Multi-band 802.11a/b Community Wireless Networks
“Slow” path “Fast” path P I
Rensselaer Polytechnic Institute 36 Shivkumar Kalyanaraman
Summary
TE: “
Towards Better routing performance
”: Key: Decoupling
route availability and setup
issues from
traffic mapping
issues,
without signaling
BANANAS-TE can leverage the
rich interconnectivity and multi homed nature
of the Internet, with manageable increase in complexity Applicable to OSPF, BGP, geographical routing, large-scale overlay networks; tested on Emulab, SSFnet Currently deploying BANANAS on Planetlab, a community wireless network in Troy, NY and in p2p streaming/videoconferencing
Shortest Path TE spectrum … MPLS BANANAS-TE
Rensselaer Polytechnic Institute 37
Signaled TE
Shivkumar Kalyanaraman