Towards an Accurate AS-level Traceroute Tool

Download Report

Transcript Towards an Accurate AS-level Traceroute Tool

Inter-domain Routing:
Today and Tomorrow
Dr. Jia Wang
[email protected]
AT&T Labs Research
Florham Park, NJ 07932, USA
http://www.research.att.com/~jiawang/
Prof. Zhuoqing Morley Mao
[email protected]
Department of EECS
University of Michigan
Ann Arbor, MI 48109, USA
http://www.eecs.umich.edu/~zmao/
IEEE INFOCOM 2004 Tutorial
March 8, 2004
Outline
1.
2.
3.
4.
5.
6.
7.
Overview of Inter-domain routing
Routing policies
Measuring inter-domain paths
Routing instability
BGP Beacon - measurement infrastructure
Implication on network engineering
Security issues
Our opinions should not be taken to represent AT&T policies
March 8, 2004
2
Part I: Overview of Interdomain Routing
Internet
 Loose cooperative effort of Internet
Service Providers (ISPs)
 E.g., AT&T, Sprint, UUNet, AOL
 Best effort service
 Connectedness
 Anyone connected to the Internet can
exchange traffic with anyone else
connected to the Internet
March 8, 2004
4
Internet routing
routes
Control plane:
exchange routes
Internet
: Routing session
Data plane:
forward traffic
IP traffic
rusty.cs.berkeley.edu
IP=169.229.62.116
Prefix=169.229.0.0/16
March 8, 2004
www.cnn.com
IP=64.236.16.52
Prefix=64.236.16.0/20
5
Internet routing dictates
application performance
routes
Control plane:
exchange routes
: Routing session
Internet
Data plane:
forward traffic
IP traffic
rusty.cs.berkeley.edu
IP=169.229.62.116
Prefix=169.229.0.0/16
March 8, 2004
Fail over to alternate route
www.cnn.com
IP=64.236.16.52
Prefix=64.236.16.0/20
6
Internet routing domain
 Network devices under same technical
and administrative control
 Common routing policy
 E.g., ISPs, enterprise networks
March 8, 2004
7
Autonomous System (AS)
 Autonomous routing domain with an AS
number (ASN)
 AS numbers




16 bits integer
Public AS number: 1 – 64511
Private AS number: 64512 – 65535
Examples
 AT&T: 7018, 6431, …
 Sprint: 1239, 1240, …
 MIT: 3
March 8, 2004
8
More than 14,000 ASes today
Internet
Autonomous
System
ISP
Level3
Calren
Berkeley
March 8, 2004
ISP
ISP
Qwest
ISP
Business
business
ISP
ISP
AT&T
Sprint UUnet
ISP
ISP
IP traffic
University
Company
company
GNN
CNN
9
Internet Initiative Japan (IIJ)
March 8, 2004
10
IIJ, Tokyo
March 8, 2004
11
Telstra international
March 8, 2004
12
WorldCom (UUNet)
March 8, 2004
13
UUNet, Europe
March 8, 2004
14
Sprint, USA
March 8, 2004
15
AT&T IP Backbone, USA
Anchorage,
AK
Year end 2001
Seattle Spokane
Portland
Portland Worcester
Manchester
Minneapolis
R
St. Paul
Milwaukee
Madison
Des
Moines
Sacramento
R
San
Las
Vegas
San
Francisco
Francisco
Oakland
Redwood
City
LAAirport
Blvd
Kansas
City
Colorado
Springs
Angeles
Albuquerque
San
Bernardino
Garden
a
St Louis
Oklaho
ma
City
R R
Buffalo
Harrisbu
rg
Detroit
Plymouth
Wash Phil
.DC
Louisville
Nashville
Tulsa
NY
C
R
BaltimoreNewark
Bohemia
White
Brunswick
Plains
Cedar
Knolls
Rochelle Pk
Hamilton
Square
Freehold
R
NYC-
R
Norfolk Camden,
NJ
Richmond
Bdwy
Raleigh
Greensboro
Charlotte
Little Rock
Memphis
Backbone Node
Remote GSR Access Router
Remote Access Router
March 8, 2004
Cambridge
Hartford
Framingham
Wayn
StamfordProvidence
Providence
Bridgepo
e
rt New
Florissant
Phoenix
Gateway Node
N X DS3
N X OC3
N X OC12
N X OC48
NX OC192
Grand
RapidsBirmingham
Pittsburgh
South Bend ClevelandAkron
Chicago
DaytoColumbus Silver
Springs
n
Indianapolis
Arlington
Cincinnati
R
Anaheim
San Diego
R
R
Albany
Syracuse
Rochester
Rolling
Meadows
Oak
Davenport Brook
Chicag
o
Omaha
Denver
R San Jose
Los
Sherman Oaks
Honolulu
R
Salt Lake
City
Glenvie
w
Ft. Worth
Birmingham
Norcross
Columbia
Dunwoody
Atlanta
Dallas
Jacksonville
New Orleans
Austin
Orlando
Houston
San Antonio
R
Note: Connectivity
and nodes shown are
targeted for
deployment; actual
deployment
may vary. Maps
should not be used to
predict service
availability.
Tampa
R
Ft.
Lauderdale
W. Palm Beach
Ft.
Ojus
Lauderdale
Miami
San Juan PR
Rev. 6-4-01
16
GARR-B
March 8, 2004
17
Gigabit research network
March 8, 2004
18
wiscnet.net
UW-Superior
Rice Lake
Rhinelander
UW-Stout
Marshfield
UW-River Falls
Stiles
Jct.
Wausau
UW-Eau Claire
Qwest
and Other
Provider(s)
Clintonville
er '02)
(Summ
UW-Stevens Point
UW-Green Bay
(Summer
'02)
Fox Valley TC
(Summer '03)
um
(S
UW-Oshkosh
m
'
er
)
02
UW-La Crosse
La Crosse
Portage
Dodgeville
GO BUCKY!
Genuity
UW-Madison
(Summer '03)
UW-Milwaukee
UW-Whitewater
UW-Parkside
)
(Winter '02
UW-Platteville
Gigabit Ethernet
OC-12 (622Mbps)
OC-3 (155Mbps)
DS-3 (45Mbps)
T1 (1.5Mbps)
March 8, 2004
Chicago






Internet 2
& Qwest
Peering - Public and Private
Commodity Internet Transit
Internet2
Merit and Other State Networks
National Education Network
Regional Research Peers
2
ter '0
(Win
)
Chicago - 2
(Winter '02)
Chicago - 1
19
MIT.edu
http://bgp.lcs.mit.edu/
March 8, 2004
20
Internet routing architecture
Intra-domain
routing
Calren
Berkeley
March 8, 2004
Level3
IP traffic
Internet
Inter-domain
routing
GNN
CNN
21
Intra-domain routing
 Run within a certain network infrastructure
 Optimize routes taken between points within
a network
 Internal Gateway Protocols (IGPs)




Metrics based
OSPF (Open Shortest Path First)
RIP (Routing Information Protocol)
IS-IS (Intermediate System to Intermediate
System)
March 8, 2004
22
Inter-domain routing
 Run between networks
 Provide full connectivity of entire
Internet
 External Gateway Protocol (EBGP)
 Policy based
 BGP (Border Gateway Protocol)
March 8, 2004
23
Inter-domain routing and BGP
 Static routing
 Mainly for stub networks
 Default routing
 Small stub networks
 Dynamic routing
 Via BGP
No need to run BGP in static routing and default routing.
March 8, 2004
24
Link state
 Examples: OSPF, IS-IS
 Based on Dijkstra’s shortest path computation
 Each router periodically floods immediate
reachability information to other routers
 Fast convergence
 High communication and computation
overhead
 Not scalable for large networks
 Requires periodic refreshes
March 8, 2004
25
Vectoring
 Distance vs. Path Vector
 Distance: hop count (RIP)
 Path: entire path (BGP)
 Helps identify loops
 Supports policy-based routing based on path
 Minimal communication overhead
 Takes longer to converge, i.e., in
proportion to the maximum path length
March 8, 2004
26
Link state vs. vectoring
Link state Vectoring
IGP
EGP
OSPF
IS-IS
RIP
BGP
BGP is a path vector protocol
March 8, 2004
27
Classful addressing
 IPv4: 32 bits
 Five classes of networks
Class
Address
Mask
# of networks # of hosts
A
0*
255.0.0.0
128
~1.6M
B
10*
255.255.0.0
16384
65535
C
110*
255.255.255.0
~2.1M
255
D
Used for multicast
E
Reserved and currently unused
Improve scaling factor of routing in the Internet => classless
March 8, 2004
28
RFC1519: Classless Inter-domain
Routing (CIDR)
 No implicit mask based on the class of
the network
 Explicit masks passed in the routing
protocol
 Allow aggregation and hierarchical
routing
March 8, 2004
29
CIDR addressing
IP address: 12.70.0.0 Mask: 255.255.252.0
Address
Mask
00001100 00100110 00000000 00000000
11111111 11111111 11000000 00000000
Network prefix
Host
identifier
CIDR representation: 12.70.0.0/22
March 8, 2004
30
Address aggregation
12.70.0.0/24
12.70.1.0/24
12.70.2.0/24
March 8, 2004
12.70.3.0/24
Internet
ISP A
12.71.0.0/16
ISP B
12.70.0.0/22
12.71.0.0/16
31
Routing and forwarding
 Routing
 The decision process of choosing optimal
path that is consistent with the
administrative or technical policy
 Forwarding
 The act of receiving a packet, doing a
lookup, and copying a packet to the next
hop
March 8, 2004
32
Classless forwarding
Internet
12.70.0.20
10.20.128.10
10.20.128.1
10.20.0.1
IP traffic
10.20.1.1
135.120.0.1
March 8, 2004
Prefix
12.70.0.0/24
12.70.0.0/16
12.0.0.0/8
0.0.0.0
Next hop
10.20.0.1
10.20.1.1
10.20.128.1
10.20.128.10
33
Inter-domain routing with CIDR
support
 BGP-4 [RFC1771]





De facto EGP
Path vector protocol
Run on top of TCP for reliability
Carry routing information between ASes
Policy based routing
March 8, 2004
34
BGP basic operations
 Set up BGP session
 Exchange all candidate routes
 Send incremental updates
March 8, 2004
35
Establish BGP session
Establish neighboring session
between 12.10.0.1 and 12.10.0.2
12.10.0.1
Prefix
135.120.0.0/24
68.35.0.0/16
March 8, 2004
TCP 179
Next hop
10.128.0.1
10.192.1.1
12.10.0.2
Prefix
12.70.0.0/24
12.9.0.0/16
Next hop
10.20.0.1
10.20.1.1
36
Exchange all candidate routes
12.70.0.0/24
12.9.0.0/16
10.20.0.1
10.20.1.1
12.10.0.1
12.10.0.2
135.120.0.0/24
68.35.0.0/16
Prefix
135.120.0.0/24
68.35.0.0/16
12.70.0.0/24
12.9.0.0/16
March 8, 2004
Next hop
10.128.0.1
10.192.1.1
10.20.0.1
10.20.1.1
10.128.0.1
10.192.1.1
Prefix
12.70.0.0/24
12.9.0.0/16
Next hop
10.20.0.1
10.20.1.1
135.120.0.0/24
68.35.0.0/16
10.128.0.1
10.192.1.1
37
Send incremental updates
Withdraw 12.9.0.0/16
12.10.0.1
Prefix
135.120.0.0/24
68.35.0.0/16
12.70.0.0/24
12.9.0.0/16
March 8, 2004
12.10.0.2
Next hop
10.128.0.1
10.192.1.1
10.20.0.1
10.20.1.1
Prefix
12.70.0.0/24
12.9.0.0/16
135.120.0.0/24
68.35.0.0/16
Next hop
10.20.0.1
10.20.1.1
10.128.0.1
10.192.1.1
38
BGP messages
 OPEN: set up a peering session
 UPDATE: announce new routes or
withdraw previously announced routes
 NOTIFICATION: shut down a peering
session
 KEEPALIVE: confirm active connection
at regular interval
March 8, 2004
39
Internal vs. external BGP
Internet
E-BGP
I-BGP
AS B
AS C
AS A
March 8, 2004
40
I-BGP mesh
E-BGP
update
March 8, 2004
I-BGP update
41
Make I-BGP scale for large AS
 Route reflectors
 Confederations
March 8, 2004
42
Route reflector
E-BGP
update
RR
RR
Only best paths
being sent by RR
March 8, 2004
43
Confederation
AS 1000
EBGP
March 8, 2004
IBGP
IBGP
AS 65010
AS 65020
EBGP
44
BGP updates
 Three blocks
 Prefix
 Path attributes
 Unreachable routes
March 8, 2004
45
BGP attributes
Value Code Reference
1 ORIGIN [RFC1771]
2 AS_PATH [RFC1771]
3 NEXT_HOP [RFC1771]
4 MULTI_EXIT_DISC [RFC1771]
5 LOCAL_PREF [RFC1771]
6 ATOMIC_AGGREGATE [RFC1771]
7 AGGREGATOR [RFC1771]
8 COMMUNITY [RFC1997]
9 ORIGINATOR_ID [RFC1998]
10 CLUSTER_LIST [RFC1998]
11 DPA [Chen]
12 ADVERTISER [RFC1863]
13 RCID_PATH / CLUSTER_ID
[RFC1863]
14 MP_REACH_NLRI [RFC2283]
15 MP_UNREACH_NLRI [RFC2283]
16 EXTENDED COMMUNITIES
[Rosen]
17 NEW_AS_PATH [E.Chen]
18 NEW_AGGREGATOR [E.Chen]
19 SAFI Specific Attribute (SSA)
[Nalawade]
20-254 Unassigned
255 reserved for development
http://www.iana.org/assignments/bgp-parameters
March 8, 2004
46
Establish connectivity
Prefix
135.120.0.0/16
AS 3
Next hop AS path
12.10.0.5 2 1
Prefix
135.120.0.0/16
IBGP
Next hop AS path
12.10.0.1 1
12.10.0.6
EBGP
12.10.0.5
AS 1
AS 2
135.120.0.0/16
IBGP
March 8, 2004
EBGP
12.10.0.2
IBGP
12.10.0.1
Prefix
135.120.0.0/16
Next hop AS path
12.10.0.1 1
47
IGP and BGP working together
Prefix
135.120.0.0/16
AS 3
IBGP
Next hop AS path
12.10.0.1 1
Prefix
12.10.0.0/30
135.120.0.0/16
12.10.0.6
Next hop
10.10.0.1
10.10.0.1
EBGP
12.10.0.5
AS 1
12.10.0.1
135.120.0.0/16
EBGP
AS 2
12.10.0.2
IBGP
IBGP
March 8, 2004
10.10.0.1
Prefix
135.120.0.0/16
Next hop AS path
12.10.0.1 1
48
Part II: Inter-domain Routing
Policies
What is routing policy?
ISP2
ISP1
traffic
Connectivity DOES NOT
imply reachability!
ISP3
ISP4
traffic
Cust1
March 8, 2004
Cust2
Policy determines how traffic
can flow on the Internet
50
BGP routing process
Routes
received
from peers
Apply
input
policy
Select
best
route
Best
routes
Apply
output
policy
Routes
advised
to peers
Routing Forwarding
table
table
BGP is not shortest path routing!
March 8, 2004
51
Best route selection






Highest local preference
Shortest AS path
Lowest MED
I-BGP < E-BGP
Lowest I-BGP cost to E-BGP egress
Tie breaking rules
March 8, 2004
52
Best route selection
 Highest local preference
 To enforce economical relationships
between domains





Shortest AS path
Lowest MED
I-BGP < E-BGP
Lowest I-BGP cost to E-BGP egress
Tie breaking rules
March 8, 2004
53
Best route selection
 Highest local preference
 Shortest AS path
 Compare the quality of routes, assuming
shorter AS-path length is better




Lowest MED
I-BGP < E-BGP
Lowest I-BGP cost to E-BGP egress
Tie breaking rules
March 8, 2004
54
Best route selection
 Highest local preference
 Shortest AS path
 Lowest MED
 To implement “cold potato” routing
between neighboring domains
 I-BGP < E-BGP
 Lowest I-BGP cost to E-BGP egress
 Tie breaking rules
March 8, 2004
55
Best route selection




Highest local preference
Shortest AS path
Lowest MED
I-BGP < E-BGP
 Prefer EBGP routes to IBGP routes
 Lowest I-BGP cost to E-BGP egress
 Tie breaking rules
March 8, 2004
56
Best route selection





Highest local preference
Shortest AS path
Lowest MED
I-BGP < E-BGP
Lowest I-BGP cost to E-BGP egress
 Prefer routes via the nearest IGP neighbor
 To implement “hot potato” routing
 Tie breaking rules
March 8, 2004
57
Best route selection
Highest local preference
Shortest AS path
Lowest MED
I-BGP < E-BGP
Lowest I-BGP cost to E-BGP egress
Tie breaking rules








Router ID based: lowest router ID
Age based: oldest route
March 8, 2004
58
BGP route propagation
 Not all possible routes propagate
 Commercial relationships determine policies
for
 Route import
 Route selection
 Route export
March 8, 2004
59
Typical AS relationships
 Provider-customer
 customer pay money for transit
 Peer-peer
 typically exchange respective customers’
traffic for free
 Siblings
March 8, 2004
60
Transit vs. peering
 ISP definition
 Internet service provider is an organization that
sells access to the Internet
 Transit definition
 “Business relationship whereby one ISP provides
(usually sells) access to all destinations in its
routing table”.
 Peering is non-transitive relationship
 A peers with B, B peers with C, does not imply A
peers with C
March 8, 2004
61
What is peering?
 Peering definition
 “An interconnection business relationship
whereby ISPs provide connectivity to each
others’ transit customers.”
 Hybrid exists
 Regional transit
 Paid peering
March 8, 2004
62
Example of commercial
relationship
Cogent
ESnet
Merit
UMich
March 8, 2004
Google
Berkeley
63
Tier-1 peering




Buy no transit from any other providers
Have only customers and peers
Has full mesh peering with other tier-1’s
Motivation for peering:
 Minimize their interconnection costs while
providing sufficient interconnection BW to support
customer and its growth
March 8, 2004
64
Tier-2 peering
 ISP that purchases (resells) transit
within an Internet region
 Benefits
 Decreases the cost and reliance on
purchased Internet transit
 Lowers inter-AS traffic latency
 Fewer AS hops, AS peering links traversed
March 8, 2004
65
Is peering always better than
transit?
 Concerns of peering
 Traffic asymmetry
 No SLAs: less liability or incentive to
improve performance
 “Free” rather than getting paid
 Peers become more powerful
March 8, 2004
66
Where to peer?
 Public peering: at public peering locations
 Private peering
 Exchange-based interconnection model
 A meet point at which ISPs exchange traffic
 Can be neutral Internet business exchange
 Direct circuit interconnection model
 Point-to-point circuit between the exchange
parties
March 8, 2004
67
What are siblings?
 Mutual transit agreement
 Provide connectivity to the rest of the
Internet for each other
 Typically between two administrative
domains such as small ISPs or
universities located close to each other,
cannot afford additional Internet
services for better connectivity
March 8, 2004
68
AS relationships translate
into BGP export rules
 Export to a provider or a peer
 Allowed: its routes and routes of its
customers and siblings
 Disallowed: routes learned from other
providers or peers
 Export to a customer or a sibling
 Allowed: its routes, the routes of its
customers and siblings, and routes learned
from its providers and peers
March 8, 2004
69
Which AS paths are legal?
 Valley-free:
 After traversing a provider-customer or
peer-peer edge, cannot traverse a
customer-provider or peer-peer edge
 Invalid path: >= 2 peer links, downhilluphill, downhill-peer, peer-uphill
March 8, 2004
70
Example of valley-free paths
[1 2 3], [1 2 6 3] are valley-free
X
X
[1 4 3], [1 4 5 3] are not valley free
March 8, 2004
71
Inferring AS relationships
 Identify the AS-level hierarchy of Internet
 Not shortest path routing





Predict AS-level paths
Traffic engineering
Understand the Internet better
Correlate with and interpret BGP update
Identify BGP misconfigurations
 E.g., errors in BGP export rules
March 8, 2004
72
Existing approaches
 On inferring Autonomous Systems
Relationships in the Internet, by L. Gao, IEEE
Global Internet, 2000.
 Characterizing the Internet hierarchy from
multiple vantage points, by L. Subramanian,
S. Agarwal, J. Rexford, and R. Katz, IEEE
Infocom, 2002.
 Computing the Types of the Relationships
between Autonomous Systems, by G.
Battista, M. Patrignani, and M. Pizzonia, IEEE
Infocom, 2003.
March 8, 2004
73
Gao’s approach
 Assumptions
 Provider is typically larger than its
customers
 Two peers are typically of comparable size
March 8, 2004
74
Gao’s algorithm
 Find the highest degree AS node to be the
top provider of the AS path
 Left to the top node
 customer-provider or sibling-sibling links
 Right to the top node
 provider-customer or sibling-sibling links
 Sibling-sibling
 if providing mutual transit service for each other
 Peer-peer
 with top provider and of comparable degree value
March 8, 2004
75
Subramanian’s Approach
 Use BGP tables from multiple vantage points
 More complete
 Exploit uniqueness of each point
 Build AS-level hierarchy of Internet
 Relationship based, not degree based
 5 level classification of AS’s
March 8, 2004
76
Relationship inference rules
 Position of AS in AS graph gives rank
 Combine ranks from multiple tables
 Compare ranks
 Peer-peer with similar ranks
 Provider-customer: provider with higher
ranks
March 8, 2004
77
Hierarchy inference
 Internet hierarchy
inference
 Based on
relationships
 Not degree [Gao]
March 8, 2004
78
Battista’s approach
 Cast it as an optimization problem to
find provider-customer relationships
that minimize the number of conflicts
 Shows the problem is NP-hard
 Do not deal with peer-peer relationships
well
March 8, 2004
79
Policy routing causes path
inflation
 End-to-end paths are significantly longer than
necessary
 Why?
 Topology and routing policy choices within an ISP,
between pairs of ISPs, and across the global
Internet
 Peering policies and interdomain routing lead to
significant inflation
 Interdomain path inflation is due to lack of BGP
policy to provide convenient engineering of good
paths across ISPs
March 8, 2004
80
Path inflation
 Based on
[Mahajan03]
 Comparing
actual
Internet
paths with
hypothetical
“direct” link
March 8, 2004
81
Part III: Measuring Interdomain Paths
Packet forwarding path
Destination
Internet
IP traffic
Source
Forwarding path - the path packets traverse through
the Internet from a source to a destination
March 8, 2004
83
An inter-domain level view
AS D
Destination
Internet
AS C
IP traffic
AS A
AS B
Source
March 8, 2004
An IP forwarding path often span across multiple
Autonomous Systems.
84
Why do we care?
 Characterize end-to-end network paths
 Diagnose routing anomalies
 Discover Internet topology
March 8, 2004
85
Why do we care?
 Characterize end-to-end network paths




Latency
Capacity
Link utilization
Loss rate.
 Diagnose routing anomalies
 Discover Internet topology
March 8, 2004
86
Varies link capacity
Internet
Destination
Source
March 8, 2004
87
Different loss rate
Internet
Destination
Source
March 8, 2004
88
Traffic engineering
Internet
Destination
Source
Customer service enhancement
March 8, 2004
89
Why do we care?
 Characterize end-to-end network paths
 Diagnose routing anomalies
 Forwarding loop, black holes, routing
changes, unexpected paths, main
component of end-to-end latency.
 Discover Internet topology
March 8, 2004
90
Forwarding loops
Internet
Destination
Source
March 8, 2004
91
Black holes
Internet
Destination
Source
March 8, 2004
92
Routing changes
Internet
Destination
Source
March 8, 2004
93
Unexpected routes
Internet
Destination
Source
March 8, 2004
94
Performance bottleneck
Internet
Destination
Source
March 8, 2004
95
Why do we care?
 Characterize end-to-end network paths
 Diagnose routing anomalies
 Discover Internet topology
 Server placement
March 8, 2004
96
Internet topology
Server
Internet
Client
Client
Client
March 8, 2004
97
Server placement
Server
Internet
Client
Proxy
Client
Client
March 8, 2004
98
Key challenge
 Need to understand how packets flow
through the Internet without real-time
access to proprietary routing data from
each domain.
 Identify accurate packet forwarding paths
 Characterize the performance metrics of
each hop along the paths
March 8, 2004
99
Identify forwarding path
 Traceroute gives IP level forwarding
path
 IP address of the router interfaces on a
forwarding path
 RTT statistics for each hop along the way
March 8, 2004
100
Traceroute from UC Berkeley to
www.cnn.com
Traceroute output: (hop number, IP address, DNS name)
1 169.229.62.1
2 169.229.59.225
3 128.32.255.169
4 128.32.0.249
5 128.32.0.66
6 209.247.159.109
7 *
8 64.159.1.46
9 209.247.9.170
10 66.185.138.33
11 *
12 66.185.136.17
13 64.236.16.52
March 8, 2004
inr-daedalus-0.CS.Berkeley.EDU
soda-cr-1-1-soda-br-6-2
vlan242.inr-202-doecev.Berkeley.EDU
gigE6-0-0.inr-666-doecev.Berkeley.EDU
qsv-juniper--ucb-gw.calren2.net
POS1-0.hsipaccess1.SanJose1.Level3.net
?
?
pos8-0.hsa2.Atlanta2.Level3.net
pop2-atm-P0-2.atdn.net
?
pop1-atl-P4-0.atdn.net
www4.cnn.com
101
Traceroute from AT&T Research
to www.cnn.com
traceroute to cnn.com (64.236.24.12), 30 hops max,
40 byte packets
1 oden (135.207.16.1) 1 ms 1 ms 1 ms
2 ***
3 attlr-gate (192.20.225.1) 2 ms 2 ms 2 ms
4 12.119.155.157 (12.119.155.157) 3 ms 4 ms 4
ms
5 gbr6-p52.n54ny.ip.att.net (12.123.192.18) 4 ms
4 ms 4 ms
6 tbr2-p012401.n54ny.ip.att.net (12.122.11.29) 4
ms (ttl=249!) 5 ms (ttl=249!) 5 ms (ttl=249!)
7 ggr2-p390.n54ny.ip.att.net (12.123.3.62) 4 ms 5
ms 4 ms
8 att-gw.ny.aol.net (192.205.32.218) 4 ms 4 ms 4
ms
9 bb2-nye-P1-0.atdn.net (66.185.151.66) 4 ms 4
ms 4 ms
10 bb2-vie-P8-0.atdn.net (66.185.152.201) 13 ms
(ttl=245!) 12 ms (ttl=245!) 12 ms (ttl=245!)
11 bb1-vie-P11-0.atdn.net (66.185.152.206) 10 ms
10 ms 10 ms
12 bb1-cha-P7-0.atdn.net (66.185.152.28) 20 ms
20 ms 20 ms
13 bb1-atm-P6-0.atdn.net (66.185.152.182) 25 ms
25 ms 25 ms
14 pop1-atl-P4-0.atdn.net (66.185.136.17) 25 ms
(ttl=243!) 24 ms (ttl=243!) 24 ms (ttl=243!)
15 * * *
March 8, 2004
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
Destination
unreachable!
Who is responsible for
the forwarding problem?
102
Need to know Inter-domain level
path
AS D
www.cnn.com
Internet
AS C
AS A
AS B
AT&T Research
March 8, 2004
Routing loop in
AS C!
103
How to obtain AS level paths
 BGP AS path
 Traceroute AS path
March 8, 2004
104
BGP AS path
Signaling path: control traffic
d: path=[BC]
d: path=[C]
Prefix d
Forwarding path: data traffic
Prefix
d
…
AS path
ABC
…
Is BGP AS path the answer? No!
March 8, 2004
105
BGP AS path is not the
answer
 Requires timely access to BGP data
 Signaling path may differ from
forwarding path
 Route aggregation and filtering
 Routing anomalies: e.g., deflections, loops
[Griffin2002]
 BGP misconfigurations: e.g., incorrect AS
prepending
March 8, 2004
Two paths may differ precisely when operators
most need accurate data to diagnose a problem!
106
Traceroute AS path
 Obtain IP level path using traceroute
 Map IP addresses to ASes
a
b
c
d
Source
e
Destination
AS A
AS B
AS C
AS D
Is traceroute AS path the answer? NO!
March 8, 2004
107
Example: UC Berkeley to CNN
Traceroute output: (hop number, IP)
1 169.229.62.1
AS25
2 169.229.59.225
AS25
3 128.32.255.169
AS25
4 128.32.0.249
AS25
5 128.32.0.66
AS11423 Calren
Berkeley
6 209.247.159.109 AS3356
March 8, 2004
7 *
AS3356
8 64.159.1.46
AS3356
9 209.247.9.170
AS3356
10 66.185.138.33
AS1668
11 *
AS1668
12 66.185.136.17
AS1668
13 64.236.16.52
AS5662 CNN
Level3
GNN
108
Traceroute AS path is not the
answer
 Identifying ASes along forwarding path
is surprisingly difficult!
 Internet route registry
 Origin AS in BGP routes
March 8, 2004
109
Internet route registry
 Whois database
 E.g. NANOG traceroute, prtraceroute
 Out-of-date, incomplete
 Address allocation to customers
 Acquisition, mergers, break-ups
March 8, 2004
110
Origin AS in BGP routes
 Last AS in the AS path for each prefix
Prefix
AS path
d
ABC
…
…
 More accurate and complete than whois data
March 8, 2004
111
Limitations of BGP origin AS
 Multiple Origin AS (MOAS)
 Infrastructure addresses may not be
advertised
 Addresses announced by someone else
March 8, 2004
112
Limitations of BGP origin AS
 Multiple Origin AS (MOAS)
 Multi-homing
 Misconfiguration
 Internet eXchange Points (IXPs)
 Infrastructure addresses may not be
advertised
 Addresses announced by someone else
March 8, 2004
113
Limitations of BGP origin AS
 Multiple Origin AS (MOAS)
 Infrastructure addresses may not be
advertised
 Does not require to be announced publicly
 Security concerns
 Addresses announced by someone else
March 8, 2004
114
Limitations of BGP origin AS
 Multiple Origin AS (MOAS)
 Infrastructure addresses may not be
advertised
 Addresses announced by someone else
 Static routed customers
 Shared equipments at boundary between
ASes
Need accurate IP-to-AS mapping!
March 8, 2004
115
Accurate AS-level traceroute
Combine BGP and traceroute data to find a better answer!
March 8, 2004
116
Assumptions
 IP-to-AS mapping
 Mappings from BGP tables are mostly
correct.
 Change slowly
 BGP paths and forwarding paths mostly
match.
 70% of the BGP path and traceroute path
match
March 8, 2004
117
BGP path and traceroute path
could differ!
 Inaccurate IP-to-AS mapping
 Traceroute problems
 Legitimate mismatches
March 8, 2004
118
BGP path and traceroute path
could differ!
 Inaccurate IP-to-AS mapping
 Internet eXchange Points (IXPs)
 Sibling ASes
 Unannounced infrastructure addresses
 Traceroute problems
 Legitimate mismatches
March 8, 2004
119
Internet eXchange Points (IXPs)
 Shared infrastructure connected to multiple
service providers
 Exchange BGP routes and data traffic
 May have its own AS number or announced
by participating ASes
 Dedicated BGP sessions between pairs of
participating ASes
 E.g., Mae-East, Mae-West, PAIX.
March 8, 2004
120
IXPs cause extra AS hop
 Extra AS hop in traceroute path
 Large number of fan-in and fan-out
ASes
 Non-transit AS, small address block,
likely MOAS
March 8, 2004
121
IXPs cause extra AS hop
A
B
C
D
E
A
E
F
B
F
G
C
G
Traceroute AS path
March 8, 2004
BGP AS path
122
Sibling ASes
 Single organization owns and manages
multiple ASes
 May share address space
 Large fan-in and fan-out for the “sibling
AS pair”
March 8, 2004
123
Sibling ASes cause extra AS hop
 Large fan-in and fan-out for the “sibling
AS pair”
A
B
C
H
D
E
A
F
B
G
C
Traceroute AS path
March 8, 2004
E
D
F
G
BGP AS path
124
Unannounced infrastructure
addresses
 ASes do not necessarily announce
infrastructure via BGP
 Lead to “unmapped” addresses
 Sometimes fall into supernet announced
by AS’s provider or sibling
March 8, 2004
125
Unannounced infrastructure
addresses
AS loop in
traceroute path
AS A
4. A,C,A
3. B,A
AS B
Substitute AS hop
2. A
Missing AS hop in
traceroute path
Extra AS hop in
traceroute path
AS C
1. A,C
March 8, 2004
126
BGP path and traceroute path
could differ!
 Inaccurate IP-to-AS mapping
 Traceroute problems
 Forwarding path changing during
traceroute
 Interface numbering at AS boundaries
 ICMP response refers to outgoing interface
 Legitimate mismatches
March 8, 2004
127
Forwarding path changing during
traceroute
AS D
AS E
Route flaps between A
B C and A D E
AS A
AS A
AS B
AS D
AS C
AS C
AS hop B is substituted by AS D in the traceroute path
March 8, 2004
128
Interface numbering at AS
boundaries
AS A
AS A
AS C
AS B
AS C
Missing AS hop B in traceroute path
March 8, 2004
129
ICMP response refers to outgoing
interface
AS A
AS C
ICMP
message
AS B
Extra AS hop B in traceroute path
March 8, 2004
130
BGP path and traceroute path
could differ!
 Inaccurate IP-to-AS mapping
 Traceroute problems
 Legitimate mismatches
 Route aggregation and filtering
 Routing anomalies, e.g., deflections
March 8, 2004
131
Route aggregation/filtering
AS A
8.0.0.0/8 B C
AS B
AS C
8.0.0.0/8
C
8.64.0.0/16 C D
Extended traceroute path due to filtering by AS B
March 8, 2004
132
Mismatch patterns and causes
Extra
AS
Miss
AS
AS
Loop
Subst
AS
IXP
X
Sibling ASes
X
X
X
X
Unannounced IP
X
X
X
X
Aggregation/ filtering
X
Inter-AS interface
X
ICMP source address
X
X
Routing anomaly
X
X
March 8, 2004
Other
X
X
X
X
X
X
133
BGP and traceroute data
collection
Initial mappings from
origin AS of a large set of BGP tables
(Ignoring unstable paths)
For each location:
For each location:
Combine all locations:
Local BGP paths
Traceroute paths
from multiple locations
Traceroute AS paths
•Compare
•Look for known causes of mismatches
(e.g., IXP, sibling ASes)
•Edit IP-to-AS mappings
(a single change explaining a large number of mismatches)
March 8, 2004
134
Experimental methodology
200,000 destinations:
d0, d1, d2, d3, d4, … d200,000
March 8, 2004
For each di
-Traceroute path
-BGP path
135
Measurement setup
 Eight vantage points
 Upstream providers: US-centric tier-1 ISPs
 Sweep all routable IP address space
 About 200,000 IP addresses, 160,000
prefixes, 15,000 destination ASes
March 8, 2004
136
Eight vantage points
Organization
Location
Upstream provider
AT&T Research
NJ, US
UUNET, AT&T
UC Berkeley
CA, US
Qwest, Level3, Internet 2
PSG home network
WA, US
Sprint, Verio
Univ of Washington
WA, US
Verio, Cable&Wireless
ArosNet
UT, US
UUNET
Nortel
ON, Canada
AT&T Canada
Vineyard.NET
MA, US
UUNET, Sprint, Level3
Peak Web Hosting
CA, US
Level 3, Global Crossing, Teleglobe
Many thanks to people who let us collect data!
March 8, 2004
137
Preprocessing BGP paths
 Discard prefixes with BGP paths
containing





Routing changes based on BGP updates
Private AS numbers (64512 - 65535)
Empty AS paths (local destinations)
AS loops from misconfiguration
AS SET instead of AS sequence
 Less than 1% prefixes affected
March 8, 2004
138
Preprocessing traceroute
paths
 Resolving incomplete traceroute paths
 Unresolved hops within a single AS map to that AS
 Unmapped hops between ASes
 Try match to neighboring AS using DNS, Whois
 Trim unresponsive (*) hops at the end
 Compare with the beginning of local BGP paths
 MOAS at the end of paths
 Assume multi-homing without BGP
 Validation using AT&T router configurations
 More than 98% cases validated
March 8, 2004
139
Initial IP-to-AS Mapping
Whois
Combined
BGP tables
Resolving
incompletes
Match
44.7%
73.2%
78.0%
Mismatch
29.4%
8.3%
9.0%
1.5
8.8
9.0
Ratio
March 8, 2004
140
Heuristics to improve mappings
 Overall modification to mappings




10% IP-to-AS mappings modified
25 IXPs identified
28 pairs of sibling ASes found
1150 of the /24 prefixes shared
March 8, 2004
141
Heuristics to improve mappings
IXPs
Match
Mismatch
Ratio
March 8, 2004
Sibling
ASes
84.4% 85.9%
Unannounced
address space
90.6%
8.7%
7.8%
3.5%
9.7
11.0
26.0
142
Systematic optimization
 Dynamic-programming and iterative
improvement
 Initial IP-to-AS mapping derived from BGP
routing tables
 Identify a small number of modifications
that significantly improve the match rate.
 95% match ratio, less than 3% changes,
very robust
March 8, 2004
143
Optimization results
Mismatch ratio
Full initial Mapping
5.23%
Heuristically optimized mapping
3.08%
Omit 10% initial mapping
6.57%
Omit 4 probing sources
6.34%
Omit probing destinations
(one probe per unique BGP path)
7.12%
March 8, 2004
144
Validation
 Public data
 Whois/DNS data
 pch.net for known IXPs
 Private data
 AS 7018
March 8, 2004
145
Validations – IXP heuristic
 25 inferences: 19 confirmed
 Whois/DNS data confirm 18 of 25 inferences
 AS5459 -- “London Internet Exchange”
 198.32.176.0/24:
part of “Exchange Point Blocks”
DNS name: sfba-unicast1-net.eng.paix.net
 Known list from pch.net confirm 16 of 25
 Missing 13 known IXPs due to
 Limited number of measurement locations
 Mostly tier-1 US-centric providers
March 8, 2004
146
Validations – Sibling heuristic
 28 inferences: all confirmed
 Whois for organization names (15 cases)
 E.g., AS1299 and AS8233 are TeliaNet
 MOAS origin ASes for several address blocks
(13 cases)
 E.g., 148.231.0.0/16 has MOAS:
AS5677 and AS7132
(Pacific Bell Internet Services and SBC Internet Services)
March 8, 2004
147
Summary
 Identify accurate AS level forwarding
path
 improve infrastructure IP to AS mappings
 Heuristics and Dynamic programming
optimization
 Match/mismatch ratio improvement: 8-12
to 25-35
 Reduction of incomplete paths: 18-22% to
6-7%
March 8, 2004
148
Summary
 Dependence on operational realities




Most BGP routes are relatively stable
Few private ASes, AS_SETs
Public, routable infrastructure addresses
Routers respond with ICMP replies
http://www.research.att.com/~jiawang/as_traceroute
March 8, 2004
149
Part IV: BGP Routing Instability
BGP routing updates
 Route updates at prefix level
 No activity in “steady state”
 Routing messages indicate changes, no
refreshes
March 8, 2004
151
Internet routing instability
 Large # of BGP updates
 Failures
 Policy changes
 Redundant messages
 Routing instability
 Route keeps changing, e.g., routes keep
going up and down
March 8, 2004
152
Implications
 Router overhead
 Transient delay and loss





Unreachable hosts
High loss rate
High jitter
Long delays
Significant packet reordering
 Poor predictability of traffic flow
March 8, 2004
153
Question
How do we know if the instability is due to routing
or network congestion?
March 8, 2004
154
Measure BGP stability
 First work by Labovitz et al.
 Methodology
 Collect routing messages from five public
exchange points
 BGP information considered
 AS path
 Next hop: next hop to reach a network
 Two routes are the same if they have the same AS
path and next hop
 Other attributes (e.g., MED, communities) ignored
 Focus on forwarding path stability
March 8, 2004
155
Measurement methodology
March 8, 2004
156
AS path
 Sequence of AS’s a route traverses
 Used for loop detection and to apply policy
AS-3
AS-4
130.10.0.0/16
AS-2
120.10.0.0/16
AS-5
110.10.0.0/16
AS-1
March 8, 2004
120.10.0.0/16 AS-2 AS-3 AS-4
130.10.0.0/16 AS-2 AS-3
110.10.0.0/16 AS-2 AS-5
157
BGP information exchange
 Announcements: a router has either
 Learned of a new route, or
 Made a policy decision that it prefers a new route
 Withdrawals: a router concludes that a
network is no longer reachable
 Explicit: associated to the withdrawal message
 Implicit: (in effect an announcement) when a
route is replaced as a result of an announcement
message
March 8, 2004
158
BGP information exchange
 In steady state BGP updates should be
only the result of infrequent policy
changes
 BGP is stateful, requires no refreshes
 Update rate: indication of network stability
March 8, 2004
159
Example of delayed convergence
stage
0
1
4
2: [1] [41] [431]
node 3: [1] [41] [241]
4: [1] [31] --
Example topology:
9
----
d
1
2
4
3
Assuming node 1 has a route to a destination, and it withdraws the route:
Stage (msg processed)
0:
1: 1->{2,3,4}W
Msg queued
1->{2,3,4}W
2->{3,4}A[241], 3->{2,4}A[341], 4->{2,3}A[431]
2: 2->{3,4}A[241]
3: 3->{2,4}A[341]
4: 4->{2,3}A[431]
3->{2,4}A[341], 4->{2,3}A[431]
4->{2,3}A[431], 4->{2,3}W
MinRouteAdver timer expires:
4->{2,3}W, 3->{2,4}A[3241], 2->{3,4}A[2431]
… (omitted)
9: 3->{2,4}W
Note: In response to a withdrawal from 1, node 3 sends out 3 messages:
March 8, 2004
3->{2,4}A[341], 3->{2,4}A[3241], 3->{2,4}W
160
Types of inter-domain routing
updates
 Forwarding instability
 may reflect topology changes
 Policy fluctuations (routing instability)
 may reflect changes in routing policy information
 Pathological updates
 redundant updates that are neither routing nor
forwarding instability
 Instability
 forwarding instability and policy fluctuation 
change forwarding path
March 8, 2004
161
Routing successive events
(instability)
 WADiff
 W: a route is explicitly withdrawn as it becomes
unreachable
 A: is later replaced with an alternative route
 Forwarding instability
 AADiff
 A: a route is implicitly withdrawn
 A: then replaced by an alternative route as the
original route becomes unavailable or a new
preferred route becomes available
 Forwarding instability
March 8, 2004
162
Routing successive events
(pathological instability)
 WADup
 W: a route is explicitly withdrawn
 A: then reannounced later
 forwarding instability or pathological behavior
 AADup
 A: a route is implicitly withdrawn
 A: then replaced with a duplicate of the original
route
 pathological behavior or policy fluctuation
March 8, 2004
163
Routing successive events
 WWDup
 The repeated transmission of BGP
withdrawals for a prefix that is currently
unreachable (pathological behavior)
March 8, 2004
164
Measurement findings:
overview
 Year 2000
 BGP updates more than one order of
magnitude larger than expected
 Routing information dominated by
pathological updates
 Implementation problems
 BGP self-synchronization
 Unconstrained routing policies
March 8, 2004
165
Routing problem findings
 Implementation problems
 Redundant updates
 Routers do not maintain the history of the
announcements sent to neighbors
 Self-synchronization
 BGP routers exchange information simultaneously
 may lead to periodic link/router failures
 Unconstrained routing policies may lead to
persistent route oscillations
March 8, 2004
166
Instability measurement
 Instability and redundant updates
exhibits strong correlation with load
 (30 seconds, 24 hours and seven days
periods)
 Instability usually exhibits high
frequency
 Pathological updates exhibits both high
and low frequencies
March 8, 2004
167
Non-localized instability
 No single AS dominates instability
statistics
 No correlation between size of AS and
its impact on instability statistics
 There is no small set of paths that
dominate instability statistics
March 8, 2004
168
Measurement conclusions
 Routing in the Internet exhibits many
undesirable behaviors




Instability over a wide range of time scales
Asymmetric routes
Network outages
Problem seems to worsen
 Many problems are due to software
bugs or inefficient router architectures
March 8, 2004
169
Lessons
 Even after decades of experience routing in
the Internet is not a solved problem
 This attests the difficulty and complexity of
building distributed algorithm in the Internet,
i.e., in a heterogeneous environment with
products from various vendors
 Simple protocols may increase the chance to
be
 Understood
 Implemented right
March 8, 2004
170
Part V: BGP Beacons --
An Infrastructure for BGP Monitoring
Better understanding of BGP
dynamics
 Difficulties




Multiple administrative domains
Unknown information (policies, topologies)
Unknown operational practices
Ambiguous protocol specs
Proposal: a controlled active measurement infrastructure
for continuous BGP monitoring – BGP Beacons.
March 8, 2004
172
What is a BGP Beacon?
 An unused, globally visible prefix with known
Announce/Withdrawal schedule
 For long-term, public use
March 8, 2004
173
Who will benefit from BGP
Beacon?
 Researchers: study BGP dynamics
 To calibrate and interpret BGP updates
 To study convergence behavior
 To analyze routing and data plane interaction
 Network operators
 Serve to debug reachability problems
 Test effects of configuration changes:
 E.g., flap damping setting
March 8, 2004
174
Related work
 Differences from Labovitz’s “BGP faultinjector”




Long-term, publicly documented
Varying advertisement schedule
Beacon sequence number (AGG field)
Enabler for many research in routing dynamics
 RIPE Ris Beacons
 Set up at 9 exchange points
March 8, 2004
175
Active measurement
infrastructure
Many Observation points:
1:Oregon RouteViews
Internet
ISP
2. RIPE
ISP
3.AT&T
ISP
Send
route update
Upstream
provider
Stub AS
BGP Beacon #1
March 8, 2004
198.133.206.0/24
ISP
ISP
ISP
ISP
Upstream
provider
ISP
ISP
ISP
4. Verio
5. MIT
6.Berkeley
176
Deployed PSG Beacons
Prefix
Src
AS
Start
date
Upstream
Beacon
provider AS host
Beacon
location
198.133.206.0/24
3130
8/10/02
2914, 1239
Randy Bush
WA, US
192.135.183.0/24
5637
9/4/02
3701, 2914
Dave Meyer
OR, US
203.10.63.0/24
1221
9/25/02
1221
Geoff Huston
Australia
198.32.7.0/24
3944
10/24/02 2914, 8001
Andrew Partan
MD, US
192.83.230.0/24
3130
06/12/03 2914, 1239
Randy Bush
WA, US
March 8, 2004
177
Deployed PSG Beacons
 B1, 2, 3, 5:
 Announced and withdrawn with a fixed period
 (2 hours) between updates
 1st daily ANN: 3:00AM GMT
 1st daily WD: 1:00AM GMT
 B4: varying period
 B5: fail-over experiments
 Software available at:
http://www.psg.com/~zmao
March 8, 2004
178
Beacon 5 schedule
Live host behind
the beacon for
data analysis
March 8, 2004
Study fail-over
Behavior for
multi-homed
customers
179
Beacon terminology
Internet
Beacon
AS
Beacon prefix:
198.133.206.0/24
Input signal:
Beacon-injected change
3:00:00 GMT: Announce (A0)
5:00:00 GMT: Withdrawal (W)
March 8, 2004
Output signal:
RouteView
AT&T
5:00:10 A1
5:00:40 W
5:01:10 A2
Signal length: number of updates in
output signal (3 updates)
Signal duration: time between first and
last update in the signal (5:00:10 -5:01:10, 60 seconds)
Inter-arrival time: time between
consecutive updates
180
Process Beacon data
 Identify output signals, ignore external events
 Minimize interference between consecutive
input signals
 Time stamp and sequence number
March 8, 2004
181
Process Beacon data
 Identify output signals, ignore external events
 Data cleaning
 Anchor prefix as reference
 Same origin AS as beacon prefix
 Statically nailed down
 Minimize interference between consecutive
input signals
 Time stamp and sequence number
March 8, 2004
182
Process Beacon data
 Identify output signals, ignore external events
 Minimize interference between consecutive
input signals
 Beacon period is set to 2 hours
 Time stamp and sequence number
March 8, 2004
183
Process Beacon data
 Identify output signals, ignore external events
 Minimize interference between consecutive
input signals
 Time stamp and sequence number
 Attach additional information in the BGP updates
 Make use of a transitive attribute: Aggregator fields
March 8, 2004
184
Beacon data cleaning process
 Goal
 Clearly identify
updates associated
with injected routing
change
 Discard beacon
events influenced by
external routing
changes
March 8, 2004
185
Beacon example analysis
 BGP implementation impact
 Cisco vs. Juniper
 Route flap damping analysis
 Convergence analysis
 Inter-arrival time analysis
March 8, 2004
186
Cumulative Beacon statistics:
significant noise
 Current observation points:
 111 peers: RIPE, Route-View, Berkeley,
MIT, MIT-RON nodes, ATT-Research, AT&T,
AMS-IXP, Verio
Avg expansion: 2*0.2+1*0.8=1.2
March 8, 2004
187
Cumulative Beacon statistics:
significant noise
 Example response to ANN-beacon at peer p
 R1: ASpath= 286 209 1 3130 3927
No. transient routes=2
 R2: ASpath= 286 209 2914 3130 3927
Out-signal length=1
 100 events: 20: R1 R2, 80: R2
Beacon
Max no. Max ANN- Max WD- Max ANN-avg Max WD-avg
transient out-signal out-signal
expansion
expansion
routes
length
length
1
186
11
14
9.7
11.2
2
179
9
15
7.0
10.8
3
117
16
13
5.8
11.4
4
307
18
15
8.8
16.3
March 8, 2004
188
Cisco vs. Juniper
update rate-limiting
Known last-hop Cisco
and Juniper routers
from the same AS and
location
Average signal length:
average number of
updates observed for a
single beacon-injected
change
March 8, 2004
189
“Cisco-like” last-hop routers
Linear increase in
signal duration wrt
signal length
Slope=30 second
Due to Cisco’s
default rate-limiting
setting
March 8, 2004
190
“Juniper-like” last-hop routers
Signal duration
relatively stable
wrt increase in
signal length
Shorter signal
duration compared
to “Cisco-like”
last-hops
March 8, 2004
191
Route flap damping
 A mechanism to punish unstable routes
by suppressing them
 RFC2439 [Villamizar et al. 1998]
 Supported by all major router vendors
 Believed to be widely deployed [AT&T,
Verio]
March 8, 2004
192
Goals
 Reduce router processing load due to
instability
 Prevent sustained routing oscillations
 Do not sacrifice convergence times for
well-behaved routes
There is conjecture a single announcement can
cause route suppression.
March 8, 2004
193
Route flap damping
Cisco default setting
 Scope
Penalty
Exponentially decayed
3000
Suppress threshold
 Inbound external routes
 Per neighbor, per
destination
 Penalty
 Flap: route change
 Increases for each flap
 Decays exponentially
2000
1000
750
P(t ' )  P(t )e (t 't )
Reuse threshold
0
2
4
March 8, 2004
32
Time (min)
194
Route flap damping analysis
Strong evidence for
withdrawal- and
announcementtriggered
suppression.
March 8, 2004
195
Distinguish between
announcement and withdrawal
Summary:
•WD-triggered sup
more likely
than ANNtriggered sup
•Cisco: overall
more likely trigger
sup than Juniper
(AAAW-pattern)
•Juniper: more
aggressive for
AWAW pattern
March 8, 2004
196
Convergence analysis
Summary:
•Withdrawals
converge
slower than
announcements
•Most beacon
events converge
within 3 minutes
March 8, 2004
197
Output signal duration
30
March 8, 2004
60
90
120
198
Beacon 1’s upstream change
Single-homed
(AS2914)
March 8, 2004
Multi-homed
(AS1,2914)
Multi-homed
(AS1239, 2914)
199
Beacon for identifying router
behavior
Rate-limiting timer
30 second
Beacon 2
Seen from
RouteView data
Different
rate-limiting
behavior:
Cisco vs.
Juniper
March 8, 2004
200
Complementary cumulative
distribution plot
Inter-arrival time analysis
Cisco-like last-hop routers
March 8, 2004
201
Inter-arrival time analysis
March 8, 2004
202
Inter-arrival time modeling
 Geometric distribution (body):
 Update rate-limiting behavior: every 30 sec
 Prob(missing update train) independent of how many
already missed
 Mass at 1:
 Discretization of timestamps for times<1
 Shifted exponential distribution (tail):
 Most likely due to route flap damping
March 8, 2004
203
Summary
 Beacons -- a public infrastructure for
BGP analysis
 Shown examples of Beacon usage
 Future direction
 Construction of robust and realistic model
for BGP dynamics
 Correlation with data plane
 Analysis of RIPE Beacons
http://www.psg.com/~zmao
March 8, 2004
204
Part VI: Implication of Routing
Instability
BGP routing (in)stability
 Large # of BGP updates
 Failures
 Policy changes
 Redundant messages
 Implications
 Router overhead
 Transient delay and loss
 Poor predictability of traffic flow
March 8, 2004
206
All flaps are NOT created equal
Does instability hamper network engineering?
March 8, 2004
207
BGP routing and traffic popularity
 A possible saving grace…
 Most BGP updates due to few prefixes
 … and, most traffic due to few prefixes
 ... but, hopefully not the same prefixes
March 8, 2004
208
Popularity vs. BGP stability
 Do popular prefixes have stable routes?
 Yes, for ~ 10 days at a stretch!
 Does most traffic travel on stable
routes?
 A resounding yes!
 Direct correlation of popularity and
stability?
 Well, no, not exactly…
March 8, 2004
209
BGP Updates
 BGP updates (March 2002)
 AT&T route reflector
 RouteViews and RIPE-NCC
 Data preprocessing
 Filter duplicate BGP updates
 Filter resets of monitor sessions
 Removes 7-30% of updates
March 8, 2004
210
BGP update events
 Grouping updates into “events”
 Updates for the same prefix
 Close together in time (45 seconds)
 Reduces sensitivity to timing
Confirmed: few prefixes responsible for most events
March 8, 2004
211
Two Views of Prefix Popularity
 AT&T traffic data
 Netflow data on peering
links
 Aggregated to the prefix
level
 Outbound from AT&T
customers
Internet
in
out
AT&T
 Inbound to AT&T customers
March 8, 2004
212
Two Views of Prefix Popularity
 NetRatings Web sites
 NetRatings top-25 list
Amazon
 Convert to site names
www.amazon.com
 DNS to get IP addresses
207.171.182.16
 Clustered into 33 prefixes
207.171.176.0/20
March 8, 2004
213
Traffic volume vs. BGP events
(CDF)
Traffic volume (%)
100
Inbound
Outbound
80
50% of events
1.4% of traffic
(4.5% of prefixes)
60
40
50% of traffic
0.1% of events
(0.3% of prefixes)
20
0
0
March 8, 2004
10 20 30 40 50 60 70 80 90 100
BGP events (%)
214
Update events/day (CCDF, log-log
plot)
Percentage (%)
100
All
Inbound
Outbound
Netrating
10
1
1% had > 5
events per day
No “popular” prefix
had > 3 events per
day
0.1
Most “popular” prefixes had < 0.2
events/day and just 1 update/event
0.01
March 8, 2004
0.1
1
#Events/Day
10
215
An interpretation of the results
 Popular  stable
 Unstable  unpopular
 Unpopular does not imply unstable
March 8, 2004
216
An interpretation of the results
 Popular  stable
 Well-managed
 Few failures and fast recovery
 Single-update events to alternate routes
 Unstable  unpopular
 Unpopular does not imply unstable
March 8, 2004
217
An interpretation of the results
 Popular  stable
 Unstable  unpopular
 Persistent flaps: hard to reach
 Frequent flaps: poorly-managed sites
 Unpopular does not imply unstable
March 8, 2004
218
An interpretation of the results
 Popular  stable
 Unstable  unpopular
 Unpopular does not imply unstable
 Most prefixes are quite stable
 Well-managed, simple configurations
 Managed by upstream provider
March 8, 2004
219
Summary
 Measurement contributions
 Grouping BGP updates into “events”
 Popular prefixes from NetRatings
 Joint analysis of popularity & stability
 Positive result for network operators
 BGP instability does not affect most traffic
March 8, 2004
220
Open problems
 Stability of the IP forwarding path
 Does popularity imply stable forwarding path?
 Relationship between BGP and forwarding
path?
 BGP traffic engineering
 Tune BGP routing policies to prevailing traffic
 Prefixes with stable BGP routes & high/stable
volumes
March 8, 2004
221
Part VII: BGP Security Issues
Why do we care about
Internet routing security?
 BGP ties the Internet together
 Critical communication infrastructure
 BGP is vulnerable to configuration and routing
attacks
 No mechanisms in verifying correctness of routing
information
 Configuration errors are common
 Example: April 1997 small ISP in Virginia mistaken
announces “attractive” routes, creating blackholes
March 8, 2004
223
Can BGP be easily attacked?
 Example routing attacks




Fraudulent origination
Fraudulent modification
Overloading router CPU
Prefix hijacking
 Impact
 Traffic black holed
 Destinations unreachable – “dark” address space
 Traffic intercepted, modified
March 8, 2004
224
Dark address space [Arbor]
March 8, 2004
225
Quote from Rob Thomas
“I would stress that all of these things, particularly
prefix hijacking and backbone router ‘ownage’, are
real threats, happening today, happening with
alarming frequency. Folks need to realize that the
underground is abusing this stuff today, and has been
for quite some time.”
March 8, 2004
226
Restrictive route filters help!
March 8, 2004
227
Current proposals
 SBGP: Secure BGP
 http://www.net-tech.bbn.com/sbgp/sbgp-
index.html
 Routing origination digitally signed
 BGP updates digitally signed
 Address-based PKI to validate signatures
 SO-BGP: Secure Origin BGP
 ftp://ftp-eng.cisco.com/sobgp/index.html
 Guards against origination fraud
 No protection against mid-path disruptions
March 8, 2004
228
Current proposals
 Current ad hoc solutions
 TCP MD5 (RFC 2385) protects a single hop
 Inbound filters, route limits, martian checks, BTSH
(ttl hack)
 Neither SBGP nor So-BGP guarantees that
routes are actually usable
 Provides accountability
March 8, 2004
229
Details of SBGP
 Uses PKI
 Signing party certifies the next hop and
propagates it throughout the net
 Use optional, transitive BGP attributes
to encode signatures
March 8, 2004
230
SBGP optimization
 Predistribute most certificates to each
BGP speaker
 Offload certificate verification
 Lazy validation of routes
 Cache signed routes and originations
March 8, 2004
231
Why is SBGP not here today?
 Expensive to deploy
 Steady state overhead is 1.4 Kbps
 Consumes a lot of CPU – need hardware support
 Need more memory on routers
 PKI has to be set up
 Complex
 Requires router upgrade
 Do not deal with route withdrawals
 Perhaps an intermediate solution can be used
 PKI among tier-1 ISPs
March 8, 2004
232
Generic Threats to Routing
Protocols
 [Barbir,Murphy,Yang2003]
 Provides a framework for discussion of
 Routing attacks
 Defense and detection mechanisms
March 8, 2004
233
Classification of vulnerability
 Design: inherent choice in protocol spec
 Important to discover
 Implementation: bug based on coding error
 Should eventually get fixed
 Misconfiguration: weak passwords, failure to
use security features, block admin ports
 More prevalent today and need better tools for
configuration
March 8, 2004
234
Background
 Scope: all routing protocols
 Routing functions:
 Transport subsystems: e.g., IP or TCP
 Can be attacked
 Neighbor state maintenance
 Configuration of neighbors: e.g., HELLO,
KeepAlive
 Database maintenance: routing state
March 8, 2004
235
Threat sources
 Insider: transmit bogus messages
 Outsider: subvert unprotected transport
 Read, insert, reply, modify
 Outsider is more difficult?
March 8, 2004
236
Threat consequences
 Network as a whole
 Network congestion
 Routing loops
 Routing information disclosure
 Arguable less true for Internet interdomain routing
 Routing instability, churn
 Routing blackholes
 Network partition
 Router overload
March 8, 2004
237
Threat consequences
 Individual prefixes




Starvation or blackhole
Eavesdrop
Cut: external reachability affected
Delay or performance degradation, loops
March 8, 2004
238
Threat consequence
 Threat consequence zone
 The area within which threat actions are
affected
 Threat consequence periods
 Duration: long lived?
 Does the protocol itself have a way to limit
duration?
 E.g., route refreshes
March 8, 2004
239
Threat actions
 Some actions can be prevented e.g.,
authorization policies with strong
authentication
 Some actions can be detected: auditing and
logging
 Tradeoff between security and performance
 “Complexity if the enemy of security” –smb
My comment: after detection what is required to revert to
the normal state?
-- An important operational issue
March 8, 2004
240
BGP Vulnerability Testing
 [Nanog28: Convery&Franz]
 Is BGP really vulnerable?
 Answer the question based on testing 7
vendor implementations
March 8, 2004
241
TCP connection establishment
tests
 Varies from “silent reject” to “full 3-way
handshake”
 BGP RST or NOTIFICATION
 Timeout varies from none to 1-3
minutes before next attack attempt
 No BGP session established:
 TCP spoofing is required to inject data
March 8, 2004
242
Effect of TCP resource
exhaustion on BGP
 Goal: prevent new BGP sessions from being
established or impact existing sessions
 Methods: SYN, ESTABLISHED, FIN-WAIT1 flooding,
 Result
 Up to 5-6 minutes delay in BGP session establishment
 Moderate increase in CPU utilization and latency
 No impact on existing sessions
 For significant impact, attacker needs to break the
current session and SYN flood both peers
 ACL can help reduce impact on CPU
March 8, 2004
243
TCP reset and BGP route
insertion
 Blind TCP sequence number guessing
operationally impossible
 Pseudo-random ISN
 Requires some guessing work
 Routers can notice
 Based on large packet volume
 Assume one can guess TCP seq no.
 Routes can be inserted
 ACK with overlapping seq no. will detect it
 May impact the FIB and takes some time to flush
bad route
March 8, 2004
244
BGP peer hijack using ARP
spoofing
 Arpspoof allows an attacker to poison the Arp
table of a BGP peer on a LAN
 Session is terminated and reestablished with
the attacker
 Defense mechanisms:
 Static Arp for ethernet peering
 Static CAM entries and port security for ethernet
switches
 Detection: duplicate ARP replies
March 8, 2004
245
BGP/TCP implementation
recommendations
 Extensive, configurable logging of connection
failures
 Aggressive rejection of TCP connections from
non-configured peers and aggressive
timeouts
 To minimize TCP resource exhaustion attacks
 Source port randomization
 Length BGP session timeouts
 To minimize message flooding attacks
 BGP TTL Hack
March 8, 2004
246
Best common practice
 A compromised router is the most
valuable asset to an attacker
 Non BGP specific
 Router hardening
 Packet filtering to stop spoofed BGP
messages at the edge prevents almost
all TCP based attacks
March 8, 2004
247
Considerations in validating
the path in routing protocols
 [draft-white-pathconsiderations-00.txt]
 Path vector protocol participant cannot
verify
 whether the path a packet takes to its
destination corresponds to the path
advertised by the routing protocol
 whether the chosen path is in accordance
with the policies of other ASes.
March 8, 2004
248
Why?
 This due to
 path vector routing protocols abstract
information about intra-AS routing
decisions
 ASes can remove routes form the routing
systems, this may prevent another AS from
enforcing its own policy
March 8, 2004
249
Validity of a path
1. Does a path from the advertising router
to the destination advertised actually
exist?
2. Does the path advertised fall within the
policies of the route's originator and all
intermediate autonomous systems?
3. Is the advertising router authorized to
advertise a path to the destination?
March 8, 2004
2 and 3 cannot be verified in a
distance or path vector protocol
250
Example 1
The advertised path may not fall
within the policies of the receiver
E: local path
C: through AS 3
D: through E
B: through E
AS 3
AS 9
C
E
AS 1
A
B
AS 5
10.1.1.0/24
AS 2
D
B’s routing table:
AS path, nexthop
[5]
E (preferred)
[3 5]
C
March 8, 2004
AS 6
AS 7
2518
AS
Some subtleties here
 BGP forwarding information looks like this:
 Prefix and nexthop
 Nexthop is the IP address of the nexthop router
for forwarding traffic
 You must have the IGP route to the nexthop for
the route to be usable
 When B forwards traffic, it goes through C to
reach E – the nexthop of the path
 C’s forwarding table is inconsistent with B
 It prefers AS path [2 3 5]
March 8, 2004
252
Why can this happen?
 Intra-AS configuration of an AS can cause
packets to follow a path inconsistent with
advertised path
 Internal inconsistency in routing decisions
within an AS
 Path vector routing depends interior routing
protocols
 Other examples: route reflection
 Any lesson here?
 Guarantee the consistency of routes for all routers
within an AS
March 8, 2004
253
Example 2
Advertising router may not be authorized to
advertised a path to the destination
10.1.2.0/23
AS B
AS A
AS D
10.1.2.0/23
AS C
10.1.2.0/24
10.1.2.0/23
10.1.2.0/24
10.1.2.0/23
D prefers
10.1.2.0/24 from C (more secure)
10.1.3.0/24 from B
 A does not receive 10.1.2.0/24 from C
 A’s choice of [B D] overrides D’s implicit policy of only
accepting this traffic from C
 This is due to removal of information from the routing
system
 Lack of information does NOT mean lack of authorization to
March 8, 2004 transit a path
254
How can routing information
be “deleted”?
 Filtering based on prefix length
 Filtering based on the presence of
supernets
 Filtering based on receiver
 Doesn’t want to transit traffic for a peer
 Very prevalent especially between peers
or inside Internet core
March 8, 2004
255
BGP security summary
 BGP is vulnerable to attacks
 Defensive configuration really helps
 Restrictive route import filters
 Consistency checking
 Many proposals exist for improving BGP
security
 Tradeoff between performance and security
 Tradeoff between complexity and ease of
management
March 8, 2004
256
What does the future hold for
BGP?
 BGP is becoming increasing more complex
 New features: e.g., support for VPN [RFC2547]
 Susceptible to malicious users
 Core router “ownage”
 Vulnerable to misconfigurations
 Oops! I forgot leaked some routes!
 Increased scalability requirements
 Address usage growth
 IPV6
 Increased address usage
 Deaggregation
March 8, 2004
257
BGP does not support realtime application today
 Failover takes minutes, not seconds
 It can get worse due to
 Multihoming
 More networks
 Is overlay networks the answer to BGP’s
bad performance?
March 8, 2004
258
Some BGP future research
directions
 Traffic engineering
 Inbound & outbound traffic control
 Security
 Protecting routing infrastructure
 Router-based DDoS defense
 Simplification for BGP policies
 Policy language
 Network debugging
 Whose network is at fault?
March 8, 2004
259