IP Tutorial - Electrical Engineering Department

Transcript IP Tutorial - Electrical Engineering Department

TCP/IP Naming, Addressing, and
Routing
An IP Tutorial
Tutorial Overview
Part 1: Internet Background
 Part 2: Internet Basics
 Part 3: How does data get from A to B?
 Part 4: IP Routing
 Part 5: IP QoS
 Part 6: Internet History, Governance,
References

What is the Internet?
A very large
“network of networks.”
Uses TCP/IP protocols and
packet switching.
Runs on any communications
substrate.
Internet Architecture: WAN
Interconnection Points
(NAPs/MAEs)
National Service Providers (NSPs)
Regional
Enterprise
Enterprise
Regional
Internet Architecture:
Enterprise Attachment
H1
H
Internet
Service
Provider
H
FDD
Net # 1
R1
Ethernet
Net # 2 H5
H6
H7
R2
Private
Line
H2
H3
H4
Ethernet
Net # 3
Internet - Recent Statistics







20M hosts, 18K adds/day
755K “www”-prefixed hosts, 256% annual
growth rate
Highest growth rate: USA (1), Japan (2)
1300K Domains (60/40 USA vs. Rest)
Largest domain , “.com” with 4.5M hosts
214 connected IP countries
55 million users
-8
Ja
Ja
Ja
97
n-
96
n-
95
n-
3
3
4
-9
l-9
ct
2
2
-9
-9
pr
ct
1
-9
-9
pr
ct
9
91
-8
n-
Ju
O
A
O
A
O
8
89
n-
ct
Ja
O
Ja
l-8
6
5
3
1
ov
-8
Ju
N
-8
ct
ug
-8
74
69
n-
ug
O
A
A
Ju
19
Internet Growth 1969-1997
100000000
10000000
1000000
100000
Hosts
10000
Networks
D omains
1000
100
10
1
Internet
B itnet
U U CP
FidoN et
OSI
7
97
-9
6
6
-9
n-
ar
Ju
M
6
96
-9
ec
ep
D
S
-9
5
5
-9
n-
ar
Ju
M
5
95
-9
ec
ep
D
S
-9
4
4
-9
n-
ar
Ju
M
4
94
-9
ec
ep
D
S
-9
3
3
-9
n-
ar
Ju
M
3
93
-9
ec
ep
D
S
-9
2
2
-9
n-
ar
Ju
M
2
92
-9
ec
ep
D
S
-9
1
1
-9
n-
ar
Ju
M
-9
ec
ep
D
S
# of countries
Worldwide Networks Growth
180
160
140
120
100
80
60
40
20
0
Internet Traffic Statistics
 Internet
NAP traffic ~ 1 Gbps,
growing at 5x/year
 Total Internet Bandwidth ~ 350
Gbps
 World’s telecom traffic ~ 1 Tbps
Comparing Internet Growth
 Telephone
Lines: CAGR* = 5.1%
 Cellular Phones: CAGR = 68.9%
 Internet Users: CAGR = 113.1%
* Compounded Annual Growth Rate
Moore’s Law vs. Internet Growth
Moore’s Law
Internet Growth
PC Performance Growth
= 2 x Every 18 months
Internet Bandwidth
Demand Growth
= 2 x Every 3-4 months
Tutorial Overview
Part 1: Internet Background
 Part 2: Internet Basics
 Part 3: How does data get from A to B?
 Part 4: IP Routing
 Part 5: IP QoS
 Part 6: Internet History, Governance,
References

Part 2: Internet Basics
Philosophy and Terminology
 Addressing
 Naming and the Domain Name System

Design Philosophies

Shared Fate Principle
 connection
state maintained at end-points
 little state maintained in routers

Addresses are Globally Significant
 allows

local decisions on routing
Provide a Virtual Network Layer
 separates
physical/link layers from
internetwork layer
Connectionless Paradigm

There is no “connection” in IP
 Packets
can be delivered out-of-order
 Each packet can take a different path to the
destination
 No error detection or correction in payload
 No congestion control (beyond “drop”)

TCP mitigates these for connectionoriented applications
 error
correction is by retransmission
Connectionless Example
H
H
Internet
Service
Provider
H
FDDI
Private
Line
Router
Ethernet
H
Router
H
Ethernet
H
H
H
H
Internet Protocol Architecture
Ping
FTP
TELNET
SMTP
ICMP
HTTP
DNS
RTP
BGP
SNMP
RIP
UDP
TCP
OSPF
IP
LANs
10/100BaseT
ATM
FR
Dedicated B/W:
DSx, SONET, ...
PPP
Circuit-Switched B/W:
POTS, SDS, ISDN, ...
CDPD
Wireless
OSI Hierarchy
7
Application
6
Presentation
5
Session
4
Transport
3
Network
2
Link
1
Physical

Physical
 SONET,

T1, T3
Link
 Ethernet,
FDDI
 Circuit, ATM, FR
switches

Network
 Routing,
Call control
 IP internetworking
OSI Hierarchy
7
Application
6
Presentation
5
Session
4
Transport
3
Network
2
Link
1
Physical

Transport
 Error
and congestion
control
 TCP, UDP

Session, Presentation,
Application
 Data,
voice encodings
 Authentication
 web/http, ftp, telnet
TCP/IP: Postal Analogy

IP Packets are like Postcards







Globally significant To/From Addresses
Finite but variable length content
Variable delays
Delivery failures
Out-of-order deliveries
May take different routes
In networking language, IP is
“connectionless”
TCP: Postal Analogy

TCP is like sending a Novel on Postcards



Network delivers postcards “best effort”
Endpoints handle all service actions above “best
effort”
– Page numbering (ordering, duplicate detection)
– Positive Acknowledgment
– Retransmission on Timeout
In networking language, TCP is
“connection-oriented”
IP Network Model
The Internet is a “network of networks”
 A network is a collection of hosts that
can communicate directly among each
other

 Any
pair can communicate
 The network defines how the pair
exchanges information
IP Network Model

An internet is a concatenation of
networks
 The
networks involved may be (and
usually are) heterogeneous
 An end-to-end path is achieved by
concatenating the transport of data over
possibly multiple networks
 A Router mediates the differences between
the preceding and succeeding networks in
the concatenation
Ramifications of Design
Principles

Hosts contain connection state
 Amount
of state maintained is determined
by the application
 Not all applications require the same
amount of state (e.g., reliable delivery)

Network elements contain no
connection state or “soft” state
 “Soft”
state is state that can be lost and
refreshed without completely losing the
“connection”
Ramifications of Design
Principles

Since intermediate systems do not
maintain “hard” state, requested QoS is
difficult to manage
 When
soft state is lost, intermediate
systems will not be able to maintain the
QoS (the information on what the QoS was
is lost momentarily)
Ramifications of Design
Principles

IP routers take actions independent of
other routers to forward data toward its
destination
 IP
routers make local decisions only; there
is no network-wide coordination
 a bad routing decision by one router can be
corrected by its neighbors
 a failure of a router does not affect the
forwarding of traffic to a destination not
directly attached to the failed router
Ramifications of Design
Principles

Implementation Performance Varies
 Most
implementations are highly
optimized for the most common case
 Use of other IP features can cause
significant performance degradation
– out-of-order datagram deliver
– use of IP options
Bandwidth Bottlenecks

Routing Protocols Create A Single "Shortest Path"
C1
C3
C2
"Longer" paths
become underutilised
Path for C1 <> C3
Path for C2 <> C3
Engineering-Out

The Bottlenecks
ATM Switches Enable Traffic Engineering
C1
C3
C2
PVC C1 <> C3
PVC C2 <> C3
MPLS Takes Over

MPLS LSRs Enable Traffic Engineering
C1
C3
C2
LSP C1 <> C3
LSP C2 <> C3
MPLS Path Creation:
Quality of Service Refinements

Source device (S) determines the type of path on the
basis of the data
S
D
Low delay (preferred for VoIP traffic)
High bandwidth (preferred for FTP)
Hosts, Subnets, &
Routers
Protocols above IP
Host
Host
IP Subnet
(No IP Processing)
R
R
IP Processing
IP Subnet
(No IP Processing)
R
IP Subnet
(No IP Processing)
R
IP Subnet
(No IP Processing)
IP Packets
IP Subnet: Ethernet, Private Line, Frame Relay, ATM, ….
Names and Addresses

Every TCP/IP device (optionally) has a
“name”. Each IP subnet interface on
the device has an IP “address” and one
or more “subnet specific addresses”
(sometimes called “physical
addresses”).
Names and Addresses
Name: Character string based on a
“domain” structure, e.g., www.att.com
 IP Address: A.B.C.D (4-octet binary
string consisting of “subnet id” and
“host id”)

Subnet Specific Addresses

Subnet Specific Addresses are often
referred to as “physical addresses” but
are really either
 true
network addresses (like E.164, ATM
End System Addresses)
 link layer addresses (like Frame Relay
DLCIs or ATM VPI/VCI)
Examples of Subnet Specific
Addresses
Ethernet, IEEE 802.3 MAC/link
 Frame Relay (E.164/network,
DLCI/link)
 Circuit-switched (E.164/network)
 ATM (E.164/network, AESA/network,
VPI/VCI/link)
 Dedicated Serial Line (null subnet
specific address)

Subnet Confusion Possible

Note: the term “subnet” is also used as
a logical subdivision of the IP address
space
 which
is meant should be clear from the
context
Names & Addresses: An
Example
IP: A.3
E.164: 201-876-4477
H
R
Circuit-switched Net
(IP subnet id = A)
IP: A.1
E.164: 908-949-1254
IP: C.1
IP: A.2
E.164: 212-546-1355
IP: B.1
NSAP: af26c9
Private Line Net
(IP subnet id = C)
R
VPI/VCI: 555
VPI/VCI: 898
ATM Network
(IP subnet id = B)
VPI/VCI: 456
IP: B.3
NSAP: ed43fc
VPI/VCI: 222
IP: C.2
R
VPI/VCI: 666
VPI/VCI: 222
IP: B.2
NSAP: cd675f
IP: D.2
MAC: 458ef9
Ethernet
(IP subnet id = D)
R
IP: D.3
MAC: b23cd1
Name: www.att.com
H
IP: D.1
MAC: efd462
IP Addresses
IP version 4 addresses are all 24 bits in
length
 Representation is in “dotted-decimal”
notation: A.B.C.D

A
is the decimal number equivalent to the
8-bit quantity in the first octet
 B is the decimal number equivalent to the
8-bit quantity in the second octet, etc.

All IP addresses contain a “network”
part and a “host” part
IP Address Network/Host
Parts

When specific boundary between
network and host parts is needed:
a
“subnet” mask is paired with the address
– the mask is ANDed with the address to obtain
the network part
– e.g., 255.255.255.0 means that the first 3 octets
are network and the last octet is host, or
a
specific bit-length is included
– the length is placed after a slash separating the
address from the length
Example: Subnet/Host
Address

Example: Host snipe.ho.att.com
 IP
address is 135.16.157.112
 IP network is 135.16.157.0 255.255.255.0
 IP network is 135.16.157.0/24

Which representations to use is
determined by local software
Classless Inter-Domain
Routing (CIDR)

IP addresses originally had a “natural”
network length
 Class
A addresses had an 8-bit network
and 24-bit host part
 Class B addresses had a 16-bit network and
16-bit host part
 Class C addresses had a 24-bit network and
8-bit host part
CIDR and Addresses

Later subnet extensions were allowed
 the
natural network part could be extended
out to, but not including, the host part
 when this is done, a subnet mask is
required to allow various IP processing
stages to determine the network/host
boundary
CIDR and Addresses

CIDR removes the “natural” network
length
 subnets
can now be any prefix of length 1
to 31 bits
 this required changes to routing protocols
to allow carriage of the subnet length field
IP Packet Structure
Header
S
D
...
Data
S = Source Address (“Calling Number”)
D = Destination Address (“Called Number”)
IP Packet Structure
4-bit
8-bit
4-bit
Version Header Type of Service
Length
(TOS)
3-bit
Flags
16-bit Identification
8-bit Time to Live
(TTL)
16-bit Total Length (Bytes)
8-bit Protocol
13-bit Fragment Offset
16-bit Header Checksum
32-bit Source IP Address
32-bit Destination IP Address
Options (if any)
Payload
20-byte
Header
Tutorial Overview
Part 1: Internet Background
 Part 2: Internet Basics
 Part 3: How does data get from A to B?
 Part 4: IP Routing
 Part 5: IP QoS
 Part 6: Internet History, Governance,
References

Part 3: How Does a Datagram
get from A to B?

Host Configurations
 How
does a host get an IP address?
 Other fixed configurations: DNS server
and default router
Name to address translation
 Mask and Match on Address

 Decision:

resolve the address or forward?
Address resolution
Getting from A to B
Host address resolution protocol (ARP)
and table
 Host forwarding table

Host Configurations

A host needs to be configured to know
3 IP addresses
 Its
own IP address
 The IP address of its DNS server (two are
preferred, primary and secondary)
 The IP address of the default router it will
use to reach hosts not on its local
(sub)network
 These can be either static (manual) or
dynamic configurations
Host Configurations

A host also needs to know the subnet
mask (or prefix length) of its own IP
address
 subnet
mask uses a 24-bit quantity with
logical AND to extract the IP subnet
 prefix length explicitly indicates what part
of the local IP address is the IP subnet
Dynamic Configuration

Dynamic Host Configuration Protocol
(DHCP)
 Uses
central administration to maintain a
server
 The protocol uses the host’s Ethernet
address (on I/F) to identify it
 The DHCP server responds with the
specific configuration information for that
host
DHCP at Bootup
DHCP: Dynamic Host Configuration Protocol
DHCP Response
Broadcast DHCP Request:
Ethernet
Name: myhost.att.com
IP addr: 135.16.12.44
MAC addr: ef655c
Source. MAC addr. =
ef655c
No IP Addr
DHCP
Server
Name/Address Translations
IP Over Ethernet
DNS: Domain Name Server
ARP: Address Resolution Protocol
DNS
ARP: 135.16.12.44?
ef655c
Ethernet
http://www.att.com
Dest. MAC addr. =
ef655c
Dest. IP Addr.=
135.16.12.44
Name: www.att.com
IP addr: 135.16.12.44
MAC addr: ef655c
Name/Address Translations
IP Over ATM Network
DNS: Domain Name Server
ARPS: Address Resolution Protocol
Server
DNS
ARPS
ATM Network
SVC set-up to “ef655c”
Assign VPI/VCI = 1234
http://www.att.com
VPI/VCI = 1234
Dest. IP Addr.=
135.16.12.44
Name: www.att.com
IP addr: 135.16.12.44
NSAP addr: ef655c
Name to Address Translation

The host obtains a name from the user
 www.att.com
The “resolver” is called to map the
name to an address
 A name resolution query is sent to the
configured DNS server

Name to Address Translation

The DNS server responds with
 the
address(es) corresponding to the name,
if it knows it, or
 the address of another server that should
know more

Translation can be name to:
 Host
address
 Mail exchange
 other information (e.g., services supported)
Name to Address: Example

A host named coyote.acme.com wants
to know the address of
roadrunner.aspca.org
 Assume
the configured name server for
coyote is dns.acme.com
Name to Address: Example

dns.acme.com receives a name query
for roadrunner.aspca.org
 this
DNS server has no idea about
– roadrunner.aspca.org, or
– aspca.org
 but
it knows org is handled by
dns.internic.net and its IP address

dns.acme.com returns a reply referring
to the address of dns.internic.net
Name to Address: Example
coyote.acme.com sends a query to
dns.internic.net for
roadrunner.aspca.org
 dns.internic.net looks in its database
and finds

 it
doesn’t know about
roadrunner.aspca.org
 but it does know that the name server for
aspca.org is called dns.aspca.org at a.b.c.d
Name to Address: Example
dns.internic.net replies with a referral to
dns.aspca.org at a.b.c.d
 coyote.acme.com sends a query to
dns.aspca.org for roadrunner.aspca.org

 dns.aspca.org
finds the entry and replies
with the address
 The server will also respond with any other
information it has for that name
Hierarchical Structure
of the DNS
root
top level
domains arpa
second level
domains
com
edu
gov
int
mil
net
org
us
att
att
va
www
worldnet
reston
cnri
uk
in
….
Administration of the Domain
Name System

Top Level Domains are assigned and a
set of top level servers are maintained
 Internet
Society is owner
(http://www.isoc.org)
 Internet Assigned Number Authority
within ISOC contracts actual running of
top-level servers (3 sites: US, Europe,
Asia/Pacific)
Administration of the Domain
Name System

Within a top level domain
 names
are created and assigned
 administration is delegated to that
subordinate name
 for each subordinate name, a minimum of
two servers must answer for that name: a
primary and at least one secondary
 the primary is the point of administration
 secondaries are updated automatically
using a domain/zone transfer protocol
Forwarding: Local or Remote?

Once the DNS returns the destination IP
address, the host must determine
whether it is local or remote
 local:
the subnet the sender is connected to
– there is a presumption that all local hosts are
directly reachable
– for example all hosts on the same Ethernet are
directly reachable
 remote:
not local and therefore must be
reached via a router
– the router must be local
Forwarding: Local or Remote?

The determination of local or remote is
based on comparing the IP subnet of the
source with that of the destination
 If
the local IP subnets match, the two hosts
are local to each other
 The assignment of IP addresses must
maintain this rule!

This is often called “mask and match”
Local: Send it Directly

If the destination is local, then it can be
sent directly
 but
you first need to know the destination
host Ethernet address
 (this generalizes for any layer 2 subnet)
Local: Send it Directly

Given the IP address of a local
destination, use the Address Resolution
Protocol (ARP)
 ARP
is not based on IP, but rather supports
IP
 ARP relies on broadcast of a request and a
reply
ARP Request:
My Ethernet address: ef655c
My IP address: 135.16.157.23
Your Ethernet address: ?
Your IP address: 135.16.157.15
ARP Reply:
Your Ethernet address: ef655c
Your IP address: 135.16.157.23
My Ethernet address: fc893e
My IP address: 135.16.157.15
ARP Cache
ARP replies are seen by all local hosts
 Each host maintains an ARP cache

 mapping
between IP address and Ethernet
(layer 2) address
 each cache entry times out (approx. 10
minutes)
 the cache is consulted for address
resolution before an ARP request is sent
Remote: Send it to the Router

If the destination is remote (subnet
match fails)
 then
send it to the local router
 the router has a local IP address
 use ARP or the ARP cache to translate to a
layer 2 address

Once the Router has the datagram
 uses
its FIB to determine the next hop
 the entire process repeats at this point
Sending Over Point-to-Point
Links
Previous discussions assumed a
broadcast network for transmission
 IP treats a point-to-point link as a
subnet with exactly two hosts

 sending
to the “other” end is both
broadcast and unicast
 point-to-point examples: private line,
frame relay PVC, ATM PVC
Data Transfer
Once the subnet and interface is
selected, data transmission uses the
underlying layer 2 medium
 IP is encapsulated in a multiprotocol
sublayer (may be different by medium)
 The multiprotocol PDU is encapsulated
using the appropriate layer 2
mechanism for that medium
 Transmission begins

Data Transfer Over Framebased Networks
File
TCP
IP
Frame
(Ethernet,
FR, PPP)
Data Transfer Over Cell-based
Networks
File
TCP
IP
Adaptation
ATM Cells
Tutorial Overview
Part 1: Internet Background
 Part 2: Internet Basics
 Part 3: How does data get from A to B?
 Part 4: IP Routing
 Part 5: IP QoS
 Part 6: Internet History, Governance,
References

Part 4: IP Routing
Elements of IP Routing
 Internet Routing Architecture and
Autonomous Systems
 Interior Routing Protocols (RIP, OSPF,
IS-IS)
 Exterior Routing Protocols (BGP)

Elements of IP Routing

IP routing is done at each IP capable
node
 at
all routers
 at all hosts (even though it may be much
simplified)
IP Routing & Forwarding
Source
H
IP Subnet
R
IP Subnet
IP Subnet
R
R


R
IP Subnet
Destination
H
IP Routing is a dynamic, fully distributed process. Does
not rely on any centralized administration.
Packet Forwarding is a hop-by-hop process. Each entity
(host or router) only forwards the packet to another
entity (host or router) attached to its local IP subnet.
Internet Routing Architecture
Autonomous
System (AS)
Autonomous
System (AS)
Autonomous
System (AS)
Autonomous
System (AS)
Autonomous
System (AS)
Autonomous System: A collection of IP subnets and routers
under the same administrative authority.
Interior Routing Protocol
Exterior Routing Protocol
Internet Routing Hierarchy
The Internet is composed of
Autonomous Systems
 Each Autonomous System is an
administrative entity that

 Uses
Interior Gateway Protocols (IGPs) to
determine routing within the Autonomous
System
 Uses Exterior Gateway Protocols (EGPs) to
interact with other Autonomous Systems
ISPs and Autonomous
Systems

A Service Provider may have multiple
Autonomous Systems within its
operating network
 The
AT&T WorldNet dial platform and
Common Backbone were two separate ASs
that have merged
 There are two ASs within the WorldNet
Common Backbone: one for Internet
Gateway Routers (IGRs) and one for the
rest
Routing’s 3 Aspects

Acquisition of information about the IP
subnets that are reachable through an
internet
 static
routing configuration information
 dynamic routing information protocols
(e.g., BGP4, OSPF, RIP, ISIS)
 each mechanism/protocol constructs a
Routing Information Base (RIB)
Routing Aspect #2

Construction of a Forwarding Table
 synthesis
of a single table from all the
Routing Information Bases (RIBs)
 information about a destination subnet
may be acquired multiple ways
 a precedence is defined among the RIBs to
arbitrate conflicts on the same subnet
 Also called a Forwarding Information Base
(FIB)
Routing #3

Use of a Forwarding Table to forward
individual packets
 selection
of the next-hop router and
interface
 hop-by-hop, each router makes an
independent decision
RIB Construction

Multiple routing protocols may run on
the same router
 static
routing
 Interior Gateway Protocols, e.g., OSPF
 Exterior Gateway Protocols, e.g., BGP
RIB Construction
Each routing protocol builds its own
Routing Information Base (RIB)
 Each protocol has its own “view” of
“costs”

 e.g.,
OSPF is administrative weights
 e.g., BGP4 is Autonomous System path
length
FIB Construction

An algorithm is used to choose one
next-hop toward each IP destination
known by any routing protocol
 the
set of IP destinations present in any RIB
are collected
 if a particular IP destination is present in
only one RIB, that RIB determines the next
hop forwarding path for that destination
FIB Construction

Choosing FIB entries, cont..
 if
a particular IP destination is present in
multiple RIBs, then a precedence is defined
to select which RIB entry determines the
next hop forwarding path for that
destination
 This process normally chooses exactly one
next-hop toward a given destination

There are no standards for this; it is an
implementation (vendor) decision
FIB Contents

IP subnet and mask (or length) of
destinations
 can
be the “default” IP subnet
IP address of the “next hop” toward
that IP subnet
 Interface id of the subnet associated
with the next hop
 Optional: cost metric associated with
this entry in the forwarding table

Packet Forwarding

Forwarding is the process of determining
where a particular datagram should be
sent next
 involves
searching the FIB for the next hop IP
address and interface

Uses the “longest matching prefix”
 several
prefixes may have common upper
parts, the longest one matching is used
Longest Matching Prefix

Next hop for “101010111...” is
135.17.21.1
Prefix
Length N ext H op
1010110
7
135.17.21.4
10101
5
135.17.21.1
101
3
135.17.21.4
Routing Information Base
Construction
A dynamic, fully distributed process
done for each routing protocol being
run
 Distance Vector and Link State routing
are the two basic techniques.

Distance Vector and Link State

Distance Vector
 Accumulates
a metric hop-by-hop as the
protocol messages traverse the subnets

Link State
 Builds
a network topology database
 Computes best path routes from current
node to all destinations based on the
topology
Distance Vector Protocols
Each router only advertises to its
neighbors, its “distance” to various IP
subnets
 Each router computes its next-hop
routing table based on least cost
determined from information received
from its neighbors and the cost to those
neighbors

Distance Vector
Attempts to minimize messaging
overhead and memory requirements at
the expense of slower convergence
 Needs careful design to avoid problems

 packet
looping, or counting to infinity
 split horizon with poisoned reverse
– if A routes to X via B , then B should not try to
route to X via A (loop formation)
– A sends to B updates that list X with infinite
(poisoned) cost
Distance Vector RIB Construction
Cost to D = 5
Next Hop = A.2
H
A.1
A.3
R
IP Subnet “A”
Cost = 2
Cost to D = 4
Next Hop = C.2
C.1
A.2
Cost to D = 3
Next Hop = B.2
B.1
IP Subnet “C”
Cost = 2
R
C.2
B.3
IP Subnet “B”
Cost = 1
R
Cost to D = 2
Next Hop = direct
D.2
Destination
B.2
Cost to D = 2
Next Hop = direct
R
D.3
IP Subnet “D”
Cost = 2
D.1
H
Packet Forwarding
Cost to D = 5
Next Hop = A.2
H
A.1
A.3
IP Subnet “A”
Cost = 2
R
Cost to D = 4
Next Hop = C.2
C.1
A.2
D.1
R
Cost to D = 3
Next Hop = B.2
IP Subnet “C”
Cost = 2
D.1
B.1
C.2
B.3
IP Subnet “B”
Cost = 1
R
Cost to D = 2
Next Hop = direct
D.2
B.2
D.1
R
Cost to D = 2
Next Hop = direct
D.3
D.1
IP Subnet “D”
Cost = 2
Destination
D.1
H
D.1
Distance Vector RIB Parameters

Accumulated cost
 cost
is a constant administrative
assignment for each subnet
 assignment is typically “1” for each subnet
(equivalent to hop-count)
 included in routing protocol exchange

Time the update was received (for
timeout)
Distance Vector RIB Parameters

The next-hop the entry was received
from
 sender’s
id is included in routing protocol
exchange

Accumulated Hop count and Maximum
Hop Count
 used
to detect cycles
 hop count included in routing protocol
exchange
Distance Vector: Additions

When a router learns of new reachable
subnets
 at
router startup
 when an interface in enabled or restored to
service

A routing update is broadcast to all
neighbors
Distance Vector: Additions
Any router receiving the packet
compares the cost it received in the new
packet with that in its RIB
 If the cost is smaller or the subnet is
new

 the
new entry is used in the RIB
 the new entry is broadcast to all its
neighbors (except the one from which it
was received)
Distance Vector: Removals

Each RIB entry is aged
a
timeout defines when an entry is
removed from the RIB

Periodically, each router re-advertises
all the routes it knows to its neighbors
 this
can be done in many ways: from
simple neighbor hellos to enumeration of
all routes
Distance Vector: Removals
If a neighbor does not respond within a
timeout, all routes learned from that
neighbor are removed
 Route removal may be advertised to
neighbors

Link State Protocols
Each router broadcasts to all the routers
in the network the state of its locally
attached links and IP subnets
 Each router constructs a complete
topology view of the entire network
based on these link state updates and
computes its next-hop routing table
based on this topology view

Link State Protocols
Attempts to minimize convergence
times and eliminate non-transient
packet looping at the expense of higher
messaging overhead, memory, and
processing requirements
 Allows multiple metrics/costs to be
used

Link State Protocols

The “broadcast” of link state from one
router to all others uses a variety of
mechanisms
 true
broadcast when the layer 2 subnet
interconnecting the routers supports
broadcast
 multicast among the routers when the
layer 2 subnet supports that (e.g. FrameRelay, ATM)
 hop-by-hop flooding as a last resort
Link State Protocols

Transmission of link state must be done
reliably
 the
protocol assumes that the topology
databases of all nodes are identical to
prevent routing-loops from forming
 acknowledgments from all neighbors are
needed
 routers must deal with out-of-order
delivery of updates, replicates, etc., all of
which requires processing time
Link State RIB Parameters

Topology Database
 Router
IDs
 Link IDs
– From Router ID
– To Router ID
 Metric(s)
 Sequence

number
List of Shortest Paths to Destinations
Link State Operation: Additions

Flooding Algorithm
 each
router announces itself and each link
it is attached to
 announcements by broadcast or multicast
or unicast to all neighbors
 Designated router used on broadcast nets
– to minimize number of adjacencies

Each router constructs its Topology DB
Link State Operation: Removals
Removals are announcements with the
metric set to “infinity”
 Adjacencies must be refreshed

 neighbors
use “hello” protocol
 if a router loses a neighbor, then routes via
that neighbor are recomputed
 send announcements with link metric to
lost neighbor set to infinity
Link State: Shortest Path

Dijkstra’s Shortest Path First graph
algorithm
 Use
yourself as starting point
 Search outward on the graph and add
router IDs as you expand the front

Addresses are associated with routers
 Hence
the SPF algorithm needs to deal
only in the number of routers, not the
number of routes
Link State: Shortest Path
From R1
A.3
Next
Router Hop Link
R2
IP Subnet “A”
Cost =3
C.1
A.2
IP Subnet “C”
Cost = 2
R1
B.1
C.2
B.3
IP Subnet “B”
Cost = 2
R3
D.2
B.2
R4
D.3
IP Subnet “D”
Cost = 3
R2
R3
R4
R1
R2
R3
A.3
A
B.3
B
B.2
B
From R4
B.1
B
B.3
B
B.3
B
IGP: Routing Information
Protocol (RIP)
The first interior routing protocol based
on “distance vector” concepts (RFC
1058, 6/1/88, updated to RIP v2 in RFC
1723, 11/15/94)
 Limited scalability (max diameter 16)
 Suffers from problems such as

 creation
of routing loops
 creation of “black holes”
IGP: Open Shortest Path First
(OSPF)
Current generation interior routing
protocol based on “link state” concepts
(RFC 1131, 10/1/89, obsoleted by OSPF
v2, RFC 1723, 11/15/94)
 Supports hierarchies for scalability
 Fast convergence and loop avoidance
 Used within the WorldNet Common
Backbone and Dial Platform

IGP: Intermediate System-toIntermediate System (IS-IS)
OSI routing protocol extended to allow
IP (RFC 1142, 12/30/91)
 Very similar to OSPF

 Differences
are small and deal mostly with
failure modes

Used in many Internet Service Provider
networks
 Cisco’s
implementation of ISIS is believed
to be better than Cisco’s OSPF
IGP: Interior Gateway Routing
Protocol (IGRP)
Cisco’s proprietary routing protocol
 Based on “distance vector” concepts,
but avoids RIP problems
 Dominant in enterprise networks
 Cisco’s EIGRP is a hybrid protocol
using both distance vector and link
state concepts

EGP: Exterior Gateway
Protocol (EGP)
The first exterior routing protocol based
on “distance vector” concepts (RFC
0904, 4/1/84)
 Designed for a simple tree-structured
topology with “regional” networks
with a single “backbone.”
 Topology restrictions quickly made this
protocol obsolete
 No longer used widely in the Internet

EGP: Border Gateway
Protocol version 4 (BGP4)
The current generation exterior routing
protocol based on “path vector”
concepts (RFC 1771, 3/21/95)
 Supports complex mesh topologies with
loop-avoidance
 Required protocol for use at Internet
exchange points

EGP: Border Gateway
Protocol version 4 (BGP4)

Supports policy-based routing by
keeping the path of ASs toward the
destination
 e.g.,
allows filtering out routes through
specified ASs
Tutorial Overview
Part 1: Internet Background
 Part 2: Internet Basics
 Part 3: How does data get from A to B?
 Part 4: IP Routing
 Part 5: IP QoS
 Part 6: Internet History, Governance,
References

Part 5: IP QoS
Philosophy
 How things work on the Internet

 data
 voice,
video
How IP QoS tries to make them work
better
 The role of ATM

Internet QoS Philosophy

Things should work with best-effort
service
 best-effort
service supports no explicit
bounds on delay, throughput, or packet
loss
Selectively do resource reservation if
you need things to work better
 Maintain only soft state or no state

Protocol Architecture
Voice,
Video
Data
HTTP
FTP
RPC
TCP
•reliable transport
•resequencing
•flow control
RTP
UDP
IP
•timing recovery
•resequencing
•adaptive encoding
•delivery not reliable
- congestion may cause
packet loss
•sequence may not be preserved
- packets may follow different
paths
•delays variable
Competing
traffic
Router
Router
Voice, Video, Jitter, & Delay
to Codec
Playout
Point
Packets experience variable delay (jitter)
under best-effort service
 Receiver can accommodate jitter by
adapting the playout point


larger jitter implies larger end-to-end delay
Sliding Windows
Packets: 1
2
3
ACKed
by
receiver
4
5
sent,
but not
ACKed
6
7
8
can
send
now
9
10
can’t
send
yet
Receiver acknowledges successfully
received packets
 Sender limits number of packets that
have been sent but not acknowledged

 Limit

= Window
Window size limits transmission rate
Data Transport & Packet Loss
Window
Size
W=1
Transmitter W=2
Receiver
W=3
W=4
User Data
Acknowledgment

TCP probes for bandwidth by
increasing its window size until loss
occurs, then backs off and tries again
 loss
more critical than delay for data
Data Transport & Packet Loss
W=4
Receiver
Transmitter
D
D
W=2
R
D
R

User Data
Ack
Duplicate Ack
Retransmission
TCP decreases window size if hole
detected in window or if time-out occurs
 loss
of more than one packet per round-trip time
typically results in an over-reaction to congestion
Internet Work on Resource Management
and QoS Support
Signaling
QoS
Routing
Little
Effort
Here
Most
Effort
Here
Scheduling
Routing: Best-Effort vs. QoS
Best-Effort Routing
 Routing based on



QoS Routing
 Routing based on
hop counts
facility speeds
QoS requirements not
met if resources are
insufficient on besteffort path





hop counts
facility speeds
bandwidth and delay
requirements
bandwidth availability
QoS requirements
supported if feasible
path through network
exists
Flow
Sequence of packets defined by
common destination address or subnet
and possibly also by one or more of the
following attributes:
 Source IP Address/Subnet

 Protocol
(TCP or UDP)
 Source TCP/UDP port number
 Destination TCP/UDP port number
 Type of Service (TOS) field
Integrated Services

Flow-Based QoS




signaled via the ReSource reserVation Protocol
(RSVP)
per-flow reservations requested by receiver,
propagated router-by-router
difficult to implement; not widely deployed
Class-Based QoS (Differential Services)


flows mapped into small # of classes
packets marked (via TOS field) at network edge and
prioritized in network interior based on marking
Services

QoS
Goal
RS VP
Differerential
S ervices
Reduce
Delays
Guaranteed
QoS
P riority
Improve
Throughput
Controlled
Load
As s ured
With exception of Guaranteed QoS
service, QoS objectives are described
qualitatively, not quantitatively
With Freedom Comes
Responsibility: Token Buckets
Arriving
Packet



Token
Available?
No
Tag packet,
drop packet,
or treat as best effort
Token bucket defines token rate & bucket depth
Use of token buckets common to all Integrated
Services
Similar to ATM and Frame Relay networks
RSVP
Sender
2.
1.
R
R
3.
Receiver
R
1.Forward data flow
established
2. PATH message traces route
from sender to receiver
3. RESV message backtracks
route of PATH message and
installs reservation
 Soft state periodically
refreshed by new PATH
and RESV messages
 Interior routers maintain
per-flow state
Differential Services
Bandwidth Brokers
User
Net 1
10 Mbps
to D

V OK
BB
BB
20
D
OK
ISP
OK
User
Net 2
Signaling is between agents from adjacent
Autonomous Systems


BB
50
Agents generically called “Bandwidth Brokers (BBs)”
Interior routers not necessarily aware of individual
bandwidth allocations

pre-provisioned rates per class between administratively
separate networks
Algorithms for Frame Scheduling and
Buffer Management

Weighted Fair Queueing (WFQ)



link bandwidth allocated per-flow or per-class in
proportion to a configured weight
supports minimum bandwidth guarantees and fair
allocation of excess bandwidth
Random Early Detection (RED)


randomizes packet loss to optimize TCP
performance
drop probabilities depend on buffer occupancy
and possibly on packet priority (Weighted RED)
Voice Delay w/ Two WFQ Implementations
(Bennett and Zhang)
30 ms
20 ms
15 ms
20 ms
10 ms
10 ms
5 ms



Accounts for queueing delay at single DS3 link
saturated by background traffic
Assumes 9 Mbps of voice
With First-In-First-Out queueing (rather than WFQ),
voice delays in the hundreds of msec would result
Example: 150 msec budget for one-way
voice delay (gateway-gateway)





Packetization + Look Ahead (G.729): 45 msec
 assumes 4 frames per packet
 10 msec per frame and 5 msec look ahead
DSP Processing: 5 msec
Propagation: 50 msec
Queueing: 25 msec (gateway-to-gateway)
Buildout: 25 msec
» To consistently live within budget, voice must be
prioritized at links, or links must be dedicated to
voice
Link Sharing
155 Mbps
1.0
Customer 1 .14


.05
.03
.06
Priority
Assured
BestEffort
...
...
.21 Customer N
.01
.12
.08
Priority
Assured
BestEffort
Provides characteristics of a private network
Implemented via WFQ or other service discipline that
guarantees bandwidth shares

experience with layer-2 services (frame relay and ATM)
indicates that sub-classes must be queued separately to
systematically divide bandwidth between them
Role of ATM
R1
R3
S1
R2
S2
R4
Priority VC
Assured VC
Best-Effort VC

ATM can provide a “designer link layer” for routers



Link sharing implemented through ATM Virtual Circuits
(VCs)
About 16K VCs supported per OC12 (today) with queueing
and QoS differentiation on a per-VC basis
QoS routing at ATM layer can compensate for lack
thereof at IP layer
Tutorial Overview
Part 1: Internet Background
 Part 2: Internet Basics
 Part 3: How does data get from A to B?
 Part 4: IP Routing
 Part 5: IP QoS
 Part 6: Internet History, Governance,
References

Internet Timeline: 1960s
1965: ARPA sponsors a study on
“cooperative network of time-sharing
computers”
 1969

 ARPANET
commissioned
 First Request for Comment (RFC)
published: “Host Software”
Internet Timeline: 1970s

Store-and-forward networks
 Email
and conferencing technologies
developed
Telnet and FTP developed (1972/73)
 Metcalfe outlines ideas behind Ethernet
 BBN starts Telenet, first public packet
data service (1974)
 UUCP developed at Bell Labs (1976)

Internet Timeline: 1980s

TCP/IP suite of protocols (1982)
 Transmission
Control Protocol (TCP)
 Internet Protocol (IP)
 Concatenates heterogeneous networks
using IP
Internet Activities Board created (1983)
 Domain Name System intro. (1984)

Internet Timeline: 1980s

NSFNET created (1986)
 backbone
56 kbps links (1986), T1 (1988)
 regional networks also created
UUNET founded for commercial
netnews service (1987)
 First commercial email exchanges via
Internet (1989)

 MCI
Mail and CompuServe
Internet Timeline: 1990s
ARPANET ceases to exist (1990)
 First commercial dial service: The
World (1990)
 Commercial Internet eXchange (CIX)
association (1991)
 NSFNET backbone to T3 (1991)

1
terabyte/month
 10 giga-packets/month

Multicast backbone established (1992)
Internet Timeline: 1990s

World Wide Web (1993)
 Mosaic
from NCSA leads to Netscape
Navigator and MS Internet Explorer
 WWW growth is 341,634% per year

NSFNET reverts to a research net (1995)
 very
high-speed Backbone Network
Service (vBNS) at OC-3, contract to MCI
 The Internet “completely” commercial

AT&T WorldNet becomes the largest
pure Internet Service Provider
Internet Governance
Internet Society
 Internet Activities Board (IAB)
 Internet Engineering Steering Group
(IESG)
 Internet Engineering Task Force (IETF)
 Internet Research Task Force (IRTF)

IETF Areas
Application Area
 Internet Area
 Operations & Management Area
 Routing Area
 Security Area
 Transport Area
 User Services Area

Request for Comments

RFC process is based on rough
consensus
 representation
is individual, not based on
company or other affiliation
Internet Drafts are submitted to IETF
working groups
 Internet Draft to Proposed Standard

 stable
specification agreed to by IESG
 all design choices resolved
Request for Comments

Proposed to Draft Standard
 Two
independent and interoperable
implementations including all options
 IESG approval
 Draft Standard is normally considered final

Draft Standard to Internet Standard
 Exhibits
a high degree of technical
maturity
 Provides significant benefit to the
community
References
Comer, Internetworking with TCP/IP,
Prentice-Hall, 1988.
 Huitema, Routing on the Internet,
Prentice-Hall PTR, 1995.
 Perlman, Interconnections: Bridges and
Routers, Addison-Wesley, 1992.
 Stevens, TCP/IP Illustrated, volumes 1-3,
Addison-Wesley, 1995.

References

Hobbes’ Internet Timeline, IETF RFC
2235, Nov. 1997.
References on the Web

www.isoc.org
 The

Internet Society
www.iab.org
 Internet

Activities Board
www.ietf.org
 RFCs
and Internet drafts
 meeting schedules
References on the Web

www.internic.net
 RFCs
and Internet drafts
 IP address and DNS registration
information
 Databases of various and sundry Internet
related “stuff”
Part 7: Miscellaneous
Load Balancing

A particular routing protocol may
determine there are multiple paths
toward a destination with the same
“cost”
 Typical
when there are multiple parallel
trunks between routers

If a RIB has multiple entries for the
same destination, then the FIB could
include one, some, or all of them
Load Balancing

If there is more than one is entry in the
FIB for a destination, load balancing is
possible
 round-robin
distribution of packets onto
paths
 hashed distribution attempts to keep
packets with the same source and
destination addresses on the same trunk to
minimize out-of-order delivery
IP Multicast
Design and purpose
 Distributed communication model
 Class “D” addresses
 MBONE

IP Multicast
Designed for efficient support of
one-to-many and many-to-many
communications, e.g., Conferencing,
etc.
 Sender sends one copy addressed to
a “multicast group” and the network
delivers one copy to each multicast
group member.

IP Multicast

Based on a fully-distributed
communication model that does not
require a centralized “bridge”:



Participants join/drop multicast sessions via the Internet
Group Management Protocol (IGMP).
Multicast routing protocols (DVMRP, MOSPF, PIM, etc.) are
used for packet routing and delivery.
The Internet Multicast Backbone
(MBONE) was deployed between 19881992 for experimentation and
development of multicast protocols
RIP Messages

Request / Response
1
Command (Req/Resp)
1
Version
2
reserved
2
Address Family (IP=2)
2
reserved
4
Address
8
reserved
4
metric
May be repeated
RIP Protocol

Updates are sent
 periodically
 upon
request
 optional: upon change of metric on
destination (e.g., due to link failure)

RIB entries time out and must be
refreshed
RIP Protocol

Convergence times are long because
 The
entire RIB is sent, not just entries that
changed
 Convergence sometimes encounters loops
– count-to-infinity in RIP means count-to-16
– each hop may wait the full period to forward
updates

RIP v1 does not implement CIDR
support (v2 does)