WorldNet Data Warehouse Albert Greenberg albert@research
Download
Report
Transcript WorldNet Data Warehouse Albert Greenberg albert@research
IP Multicasting
By
Behzad Akbari
Fall 2013
These slides are based on the slides of J. Kurose (UMASS) and Shivkumar (RPI)
1
Broadcast Routing
deliver packets from source to all other nodes
source duplication is inefficient:
duplicate
duplicate
creation/transmission
R1
R1
duplicate
R2
R2
R3
R4
source
duplication
R3
R4
in-network
duplication
source duplication: how does source determine
recipient addresses?
2
In-network duplication
flooding: when node receives broadcast packet,
sends copy to all neighbors
controlled flooding: node only broadcast packet if it
hasn’t broadcast same packet before
Problems: cycles & broadcast storm
Node keeps track of packet ids already broadcasted
Or reverse path forwarding (RPF): only forward packet if it
arrived on shortest path between node and source
spanning tree
No redundant packets received by any node
3
Spanning Tree
First construct a spanning tree
Nodes forward copies only along spanning
tree
A
B
c
F
A
E
B
c
D
F
G
(a) Broadcast initiated at A
E
D
G
(b) Broadcast initiated at D
4
Spanning Tree: Creation
Center node
Each node sends unicast join message to center
node
Message forwarded until it arrives at a node already
belonging to spanning tree
A
A
3
B
c
4
E
F
1
2
B
c
D
F
5
E
D
G
G
(a) Stepwise construction
of spanning tree
(b) Constructed spanning
tree
5
Multicast Routing: Problem Statement
Goal: find a tree (or trees) connecting routers
having local mcast group members
tree: not all paths between routers used
source-based: different tree from each sender to rcvrs
shared-tree: same tree used by all group members
Shared tree
Source-based trees
6
Approaches for building mcast trees
Approaches:
source-based tree: one tree per source
shortest path trees
reverse path forwarding
group-shared tree: group uses one tree
minimal spanning (Steiner)
center-based trees
…we first look at basic approaches, then specific
protocols adopting these approaches
7
Shortest Path Tree
mcast forwarding tree: tree of shortest path
routes from source to all receivers
Dijkstra’s algorithm
S: source
LEGEND
R1
1
2
R4
R2
3
R3
router with attached
group member
5
4
R6
router with no attached
group member
R5
6
R7
i
link used for forwarding,
i indicates order link
added by algorithm
8
Reverse Path Forwarding
rely on router’s knowledge of unicast
shortest path from it to sender
each router has simple forwarding behavior:
if (mcast datagram received on incoming link
on shortest path back to center)
then flood datagram onto all outgoing links
else ignore datagram
9
Reverse Path Forwarding: example
S: source
LEGEND
R1
R4
router with attached
group member
R2
R5
R3
R6
R7
router with no attached
group member
datagram will be
forwarded
datagram will not be
forwarded
• result is a source-specific reverse SPT
– may be a bad choice with asymmetric links
10
Reverse Path Forwarding: pruning
forwarding tree contains subtrees with no mcast
group members
no need to forward datagrams down subtree
“prune” msgs sent upstream by router with no
downstream group members
LEGEND
S: source
R1
router with attached
group member
R4
R2
P
R5
R3
R6
P
R7
P
router with no attached
group member
prune message
links with multicast
forwarding
11
Shared-Tree: Steiner Tree
Steiner Tree: minimum cost tree connecting
all routers with attached group members
problem is NP-complete
excellent heuristics exists
not used in practice:
computational complexity
information about entire network needed
monolithic: rerun whenever a router needs to
join/leave
12
Center-based trees
single delivery tree shared by all
one router identified as “center” of tree
to join:
edge router sends unicast join-msg addressed to center
router
join-msg “processed” by intermediate routers and
forwarded towards center
join-msg either hits existing tree branch for this center, or
arrives at center
path taken by join-msg becomes new branch of tree for this
router
13
Center-based trees: an example
Suppose R6 chosen as center:
LEGEND
R1
3
R2
router with attached
group member
R4
2
R5
R3
1
R6
1
router with no attached
group member
path order in which join
messages generated
R7
14
IP Multicast Architecture
Service model
Hosts
Host-to-router protocol
(IGMP)
Routers
Multicast routing protocols
(various)
15
Internet Group Management Protocol
IGMP: “signaling” protocol to establish,
maintain, remove groups on a subnet.
Objective: keep router up-to-date with group
membership of entire LAN
Routers need not know who all the members are,
only that members exist
Each host keeps track of which mcast groups
are subscribed to
Socket API informs IGMP process of all joins
16
How IGMP Works
Routers:
Q
Hosts:
On each link, one router is elected the “querier”
Querier periodically sends a Membership Query message
to the all-systems group (224.0.0.1), with TTL = 1
On receipt, hosts start random timers (between 0 and 10
seconds) for each multicast group to which they belong
17
How IGMP Works (cont.)
Routers:
Hosts:
Q
G
G
G
G
When a host’s timer for group G expires, it sends a
Membership Report to group G, with TTL = 1
Other members of G hear the report and stop (suppress)
their timers
Routers hear all reports, and time out non-responding
groups
18
How IGMP Works (cont.)
Normal case: only one report message per group
present is sent in response to a query
Query interval is typically 60-90 seconds
When a host first joins a group, it sends immediate
reports, instead of waiting for a query
IGMPv2: Hosts may send a “Leave group” message
to “all routers” (224.0.0.2) address
Querier responds with a Group-specific Query message:
see if any group members are present
Lower leave latency
19
IP Multicast Architecture
Service model
Hosts
Host-to-router protocol
(IGMP)
Routers
Multicast routing
protocols
20
Multicast Routing
Basic objective – build distribution tree for
multicast packets
The “leaves” of the distribution tree are the
subnets containing at least one group member
(detected by IGMP)
Multicast service model makes it hard
Anonymity
Dynamic join/leave
21
Routing Techniques
Flood and prune
Begin by flooding traffic to entire network
Prune branches with no receivers
Examples: DVMRP, PIM-DM
Link-state multicast protocols
Routers advertise groups for which they have
receivers to entire network
Compute trees on demand
Example: MOSPF
22
Routing Techniques(…)
Core-based protocols
Specify “meeting place” aka “core” or “rendezvous
point (RP)”
Sources send initial packets to core
Receivers join group at core
Requires mapping between multicast group
address and “meeting place”
Examples: CBT, PIM-SM
23
Routing Techniques (…)
Tree building methods:
Data-driven: calculate the tree only when the first
packet is seen. Eg: DVMRP, MOSPF
Control-driven: Build tree in background before any
data is transmitted. Eg: CBT
Join-styles:
Explicit-join: The leaves explicitly join the tree. Eg:
CBT, PIM-SM
Implicit-join: All subnets are assumed to be receivers
unless they say otherwise (eg via tree pruning). Eg:
DVMRP, MOSPF
24
Shared vs. Source-based Trees
Source-based trees
Separate shortest path tree for each sender
(S,G) state at intermediate routers
Eg: DVMRP, MOSPF, PIM-DM, PIM-SM
Shared trees
Single tree shared by all members
Data flows on same tree regardless of sender
(*,G) state at intermediate routers
Eg: CBT, PIM-SM
25
Source-based Trees
Router
S Source
R Receiver
R
R
S
R
S
R
26
A Shared Tree
Router
S Source
R Receiver
R
R
S
RP
R
S
R
27
Shared vs. Source-Based Trees
Source-based trees
Shortest path trees – low delay, better load distribution
More state at routers (per-source state)
Efficient in dense-area multicast
Shared trees
Higher delay (bounded by factor of 2), traffic
concentration
Choice of core affects efficiency
Per-group state at routers
Efficient for sparse-area multicast
28
Distance-Vector Multicast Routing
DVMRP consists of two major components:
A conventional distance-vector routing protocol (like
RIP)
A protocol for determining how to forward multicast
packets, based on the unicast routing table
DVMRP router forwards a packet if
The packet arrived from the link used to reach the
source of the packet
Reverse path forwarding check – RPF
If downstream links have not pruned the tree
29
Example Topology
G
G
S
G
30
Flood with Truncated Broadcast
G
G
S
G
31
Prune
G
G
Prune (s,g)
S
Prune (s,g)
G
32
Graft
G
G
G
Report (g)
Graft (s,g)
S
Graft (s,g)
G
33
Steady State
G
G
G
S
G
34
DVMRP limitations
Like distance-vector protocols, affected by
count-to-infinity and transient looping
Shares the scaling limitations of RIP. New
scaling limitations:
(S,G) state in routers: even in pruned parts!
Broadcast-and-prune has an initial broadcast.
No hierarchy: flat routing domain
35
Multicast Backbone (MBone)
An overlay network of IP multicast-capable
routers using DVMRP
Tools: sdr (session directory), vic, vat, wb
R
R
H
R
R
H
R
H
R
Host/router
MBone router
Physical link
Tunnel
Part of MBone
36
MBone Tunnels
A method for sending multicast packets through multicastignorant routers
IP multicast packet is encapsulated in a unicast IP packet (IP-inIP) addressed to far end of tunnel:
IP header,
dest = unicast
IP header,
dest = multicast
Transport header
and data…
Tunnel acts like a virtual point-to-point link
Intermediate routers see only outer header
Tunnel endpoint recognizes IP-in-IP (protocol type = 4)
and de-capsulates datagram for processing
Each end of tunnel is manually configured with unicast address
of the other end
37
Protocol Independent Multicast (PIM)
Support for both shared and per-source trees
Dense mode (per-source tree)
Sparse mode (shared tree)
Similar to DVMRP
Core = rendezvous point (RP)
Independent of unicast routing protocol
Just uses unicast forwarding table
38
PIM Protocol Overview
Basic protocol steps
in RegisterRouters with local members Join
toward Rendezvous Point (RP) to join shared tree
Routers with local sources encapsulate data
messages to RP
Routers with local members may initiate datadriven switch to source-specific shortest path
trees
PIM v.2 Specification (RFC2362)
39
PIM Example: Build Shared Tree
Shared tree after
R1,R2 join
Source 1
Join message
toward RP
(*,G)
RP
(*,G)
(*,G)
(*,G)
(*,G)
(*,G)
Receiver 1
Receiver 2
Receiver 3
40
Data Encapsulated in Register
Unicast encapsulated data packet to RP in Register
Source 1
(*,G)
RP
(*,G)
(*,G)
(*,G)
(*,G)
(*,G)
Receiver 1
Receiver 2
RP de-capsulates, forwards down shared tree
Receiver 3
41
RP Send Join to High Rate Source
Shared tree
Source 1
Join message
toward S1
(S1,G)
RP
Receiver 1
Receiver 2
Receiver 3
42
Build Source-Specific Distribution Tree
Shared Tree
Source 1
Join messages
(S1, G)
(S1,G),(*,G)
RP
(S1,G),(*,G)
(S1,G),(*,G)
Receiver 1
Receiver 2
Build source-specific tree for high data rate source
Receiver 3
43
Forward On “Longest-match” Entry
Shared Tree
Source 1
Source 1
Distribution Tree
(S1,G),(*,G)
(S1, G)
(*, G)
RP
(S1,G),(*,G)
(S1,G),(*,G)
Receiver 1
Receiver 3
Receiver 2
Source-specific entry is “longer match” for source S1 than is Shared
tree entry that can be used by any source
44
Prune S1 off Shared Tree
Shared Tree
Source 1
Distribution Tree
Source 1
Prune S1
RP
Receiver 1
Receiver 2
Prune S1 off shared tree where if S1 and RP entries differ
Receiver 3
45
Reliable Multicast Transport
Problems:
Retransmission can make reliable multicast as
inefficient as replicated unicast
Ack-implosion if all destinations ack at once
Source does not know # of destinations
“Crying baby”: a bad link affects entire group
Heterogeneity: receivers, links, group sizes
Not all multicast applications need strong reliability
of the type provided by TCP.
Some can tolerate reordering, delay, etc
46
Reliability Models
Reliability => requires redundancy to recover
from uncertain loss or other failure modes.
Two types of redundancy:
Spatial redundancy: independent backup copies
Forward error correction (FEC) codes
Problem: requires huge overhead, since the FEC is also
part of the packet(s) it cannot recover from erasure of all
packets
Temporal redundancy: retransmit if packets lost/error
Lazy: trades off response time for reliability
Design of status reports and retransmission optimization
47