Reliable multicast from end-to-end solutions to active solutions C. Pham RESO/LIP-Univ. Lyon 1, France DEA DIF Nov.

Download Report

Transcript Reliable multicast from end-to-end solutions to active solutions C. Pham RESO/LIP-Univ. Lyon 1, France DEA DIF Nov.

Reliable multicast
from end-to-end solutions
to active solutions
C. Pham
RESO/LIP-Univ. Lyon 1, France
DEA DIF Nov. 13th, 2002
ENS-Lyon, France
Q&A




Q1: How many people in the audience
have heard about multicast?
Q2: How many people in the audience
know basically what multicast is?
Q3: How many people in the audience
have ever tried multicast technologies?
Q4: How many people think they need
multicast?
2
My guess on the answers

Q1: How many people in the audience have
heard about multicast?


Q2: How many people in the audience know
basically what multicast is?


about 40%
Q3: How many people in the audience have
ever tried multicast technologies?


100%
0%
Q4: How many people think they need
multicast?

0%
3
Purpose of this tutorial



Provide a comprehensive overview of
current multicast technologies
Show the evolution in multicast
technologies
Achieve 100%, 100%, 30% and 70% to
the previous answers next time!
4
multicast!
How multicast can change the
way people use the Internet?
multicast!
Everybody's talking
multicast!about
multicast! Really
annoying ! Why would
I need multicast for by
the way?
multicast!
multicast!
multicast!
multicast!
multicast!
multicast!
From unicast…
Sender
Problem
data
Sending same data
data
data
to many receivers
via unicast is
inefficient
data
data
Example
Popular WWW sites
become serious
Receiver
Receiver
bottlenecks

data
Receiver
6
…to multicast on the Internet.
Not n-unicast from
the sender
perspective
Efficient one to
many data
distribution
Towards low
latence, high
bandwidth
Sender

data
IP multicast
data
data
data
Receiver
Receiver
Receiver
7
New applications for the Internet
Think about…









high-speed www
video-conferencing
video-on-demand
interactive TV programs
remote archival systems
tele-medecine, white board
high-performance computing, grids
virtual reality, immersion systems
distributed interactive simulations/gaming…
8
A whole new world for multicast…
9
A very simple example

File replication




10MBytes file
1 source, n receivers (replication sites)
512KBits/s upstream access
n=100


Tx= 4.55 hours
n=1000

Tx= 1 day 21 hours 30 mins!
10
A real example: LHC (DataGrid)
1 TIPS = 25,000 SpecInt95
Online System
~100 MBytes/sec
~PBytes/sec
Bunch crossing per 25 nsecs.
100 triggers per second
Event is ~1 MByte in size
Tier 1
~ 4 TIPS
France Regional
Center
PC (1999) = ~15 SpecInt95
Offline Farm
~20 TIPS
~100 MBytes/sec
~622 Mbits/sec
or Air Freight
Tier 0
UK Regional
Center
Italy Regional
Center
CERN Computer
Center > ~20 TIPS
Fermilab
~2.4 Gbits/sec
Tier 2
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center Tier2 Center
~622 Mbits/sec
Tier 3
Institute
Institute Institute
~0.25TIPS
Physics data cache
Workstations
source DataGrid
100 - 1000
Mbits/sec
Institute
Physicists work on analysis “channels”.
Each institute has ~10 physicists working
on one or more channels
Data for these channels should be
cached by the institute server
11
Multicast for computational grids
application user
from Dorian Arnold: Netsolve Happenings
12
Some grid applications
Astrophysics:
Black holes,
neutron stars,
supernovae
Mechanics:
Fluid dynamic,
CAD, simulation.
Distributed &
interactive
simulations:
DIS, HLA,Training.
Chemistry&biology:
Molecular simulations,
Genomic simulations.
13
Reliable multicast: a big win for
grids
Data replications
Code & data transfers,
interactive job submissions
SDSC IBM SP
1024 procs
5x12x17 =1020
224.2.0.1
Data communications for
distributed applications
(collective & gather
operations, sync. barrier)
NCSA Origin Array
256+128+128
5x12x(4+2+2) =480
Databases, directories
services
CPlant cluster
256 nodes
Multicast address group 224.2.0.1
14
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
15
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Resource Broker:
•
7 sites OK, but
need
•
to send data fast…
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Nobel
Prize is on the way :-)
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• •
•
The phenomenon
was short• but we manage
•
•
to react quickly.
This would have not
been possible
without
efficient
•
•
•
•
•
•
multicast facilities to enable quick
•
•
•
reaction
and fast distribution
of • data.
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• •
•
Congratulations, you
have done
a great
•
•
job, it's •the discovery of the
•
century!!
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••• •
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
From [email protected]
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
We see
something,
•
OK! • Resource
Estimator
but too weak. •
Says need 5TB, 2TF.
•
Please simulate
•
Where can I do this?•
to enhance signal!
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Resource Broker:
• match…
LANL is best
•
•but down for the
moment •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
From reliable multicast to Nobel prize!
•
Wide-area interactive simulations
computer-based
sub-marine simulator
display
INTERNET
battle field simulation
human in the loop
flight simulator
16
The challenges of multicast
SCALABILITY
SCALABILITY
SCALABILITY
SCALABILI
17
Part I
The IP multicast model
A look back in history of multicast

History




Ethernet



Long history of usage on shared medium
networks
Data distribution
Resource discovery: ARP, Bootp, DHCP
Broadcast (software filtered)
Multicast (hardware filtered)
Multiple LAN multicast protocols

DECnet, AppleTalk, IP
19
IP Multicast - Introduction

Efficient one to many data distribution




Location independent addressing


Tree style data distribution
Packets traverse network links only once
replication/multicast engine at the network layer
IP address per multicast group
Receiver-oriented service model





Receivers subscribe to any group
Senders do not know who is listening
Routers find receivers
Similar to television model
Contrasts with telephone network, ATM
20
The Internet group model
multicast/group communications means...


1n
as well as
nm
a group is identified by a class D IP address
(224.0.0.0 to 239.255.255.255)


abstract notion that does not identify any host!
source
site 2
194.199.25.100
source
host_1
Ethernet
host_1
from logical view...
multicast router
host_2
receiver
multicast group
225.1.2.3
multicast router
...to physical view
host_3
host_2
receiver
133.121.11.22
receiver
194.199.25.101
from V. Roca
site 1
Internet
receiver
multicast router
multicast distribution tree
host_3
Ethernet
21
Example: video-conferencing
224.2.0.1
Multicast address group 224.2.0.1
from UREC, http://www.urec.fr
22
The Internet group model... (cont’)

local-area multicast



wide-area multicast



from V. Roca
use the potential diffusion capabilities of the physical
layer (e.g. Ethernet)
efficient and straightforward
requires to go through multicast routers, use
IGMP/multicast routing/...(e.g. DVMRP, PIM-DM, PIM-SM,
PIM-SSM, MSDP, MBGP, BGMP, etc.)
routing in the same administrative domain is simple and
efficient
inter-domain routing is complex, not fully operational
23
IP Multicast Architecture
Service model
Hosts
Host-to-router protocol
(IGMP)
Routers
Multicast routing protocols
(various)
24
Multicast and the TCP/IP layered
model
Application

security
reliability
mgmt
user space
kernel space
congestion
control
other building
blocks
higher-level
services
Socket layer
TCP
ICMP
UDP
IP / IP multicast
IGMP
multicast
routing
device drivers
from V. Roca
25
Internet Group Management
Protocol


IGMP: “signaling” protocol to establish,
maintain, remove groups on a subnet.
Objective: keep router up-to-date with
group membership of entire LAN


Routers need not know who all the members
are, only that members exist
Each host keeps track of which mcast
groups are subscribed to

Socket API informs IGMP process of all
joins
26
IGMP: subscribe to a group (1)
224.2.0.1
224.2.0.1
224.5.5.5
224.2.0.1
Host 1
Host 2
periodically sends
IGMP Query at 224.0.0.1
Host 3
empty
224.0.0.1 reach all multicast host on the subnet
from UREC
27
IGMP: subscribe to a group (2)
somebody has
already
subscribed for
the group
224.2.0.1
Host 1
224.2.0.1
224.5.5.5
224.2.0.1
Host 2
Host 3
Sends Report
for 224.2.0.1
224.2.0.1
from UREC
28
IGMP: subscribe to a group (3)
224.2.0.1
Host 1
224.2.0.1
224.5.5.5
224.2.0.1
Host 2
Host 3
Sends Report
for 224.5.5.5
224.2.0.1
224.5.5.5
from UREC
29
Data distribution example
224.2.0.1
224.2.0.1
Host 1
Host 2
OK
from UREC
224.2.0.1
224.5.5.5
Host 3
224.2.0.1
224.5.5.5
30
IGMP: leave a group (1)
224.2.0.1
224.5.5.5
224.2.0.1
Host 1
Host 2
Host 3
Sends Leave
for 224.2.0.1
at 224.0.0.2
224.2.0.1
224.5.5.5
224.0.0.2 reach the multicast enabled router in the subnet
from UREC
31
IGMP: leave a group (2)
224.2.0.1
224.5.5.5
224.2.0.1
Host 1
Sends IGMP Query
for 224.2.0.1
from UREC
Host 2
Host 3
224.2.0.1
224.5.5.5
32
IGMP: leave a group (3)
Hey,
I'm
still
here!
224.2.0.1
Host 1
224.2.0.1
224.5.5.5
Host 2
Host 3
Sends Report
for 224.2.0.1
224.2.0.1
224.5.5.5
from UREC
33
IGMP: leave a group (4)
224.2.0.1
224.2.0.1
Host 1
Host 2
Host 3
Sends Leave
for 224.5.5.5
at 224.0.0.2
224.2.0.1
224.5.5.5
from UREC
34
IGMP: leave a group (5)
224.2.0.1
224.2.0.1
Host 1
Host 2
Sends IGMP Query for 244.5.5.5
from UREC
Host 3
224.2.0.1
35
Part II
Introducing reliability
User perspective of the Internet
from UREC, http://www.urec.fr
37
Links: the basic element in networks

Backbone links



optical fibers
2.5 to 160 GBits/s with DWDM techniques
End-user access





9.6Kbits/s (GSM) to 2Mbits/s (UMTS) V.90
56Kbits/s modem on twisted pair
64Kbits/s to 1930Kbits/s ISDN access
512Kbits/s to 2Mbits/s with xDSL modem
1Mbits/s to 10Mbits/s Cable-modem
155Mbits/s to 2.5Gbits/s SONET/SDH
38
Routers: key elements of
internetworking

Routers




run routing protocols and build
routing table,
receive data packets and
perform relaying,
may have to consider Quality of
Service constraints for
scheduling packets,
are highly optimized for packet
forwarding functions.
39
The Wild Wild Web
important data
heterogeneity,
link failures,
congested routers
packet loss,
packet drop,
bit errors…
?
40
Multicast difficulties

At the routing level





management of the group address (IGMP)
dynamic nature of the group membership
construction of the multicast tree (DVMRP,
PIM, CBT…)
multicast packet forwarding
At the transport level



reliability, loss recovery strategies
flow control
congestion avoidance
41
Reliability Models


Reliability => requires redundancy to recover
from uncertain loss or other failure modes.
Two types of redundancy:

Spatial redundancy: independent backup copies



Forward error correction (FEC) codes
Problem: requires huge overhead, since the FEC is also
part of the packet(s) it cannot recover from erasure of
all packets
Temporal redundancy: retransmit if packets
lost/error


Lazy: trades off response time for reliability
Design of status reports and retransmission optimization
important
42
Temporal Redundancy Model
Packets
Timeout
Status Reports
• Sequence Numbers
• CRC or Checksum
• ACKs
• NAKs,
• SACKs
• Bitmaps
Retransmissions
• Packets
• FEC information
43
Part III
End-to-end solutions
End-to-end solutions for reliability

Sender-reliable



Sender detects packet losses by gap in ACK
sequence
Easy resource management
Receiver-reliable

Receiver detect the packet losses and send
NACK towards the source
45
Challenge: Reliable multicast
scalability


many problems arise with 10,000 receivers...
Problem 1: scalable control traffic


ACK each data packet (à la TCP)...oops, 10000ACKs/pkt!
NAK (negative ack) only if failure... oops, if pkt is lost
close to src,10000 NAKs!
source
source implosion!
46
Challenge: Reliable multicast
scalability

problem 2: exposure

receivers may receive several time the same
packet
47
One example – SRM
Scalable Reliable Multicast

Receiver-reliable



NACK-based
Not much per-receiver state at the
sender
Every member may multicast NACK or
retransmission
48
SRM (con’t)

NACK/Retransmission suppression




Delay before sending
Based on RTT estimation
Deterministic + Stochastic
Periodic session messages


Sequence number: detection of loss
Estimation of distance matrix among
members
49
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
50
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
51
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
52
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
53
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
54
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
55
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
56
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
57
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
58
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
59
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
60
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
61
SRM Request Suppression
Src
from Haobo Yu , Christos Papadopoulos
62
Deterministic Suppression
3d
d
data
time
2d
session msg
d
d
d
nack
3d
repair
= sender
= repairer
d
4d
d
= requestor
Delay = C1dS,R
from Haobo Yu , Christos Papadopoulos
63
SRM Star Topology
Src
from Haobo Yu , Christos Papadopoulos
64
SRM: Stochastic Suppression
0
d
d
time
data
1
repair
session msg
d
2
NACK
d
3
2d
= sender
Delay = U[0,C2]  dS,R
= repairer
= requestor
from Haobo Yu , Christos Papadopoulos
65
What’s missing?



Losses at link (A,C)
causes
retransmission to
the whole group
Only retransmit to
those members who
lost the packet
[Only request from
the nearest
responder]
from Haobo Yu , Christos Papadopoulos
S
B
A
C
D
sender
E
F
receiver
66
Idea: Perform Local Recovery with
scope limitation

TTL scoped multicast

use the TTL field of IP packets to limit the
scope of the repair packet
Src
TTL=1
TTL=2
TTL=3
67
Example: RMTP



Reliable Multicast Transport Protocol by
Purdue and AT&T Research Labs
Designed for file dissemination (singlesender)
Deployed in AT&T’s billing network
from Haobo Yu , Christos Papadopoulos
68
RMTP: Fixed hierarchy



Rcvrs grouped into local
regions
Rcvr unicasts periodic
ACK to its ACK
Processor (AP), AP
unicasts its own ACK to
its parent
Rcvr dynamically
chooses closest
statically configured
Designated Receiver
(DR) as its AP
S
R3
A
A
A
A DR/AP
from Haobo Yu , Christos Papadopoulos
A
R2
R1
A
R5
R4
A
A
A
A
A
receiver
R*
router
69
RMTP: Error control

DR checks retx “request” periodically

Mcast or unicast retransmission


Based on percentage of requests
Scoped mcast for local recovery
from Haobo Yu , Christos Papadopoulos
70
RMTP: Comments

: Heterogeneity


Lossy link or slow receiver will only affect a
local region
: Position of DR critical

Static hierarchy cannot adapt local
recovery zone to loss points
from Haobo Yu , Christos Papadopoulos
71
Summary: reliability problems

What is the problem of loss recovery?




feedback (ACK or NACK) implosion
replies/repairs duplications
difficult adaptability to dynamic
membership changes
Design goals



reduces the feedback traffic
reduces recovery latencies
improves recovery isolation
72
Summary: end-to-end solutions




ACK/NACK aggregation based on timers
are approximative!
TTL-scoped retransmissions are
approximative!
Not really scalable!
Can not exploit in-network information.
73
Part IV
Active solutions
What is active networks?




Programmable nodes/routers
Customized computations on packets
Standardized execution environment
and programming interface
However, adds extra processing cost
75
Motivations behind active
networking

user applications can implement, and
deploy customized services and
protocols



specific data filtering criteria (DIS, HLA)
fast collective and gather operations…
globally better performances by
reducing the amount of traffic


high throughput
low end-to-end latency
76
Active networks implementations

Discrete approach (operator's approach)



Adds dynamic deployment features in
nodes/routers
New services can be downloaded into
router's kernel
Integrated approach



Adds executable code to data packets
Capsule = data + code
Granularity set to the packets
77
The discrete approach

Separates the injection of programs
from the processing of packets
A1
A
2
active code A1
active code A2
Data
Data
78
The integrated approach

User packets carry code to be applied
on the data part of the packet
data
code
data
data
data
data

code
High flexibility to define new services
79
An active router
AL packet
IP input processing
some layer for executing code.
Let's call it Active Layer
IP output processing
IP packet
IP packet
Filter Action
Routing
agent
Forwarding
table
Packet scheduler
IP output processing
IP packet
IP packet
Packet scheduler
80
Where to put active components?

In the core network?



routers already have to process millions of
packets per second
gigabit rates make additional processing
difficult without a dramatic slow down
At the edge?


to efficiently handle heterogeneity of user
accesses
to provide QoS, implement intelligent
congestion avoidance mechanisms…
81
Users' accesses
residentials
offices
PSTN
ADSL
Cable
…
Internet
Data
Center
metro ring
Network Provider
campus
Network Provider
Internet
82
Solutions

Traditional




end-to-end retransmission schemes
scoped retransmission with the TTL fields
receiver-based local NACK suppression
Active contributions





cache of data to allow local recoveries
feedback aggregation
subcast
early lost packet detection
…
83
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
10 human years (means much more in computer year)
•
•
•
•
•
•
Application-based
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
HBM
•
•
•
•
•
•
•
•
•
YOID•
•
•
•
•
•
••
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
MTP
••
•
•
•
•
•
•
•
FLID
•
•
•
•
•
•
•
••
•
•
ALMI
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
AFDP
•
•
•
RLC
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
RLM
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• •
•
•
•
•
•
•
•
•
•
•
••
•
•
PGM
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
AER•
•
•
•
•
•
•
•
•
•
RMANP
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
••
•
•
•
•
•
•
•
CIFL•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
ARM
•
•
•
•
•
•
•
DyRAM
•
•
•
••
•
•
•
•
•
•
•
•
Layered/FEC
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
• •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
XTP
•
••
•
•
••
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
Router supported,
active networking
•
•
•
•
•
•
•
•
•
•
•
RMF
•
•
•
•
•
•
•
•
•
•
•
LBRM
•
•
•
•
•
•
•
•
•
•
••• •
•
•
•
•
•
•
•
•
•
•
•
•
•
SRM
•
•
•
•
•
•
•
•
•
•
•
•
•
LMS
•
•
•
•
•
•
••
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
End to End
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
RMTP
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
TRAM
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
Logging server/replier
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
The
reliable
multicast
universe
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•

•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
••
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• •
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•

•
•
•
•
•
••• •
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
PGM
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
RMANP
•
•
•
•
•
AER
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
ARM
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Routers have specific
functionalities/services
for supporting multicast
flows.
Active networking goes a
step further by opening
routers to dynamic code
provided by end-users.
Open new perspectives
for efficient in-network
services and rapid
deployment.
••
•
•
•
•
DyRAM
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
••
•
••
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Router supported,
active networking
•
•
•
•
•
•
•
•
•
•
•
••
•
•
•
•
A step toward active services:
LBRM
86
Active local recovery


routers perform cache of data packets
repair packets are sent by routers,
when available
data
data
data5
data1
data2
data3
data4
data5
data1
data2
data3
data4
data5
data1
data2
data3
data5
87
Global NACKs suppression
data4
only one NACK is
forwarded to the
source
88
Local NACKs suppression
data
NACK
89
Early lost packet detection
The repair latency can be reduced if the lost
packet could be requested as soon as possible
data3
data4
data5
A NACK is sent by the
router
These NACKs are ignored!
90
Active subcast features

Send repair packet only to the relevant
set of receivers
91
The DyRAM framework
(Dynamic Replier Active Reliable Multicast)

Motivations for DyRAM



low recovery latency using local recovery
low memory usage in routers : local recovery
is performed from the receivers (no cache
in routers)
low processing overheads in routers : light
active services
92
DyRAM's main active services

DyRAM is NACK-based with…




Global NACK suppression
Early packet loss detection
Subcast of repair packets
Dynamic replier election
93
Replier election


A receiver is elected to be a replier for
each lost packet (one recovery tree per
packet)
Load balancing can be taken into account
for the replier election
94
Replier election and repair
NAK 2 from link 1
subcast
NAK 2 from link 2
D0
DyRAM
IP multicast
2
NAK 2
0
1
Repair 2
NAK 2,@
D1
Repair 2
DyRAM
Repair 2
IP multicast
R1
NAK 2
1
0
IP multicast
NAK 2,@
NAK 2,@
IP multicast
NAK 2
R4
R3
IP multicast
Repair 2
R2
R5
R6
R7
The DyRAM framework for grids
The backbone is very
fast so nothing else
than fast forwarding
functions.
source
1000 Base FX
active router
Any receiver can be
elected as a replier
for a loss packet.
• Nacks suppresion
• Subcast
• Loss detection
active router
core network
Gbits rate
active router
100 Base FX
active router
active router
•Nacks suppression
•Subcast
•Replier election
A hierarchy of active
routers can be used
for processing specific
functions at different
layers of the hierarchy.
Network model
10 MBytes file transfer
Source router
97
Local recovery from the receivers
4 receivers/group

#grp: 6…24
Local recoveries
reduces the endto-end delay
(especially for high
loss rates and a
large number of
receivers).
p=0.25
98
Local recovery from the receivers

As the group size
increases, doing
the recoveries
from the receivers
greatly reduces the
bandwidth
consumption
48 receivers distributed in g groups  #grp: 2…24
99
DyRAM vs ARM

ARM performs
better than
DyRAM only for
very low loss
rates and with
considerable
caching
requirements
100
Simulation results
4 receivers/group
#grp: 6…24
simulation results
very close to those
of the analytical
study
EPLD is very
beneficial
to DyRAM
p=0.25
101
DyRAM implementation
Preliminary experimental results
Testbed configuration




TAMANOIR active execution
environment
Java 1.3.1 and a linux kernel 2.4
A set of PCs receivers and 2 PC-based
routers ( Pentium II 400 MHz 512 KB
cache 128MB RAM)
Data packets are of 4KB
Packets format : ANEP
Data /Repair packets
S@IP D@IP SVC
SEQ
isR
NACK packets
S@IP D@IP SVC
SEQ S@IP
Payload
The data path
FTP
TAMANOIR
S
S1
S,@IP
FTP port
data
Tamanoir port
UDP
IP
IP UDP S,@IP
data
ANEP packet
Router’s data structures

The track list TL which maintains for each
multicast session,




lastOrdered : the sequence number of the last
received packet in order
lastReceived : the sequence number of the last
received data packet
lostList : list of not received data packets in
between.
The Nack structure NS that keeps for each
lost data packet,


seq : sequene number of the lost data packet
subList : list of IP addresses of the downstream
receivers (active routers) that have lost it.
The first configuration
ike
resama
resamo
stan
resamd
Active service costs



NACK : 135μs
DP : 20μs if no
seq gap, 12ms17ms
otherwise.
Only 256μs
without timer
setting
Repair : 123μs
The second configuration
ike
resamo
The replier election cost
The election is
performed on-the-fly.
It depends on the
number of downstream
links.
ranges from 0.1 to 1ms
for 5 to 25 links per
router.
Conclusions





Reliability on large-scale multicast is
difficult.
End-to-end solutions have to face many
critical problems with approximative solutions
Active services can provide more efficient
solutions.
The main DyRAM design goal is reducing the
end-to-end latencies using active services
Preliminary results are very encouraging
111