Best Current Practices for IPv4 Multicast Deployment Bill Nickless [email protected] http://www.mcs.anl.gov/home/nickless Presented by: Bill Nickless.

Download Report

Transcript Best Current Practices for IPv4 Multicast Deployment Bill Nickless [email protected] http://www.mcs.anl.gov/home/nickless Presented by: Bill Nickless.

Best Current Practices for IPv4
Multicast Deployment
Bill Nickless
[email protected]
http://www.mcs.anl.gov/home/nickless
Presented by: Bill Nickless
What is Multicast?
• A multicast sender simply sends its data,
and intervening routers "conspire" to get
the data to all interested listeners.
(S. Deering)
• Destination of IP multicast packets is a
“Group” address, within 224.0.0.0/4.
Presented by: Bill Nickless
Notation
•
•
•
•
•
Specific source address(es):
Specific group address(es):
Specific source traffic for a group:
All sources traffic for a group:
Rendezvous Point
Presented by: Bill Nickless
S
G
(S,G)
(*,G)
RP
Any Source Multicast
• Senders send multicast group-addressed packets.
• Receivers register their interest in groups by way of
IGMPv2 (*,G) Joins
• Network keeps track of all senders for each group,
and delivers packets from all senders to each
interested Receiver.
Presented by: Bill Nickless
Source Specific Multicast
• Senders send multicast group-addressed packets.
• Receivers register their interest in specific sources
sending to specific groups by way of IGMPv3 (S,G)
Joins (well, group membership reports….)
• Receivers are responsible for specifying which
Senders’ traffic they want to receive.
Presented by: Bill Nickless
Reachability
NOT DEFINED
BY INTERNET
STANDARDS
Presented by: Bill Nickless
Reachability (Where To?)
• NOT DEFINED BY INTERNET STANDARDS
• Unicast reachability is interpreted by implementation
and practice as: Send me IP packets with destination
addresses that match this advertisement.
• Think ‘show ip route’
Presented by: Bill Nickless
Reachability (Whence?)
• NOT DEFINED BY INTERNET STANDARDS
• Multicast reachability is interpreted by implementation
and practice as: Here’s where to get IP packets from
sources that match this advertisement.
• Think ‘show ip rpf’
Presented by: Bill Nickless
Reachability Examples
terra% netstat –rn
Kernel IP routing table
Destination
Gateway
140.221.11.103 0.0.0.0
140.221.8.0
0.0.0.0
127.0.0.0
0.0.0.0
224.0.0.0
0.0.0.0
0.0.0.0
140.221.11.253
Presented by: Bill Nickless
Genmask
255.255.255.255
255.255.252.0
255.0.0.0
240.0.0.0
0.0.0.0
Flags
UH
U
U
U
UG
Iface
eth0
eth0
lo
eth0
eth0
Reachability Examples
Kiwi#show ip route 140.221.11.103
Routing entry for 140.221.8.0/22
Known via "ospf 683", distance 110, metric 1117,
type intra area
Last update from 140.221.20.124
on GigabitEthernet5/0, 03:35:56 ago
Routing Descriptor Blocks:
* 140.221.20.124, from 140.221.47.6, 03:35:56 ago,
via GigabitEthernet5/0
Route metric is 1117, traffic share count is 1
Presented by: Bill Nickless
Reachability Examples
Kiwi#show ip rpf 140.221.11.103
RPF information for terra.mcs.anl.gov (140.221.11.103)
RPF interface: GigabitEthernet5/0
RPF neighbor: stardust-msfc-20.mcs.anl.gov
(140.221.20.124)
RPF route/mask: 140.221.8.0/22
RPF type: unicast (ospf 683)
RPF recursion count: 0
Doing distance-preferred lookups across tables
Presented by: Bill Nickless
The Old MBONE
• Excellent first approximation.
• Used tunnels to encapsulate multicast traffic over
unicast paths.
• Routing done by user-space daemons running on
general purpose Unix boxes.
• Internet Group Management Protocol (IGMP)
(Think Multicast ARP)
• Pre-dates the World Wide Web (hence SDR)
Presented by: Bill Nickless
Lessons Learned from MBONE
• Distance Vector Metric Routing Protocol
(DVMRP) does not scale
– Easy to create IP Multicast “amplifiers”.
– Separate tunneled routing infrastructure not aligned
with modern BGP Internetworking.
• Flood & Prune does not scale
– Examples: PIM-Dense Mode, DVMRP.
– Not sensitive to available bandwidth.
– Requires downstream routers that are smart and
powerful enough to send prune messages.
Presented by: Bill Nickless
Applying Those Lessons
• Multicast Border Gateway Protocol.
– Provides reachability and policy control for multicast
routing, just as BGP does for unicast.
• Protocol Independent Multicast
(Sparse Mode)
– Listeners receive traffic only when requested.
– Forms multicast distribution trees.
• Multicast Source Discovery Protocol
– Finding active sources in other PIM Sparse Mode
domains (usually other ASes).
Presented by: Bill Nickless
Setting Reachability Policy:
Multicast Border Gateway Protocol
• RFC 2283 adds the MP_REACH_NLRI attribute to BGP-4.
– Identifies a BGP route as unicast, multicast, or both
• When implemented in a router, all the standard BGP
machinery is available for prefix filtering, preference setting,
MEDs, AS length comparisons, etc.
• M-BGP routes can be independent of BGP, allowing for
different inter-AS unicast/multicast reachability.
Presented by: Bill Nickless
Cisco M-BGP Configuration
router bgp 683
network 130.202.0.0 nlri unicast multicast
network 140.221.0.0 nlri unicast multicast
neighbor 192.5.170.130 remote-as 145
nlri unicast multicast
neighbor 192.5.170.130 description vBNS
neighbor 192.5.170.130 soft-reconfiguration
inbound
neighbor 192.5.170.130 route-map
from-vbns-lp-400 in
neighbor 192.5.170.130 route-map
to-vbns-med-10 out
Presented by: Bill Nickless
Cisco M-BGP Configuration
route-map from-vbns-lp-400 permit 10
match nlri unicast
set local-preference 400
!
route-map from-vbns-lp-400 permit 15
match as-path 145
match nlri multicast
set local-preference 400
!
route-map to-vbns-med-10 permit 10
match ip address 50
set metric 10
Presented by: Bill Nickless
Cisco M-BGP Configuration
access-list 50 permit 140.221.0.0
access-list 50 permit 130.202.0.0
!
ip as-path access-list 145 deny _24_
ip as-path access-list 145 deny _293_
ip as-path access-list 145 deny _11537_
ip as-path access-list 145 permit .*
Presented by: Bill Nickless
Juniper M-BGP Configuration
routing-options {
rib inet.2 {
static {
route 141.142.0.0/16 reject;
route 141.142.109.0/25 next-hop 141.142.11.74;
route 141.142.109.128/25 next-hop 141.142.11.74;
route 141.142.104.0/24 next-hop 141.142.11.74;
route 141.142.105.0/24 next-hop 141.142.11.74;
route 141.142.108.0/24 next-hop 141.142.11.74;
}
}
}
Presented by: Bill Nickless
Juniper M-BGP Configuration
routing-options {
rib-groups {
ifrg {
import-rib
}
mcrg {
export-rib
import-rib
}
igp-rg {
export-rib
import-rib
}
}
}
Presented by: Bill Nickless
[ inet.0 inet.2 ];
inet.2;
inet.2;
inet.0;
[ inet.0 inet.2 ];
Juniper M-BGP Configuration
protocols {
bgp {
group anl {
import [ bgp-anl-accept reject-all ];
family inet {
any;
}
export [ bgp-announce-ncsa reject-all ];
peer-as 683;
neighbor 206.220.243.21;
}
}
Presented by: Bill Nickless
Monitoring M-BGP (Cisco)
Kiwi#show ip mbgp sum
BGP router identifier 192.5.170.2, local AS number 683
MBGP table version is 324285
4121 network entries and 12621 paths using 862335 bytes of memory
Neighbor
192.5.170.130
Up/Down
5d14h
V
4
AS MsgRcvd MsgSent
145
53420
20497
State/PfxRcd
346
Presented by: Bill Nickless
TblVer
324285
InQ OutQ
0
0
Kiwi#show ip mbgp 128.163.3.214
MBGP routing table entry for 128.163.0.0/16, version 323761
Paths: (3 available, best #2)
24 145 10490 10437, (aggregated by 10437 128.163.55.253),
(received-only)
192.12.123.10 from 192.12.123.10 (198.10.80.66)
Origin IGP, localpref 100, valid, external, atomic-aggregate
145 10490 10437, (aggregated by 10437 128.163.55.253)
192.5.170.130 from 192.5.170.130 (204.147.135.241)
Origin IGP, localpref 400, valid, external,
atomic-aggregate, best
145 10490 10437, (aggregated by 10437 128.163.55.253),
(received-only)
192.5.170.130 from 192.5.170.130 (204.147.135.241)
Origin IGP, localpref 100, valid, external, atomic-aggregate
Presented by: Bill Nickless
Monitoring M-BGP (Juniper)
nickless@charlie> show bgp neighbor 206.220.243.21
Peer: 206.220.243.21+179 AS 683 Local: 206.220.243.160+1969 AS 1224
[. . .]
NLRI advertised by peer: inet-unicast inet-multicast
NLRI for this session: inet-unicast inet-multicast
Peer supports Refresh capability (2)
Table inet.0 Bit: 10006
Active Prefixes: 13
Received Prefixes: 13
Suppressed due to damping: 0
Table inet.2 Bit: 20006
Active Prefixes: 9
Received Prefixes: 9
Suppressed due to damping: 0
Presented by: Bill Nickless
nickless@charlie> show route table inet.2 140.221.34.1
inet.2: 5046 destinations, 5046 routes (5045 active, 0 holddown, 1 hidden)
+ = Active Route, - = Last Active, * = Both
140.221.0.0/16
Presented by: Bill Nickless
*[BGP/170] 2w5d 19:24:04, MED 0, localpref 1000
AS path: 683 I
> to 206.220.243.21 via at-1/0/0.683
[BGP/170] 3d 04:38:22, MED 0, localpref 60
AS path: 11537 683 I
> to 141.142.11.246 via so-2/2/0.0
[BGP/170] 1w0d 11:18:35, localpref 60
AS path: 145 683 I
> to 141.142.11.1 via at-1/0/0.145
[BGP/170] 2w5d 19:23:42, localpref 60
AS path: 38 683 I
> to 192.17.8.32 via at-1/0/0.38
[BGP/170] 4d 05:55:21, MED 5, localpref 20
AS path: 2914 683 I
> to 192.17.8.34 via at-1/0/0.2914
PIM Sparse Mode
• RFC 2362 defines PIM Sparse Mode.
• No PIM-SM activity until:
– A host starts transmitting traffic (or)
– A host subscribes to a group.
• A Rendezvous Point (RP) is the root of the shared
distribution tree for multicast traffic within a PIM Domain.
• Given enough traffic, a source-based distribution tree is
created. (Enough is typically anything greater than zero).
• Inter-PIM Domain distribution trees are all source-based.
Presented by: Bill Nickless
PIM Sparse Mode
Presented by: Bill Nickless
Multicast Session Discovery
Protocol (MSDP)
• Not yet an RFC (in Last Call stage). See
http://www.ietf.org/html.charters/msdp-charter.html
and
ftp://ftp.ietf.org/internet-drafts/
draft-ietf-msdp-spec-09.txt
• Currently only covers IPv4.
• PIM-SM RPs communicate through MSDP to find active
multicast sources.
• If “interested”, the RP initiates a PIM-SM Join towards each
active source.
Presented by: Bill Nickless
Reachability Redux
• A BGP NLRI=Multicast route is a statement of reachability.
• Inter-domain PIM-Sparse Mode Joins follow the BGP
reachability topology.
• MSDP forwarding between RPs follows the BGP
reachability topology.
• Not doing MSDP where you do M-BGP means that you’ve
formed an MSDP “black hole”.
Presented by: Bill Nickless
Cisco PIM-SM w/ MSDP
Configuration
•
interface ATM3/0.145 point-to-point
description vBNS MBGP+PIM-SM+MSDP
ip address 192.5.170.129 255.255.255.252
ip pim border
ip pim sparse-mode
ip multicast ttl-threshold 32
ip multicast boundary 10
ip
ip
ip
ip
ip
ip
ip
msdp
msdp
msdp
msdp
msdp
msdp
msdp
peer 204.147.128.141
description 204.147.128.141 vBNS
sa-filter in 204.147.128.141 list 111
sa-filter out 204.147.128.141 list 111
sa-request 204.147.128.141
ttl-threshold 204.147.128.141 32
cache-sa-state
Presented by: Bill Nickless
•
access-list
access-list
access-list
access-list
10
10
10
10
•
access-list
access-list
access-list
access-list
access-list
access-list
access-list
access-list
access-list
access-list
access-list
access-list
access-list
access-list
access-list
111
111
111
111
111
111
111
111
111
111
111
111
111
111
111
Presented by: Bill Nickless
deny
deny
deny
permit
deny
deny
deny
deny
deny
deny
deny
deny
deny
deny
deny
deny
deny
deny
permit
224.0.1.39 ! CISCO-RP-ANNOUNCE.MCAST.NET
224.0.1.40 ! CISCO-RP-DISCOVERY.MCAST.NET
239.0.0.0 0.255.255.255
224.0.0.0 15.255.255.255
ip
ip
ip
ip
ip
ip
ip
ip
ip
ip
ip
ip
ip
ip
ip
any host 224.0.2.2 ! SUN-RPC.MCAST.NET
any host 224.0.1.3 ! RWHOD.MCAST.NET
any host 224.0.1.24 ! MICROSOFT-DS.MCAST.NET
any host 224.0.1.22 ! SVRLOC.MCAST.NET
any host 224.0.1.2 ! SGI-DOG.MCAST.NET
any host 224.0.1.35 ! SVRLOC-DA.MCAST.NET
any host 224.0.1.60 ! HP-DEVICE-DISC.MCAST.NET
any host 224.0.1.39 ! CISCO-RP-ANNOUNCE.MCAST.NET
any host 224.0.1.40 ! CISCO-RP-DISCOVERY.MCAST.NET
any 239.0.0.0 0.255.255.255
10.0.0.0 0.255.255.255 any
127.0.0.0 0.255.255.255 any
172.16.0.0 0.15.255.255 any
192.168.0.0 0.0.255.255 any
any
Juniper PIM-SM w/ MSDP Config
protocols {
pim {
rib-group mcrg;
rp {
local {
address 141.142.12.1;
}
}
interface all {
mode sparse;
version 2;
}
}
}
Presented by: Bill Nickless
Juniper PIM-SM w/ MSDP Config
protocols {
msdp {
rib-group mcrg;
group anl {
/* kiwi-loop.anchor.anl.gov */
peer 192.5.170.2 {
local-address 141.142.12.1;
}
}
}
}
Presented by: Bill Nickless
Monitoring MSDP and PIM-Sparse
• Verify that MSDP session has come up with your
peer:
Kiwi#show ip msdp sum
MSDP Peer Status Summary
Peer Address
AS
State
204.147.128.141
145
Up
Uptime/ Reset Peer Name
Downtime Count
1d12h
11
cs.dng.vbns.net
nickless@charlie> show msdp peer 192.5.170.2
Peer address
Local address
State
Last up/down Peer-Group
192.5.170.2
141.142.12.1
Established
2w5d18h anl
Presented by: Bill Nickless
Monitoring MSDP and PIM-Sparse
• Verify that active sources are being discovered:
Kiwi#show ip msdp sa-cache 224.2.177.155
MSDP Source-Active Cache - 4020 entries
(128.197.160.27, 224.2.177.155), RP 204.147.128.141,
MBGP/AS 145,
03:40:18/00:05:03
[…etc]
nickless@charlie> show msdp source-active group
Group address
Source address Peer address
233.2.171.1
140.221.34.1
141.142.11.246
192.5.170.2
192.17.8.32
204.147.128.141
Presented by: Bill Nickless
233.2.171.1
Originator
192.5.170.2
192.5.170.2
192.5.170.2
192.5.170.2
Flags
Accept
Accept
Accept
Accept
Monitoring MSDP and PIM-Sparse
• Verify that you are receiving traffic from those
active sources, and are forwarding:
Kiwi#show ip mroute count 224.2.177.155 128.163.3.214
Forwarding Counts: Pkt Count/Pkts per second/
Avg Pkt Size/Kilobits per second
Other counts: Total/RPF failed/
Other drops(OIF-null, rate-limit etc)
Group: 224.2.177.155, Source count: 26,
Group pkt count: 31060731
RP-tree: Forwarding: 159/0/429/0, Other: 72/0/0
Source: 128.163.3.214/32, Forwarding: 7089/0/480/0,
Other: 6/0/0
Presented by: Bill Nickless
Kiwi#show ip mroute 224.2.177.155 128.163.3.214
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, C - Connected, L - Local,
P - Pruned R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
(128.163.3.214, 224.2.177.155), 03:55:28/00:03:22, flags: MT
Incoming interface: ATM3/0.145, RPF nbr 192.5.170.130, Mbgp
Outgoing interface list:
ATM0/0.216, Forward/Sparse, 03:55:28/00:03:08
ATM0/0.200, Forward/Sparse, 03:55:28/00:02:04
Presented by: Bill Nickless
nickless@charlie> show multicast route group 233.2.171.1 \
source-prefix 140.221.34.1 extensive
Group
Source prefix
Act Pru NHid Packets
IfMismatch T/O
233.2.171.1
140.221.34.1
/32 A
F 68
1829657
0
355
Upstream interface: at-1/0/0.683
Session name: Static Allocations
nickless@charlie> show multicast route group 233.2.171.1 \
source-prefix 140.221.34.1 extensive
Group
Source prefix
Act Pru NHid Packets
IfMismatch T/O
233.2.171.1
140.221.34.1
/32 A
F 68
1830512
0
355
Upstream interface: at-1/0/0.683
Session name: Static Allocations
Presented by: Bill Nickless
nickless@charlie> show pim join 233.2.171.1 extensive
Group
Source
RP
Flags
[. . .]
233.2.171.1
140.221.34.1
sparse,spt-pending
Upstream interface: at-1/0/0.683
Upstream State: Local RP, Join to Source
Downstream Neighbors:
Interface: ge-1/1/0.103
141.142.0.14
State: Join
Flags: S
Timeout: 182
Interface: gr-1/2/0.0
141.142.11.74
State: Join
Flags: S
Timeout: 208
Presented by: Bill Nickless
Other Tips
• ATM peerings are best done with point-to-point
subinterfaces. (What’s a Designated Router in the
context of an ATM exchange point, anyway?)
• MSDP Source Actives are made from PIM Register
messages. If you’re not sending MSDP SA
messages for a source, you may have a problem
with the Designated Router for that source.
Presented by: Bill Nickless
More Tips
• MSDP encapsulates data in its Source Active
messages (just like they were encapsulated in the
PIM Sparse Mode Register messages). This was
done primarily to support SDR.
• It is possible for MSDP to work while PIM-SM is not
working, so you can’t always count on SDR to verify
multicast routing.
Presented by: Bill Nickless
Debugging Multicast
• You must have:
– at least one constantly active source
– at least one constantly active receiver
• Start near the receiver
– Identify the PIM-SM Designated Router
– Verify IGMP state in the Designated Router
– Look for (S,G) state in the Designated Router
Presented by: Bill Nickless
Debugging Multicast
• Follow the Reverse Path Forwarding (RPF) from the
Designated Router back towards the source
• Verify PIM-SM has been configured on each
interface along the RPF, because that determines
the forwarding tree topology.
• Check (S,G) state in each router.
• Check (S,G) counters in each router.
Presented by: Bill Nickless
Debugging Multicast
• If the source is external to your PIM Domain:
– Verify that you have an MSDP SA for that source.
– Verify that the M-BGP Next Hop is:
• A PIM Sparse Mode neighbor
• An MSDP peer
– Verify that you’re actually choosing the NLRI=Multicast
route as your preferred RPF path. (hello BGP distance)
Presented by: Bill Nickless
Debugging Multicast
• What if nobody can hear your source?
– Verify that the (S,G) shows up at your RP.
– Verify that your RP is MSDP announcing the source, and
that it shows up in your peer’s MSDP SA cache.
– Verify your PIM-SM adjacency with your peer.
– Verify that you have your peer’s interface in the outgoing
list for the (S,G).
– Verify that packet counters show traffic going out.
Presented by: Bill Nickless
The Beacon: Test Signal
• Testing Multicast requires active sessions
• http://dast.nlanr.net/projects/beacon
• In Java, so runs
anywhere
Presented by: Bill Nickless
The Beacon: Issues
• Shows current state only.
– Archive state over time?
– How to visualize evolving state? Inherently a 3dimensional problem, since state is 2D already.
• Server scaling problems with O(40) beacons.
– Currently seeing O(70) beacons at any time.
• Assumes Any Source Multicast model.
Presented by: Bill Nickless
Core Multicast Building Blocks
• M-BGP: RFC 2283 is implemented by Juniper and
Cisco in all major releases. AG community has
used Juniper/Cisco the most.
• MSDP: Implemented by Juniper, Cisco, Foundry...
• PIM-Sparse Mode: RFC 2362 is implemented by a
whole raft of vendors, including Cisco, Juniper,
Foundry, Extreme, Marconi, etc.
Presented by: Bill Nickless
Edge Multicast Building Blocks
• IGMPv2 is widely available in Layer 2 and Layer 3
devices, and in most host operating systems.
• IGMPv3 is coming soon to support SSM:
– Available in Layer 3 devices from Cisco and Juniper.
– IGMPv3 will be available in Windows XP (Whistler).
– Ugly hack workarounds exist (URD et al).
Presented by: Bill Nickless
North American IP Multicast Status
• ESNet, Abilene, vBNS+, and NREN all running M-BGP,
MSDP, and PIM-SM amongst themselves and with their
customers/peers.
• Regional and Institutional networks are currently the
most common stumbling blocks for multicast apps.
• STARTAP in Chicago is an international IP multicast
meeting point.
• International / commercial networks are coming online.
Presented by: Bill Nickless