Transcript PRESENTATION TITLE/SIZE 30 - Asia Pacific Regional
IP Multicast update Apricot 2006
Toerless Eckert [email protected]
Session Number Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
1
Agenda
•
Network Services
• • Traditional ASM “Any Source Multicast” Source Specific Multicast (SSM) with source redundancy •
IPv4 multicast protocols
• IGMP, PIM-SM/MSDP, Bidir-PIM, PIM SSM, … •
IPv6 multicast
• Addressing, Embedded-RP •
Multicast RPF
• ECMP, MP-BGP, IGP incongruency: one/two topologies •
Reliable multicast transport protocols
• PGM, ALC/Tornado-Codecs – content preprovisioning/nVoD Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP Multicast Network Services
ASM, SSM, Source redundancy
Presentation_ID © 2004, Cisco Systems, Inc. All rights reserved.
IP Multicast Network Services
ASM
IP multicast service models describe how applications can send and receive multicast packets Everything application developers need to know about IP multicast (“protocol stuff is for network operators”)
•
ASM: Classical IP Multicast service (rfc1112, ~1990)
• • • • Called “Any Source Multicast” today Sources send IP multicast packets to a IP multicast group Receivers “join to IP multicast group”. Network will deliver packets sent by any source to an IP multicast group to all receivers that have joined the IP multicast group.
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP multicast services
SSM and Source redundancy
• • •
SSM: Source Specific Multicast (~2000)
• • Source(s) still send IP multicast to IP multicast group address – but called “send packet to (S,G) channel”!
Receivers need to “subscribe to (S,G) channel” – indicate to network not only IP multicast group but also the source(S) !!
• • Network will deliver packets on a per-channel basis only Need application based source discovery mechanisms for multi-source applications
“Redundant IP address” for source-redundancy:
• • Primary target for SSM: “Single-Source” – TV/Audio/Data ”broadcast” applications Require source-redundancy – Use single IP address (with anycast/prioritycase) to avoid dynamic source-discovery
But why SSM, is ASM not good enough or better ?
•
ASM is simpler for application developers !
Reluctance to adopt SSM
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP multicast network services
Issues with ASM – resolved with SSM
•
DoS attacks by unwanted sources
• Receivers can ignore packets, but network resources can only be protected by extensive network source access control == network level application control.
•
Address allocation
• Try to get “global scope” IPv4 multicast address (GLOB, …) – “Oh, let’s do multicast group NAT then…” •
Complexity of protocol operations required
• PIM-SM (Shared trees, shortest path trees, RPT/SPT switchover)/MSDP, RP announcement (AutoRP/BSR), RP placement, RP redundancy • • Operating PIM-SM over core networks (MVPN, Multicast and MPLS) Futures: Bandwidth reservation (RSVP, per group ? Per source ?), Link/Node Protection with PIM-SM •
Scalability, Speed of protocol operations (convergence)
• Operations for both SPT and RPT needed – and their interaction Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP multicast network services Summary
• • •
SSM is a key recent enhancement to IP multicast
• Network operators should be very interested to use it / promote it’s use over ASM (where appropriate) to provider better (manageable/scalable) multicast services
SSM can not replace ASM in all applications
• • • Many-source applications Source-discovery with IP multicast ASM and SSM can coexist
Recent means of improvement / simplification of ASM
• Easier protocols for ASM • • Bidir-PIM (intradomain only today) Easier RP-redundancy (PIM-Anycast-RP, Prioritycast) • IPv6 multicast (address allocation, embedded-RP) Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IPv4 multicast protocols
Presentation_ID © 2004, Cisco Systems, Inc. All rights reserved.
Protocols for IP multicast H
ost to Router signaling: membership reporting
• •
IPv4:IGMP “Internet Group Management Protocol” IPv6:MLD “Multicast Listener Discovery ”
•
MLD
•
Sufficient for ASM
• •
IGMPv3/MLDv2: group and source memberships
• •
Required for SSM, also support ASM No IGMPv2/MLDv1 report suppression:
•
Enables tracking per-receiver on a LAN
•
Enables null leave latency
•
IGMPv3/MLDv2 fully backward compatible (router/host)
•
But not snooping devices – must support IGMPv3/MLDv2!
IGMPv3/MLD support in host != SSM support
•
OS may support IGMPv3, must application may still only signaling group membership report
•
SSM transition solutions available to map group membership reports to (S,G) channel subscriptions (eg: SSM mapping)
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Rcvr Membership Reports Membership Queries Router
Protocols for IP multicast
Host to Router signaling: redundant source reporting
• • •
No IETF standard available
•
“Multicast and Anycast Group Membership” MAGMA) IETF WG never got around working on it … and is now getting isbanded ;-( Pragmatic solution Source announces redundant source address via RIP:
• • • •
Easily done from application (RIP uses UDP) No protocol machinery required – only periodic sending.
Fast periodic sending for fast source failure detection All routers support RIP, but RIP is seldomnly used in production networks
•
Allows router to be configured to easily limit RIP to only redundant source announcements Already used in MPEG video sourcing products Src RIP(v2) Report (UDP) Router
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Protocols for IP multicast
Roadmap? of IP multicast protocol evolution
BGP IGPs Eg: OSPF MOSPF RPF-flooding PIM-DM DVMRP Spanning-Tree-flooding CBT IPv4 PIM-SM / MSDP MP-BGP(SAFI2), AutoRP/BSR Anycast-RP
“
eierlegende Wollmilchsau
” (german) == egg-laying wool-milk-sow (The pig that gives meat, sausage, wool, feathers, milk and eggs) Aka: the universal solution that can do everything.
RPFv4v6:
ASMv6:
Multi-
PIM-SM+
topologies IGP + BGP
Presentation_ID
Embedded-RP
Intra/Interdomain
© 2004 Cisco Systems, Inc. All rights reserved.
ASMv4/v6: Bidir-PIM
Intradomain only
SSMv4v6: PIM-SSM
Intra/Interdomain
Protocols for IP multicast
IPv4 ASM standard model protocols: PIM-SM / MSDP
•
PIM-SM
• • • Shared-Tree and Shortest-Path-Tree Forwarding Efficient traffic pruning in all cases Complex operations: Register-tunnel operations, RPT/SPT switchover, RP placement, RP announcement/RP redundancy •
BSR and AutoRP
• • RP-annoncements RP redundancy •
MSDP
• • For Interdomain connecting of PIM-SM domains • Complex and set of MSDP-RPF rules. Works correctly only if MSDP overlay topology is matched with unicast/BGP routing information For RP redundancy (“MSDP mesh-group”) – no need for MSDP-RPF check.
•
Anycast-RP
• • For RP redundancy with MSDP mesh groups – faster than AutRP/BSR Typically uses static-RP config for RP announcements Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Protocols for IP multicast
IPv4 recent additions to the protocols
• • •
PIM-SSM
• • • Protocol used for SSM: Built P2MP (S,G) SPT’s rooted in the source S No separate spec!. Subset of PIM-SM: No RP or RP = 0.0.0.0 !
Very simple: no RP == no register-tunnel/first-hop-DR, RP placement, announcement, redundancy, no RPT operations, no RPT/SPT switchover • Separate PIM-SSM spec would be 1/10 th of PIM-SM spec ?
Bidir-PIM
• • New PIM family protocol: Shared tree ( (*,G)) tree building only Very good for enterprise applications with many source per group (scalability, convergence)
RP-redundancy
• •
PIM-Anycast-RP:
functions like MSDP mesh-group – without MSDP
Prioritycast-RP
: Eg: for Bidir-PIM – RP redundancy without any protocol between redundant instances – because only one is active Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP Protocols for IP multicast R
edundant IP address policies and support
•
Different policies possible:
•
Anycast : clients connect to the closest instance of redundant IP address
•
Prioritycast : clients connect to the highest priority instance of the redundant IP address
• •
IP multicast: Redundant IP addresses for sources (SSM) and RP (PIM-SM, Bidir-PIM).
• •
Bidir today only supports only one active RP == needs prioritycast.
Sources (video) may have different Quality – prioritycast.
Redundant IP addresses implemented by redistributing them into IGP
• • •
Anycast comes for free (closest instance = SPF) Prioritycast requires engineering.
Elegant solution: Prefixlengths Src A primary 10.2.3.4/32 Src B secondary 10.2.3.4/31 Rcvr 1 Rcvr 2
Example: prioritycast with Prefixlength annuncement
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IPv6 multicast
Presentation_ID © 2004, Cisco Systems, Inc. All rights reserved.
IPv6 multicast Summary
• •
Everything like IPv4…
• • ASM/SSM service models Everything PIM: PIM-SM(PIM-SSM), Bidir-PIM, (PIM-DM), BSR
Except:
•
IPv6 multicast addressing
• Global IPv6 multicast addresses for free (with the purchase of your IPv6 unicast addresses).
• Scoping is easy and free! • • •
MLDv
•
No MSDP
• Use PIM-anycast or Prioritycast-RP for RP redundancy • Use embedded-RP (IPv6 specific!) for Interdomain PIM-SM
No “old BGP4”, but only MP-BGP SAFI2
• Will discuss in RPF section of presentation Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IPv6 Multicast Addresses (RFC 2373)
1111 1111 F F 8 bits 0 Flags P T Scope
Flags
= 128 bits group ID T = 0, permanent IANA groups T= 1, FF1X::/12 -> user groups P proposed for unicast-based assignments 8 bits
Scope
= 1 = Interface-local 2 = Link 4 = Admin-local 5 = Site 8 = Organization E = Global
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IPv6 Unicast Based Multicast Addresses (RFC3306)
•
Solves the old IPv4 address assignment problem: How can I get global IPv4 multicast addresses?
•
In IPv6, if you own an IPv6 unicast address prefix you implicitly own an RFC3306 IPv6 multicast address prefix : 8 4 4 8 8 64 32 FF | Flags | Scope | Rsvd | Plen | Network prefix | Group id
FF3E:0040:3FFE:0C15:C003:1109:0000:1111
3 hex Uni-pfx E hex Global 40 hex Prefix=64
Flags = 00PT, P = 1, T = 1=> Unicast based address
SSM:
Special case of unicast prefix-based addresses P=T=1, plen=0, network prefix=0 FF3x::/ 96 Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Embedded Rendezvous Point
Addresses
8 Multicast Address with Embedded Rendezvous Point Address 4 4 4 4 8 64 32 FF Scope Rsvd Plen Network Prefix Group-ID FF 7 6:0 1 30: 1234:5678:9abc 4321 1234:5678:9abc::1 Resulting Rendezvous Point Address
Presentation_ID • • • •
Special case of Unicast prefix based addresses Flags = 0 R PT, R = 1, P = 1, T = 1=> Rendezvous Point address embedded Rendezvous Point address = network prefix = Rpad Sixteen Rendezvous Point addresses per network prefix
© 2004 Cisco Systems, Inc. All rights reserved.
R
Embedded – Rendezvous Point Usage
•
PIM-SM protocol operations with embedded-Rendezvous Point:
•
No change in PIM-SM protocol operations Just an automatic replacement to static Rendezvous Point configuration
• •
Can replace BSR for Group-to-RP mapping Method requires large IPv6 addresses No equivalent possible in IPv4
•
Not usable for Bidir-PIM either ;-( ASM across single shared PIM domain, one Rendezvous Point S DR RP
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP multicast RPF
Presentation_ID © 2004, Cisco Systems, Inc. All rights reserved.
IP multicast RPF Overview
• • • • • •
RPF – Reverse Path Selection
•
Unlike DVMRP, PIM based multicast tree building does not come with it’s own protocol to determine the shortest paths towards an RP or a source. Instead it relies on unicast routing protocols
• Initial PIM architects assumed that exactly the same routes (IGP/BGP) as for unicast could be used. And then there was reality…
Static multicast routes ECMP (Equal Cost multipath)
• Necessarily per multicast-flow, not per-packet, tailend driven
MP-BGP (MBGP)
• For interdomain incongruency (IPv4/IPv6)
Separate topology for multicast in IGPs
• • When asymmetric metrics are required For cost optimization
Dual topologies for multicast
• For live-live traffic redundancy Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP multicast RPF ECMP – Equal cost multipath
• • • •
Decision in unicast is made by upstream (sending) router Decision in multicast is made during RPF select on downstream router. Router local choice – no network dependency Polarizing: i = ( hash(S) % n)
•
..but good for L3 link bundles – predictable traffic distribution Non-polarizing: i = i | max( hash(S, Nbr-i ))
•
Also stable in case of link failure or unaffected flows.
Polarizing Non-Polarizing
Given
1..n
(eg: 2) ECMPs, if all routers select the same neighbor
I
for a source S, then polarization may happen: A rtr2 will only be joined to by rtr1 for Sources that it’s own ECMP would RPF to rtr4, but never to rtr5!
4 Polarizing 5 6 2 1 3 7
Presentation_ID
Multicast RPF Selection for different source addresses
© 2004 Cisco Systems, Inc. All rights reserved.
IP multicast RPF BGP – IPv4 / IPv6
R
•
IPv4: MP-BGP introduced SAFI:
• • •
SAFI1 = unicast only, SAFI2 = multicast only, SAFI3 = both Traditional implementations only use SAFI2 and non-SAFI BGP4 (unicast) Lazyness (not wanting to have all multicast routes in SAFI2) requires complex route preference rules – prefer shorter prefix SAFI2 over longer prefix non-SAFI2
•
IPv6: Uses latest version of MP-BGP (RFC2858)
•
No non-SAFI route announcements (never defined), no SAFI 3 (removed by IETF) Only SAFI 1 for unicast, SAFI 2 for multicast
•
Should never use SAFI 1 routes for multicast to keep RPF rules simple (and to use MP-BGP as intended by wording of BGP spec)
Presentation_ID
AS1
© 2004 Cisco Systems, Inc. All rights reserved.
AS3 Unicast only AS2 BGP4: 131.188.1.0/24 AS4 BGP SAFI2: 131.188.0.0/16 131.188.1.1
S
IP multicast RPF Separate multicast topology for cost optimization
Src2 Src1 Rcvr Region1 Core POP1 B1 A1 Rcvr WAN Links Rcvr A2 Core POP2 Region2 Rcvr Rcvr B2 Rcvr A3 B3 Rcvr Core POP3 Region3 Rcvr Rcvr Rcvr
• •
Consider simplified example core/distribution network toplogy Core pops have redundant core routers, connectivity via (10Gbps) WAN links, redundant. Simple setup: A/B core routers, A/B links
•
Regions use ring(s) for redundant connectivity
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP multicast RPF Separate multicast topology for cost optimization
Src2 Src1 Rcvr Region1 Core POP1 B1 A1 Rcvr Load splitting across WAN Links A3 Rcvr Core POP3 Region3 Rcvr A2 B3 B2 Rcvr Rcvr Core POP2 Region2 Rcvr Rcvr Rcvr Rcvr
•
IGP metric are set to achieve good load distribution across redundant core.
• •
Manual IGP metric setting and/or tools Assume in the idealized topology cost of 1 on all links.
•
Result: Unicast traffic is load split across redundant core links
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP multicast RPF Separate multicast topology for cost optimization
Src2 Src1 Rcvr Region1 Core POP1 B1 A1 Rcvr Unnecessary use of WAN Links A3 Rcvr Core POP3 Region3 Rcvr A2 B3 B2 Rcvr Rcvr Core POP2 Region2 Rcvr Rcvr Rcvr Rcvr
•
The same metric good for unicast load splitting cause multicast traffic to go unnecessarily across both the A and B WAN links.
•
10 Gbps WAN links, 1..2 Gbs multicast => 10..20% WAN waste (cost factor)
•
Can not resolve problem well without multicast specific topology
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP multicast RPF Separate multicast topology for cost optimization
Src2 Src1 Rcvr Region1 Core POP1 B1 Rcvr Efficient use of WAN Links A1 A3 Rcvr Core POP3 Region3 Rcvr A2 B3 B2 Rcvr Rcvr Core POP2 Region2 Rcvr Rcvr Rcvr Rcvr
•
Simple? to minimize tree costs with a multicast specific topology
•
Manual or tool based
• Presentation_ID
Example toplogy: make B links very expensive for multicast (cost 100), so they are only us as last resort (no A connectivity)
© 2004 Cisco Systems, Inc. All rights reserved.
IP multicast RPF Dual multicast topologies for resiliency
STBs HFC Redundant Encoder/Multiplexer Redundant Decoder / Ad-Inserter/..
•
Send traffic twice to different multicast groups (eg: green = 232.1.0.1, red = 232.1.0.2)
•
Use path separation in network to pass red/green across different paths Note: dual topologies just one solution (VRFs, virtual routers, …)
•
Receivers receive both copies, remove duplicates by sequence numbers (eg: MPEG timestamp).
•
No single network failure will cause any service interruption
•
Same bandwidth allocation needed as in traditional SONET rings, but solution even better: 0 loss instead of 50 msec.
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP multicast RPF Dual multicast topologies for resiliency
•
Traditional application:
• • Market data distribution in finance network (NASDAQ, etc..) Based on two completely separate networks !
•
With two topology solution
• • • No separate physical networks required Can provide different subsets of the network to different classes of traffic.
• Can share links to reduce cost (two unidirection links).
• • Can share nodes to reduce cost.
Vs virtual routers or similar “virtual network”:
• No need for subnet encaps ifor multiple topologies
Vs. RSVP-TE P2MP
• DIffserv type approach (not per flow/tree) Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
IP multicast RPF Dual multicast topologies for resiliency
Rcvr Rcvr
IGP cost in different Topologies:
Rcvr Rcvr Redundant Encoder/Multiplexer
• • •
Rcvr Rcvr Topology sharing of links: Particular useful in rings.
Infinite/large metric Two topologies also useful for unicast (eg: VoD load splitting) Small metric
•
Requires unidirectionaly “infinite” link metric to avoid reconvergence of topologies (if wanted)
• •
Available in ISIS today, not in OSPF Part of OSPF/UDLR draft in IETF
Unicast traffic flows in the reverse direction of unicast
Small metric Infinite/large
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Reliable multicast
transport protocols
Presentation_ID © 2004, Cisco Systems, Inc. All rights reserved.
Reliable Multicast Overview
• •
PGM (Pragmatic Generic Multicast)
•
Near-real-time delivery with TCP compliant flow-control, optional network level scalability support (DLR, NE)
•
NAK-based: Preferred to small to mid-size receiver groups with tight delivery schedules
•
Used in many finance/comerical applications today, supported by Windows-XP and later, etc. Router support little used.
ALC (Asynchronuous Layered Codec)
• •
Non-real-time delivery without any feedback from receivers – can support arbitrary large receiver population (eg: STBs).
Relies on FEC. Interesting with “Tornado Codecs” (large block codecs.
•
Target applications eg: Content Distribution to VoD servers, STB with HD, PC software upgrade, nVoD
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Reliable multicast PGM components
Server PGM (Server/Source) Stack Network Optional PGM functions
DLR: Designated Local Repairer Network Element = Router Assist
Host PGM (Host/Receiver) Stack
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Reliable multicast
content carousels – before ALC/Tornado-Codecs
File Receive Received File
• • •
In a traditional carousel, a file is repeatedly sent, receivers start receiving in the middle receive the tail of the file and then continue to receive until they have received the head of the next iteration.
Works only well if network has little packet loss
Need to potentially receive content for many iterations in face of higher packet loss
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Reliable multicast
content carousels – with ALC/Tornado-Codecs
File
Tornado Encode
Receive arbitrary packets
Tornado Decode
Received File
•
ALC encodes file into eg: 2^32 different packets.
•
Receiver needs to receive just sufficiently many arbitrary packet from encoding to reconstruct file (original file size +5% overhead)
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Reliable multicast
content carousels – with ALC/Tornado-Codecs
•
Allows to carry reliable multicast content as scavenger class traffic (less than best effort).
•
Use free bandwidth in network !
•
Limit of basic carousel:
•
Can only start encoding after whole file is available
•
Not directly usable for real-time transmission
•
Break up file into segments, apply ALC encoding separately, start transmission after first segment.
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Reliable multicast
nVoD – with ALC/Tornado-Codecs
Movie 1. Segment movie: S1 = 1 min, S2 = 2 min , S3 = 4min, 8, 16, … S2 S3 S4 S5 S6 2. Carousel each segment simultaneously ALC encoded At double speed: S1 S2 S3 S3 S4 S5 S6
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
S1 S2 S3 S3 S4 S5 S6
Reliable multicast
nVoD – with ALC/Tornado-Codecs
3. Receiver IGMP joins to one segment at a time. Once segment is fully received, it is decoded and receiver receives next segment. Because segments are transmitted faster than realtime (example: factor 2), playout takes as long as receiving next segment.
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Reliable multicast
nVoD – with ALC/Tornado-Codecs
•
Never more traffic than unicast, never more traffic than 14 unicast user (in example)
• VoD starts after 30 seconds • Total bandwidth in network is 7 segments * double speed == same bandwidth as 14 unicast VOD viewers require.
• As soon as more than 14 user watch same content (independent of their starting time), no more bandwidth is required.
• If less than 14 users watch, bandwidth utilized is still the same as in unicast (because only traffic joined to by IGMP is being forwarded).
• Just transmission is more bursty than unicast !
•
Parameters can easily be varied
•
Beneficial for top 3?% of VoD library
Zipf distribution – majority of market share
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Questions
?
Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.
Presentation_ID © 2003, Cisco Systems, Inc. All rights reserved.