PRESENTATION TITLE/SIZE 30 - Asia Pacific Regional

Download Report

Transcript PRESENTATION TITLE/SIZE 30 - Asia Pacific Regional

IP Multicast update Apricot 2006

Toerless Eckert [email protected]

Session Number Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

1

Agenda

Network Services

• • Traditional ASM “Any Source Multicast” Source Specific Multicast (SSM) with source redundancy •

IPv4 multicast protocols

• IGMP, PIM-SM/MSDP, Bidir-PIM, PIM SSM, … •

IPv6 multicast

• Addressing, Embedded-RP •

Multicast RPF

• ECMP, MP-BGP, IGP incongruency: one/two topologies •

Reliable multicast transport protocols

• PGM, ALC/Tornado-Codecs – content preprovisioning/nVoD Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP Multicast Network Services

ASM, SSM, Source redundancy

Presentation_ID © 2004, Cisco Systems, Inc. All rights reserved.

IP Multicast Network Services

ASM

IP multicast service models describe how applications can send and receive multicast packets Everything application developers need to know about IP multicast (“protocol stuff is for network operators”)

ASM: Classical IP Multicast service (rfc1112, ~1990)

• • • • Called “Any Source Multicast” today Sources send IP multicast packets to a IP multicast group Receivers “join to IP multicast group”. Network will deliver packets sent by any source to an IP multicast group to all receivers that have joined the IP multicast group.

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP multicast services

SSM and Source redundancy

• • •

SSM: Source Specific Multicast (~2000)

• • Source(s) still send IP multicast to IP multicast group address – but called “send packet to (S,G) channel”!

Receivers need to “subscribe to (S,G) channel” – indicate to network not only IP multicast group but also the source(S) !!

• • Network will deliver packets on a per-channel basis only Need application based source discovery mechanisms for multi-source applications

“Redundant IP address” for source-redundancy:

• • Primary target for SSM: “Single-Source” – TV/Audio/Data ”broadcast” applications Require source-redundancy – Use single IP address (with anycast/prioritycase) to avoid dynamic source-discovery

But why SSM, is ASM not good enough or better ?

ASM is simpler for application developers !

Reluctance to adopt SSM

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP multicast network services

Issues with ASM – resolved with SSM

DoS attacks by unwanted sources

• Receivers can ignore packets, but network resources can only be protected by extensive network source access control == network level application control.

Address allocation

• Try to get “global scope” IPv4 multicast address (GLOB, …) – “Oh, let’s do multicast group NAT then…” •

Complexity of protocol operations required

• PIM-SM (Shared trees, shortest path trees, RPT/SPT switchover)/MSDP, RP announcement (AutoRP/BSR), RP placement, RP redundancy • • Operating PIM-SM over core networks (MVPN, Multicast and MPLS) Futures: Bandwidth reservation (RSVP, per group ? Per source ?), Link/Node Protection with PIM-SM •

Scalability, Speed of protocol operations (convergence)

• Operations for both SPT and RPT needed – and their interaction Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP multicast network services Summary

• • •

SSM is a key recent enhancement to IP multicast

• Network operators should be very interested to use it / promote it’s use over ASM (where appropriate) to provider better (manageable/scalable) multicast services

SSM can not replace ASM in all applications

• • • Many-source applications Source-discovery with IP multicast ASM and SSM can coexist

Recent means of improvement / simplification of ASM

• Easier protocols for ASM • • Bidir-PIM (intradomain only today) Easier RP-redundancy (PIM-Anycast-RP, Prioritycast) • IPv6 multicast (address allocation, embedded-RP) Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IPv4 multicast protocols

Presentation_ID © 2004, Cisco Systems, Inc. All rights reserved.

Protocols for IP multicast H

ost to Router signaling: membership reporting

• •

IPv4:IGMP “Internet Group Management Protocol” IPv6:MLD “Multicast Listener Discovery ”

MLD = IGMP IGMPv2/MLDv1: group memberships

Sufficient for ASM

• •

IGMPv3/MLDv2: group and source memberships

• •

Required for SSM, also support ASM No IGMPv2/MLDv1 report suppression:

Enables tracking per-receiver on a LAN

Enables null leave latency

IGMPv3/MLDv2 fully backward compatible (router/host)

But not snooping devices – must support IGMPv3/MLDv2!

IGMPv3/MLD support in host != SSM support

OS may support IGMPv3, must application may still only signaling group membership report

SSM transition solutions available to map group membership reports to (S,G) channel subscriptions (eg: SSM mapping)

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Rcvr Membership Reports Membership Queries Router

Protocols for IP multicast

Host to Router signaling: redundant source reporting

• • •

No IETF standard available

“Multicast and Anycast Group Membership” MAGMA) IETF WG never got around working on it … and is now getting isbanded ;-( Pragmatic solution Source announces redundant source address via RIP:

• • • •

Easily done from application (RIP uses UDP) No protocol machinery required – only periodic sending.

Fast periodic sending for fast source failure detection All routers support RIP, but RIP is seldomnly used in production networks

Allows router to be configured to easily limit RIP to only redundant source announcements Already used in MPEG video sourcing products Src RIP(v2) Report (UDP) Router

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Protocols for IP multicast

Roadmap? of IP multicast protocol evolution

BGP IGPs Eg: OSPF MOSPF RPF-flooding PIM-DM DVMRP Spanning-Tree-flooding CBT IPv4 PIM-SM / MSDP MP-BGP(SAFI2), AutoRP/BSR Anycast-RP

eierlegende Wollmilchsau

” (german) == egg-laying wool-milk-sow (The pig that gives meat, sausage, wool, feathers, milk and eggs) Aka: the universal solution that can do everything.

RPFv4v6:

ASMv6:

Multi-

PIM-SM+

topologies IGP + BGP

Presentation_ID

Embedded-RP

Intra/Interdomain

© 2004 Cisco Systems, Inc. All rights reserved.

ASMv4/v6: Bidir-PIM

Intradomain only

SSMv4v6: PIM-SSM

Intra/Interdomain

Protocols for IP multicast

IPv4 ASM standard model protocols: PIM-SM / MSDP

PIM-SM

• • • Shared-Tree and Shortest-Path-Tree Forwarding Efficient traffic pruning in all cases Complex operations: Register-tunnel operations, RPT/SPT switchover, RP placement, RP announcement/RP redundancy •

BSR and AutoRP

• • RP-annoncements RP redundancy •

MSDP

• • For Interdomain connecting of PIM-SM domains • Complex and set of MSDP-RPF rules. Works correctly only if MSDP overlay topology is matched with unicast/BGP routing information For RP redundancy (“MSDP mesh-group”) – no need for MSDP-RPF check.

Anycast-RP

• • For RP redundancy with MSDP mesh groups – faster than AutRP/BSR Typically uses static-RP config for RP announcements Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Protocols for IP multicast

IPv4 recent additions to the protocols

• • •

PIM-SSM

• • • Protocol used for SSM: Built P2MP (S,G) SPT’s rooted in the source S No separate spec!. Subset of PIM-SM: No RP or RP = 0.0.0.0 !

Very simple: no RP == no register-tunnel/first-hop-DR, RP placement, announcement, redundancy, no RPT operations, no RPT/SPT switchover • Separate PIM-SSM spec would be 1/10 th of PIM-SM spec ?

Bidir-PIM

• • New PIM family protocol: Shared tree ( (*,G)) tree building only Very good for enterprise applications with many source per group (scalability, convergence)

RP-redundancy

• •

PIM-Anycast-RP:

functions like MSDP mesh-group – without MSDP

Prioritycast-RP

: Eg: for Bidir-PIM – RP redundancy without any protocol between redundant instances – because only one is active Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP Protocols for IP multicast R

edundant IP address policies and support

Different policies possible:

Anycast : clients connect to the closest instance of redundant IP address

Prioritycast : clients connect to the highest priority instance of the redundant IP address

• •

IP multicast: Redundant IP addresses for sources (SSM) and RP (PIM-SM, Bidir-PIM).

• •

Bidir today only supports only one active RP == needs prioritycast.

Sources (video) may have different Quality – prioritycast.

Redundant IP addresses implemented by redistributing them into IGP

• • •

Anycast comes for free (closest instance = SPF) Prioritycast requires engineering.

Elegant solution: Prefixlengths Src A primary 10.2.3.4/32 Src B secondary 10.2.3.4/31 Rcvr 1 Rcvr 2

Example: prioritycast with Prefixlength annuncement

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IPv6 multicast

Presentation_ID © 2004, Cisco Systems, Inc. All rights reserved.

IPv6 multicast Summary

• •

Everything like IPv4…

• • ASM/SSM service models Everything PIM: PIM-SM(PIM-SSM), Bidir-PIM, (PIM-DM), BSR

Except:

IPv6 multicast addressing

• Global IPv6 multicast addresses for free (with the purchase of your IPv6 unicast addresses).

• Scoping is easy and free! • • •

MLDv = IGMPv No AutoRP (use BSR).

No MSDP

• Use PIM-anycast or Prioritycast-RP for RP redundancy • Use embedded-RP (IPv6 specific!) for Interdomain PIM-SM

No “old BGP4”, but only MP-BGP SAFI2

• Will discuss in RPF section of presentation Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IPv6 Multicast Addresses (RFC 2373)

1111 1111 F F 8 bits 0 Flags P T Scope

Flags

= 128 bits group ID T = 0, permanent IANA groups T= 1, FF1X::/12 -> user groups P proposed for unicast-based assignments 8 bits

Scope

= 1 = Interface-local 2 = Link 4 = Admin-local 5 = Site 8 = Organization E = Global

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IPv6 Unicast Based Multicast Addresses (RFC3306)

Solves the old IPv4 address assignment problem: How can I get global IPv4 multicast addresses?

In IPv6, if you own an IPv6 unicast address prefix you implicitly own an RFC3306 IPv6 multicast address prefix : 8 4 4 8 8 64 32 FF | Flags | Scope | Rsvd | Plen | Network prefix | Group id

FF3E:0040:3FFE:0C15:C003:1109:0000:1111

3 hex Uni-pfx E hex Global 40 hex Prefix=64

Flags = 00PT, P = 1, T = 1=> Unicast based address

SSM:

Special case of unicast prefix-based addresses P=T=1, plen=0, network prefix=0 FF3x::/ 96 Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Embedded Rendezvous Point

Addresses

8 Multicast Address with Embedded Rendezvous Point Address 4 4 4 4 8 64 32 FF Scope Rsvd Plen Network Prefix Group-ID FF 7 6:0 1 30: 1234:5678:9abc 4321 1234:5678:9abc::1 Resulting Rendezvous Point Address

Presentation_ID • • • •

Special case of Unicast prefix based addresses Flags = 0 R PT, R = 1, P = 1, T = 1=> Rendezvous Point address embedded Rendezvous Point address = network prefix = Rpad Sixteen Rendezvous Point addresses per network prefix

© 2004 Cisco Systems, Inc. All rights reserved.

R

Embedded – Rendezvous Point Usage

PIM-SM protocol operations with embedded-Rendezvous Point:

No change in PIM-SM protocol operations Just an automatic replacement to static Rendezvous Point configuration

• •

Can replace BSR for Group-to-RP mapping Method requires large IPv6 addresses No equivalent possible in IPv4

Not usable for Bidir-PIM either ;-( ASM across single shared PIM domain, one Rendezvous Point S DR RP

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP multicast RPF

Presentation_ID © 2004, Cisco Systems, Inc. All rights reserved.

IP multicast RPF Overview

• • • • • •

RPF – Reverse Path Selection

Unlike DVMRP, PIM based multicast tree building does not come with it’s own protocol to determine the shortest paths towards an RP or a source. Instead it relies on unicast routing protocols

• Initial PIM architects assumed that exactly the same routes (IGP/BGP) as for unicast could be used. And then there was reality…

Static multicast routes ECMP (Equal Cost multipath)

• Necessarily per multicast-flow, not per-packet, tailend driven

MP-BGP (MBGP)

• For interdomain incongruency (IPv4/IPv6)

Separate topology for multicast in IGPs

• • When asymmetric metrics are required For cost optimization

Dual topologies for multicast

• For live-live traffic redundancy Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP multicast RPF ECMP – Equal cost multipath

• • • •

Decision in unicast is made by upstream (sending) router Decision in multicast is made during RPF select on downstream router. Router local choice – no network dependency Polarizing: i = ( hash(S) % n)

..but good for L3 link bundles – predictable traffic distribution Non-polarizing: i = i | max( hash(S, Nbr-i ))

Also stable in case of link failure or unaffected flows.

Polarizing Non-Polarizing

Given

1..n

(eg: 2) ECMPs, if all routers select the same neighbor

I

for a source S, then polarization may happen: A rtr2 will only be joined to by rtr1 for Sources that it’s own ECMP would RPF to rtr4, but never to rtr5!

4 Polarizing 5 6 2 1 3 7

Presentation_ID

Multicast RPF Selection for different source addresses

© 2004 Cisco Systems, Inc. All rights reserved.

IP multicast RPF BGP – IPv4 / IPv6

R

IPv4: MP-BGP introduced SAFI:

• • •

SAFI1 = unicast only, SAFI2 = multicast only, SAFI3 = both Traditional implementations only use SAFI2 and non-SAFI BGP4 (unicast) Lazyness (not wanting to have all multicast routes in SAFI2) requires complex route preference rules – prefer shorter prefix SAFI2 over longer prefix non-SAFI2

IPv6: Uses latest version of MP-BGP (RFC2858)

No non-SAFI route announcements (never defined), no SAFI 3 (removed by IETF) Only SAFI 1 for unicast, SAFI 2 for multicast

Should never use SAFI 1 routes for multicast to keep RPF rules simple (and to use MP-BGP as intended by wording of BGP spec)

Presentation_ID

AS1

© 2004 Cisco Systems, Inc. All rights reserved.

AS3 Unicast only AS2 BGP4: 131.188.1.0/24 AS4 BGP SAFI2: 131.188.0.0/16 131.188.1.1

S

IP multicast RPF Separate multicast topology for cost optimization

Src2 Src1 Rcvr Region1 Core POP1 B1 A1 Rcvr WAN Links Rcvr A2 Core POP2 Region2 Rcvr Rcvr B2 Rcvr A3 B3 Rcvr Core POP3 Region3 Rcvr Rcvr Rcvr

• •

Consider simplified example core/distribution network toplogy Core pops have redundant core routers, connectivity via (10Gbps) WAN links, redundant. Simple setup: A/B core routers, A/B links

Regions use ring(s) for redundant connectivity

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP multicast RPF Separate multicast topology for cost optimization

Src2 Src1 Rcvr Region1 Core POP1 B1 A1 Rcvr Load splitting across WAN Links A3 Rcvr Core POP3 Region3 Rcvr A2 B3 B2 Rcvr Rcvr Core POP2 Region2 Rcvr Rcvr Rcvr Rcvr

IGP metric are set to achieve good load distribution across redundant core.

• •

Manual IGP metric setting and/or tools Assume in the idealized topology cost of 1 on all links.

Result: Unicast traffic is load split across redundant core links

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP multicast RPF Separate multicast topology for cost optimization

Src2 Src1 Rcvr Region1 Core POP1 B1 A1 Rcvr Unnecessary use of WAN Links A3 Rcvr Core POP3 Region3 Rcvr A2 B3 B2 Rcvr Rcvr Core POP2 Region2 Rcvr Rcvr Rcvr Rcvr

The same metric good for unicast load splitting cause multicast traffic to go unnecessarily across both the A and B WAN links.

10 Gbps WAN links, 1..2 Gbs multicast => 10..20% WAN waste (cost factor)

Can not resolve problem well without multicast specific topology

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP multicast RPF Separate multicast topology for cost optimization

Src2 Src1 Rcvr Region1 Core POP1 B1 Rcvr Efficient use of WAN Links A1 A3 Rcvr Core POP3 Region3 Rcvr A2 B3 B2 Rcvr Rcvr Core POP2 Region2 Rcvr Rcvr Rcvr Rcvr

Simple? to minimize tree costs with a multicast specific topology

Manual or tool based

• Presentation_ID

Example toplogy: make B links very expensive for multicast (cost 100), so they are only us as last resort (no A connectivity)

© 2004 Cisco Systems, Inc. All rights reserved.

IP multicast RPF Dual multicast topologies for resiliency

STBs HFC Redundant Encoder/Multiplexer Redundant Decoder / Ad-Inserter/..

Send traffic twice to different multicast groups (eg: green = 232.1.0.1, red = 232.1.0.2)

Use path separation in network to pass red/green across different paths Note: dual topologies just one solution (VRFs, virtual routers, …)

Receivers receive both copies, remove duplicates by sequence numbers (eg: MPEG timestamp).

No single network failure will cause any service interruption

Same bandwidth allocation needed as in traditional SONET rings, but solution even better: 0 loss instead of 50 msec.

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP multicast RPF Dual multicast topologies for resiliency

Traditional application:

• • Market data distribution in finance network (NASDAQ, etc..) Based on two completely separate networks !

With two topology solution

• • • No separate physical networks required Can provide different subsets of the network to different classes of traffic.

• Can share links to reduce cost (two unidirection links).

• • Can share nodes to reduce cost.

Vs virtual routers or similar “virtual network”:

• No need for subnet encaps ifor multiple topologies

Vs. RSVP-TE P2MP

• DIffserv type approach (not per flow/tree) Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

IP multicast RPF Dual multicast topologies for resiliency

Rcvr Rcvr

IGP cost in different Topologies:

Rcvr Rcvr Redundant Encoder/Multiplexer

• • •

Rcvr Rcvr Topology sharing of links: Particular useful in rings.

Infinite/large metric Two topologies also useful for unicast (eg: VoD load splitting) Small metric

Requires unidirectionaly “infinite” link metric to avoid reconvergence of topologies (if wanted)

• •

Available in ISIS today, not in OSPF Part of OSPF/UDLR draft in IETF

Unicast traffic flows in the reverse direction of unicast

Small metric Infinite/large

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Reliable multicast

transport protocols

Presentation_ID © 2004, Cisco Systems, Inc. All rights reserved.

Reliable Multicast Overview

• •

PGM (Pragmatic Generic Multicast)

Near-real-time delivery with TCP compliant flow-control, optional network level scalability support (DLR, NE)

NAK-based: Preferred to small to mid-size receiver groups with tight delivery schedules

Used in many finance/comerical applications today, supported by Windows-XP and later, etc. Router support little used.

ALC (Asynchronuous Layered Codec)

• •

Non-real-time delivery without any feedback from receivers – can support arbitrary large receiver population (eg: STBs).

Relies on FEC. Interesting with “Tornado Codecs” (large block codecs.

Target applications eg: Content Distribution to VoD servers, STB with HD, PC software upgrade, nVoD

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Reliable multicast PGM components

Server PGM (Server/Source) Stack Network Optional PGM functions

DLR: Designated Local Repairer Network Element = Router Assist

Host PGM (Host/Receiver) Stack

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Reliable multicast

content carousels – before ALC/Tornado-Codecs

File Receive Received File

• • •

In a traditional carousel, a file is repeatedly sent, receivers start receiving in the middle receive the tail of the file and then continue to receive until they have received the head of the next iteration.

Works only well if network has little packet loss

Need to potentially receive content for many iterations in face of higher packet loss

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Reliable multicast

content carousels – with ALC/Tornado-Codecs

File

Tornado Encode

Receive arbitrary packets

Tornado Decode

Received File

ALC encodes file into eg: 2^32 different packets.

Receiver needs to receive just sufficiently many arbitrary packet from encoding to reconstruct file (original file size +5% overhead)

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Reliable multicast

content carousels – with ALC/Tornado-Codecs

Allows to carry reliable multicast content as scavenger class traffic (less than best effort).

Use free bandwidth in network !

Limit of basic carousel:

Can only start encoding after whole file is available

Not directly usable for real-time transmission

Break up file into segments, apply ALC encoding separately, start transmission after first segment.

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Reliable multicast

nVoD – with ALC/Tornado-Codecs

Movie 1. Segment movie: S1 = 1 min, S2 = 2 min , S3 = 4min, 8, 16, … S2 S3 S4 S5 S6 2. Carousel each segment simultaneously ALC encoded At double speed: S1 S2 S3 S3 S4 S5 S6

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

S1 S2 S3 S3 S4 S5 S6

Reliable multicast

nVoD – with ALC/Tornado-Codecs

3. Receiver IGMP joins to one segment at a time. Once segment is fully received, it is decoded and receiver receives next segment. Because segments are transmitted faster than realtime (example: factor 2), playout takes as long as receiving next segment.

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Reliable multicast

nVoD – with ALC/Tornado-Codecs

Never more traffic than unicast, never more traffic than 14 unicast user (in example)

• VoD starts after 30 seconds • Total bandwidth in network is 7 segments * double speed == same bandwidth as 14 unicast VOD viewers require.

• As soon as more than 14 user watch same content (independent of their starting time), no more bandwidth is required.

• If less than 14 users watch, bandwidth utilized is still the same as in unicast (because only traffic joined to by IGMP is being forwarded).

• Just transmission is more bursty than unicast !

Parameters can easily be varied

Beneficial for top 3?% of VoD library

Zipf distribution – majority of market share

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Questions

?

Presentation_ID © 2004 Cisco Systems, Inc. All rights reserved.

Presentation_ID © 2003, Cisco Systems, Inc. All rights reserved.