Speech Coders – a VoIP perspective Roar Hagen CTO SIP/email: [email protected] Agenda • Speech Coders – a VoIP perspective • Demo • Q&A.
Download ReportTranscript Speech Coders – a VoIP perspective Roar Hagen CTO SIP/email: [email protected] Agenda • Speech Coders – a VoIP perspective • Demo • Q&A.
Speech Coders – a VoIP perspective Roar Hagen CTO SIP/email: [email protected] Agenda • Speech Coders – a VoIP perspective • Demo • Q&A QoS – (endpoints) status ”A lot of talk, ... but not much work” • Year after year the same story • More then 3000 papers since 1984 • Limited ToS support at the end points QoS – status Industry’s perspective percentage of respondents 0 10 20 30 40 50 60 quality concerns unproven technology PSTN works fine too busy to switch not compelling economics source: Forrester Research/AT&T (2000) Background - Diverse Environment PSTN Managed network [ Managed network Public Internet ] Next generation codecs should address the needs of all applications Wireless Packet Loss San Francisco – Hong Kong Jitter San Francisco – Hong Kong Homg Kong to China VoIP Call 2G/2.5G Wireless VoIP – The Big Unknown ? 3G Fixed Walk Vehicle Mobility WLAN LAN Bluetooth 0.1 1 10 100 Mbps Approach We need holistic view/approach for both • Horizontal (end-to-end) perspective • Vertical (top-down) perspective Vertical (Top Down) Perspective Presentation Speech Codecs/… Session SIP/H.323 Transport RTP/UDP/RSVP Network IP/WFQ/IP-prec Link MLPPP/FR/ATM AAL1 Physical VoIP Aspirations • IP innovation rather than PSTN replication • New features and services through voice and data convergence • End-to-end IP • Better than PSTN sound quality MOS = USER EXPERIENCE Current speech processing technology not designed for packet switched environments “FALL OFF A CLIFF” shape of curve forces over provisioning OVERPROVISIONED NETWORK CONGESTED NETWORK * MEAN OPINION SCORE MOS = USER EXPERIENCE …congestion related VoIP QoS problems can be solved without over provisioning… Operate AT and ABOVE congestion point without customer knowing OVERPROVISIONED NETWORK CONGESTED NETWORK * MEAN OPINION SCORE narrow band sound quality equal to PSTN wide band sound quality Better Than PSTN Quality Matching PSTN Quality Telephony bandwidth speech test result Wideband speech 5 5 4.5 4.0 4.0 3.5 3.5 MOS MOS 4.5 3.0 2.5 GIPS Ehanced G.711+ GIPS NetEQ™ 2.0 G.711+GIPS NetEQ™ G.711+ITU PLC 1.5 3.0 2.5 GIPS iPCM™-wb+ GIPS NetEQ™-wb 2.0 G.722+ GIPS NetEQ™-wb G.722.1 1.5 G.729A 1.0 0% G.711+No PLC 5% 10% 15% 20% 25% 30% NETWORK CONDITION (% PACKET LOSS) 1.0 0% Source + no PLC 5% 10% 15% 20% 25% NETWORK CONDITION (% PACKET LOSS) SOURCE LOCKHEED MARTIN GLOBAL TELECOMMUNICATION (COMSAT) Jitter Buffer/PLC Enhancements Source: Lockheed Martin Global Telecommunications (COMSAT) Delay gain with NetEQ™ approx. 30-60ms compared to traditional jitter buffers Jitter 140 Adaptive jitter buffer NetEQ™ Fixed jitter buffer Delay (ms) 120 100 80 60 40 20 0 0 200 400 600 800 1000 1200 Packet number 1400 1600 1800 2000 The NextGen Speech Codec Ideal • Need one concept that will work for a long time – footprint importance • Need to handle large diversity of transport network – – – – – • • low rate high quality, high rate packet loss jitter low delay Manageable IPR situation Signal Robustness – speech – music • Suitable for variety of applications, e.g. IP video-conferencing iLBC (internet Low Bitrate Codec) • Speech sampled at 8 kHZ, • using a block-independent linear-predictive coding (LPC) algorithm. • Bandwidth 13.867 kbps (52 bytes per 30 ms) • Frame size 30 ms (support for 20 ms in the next revision) • Complexity and memory requirements are similar to ITU G.729A • Basic Quality is equal to or better than G.729. Packet loss robustness is significantly better than G.729. • Packet loss concealment - Integrated example solution MOS Results G.729A G.723.1 iLBC 4.0 MOS 3.5 3.0 2.5 2.0 1.5 0 Source: Dynastat Inc. 5 10 Packet Loss [%] 15 iLBC - IETF work • IETF deliverables, submitted during February ‘02: – iLBC codec specification draft - experimental standards track – iLBC RTP Payload Profile - regular standards track (AVT) – Statement about IPRs in iLBC and its “freeware nature” • MOS results submission to the AVT mailing list during March ‘02 Why iLBC !? • Current low bit rate codecs: ITU G.729, G.723.1, GSM-EFR, and 3GPP-AMR were developed for circuit switched & wireless telephony and are all based on the CELP (Code Excited Linear Prediction) paradigm. • CELP coders are stateful, they have memory, error propagation results from lost or delayed packets. • iLBC treats every packet individually, making it suitable for packet communications. More information • Coming Soon - web site www.ilbcfreeware.org with: – – – – – Info about initiative Info about codec Latest iLBC IETF drafts (spec and payload format) Latest iLBC float point Source code FAQ list • IETF drafts: – draft-andersen-ilbc-00.txt – draft-duric-rtp-ilbc-00.txt - codec spec (exper. stds track) - RTP payload profile (AVT group) • Web site www.globalipsound.com • Free demo SIP client available, please request at: SIP/email: [email protected] Summary • Current speech coding technology not suited for VoIP • VoIP opens possibilities – Move quality exprience to the next level with wideband coders • NGN will not be NGN unless we move step forward on all of its fields • iLBC – internet Low Bit Rate Codec – Provide an open standard ”the Internet way” for coder Demo