Slides - TNC 2008
Download
Report
Transcript Slides - TNC 2008
Live Music Performances over
High-Speed IP Networks
Stefan Karapetkov
Director, Emerging Technologies
TERENA Networking Conference
Bruges, Belgium, May 20, 2008
Agenda
Manhattan School of Music
Audio-Video Networks
Audio Technology
Voice-specific Codec Functions
Adjustments for Live Music Mode
Video Technology
Transmission Technology
Live Music Mode Demo
2
MSM Testimonial
3
Audio-Video Networks Today
Video Endpoints
IM/Presence and
IP-PBX Integration
Conference Servers
Call Control, Management
& Scheduling
Video Recording, Streaming
& Content Management
Security & NAT/FW
Traversal
H.323 Architecture
User Database
Gatekeeper
Terminal A
1) H.225 SETUP
2) H.225 SETUP
4) H.225 CONNECT
3) H.225 CONNECT
5) H.245 CAPS, MS
6) H.245 CAPS, MS
8) H.245 CAPS, MS
7) H.245 CAPS, MS
9) H.245 OLC
10) H.245 OLC
RTP/RTCP Stream
IP Network
Terminal B
SIP Architecture
User Database
SIP Proxy
Registrar
2) 302 Moved Temporarily
SIP
Redirect
Server
IP Network
RTP/RTCP Stream
SIP User Agent A
SIP User Agent B
Audio and Video Compression
Concert Site
Remote Site
7
Super Wide
Siren 22 stereo
Siren 14 stereo
G.722.1C
G.722.1
G.722
G.722.2
Narrowband
Wideband
Audio Fidelity
Advanced Audio Compression Technology
G.711
G.728
G.729A
AMR-NB
4 kbps
64 kbps
Data Bit Rate
128 kbps
SirenTM22 Stereo Codec Highlights
SirenTM22
MP3
Stereo
Stereo
Optimized for low
latency - 40ms
High latency – 54-81ms
Frequency band 22kHz
Frequency band 18kHz
Low complexity 15MIPS
High complexity 100MIPS
Low bit rate – max.
128kbps
Optimized for storage bit rates > 128kbps
9
SirenTM22 on the Road to Standardization
ITU-T G.719 full-band codec approved in May 2008
Based on Polycom Siren™22 and Ericsson’s advanced audio
G.719 number for higher visibility
ITU-T cited the strong and increasing demand for audio
coding providing the full human auditory bandwidth
Conferencing systems are increasingly used for more elaborate
presentations, often including music and sound effects
In today’s multimedia presentations, playback of audio and video
from DVDs and PCs is becoming a common practice
New Telepresence systems provide High Definition video and
audio quality to the user, and require high-quality media delivery
to create the immersive experience
Extending the quality of remote meetings helps reduce travel
which in turn reduces greenhouse gas emission and limits
climate change.
10
Automatic Gain Control (AGC)
AGC adds 0dB
AGC adds 3dB
AGC adds 6dB
Signal strength
Nominal is 2 feet
from microphone
Max. 12 feet from
microphone
Automatic Gain Control (AGC)
Activated by speech and music
Ignores white noise, e.g. if a fan is working close,
AGC will not ramp up the gain based on fan noise
AGC destroys the natural dynamic range
If the music is loud, AGC decreases the volume
If the music is quiet, AGC increases the volume
Therefore, AGC must be completely disabled in a
codec
12
Automatic Noise Suppression (ANS) & Noise Fill
ANS
Noise Fill
Signal
Signal
White noise
Signal
Comfort
Noise
13
Acoustic Echo Cancellation (AEC)
Hears
echo
Acoustic
Coupling
AEC
14
Stereo Acoustic Echo Cancellation (AEC)
50-22,000 Hz operating range
Adaptive filter length of 260ms
This number is the max delay of the echo that we can compensate
This is the room response – it includes many audio wave reflections
No learning sequence needed
Algorithm trains quickly on speech
No need to send out white noise to train it
Stereo echo canceller identifies multiple paths of the stereo
loudspeakers
Quickly adapts to microphones that are moved within two
words of speech
Moving the mike changes the echo path and the adaptive filter has to
learn the new path.
Echo comes back for short time (1-2 words); then canceller adjusts.
15
Stereo AEC in Live Music Mode (LMM)
Standard AEC leads to audio artifacts, low notes can be cut
Main complain from MSM is that sustained note (e.g. press
sustain pedal on piano) cannot be heard all the way even if
they are just 1dB over the noise floor
AEC settings in LMM prevent very quiet musical sounds from
being cut out
Assumption that LMM is set in a quiet environment without
background noise
We changed the thresholds for signal detection to be more
aggressive (low)
16
Installed Audio
Definition: rack-mounted systems that process all the
audio in a conference room or large meeting room
Microphones
DVD
17
Video System
SoundStructure
Speakers
Telephony
Interworking: Installed Audio & Video Endpoints
SoundStructure adds 8/12/16 additional inputs/outputs
Digital connectivity with Polycom Video Endpoints
Fully digital audio for better quality
Bi-directional stereo between SoundStructure and HDX
Full 22kHz stereo AEC compatible with Siren 22 audio codec
Shared mute and volume control
Auto-discovery between the devices – automatic configuration
SoundStructure
HDX
18
18
Advanced Video Technology: High Definition
Quality
720p
HD
480p
SD
CIF
1280x720
704x480
352x288
384kbps
512kbps
1Mbps
6Mbps
Bandwidth
19
Advanced Video Technology: Camera Control
FECC
Res 1280x720p
50/60FPS
Aspect ratio 16:9
Pan +/- 100°
Tilt +20° to -30°
12x optical zoom
FECC
20
Advanced Video Technology: Far End Camera
Control (FECC)
FECC
In H.323, FECC uses H.281
(binary data) over H.224 (frames)
RFC 4573, MIME Type
Registration for RTP Payload
Format for H.224
21
Advanced Video Technology: Multiple Streams
‘Live’ Stream
‘Presentation’ Stream
ITU-T Recommendation H.239
RFC 4796, SDP Content Attribute
RFC 4574, The SDP Label Attribute
RFC 3388, Grouping of Media Lines in SDP
RFC 4582, Binary Flow Control Protocol (BFCP)
RFC 4583, SDP Format for BFCP Streams
draft-even-xcon-pnc-01, Role Mgmt & Multiple Streams
22
Transmission Technology
H.323
Domain
SIP
Domain
23
Audio Precedence in Codec Negotiation
High priority
Audio
Video
Bandwidth
Standard Setting
LMM Setting
> 1024
Siren22 Stereo 128
Siren22 Stereo 128
768 - 1024
Siren22 Stereo 96
Siren22 Stereo 128
512 - 768
Siren22 Stereo 96
Siren22 Stereo 128
384 - 512
Siren22 Stereo 96
Siren22 Stereo 128
256 - 384
Siren14 Stereo 48
Siren14 Stereo 48
24
Keeping Quality Up in Transmission
Video Error Concealment (PVEC)
Video
Lost Packet Recovery (LPR)
Audio
Video
IP Network
Audio
Video
25
LPR Definitions
LPR is a new method of error
concealment for packet based networks
that is based upon Forward Error
Correction (FEC)
LPR constantly adjust the video bit rate to
reduce the amount of loss in a packet
based network
26
Lost Packet Recovery (LPR)
Video
Encoder
Encryption
RTP
Sender
LPR
Packetizer
LPR DBA
Mode
Decision
LPR
Recovery
Packet
Generator
RTCP
RTCP
LPR
Recovery
RTP
Reordering
Buffer
LPR
Regeneration
Decryption
Video
Decoder
27
LPR DBA Example
Full Bandwidth
Packet loss 25%, FEC on
Bit rate drop 26%
Bit rate increase
e.g. 10%
Packet loss 4%, FEC on
Bit rate drop 5%
100%
Packet loss 15%
Bit rate drop 16%
77%
74%
70%
No packet loss, FEC off
58%
X ms
X ms
X ms
Down Speeding
…
Y ms
No packet loss,
FEC off
72% 72%
…
64%
58% 58%
X ms
X ms
X ms
X ms
X ms
Up Speeding
X ms
X ms
Down Speeding
28
Technology Summary
Flexible Networking – H.323 and SIP
Advanced Audio Technologies
Audio Compression
Automatic Gain Control (AGC)
Automatic Noise Suppression (ANS) and Noise Fill
Stereo Acoustic Echo Cancellation (AEC)
Advanced Video Technology
High Definition
Camera Control
Multiple Streams
Advanced Transmission Technology
Lost Packet Recovery (LPR)
29
Live Music Mode Demo
[email protected]