Speech Quality

Download Report

Transcript Speech Quality

Overcoming VoIP
Quality Challenges
Dr. Jan Linden, VP of Engineering
Global IP Solutions
3
Outline
 VoIP Quality Challenges
 Latency
 Codec Choice
 Conferencing
 How to Measure Speech Quality
4
VoIP Design Considerations
Speech Quality
Time to Market
Ease of Use
Flexibility
Network
Impairments
Power
Consumption
Cost
Quality
Cost
Signaling
Infrastructure
Features
Device
Considerations
5
Major Challenges for VoIP End-point
Design
Both Sides of the Call Need to be Considered
Speech Codec
Hardware Issues
(Processor, OS,
Acoustics, etc.)
Codec
Hardware
Network
Coping with Network
Degredation
Power Consumption
VoIP Design
Challenges
Power
Echo
Echo Cancellation
Additional Voice
Processing Components
Voice
Environment
Environment –
Background Noise,
Room Acoustics, etc.
6
Delay
 Major effect is “stepping on each other’s talk”
 Usage scenario affects annoyance factor – higher
delay can be tolerated for mobile devices
 Long delays make echo more annoying
Impact of IP Networks
Packet Loss
 Smooth concealment
necessary
Network Jitter
 Jitter buffer necessary to ensure continuous playout
 Trade-off between delay and quality
7
Sources of Latency
 Codec
 Capture
 Playout
 Network delay
 Jitter buffer
 OS interaction
 Transcoding
A/D
A/D
PrePreprocessi
Processing
ng
Speech
Speech
Encoding
encoding
IP
IP
Interface
interface
IP
Network
IP Network
D/A
D/A
PostPostprocessi
Processing
ng
Speech
Speech
Decoding
decoding
Jitter
Jitter
Butter
buffer
8
Impact of Delay on Voice Quality
Mean Opinion Score
4
3
2
1
0
250
500
One-w ay transmission time [ms]
750
Data from ITU-T G.114
 ITU-T (G.114) recommends:
– Less than 150 ms one-way delay for most applications (up to 400
ms acceptable in special cases)
 Users have got used to longer delays
– Still, low delay very important for high quality
9
Speech Codec
 Many conflicting parameters
affect choice of codec
 Determines upper limit of
quality
Complexity
Memory
Delay
Speech
Codec
 Support of several codecs
necessary
– Interoperability
Input Signal
Robutness
Bit-rate
– Usage scenario
 IPR issues a significant
concern
Packet-loss
Robustness
Quality
Sampling
Rate
10
Audio Spectrum
 Better than PSTN quality is
achievable in VoIP
– Utilizing full 0 – 4 kHz
band in narrowband
– Wideband coding offers
more natural and crispier
voice
Telephony band
11
Audio Spectrum vs. Speech Quality
Speech Quality
Wideband
Speech
CD
Speech
Super
Wideband
Speech
Narrowband
Speech
(PSTN)
Frequency
4 kHz
8 kHz 10 kHz
16 kHz
22.1 kHz
12
Speech Codec Design for VoIP
 Many standard codecs designed for bit errors, not
packet loss
– Error propagation issue for CELP codecs
 Variable bit rate attractive for IP networks
 Packet overhead significant (5 – 32 kb/s)
– Makes low bit rate codecs less attractive
 Packet loss concealment a must
 Jitter buffer design has significant impact on quality
 Alternatives to standards
– De-facto standards like iSAC
– Open source like Speex
Echo Cancellation
 High delay in VoIP makes echo problem more prominent
 Network/Line echo cancellation for gateways
 Acoustic echo cancellation
– Hands-free/speakerphone
– Small devices
 Biggest challenge is AEC for PC
– Acoustic setup unknown and changing
– Wideband speech
– Very few solutions on the market
14
Effects of Transcoding
 Transcoding occurs when the endpoints are using different codecs
– Every transcoding introduces distortion
– Low bit-rate codecs very sensitive to transcoding
 Transcoding between networks
VoIP to PSTN
 Limited quality
degradation since
G.711 used on the
PSTN side
VoIP to Cellular
 Severe quality
degradation common
since low bit-rate
codecs typically used
on both sides
VoIP to VoIP
 Usually occurs in
Session Border
Controllers
 Can normally be
avoided
 Transcoding in conferencing
– Mixing done in decoded domain results in transcoding
15
How to Make the VoIP Software Robust?
Very Quick Jitter Buffer
Adaptation – Conditions
Change Very Rapidly (on a
milisecond basis)
Minimize Delay
Everywhere – every
milisecond counts
Spot Jitter Patterns Increase Delay to Keep
Good Quality when
Unavoidable
Packet Loss
Concealment - Capable
of Handling Several Lost
Packets in a Row
16
Measuring Voice Quality
Subjective Methods

Test the “right thing”, i.e. subjective
quality

Takes all types of degradation into
account

Time consuming and costly

Lack of repeatability
Objective Methods
 Simple and affordable
 Inaccurate but repeatable results
 Sensitive to any processing (nonlinear filtering, echo cancellation,
time warping etc.)
– Time synchronization major
challenge not yet solved
 Sensitive to background and
equipment impairments
 One step behind development of
codecs and error concealment
 Next generation algorithm in
standardization process (P.OLQA)
Audio Conferencing
 Design includes a trade-off between quality and
scalability
A
 Client based or server based
–
–
Server based offers better scalability than client based
Can be combined
 Transcoding often unavoidable
 Two strategies:
–
–
Mix incoming signals to form one output signal
Only relay packets and mix at client side
 Multi-codec support
–
In relay mode all endpoints need to support all codecs
 Narrowband and wideband
–
–
–
Both can be present in a conference
Narrowband participant will hear everything in narrowband
Wideband participant hears others in narrowband or
wideband
A+B+C+D
E
A+B+C+E
B+C+D+E
D
A+B+D+E
B
A+C+D+E
C
18
Conclusions
 Latency has a significant impact on the perceived quality
in VoIP
– Low latency, high quality (e.g. NetEQ) jitter buffer necessary
 Choose the right codec for the usage scenario
– Or a codec that can adapt like iSAC
 Transcoding should be avoided, if possible
 Significantly better quality than PSTN possible
– Wideband coding
 No good objective measure for speech quality exists
– Always combine with subjective evaluation