NC3A Technical Presentation 001

Download Report

Transcript NC3A Technical Presentation 001

The NATO Post-2000
Narrow Band Voice Coder:
Test and Selection of
STANAG 4591
Technical Presentation-001
[email protected]
CIS Division, NATO C3 Agency
1
NATO UNCLASSIFIED
Abstract and Conditions of Release
Abstract
The work described in this presentation was carried out under customer
funded projects 25.12.00 and N.25.12.00, conducted by NC3A on behalf of
AC322(SC/6-AHWG/3).
This presentation gives a general introduction to the work, which is
documented in NC3A Technical Note-881 and NC3A Technical
Memorandum-946.
This presentation is a working paper that may not be cited as representing
formally approved NC3A opinions, conclusions or recommendations.
2
NATO UNCLASSIFIED
NBVC and NC3A
Customers
NATO Infrastructure
Committee
Voice coder developers
NATO Narrow Band Voice Coder Ad-Hoc Working Group
Host
Nation
NC3A-NL, The Hague
NC3A-BE, Brussels
Scientific staff
Acquisition staff
CustomerEquipment Acquisition
funded Contractual issues
Set up voice coding testbed
Process input data
Blind and deblind data
Support to AHWG NBVC, test labs
and coder developers
3
NATO UNCLASSIFIED
Introduction to STANAG 4591
4
NATO UNCLASSIFIED
Background
• Voice Coding technology is constantly improving
• driven by mobile telephony
–narrow band
–wireless channels
• new coders outperform existing NATO voice coders
STANAG 4198 - LPC10e
+ low rate (2.4k)
- low speech quality
- low resilience to noise
STANAG 4209 - CVSD
+ good resilience to noise
- poor speech quality in
no noise
- high rate (16 k)
• AHWG NBVC tasked by NATO to select a future
Narrow Band Voice Coder for NATO use at
1.2kbps and 2.4kbps
5
NATO UNCLASSIFIED
Voice Coders Tested
• NATO requested candidates to be submitted by member
nations
• Three candidates submitted
• France
• HSX (Harmonic Stochastic eXcitation)
• Turkey
• SB-PLC (Split-Band Linear Predictive Coding)
• USA
• MELP (Mixed Excitation Linear Prediction)
(each candidate operates at both
1.2k & 2.4k)
• plus LPC-10e (2.4k) CELP (4.8k)
known reference coders
6
NATO UNCLASSIFIED
CVSD (16k) as
Test Resources and Responsibilities
• Project was ‘customer funded’ by
NATO Infrastructure Committee and
nations submitting coders
• NC3A host nation, but worked with
specialist speech processing labs
The TNO test laboratory at Soesterberg, NL
• NC3A ran raw audio data through coders
and ‘blinded’ all output
• National test labs analysed raw audio from
NC3A. Test labs were:
• TNO, NL
• CELAR, FR
• Arcon, US
NATO data being analysed at TNO
7
• NC3A impartially collated results
NATO UNCLASSIFIED
NATO NBVC tests
Phase 1
• Floating Point vocoder implementations
• Performance
• Intelligibility
• Quality
• Noise Conditions
• Quiet
• Modern office
• Acoustic noise, (6 dB, 12 dB)
• 5488 Mb of processed audio in 5848 files
8
NATO UNCLASSIFIED
A typical test booth where subjects
listen to speech for analysis
Processing by NC3A
Encode
Decode
LPC10e
LPC10e
CVSD
CELP
FR1200
Raw
audio file
8kHz
sample rate,
16 bit
samples
9
FR2400
TU1200
TU2400
B
I
T
S
T
R
E
A
M
CVSD
Nine raw
9
8
7
6
5
4
3
2
1
CELP
audio
output
files
FR1200
FR2400
TU1200
TU2400
US1200
US1200
US2400
US2400
NATO UNCLASSIFIED
Sent to
test labs
for
analysis
Double blinding process
Decoded
output
files
Single
blinded
files
Double
blinded
files
LPC10e
Coder1
Vocoder1
CVSD
Coder2
Vocoder2
CELP
Nine
audio
output
files
Coder3
B
Vocoder3
FR1200
L
Coder4
L
Vocoder4
FR2400
I
Coder5
I
Vocoder5
TU1200
N
Coder6
N
Vocoder6
TU2400
D
Coder7
D
Vocoder7
US1200
Coder8
Vocoder8
US2400
Coder9
Vocoder9
Blinded
by NC3A
10
B
Blinded
by DSTL
NATO UNCLASSIFIED
To
test
lab
Modulated Noise Reference Unit
•MNRU is a standard method to apply known levels of
noise. It provides known references against which
listeners can compare vocoder outputs
LPC10e
CVSD
CELP
FR1200
FR2400
TU1200
TU2400
US1200
US2400
B
I
T
S
T
R
E
A
M
LPC10e
CVSD
CELP
FR1200
FR2400
TU1200
TU2400
US1200
US2400
MNRU 5db
One
raw
audio
file
17
10
17
11
12
13
14
15
16
raw audio
output files.
MNRU 10dB
MNRU 15dB
MNRU files to test
labs as references
for analysing
speech quality
MNRU 20dB
MNRU 25dB
MNRU 30dB
MNRU 35dB
MNRU 40dB
11
Nine
raw
audio
output
files
NATO UNCLASSIFIED
NATO NBVC tests
Phase II
• Fixed point implementation
•C plus ETSI libraries
• Performance Measurements
• Intelligibility, Quality
• Speaker recognition
• Language dependency
• 10 acoustic noise
environments
• Transmission channel
• 1% BER
• Tandem
• 16 kbps CVSD - vocoder
• Whispered speech
–English, French, German,
Dutch, Polish, Turkish
12
NATO UNCLASSIFIED
Phase 2 additional test conditions
1% random bit errors
Audio
input file
Bitstream
Coder n
Decoder n
Audio
output
file
Test configuration: 1% Bit error rate
Audio
input file
CVSD
Coder
B
i
t
s
CVSD
Decoder
A
u
d
i
o
Coder n
B
i
t
s
Decoder n
Test configuration: Voice coder tandem
13
NATO UNCLASSIFIED
Audio
output
file
NATO NBVC tests - Phase 2
Noise Conditions
Phase 1 plus ……..
MCE field shelter
HMMWV
Bradley Fighting Vehicle
Le Clerc Tank
Volvo (staff car)
Blackhawk helicopter
Mirage 2000
F-15
14
NATO UNCLASSIFIED
NATO NBVC Phase 2
3 test labs
x 9 coders (+ 8 MNRU levels)
x  5 tests
x  12 noise conditions
x  88 files per test
• Over 36,000 files
• Over 30 GB of processed speech data
• @ 500 hours of speech
–Some voice coders ran approx 10 times real time
15
NATO UNCLASSIFIED
Need for Precision Weighted Ranking
Quiet
100.0
95.0
90.0
• Graphs show variation
between intelligibility tests
performed by the 3 test
labs
• General trends are the
same
• Absolute scores vary
85.0
80.0
Arcon
75.0
CELAR
70.0
TNO
65.0
60.0
55.0
50.0
1
US24
2
3
4
5
6
7
CELP
FR24
CVSD
TU24
US12
LPC
8
9
TU12
FR12
BlackHawk
90.0
80.0
• Need to combine all
results accurately and
fairly
• Simple scaling is not
sufficient
70.0
60.0
Arcon
50.0
CELAR
40.0
TNO
30.0
20.0
10.0
0.0
1
US24
16
2
3
4
5
6
7
CELP
FR24
CVSD
TU24
US12
LPC
NATO UNCLASSIFIED
8
9
TU12
FR12
Precision Weighted Ranking
• Range of test results divided into
segments or bins
• The resolution (or 95%
confidence interval) of the test
determines bin size
• Coder scores are determined by
which bin their test result falls
into
• Worst coder always scores 1. In
this test Vocoder 7 came last
• Coders in subsequent intervals
score bin number
Confidence
interval
of test
Bin 1
0.26
Bin 3
Bin 4
Confidence
interval
of test
Bin 5
Bin 7
y = 0.0341x + 0.1949
c
0.22
0.18
0.20
0.40
1
0.2238
0.4263
V7
0.60
2
0.4263
0.6357
0.80
3
0.6357
0.8522
1.00
1.20
4
0.8522
1.0762
5
1.0762
1.3077
1.40
1.60
6
1.3077
1.5472
7
1.5472
1.7948
Score = 7
V8
V6
V9
Score = 1
• Scores for vocoders 6, 8 and 9
were 4 - 5 confidence intervals
above that of V7. They all score 5
17
Score vs Interval
V3
V1
V4
V5
V2
NATO UNCLASSIFIED
Combined Performance Index
Coder
Wgt.
2400bps
1200bps
60%
40%
Performa
nce
Characte
ristic
Wgt.
Intelligibility
41.8%
Whispered Speech
2.2%
Quality
Quality BER
18
34.2%
1.8%
Test
Method
Wgt.
DRT(US)
CVC(NL)
Inteltrans(FR)
NA
NA
NA
SRT(NL)
NA
MOS(US)
MOS(NL)
NA
NA
MOS(NL)
NA
Type
Conditio
n
Wgt.
Baseline
27.4%
Conditio
n
Wgt.
Quiet
100.0%
TOTAL
100.0%
Acoustic Noise 56.8%
SNR(12)
11.1%
SNR(6)
11.1%
Office
11.1%
MCE
Field
11.1%
Shelter
HMMMW
11.1%
V or P4
M2A2
Bradley or
11.1%
Leclerc
UH60
Black
11.1%
Hawk
F15 or
Mirage11.1%
2000
Volvo
11.1%
TOTAL
100.0%
Transmis
Random
sion.
7.4%
Bit Errors
100.0%
Channel
(1%)
TOTAL
100.0%
Tandem
8.4% CVSD=>Coder100.0%
TOTAL
100.0%
TOTAL
100.0%
CHECK
800.0%
Whispere
Special
100.0%
100.0%
d Speech
TOTAL
100.0%
TOTAL
100.0%
CHECK
200.0%
Baseline
42.1%
Quiet
100.0%
TOTAL
100.0%
Acoustic Noise 52.6%
SNR(12)
14.0%
SNR(6)
10.0%
Office
20.0%
MCE
Field
16.0%
Shelter
HMMMW
10.0%
V
F15
10.0%
Volvo
20.0%
TOTAL
100.0%
Tandem
5.3% CVSD=>Coder100.0%
TOTAL
100.0%
TOTAL
100.0%
CHECK
600.0%
Transmis
Transmis
sion.
100.0%
sion.
100.0%
Channel
Channel
TOTAL
100.0%
TOTAL
100.0%
CHECK
200.0%
NATO UNCLASSIFIED
11.44%
Cond. by
Type by
Char. by
2.4
Coder
6.86%
Cond. by
Type by
Char. by
1.2
Coder
4.58%
6.3%
6.3%
6.3%
2.64%
2.64%
2.64%
1.58%
1.58%
1.58%
1.06%
1.06%
1.06%
6.3%
2.64%
1.58%
1.06%
6.3%
2.64%
1.58%
1.06%
6.3%
2.64%
1.58%
1.06%
6.3%
2.64%
1.58%
1.06%
6.3%
2.64%
1.58%
1.06%
6.3%
2.64%
1.58%
1.06%
7.4%
3.08%
1.85%
1.23%
8.4%
3.52%
2.11%
1.41%
100.0%
41.80%
25.08%
16.72%
100.0%
2.20%
1.32%
0.88%
100.0%
42.1%
2.20%
14.40%
1.32%
8.64%
0.88%
5.76%
7.4%
5.3%
10.5%
2.52%
1.80%
3.60%
1.51%
1.08%
2.16%
1.01%
0.72%
1.44%
8.4%
2.88%
1.73%
1.15%
5.3%
1.80%
1.08%
0.72%
5.3%
10.5%
1.80%
3.60%
1.08%
2.16%
0.72%
1.44%
5.3%
1.80%
1.08%
0.72%
100.0%
34.20%
20.52%
13.68%
100.0%
1.80%
1.08%
0.72%
100.0%
1.80%
1.08%
0.72%
Cond. by
Type
Cond. by
Type by
Char.
27.4%
Phase 2 Combined Performance Index
9
8
7
6
US
FR
TU
5
4
3
2
1
0
• Selection made on combined scores at 2400 and 1200 bps
• 60% - 2400 bps score
• 40% - 1200 bps score
19
NATO UNCLASSIFIED
Phase 2 Combined Performance Index
10
9
8
7
6
5
4
3
2
1
0
20
NATO UNCLASSIFIED
Specific Results - Intelligibility
TNO CVC
70.0
60.0
Intelligibility score (%)
• Results of all
coders in all noise
conditions (CVC
test)
Quiet
50.0
6dB Babble
40.0
12dB Babble
Tandem
30.0
BER
20.0
10.0
TNO CVC
1
2
3
4
5
6
US24
CELP
FR24
CVSD
TU24
US12
7
8
LPC
TU12
9
FR12
70.0
Intelligibility score (%)
60.0
Office
MCE
50.0
HMMWV
40.0
Bradley
Black Haw k
30.0
F15
Auto
20.0
10.0
1
US24
21
2
3
4
5
6
7
8
9
CELP
FR24
CVSD
TU24
US12
LPC
TU12
FR12
NATO UNCLASSIFIED
Specific Results - Speech Quality
ARCON MOS
• Range of Mean
Opinion Score test
• 1 (Bad)
• 2 (Poor)
• 3 (Fair)
• 4 (Good)
• 5 (Excellent)
Mean Opinion Score
4.0
3.5
3.0
Quiet
6dB Babble
2.5
12dB Babble
Tandem
2.0
1.5
1.0
1
2
3
4
5
6
7
8
US24
CELP
FR24
CVSD
TU24
US12
LPC
TU12
ARCON MOS
9
FR12
• Results of all
coders in all noise
conditions (MOS
test)
Mean Opinion Score
4.0
3.5
Office
MCE
3.0
HMMWV
2.5
Bradley
Black Haw k
2.0
F15
Auto
1.5
1.0
US24
CELP
FR24
CVSD
TU24
US12
LPC
1
2
3
4
5
6
7
22
NATO UNCLASSIFIED
TU12
FR12
8
9
Specific Results - Language Dependency
French-English
• The closer a point
lies to the x=y
diagonal, the less
language
dependant the
voice coder
6.0
English SRT (dB SNR)
• Language
dependency of all
tested coders
5.0
R2 = 0.71
4.0
3.0
2.0
1.0
0.0
-1.0
-1.0
0.0
1.0
2.0
3.0
4.0
French SRT (dB SNR)
23
NATO UNCLASSIFIED
5.0
6.0
Current position
• Phase 1
• Completed
• Results available in NC3A Technical Note-881
• Phase 2
• All material processed and analysed
• Results collated
• Results analysed and blind removed
• Coder selected on 24 October 2001
• Stanag 4591 known
•MELPe
24
NATO UNCLASSIFIED
NC3A - Current activity
• Test Process Phase 3
• Real-time Implementation of Phase 2 winner
• Communicability tests
–real-life communication problem
–end-to-end delay effects
• Assist in drafting STANAG 4591
• Advise on the use and implementation of STANAG 4591
25
NATO UNCLASSIFIED
Stanag 4591 vs COTS voice coders
80
70
60
50
6 dB SNR
12 dB SNR
Quiet
40
30
20
MELPe
COTS Z
COTS Y
COTS X
MELPe
COTS Z
COTS Y
0
COTS X
10
COTS X =
6 kbps
COTS Y = 4.56 kbps
COTS X = 4.56 kbps
MELPe = 2.4 kbps
Male speaker
26
Female speaker
NATO UNCLASSIFIED
Conclusion
• STANAG 4591 provides
• substantially improved performance
– speech quality
– intelligibility
– noise immunity
• reduced throughput requirements
• interoperability
27
NATO UNCLASSIFIED
Further information
Stanag 4591 test and selection process
Street MD, “Future NATO narrow band voice coder selection: Stanag 4591”, NC3A Technical Note 881,
The Hague, December 2001
http://nc3a.info/Voice
Street MD and Collura JS, “Interoperable Voice Communications: test and selection of STANAG 4591”,
RTA IST Symposium - NATO Research and Technology Agency (Information Systems and Technology
panel) Tactical Military Communications symposium, Warsaw, October 2001
http://www.rta.nato.int/IST.htm
Street MD and Collura JS, “The test and selection of the future NATO narrow band voice coder”, RCMCIS
- NATO Regional Conference on Military CIS, Warsaw, Zegrze, October 2001.
http://www.wil.waw.pl/ses3.htm
MELPe: the selected voice coder
Collura JS and Rahikka DJ, “Interoperable secure voice communications in tactical systems, IEE coll. on
Speech coding algorithms for radio channels, London, February 2000.
An overview of the MELP voice coder and its use in military environments
http://www.iee.org/OnComms/pn/communications
Collura JS, Rahikka DJ, Fuja TE, Sridhara D and Fazel T, “Error coding strategies for MELP vocoder in
wireless and ATM environments”, IEE coll. on Speech coding algorithms for radio channels, London,
February 2000.
Performance of MELP with a variety of different error correction mechanisms
http://www.iee.org/OnComms/pn/communications
28
NATO UNCLASSIFIED
Information and Source Code available from:
http://elayne.nc3a.nato.int/S4591/
Applied Communication Technologies Branch
CIS Division
NATO C3 Agency
PO Box 174
2501 CD , The Hague
The Netherlands
Tel: +31 70 374 3043
Fax: +31 70 374 3049
Email: [email protected]
29
NATO UNCLASSIFIED