Towards Synthesis of Focus in Mandarin Text-to-speech System Dr. Dezhi HUANG [email protected] SNLP Unit, FTRD Beijing 2005/11/2 V1.1
Download
Report
Transcript Towards Synthesis of Focus in Mandarin Text-to-speech System Dr. Dezhi HUANG [email protected] SNLP Unit, FTRD Beijing 2005/11/2 V1.1
Towards Synthesis of Focus in Mandarin
Text-to-speech System
Dr. Dezhi HUANG
[email protected]
SNLP Unit, FTRD Beijing
2005/11/2 V1.1
Table of Contents
1
Synthesis of focus
2
Proposal for SSML
3
Examples with <focus>
2
Human has the strong ability of
information reconstruct
Evidence from music perception
The “Butterfly Lovers” violin concerto
3
Human has the strong ability of
information reconstruct (Cont.)
Evidence from human vision
4
Application model of Mandarin
Text-to-speech (Cont.)
Spoken dialog system
Information query by the side of road
Mandarin Voice-enabled
Service Gateway
PSTN/
Wireless
Mandarin TTS
Engine
Angry
Environment Noise
5
Why we fail?
The important content is not prominent as we
expect
Weaken the background noise (Noise reduction)
Improve the prominence of information that we need
Utilizing the human ability of information reconstruct
6
What do we need in speech
communication?
The key information is always contained in a
phrase/word in a sentence
Have you always seen Prof. Zhao?
No, I saw him only once.
The container of key information is called the focus.
The semantic centre of a sentence
7
The value of synthesis of focus
It is helpful for
Analyzing the syntactic of sentence
Understanding the meaning of utterance
Capturing the turn-taking
Comprehending the attempt and emotion of speaker
Improve the acceptance of TTS
8
Key challenges in synthesis of focus
Difficult to locate a focus in a sentence
Some focuses can be found from the syntactic structure
明天你准备去买什么?我要去买红色的帽子。
The other focuses are decided by the context of a sentence
老王去年退休了。
老王去年退休了。
老王去年退休了。
Markup Language for Focus
Lack of appropriate acoustic model to realize a focus
Pitch accent
Duration
Energy
Pause
Weakness
9
Table of Contents
1
Synthesis of focus
Make the synthesized speech clear
Improve the validity of speech communication
with TTS
2
Proposal for SSML
3
Examples with <focus>
10
What is SSML?
It is designed to provide a rich, XML-based
markup language for assisting the generation of
synthetic speech in Web and other applications
SSML
Natural Language
Processing and Understanding
Speech Synthesis
11
<EMPHASIS> in SSML
The emphasis element requests that the contained
text be spoken with emphasis (also referred to as
prominence or stress)
Level: strong, moderate and none
For synthesizer, it is easy to know which word has
sentence stress
老王买了车。
老王买了车。
12
The proposed <focus> element
The focus element indicates that the contained
text be the semantic centre and the carrier of
important information of a sentence
In the perspective of pragmatics
Contrastive focus (also referred to as identificational
focus)
Informational focus (also referred to as the
presentational focus, natural focus)
13
Samples of focus
(1) 你经常见赵教授吗?
我见过他一次。
(2)昨天老张干什么了?
昨天老张去看病。
(3)是老张帮我修了车。
(4)他连我也不相信。
(5)他经常和我打球。
(6)他居然卖了房子。
(7)我们去钓鱼吧。
14
A focus in Mandarin is not one-to-one
corresponding with an emphasis
Most of focuses are realized by stresses
是老张退休了。
明天最高气温多少度?明天最高气温30度。
Some of them are realized by pause or intonation
你常常见赵老师吗?我见过他一次。
我们下象棋吧。
15
Differences between focus and
emphasis
Focus is the concept of semantics and pragmatics
We can mark the focus up without speech signal
国家工商总局昨天发出紧急通知强调,全国大中城市、边境地区、发
生过疫情的地区、养殖大省四类区域必须建立健全禽类产品“挂牌经
营”制度,市场内禽类产品要标明禽类生产地、动物检验检疫证明及
销售承诺。
Emphasis is the concept of psychoacoustics
The consistency of emphasis label is relatively
difficult to achieve without speech signal
16
Differences between focus and
emphasis (Cont.)
Focus always carries the purpose of utterance
We can know exactly what the sentence means
Emphasis is not directly linked to the purpose of
utterance
The emphasized word may be trivial
黄菊强调,认真学习贯彻五中全会精神,继续推进国有商
业银行改革。
他经常和我打球。
17
What can we benefit from focus
labeling?
Improve the intelligibility of synthesized speech,
especially in communication environment with
noise
Q:明天最晚一班到北京的飞机是几点?
A:在晚上9点钟有一班CZ8071的飞机飞往北京。
Q:几点钟?
A:是9点。
Q:哪一班?
A:是CZ8071。
18
What can we benefit from focus
labeling? (Cont.)
focus labeling can be directly applied to text
information processing
The next generation of search engine should need to
know
which is the topic of a paragraph
which are the focuses of a sentence
Text highlight is important step for information retrieval
Keywords in automatic digest are always the focuses
19
Table of Contents
1
Synthesis of focus
2
Proposal for SSML
<focus> indicates what is semantic centre
<focus> solves the problem of focus location
3
Examples with <focus>
20
Attributes of <focus>
Type
informational
contrastive
Method
StrongStress
ModerateStress
None
Pause
Intonation
21
Samples of <focus>
(1) 你经常见<focus type=“informational”
method=“StrongStress ”>赵教授</focus>吗?
我见过他<focus type=“informational” method=“Pause”>
一次</focus>。
(2)昨天老张干什么了?
昨天老张<focus type=“informational”
method=“ModerateStress ”>去看病</focus>。
(3)是<focus type=“contrastive” method=“StrongStress ”>老
张</focus>帮我修了车。
22
Samples of <focus> (Cont.)
(4)他连<focus type=“contrastive” method=“StrongStress ”>
我</focus>也不相信。
(5)他经常<focus type=“informational” method=“Pause”>和
我打球</focus>。
(6)他居然<focus type=“informational”
method=“ModerateStress ”>卖了房子</focus>。
(7)我们<focus type=“informational” method=“Intonation ”>
去钓鱼</focus>吧。
23
Thank you!
24