Towards Synthesis of Focus in Mandarin Text-to-speech System Dr. Dezhi HUANG [email protected] SNLP Unit, FTRD Beijing 2005/11/2 V1.1
Download ReportTranscript Towards Synthesis of Focus in Mandarin Text-to-speech System Dr. Dezhi HUANG [email protected] SNLP Unit, FTRD Beijing 2005/11/2 V1.1
Towards Synthesis of Focus in Mandarin Text-to-speech System Dr. Dezhi HUANG [email protected] SNLP Unit, FTRD Beijing 2005/11/2 V1.1 Table of Contents 1 Synthesis of focus 2 Proposal for SSML 3 Examples with <focus> 2 Human has the strong ability of information reconstruct Evidence from music perception The “Butterfly Lovers” violin concerto 3 Human has the strong ability of information reconstruct (Cont.) Evidence from human vision 4 Application model of Mandarin Text-to-speech (Cont.) Spoken dialog system Information query by the side of road Mandarin Voice-enabled Service Gateway PSTN/ Wireless Mandarin TTS Engine Angry Environment Noise 5 Why we fail? The important content is not prominent as we expect Weaken the background noise (Noise reduction) Improve the prominence of information that we need Utilizing the human ability of information reconstruct 6 What do we need in speech communication? The key information is always contained in a phrase/word in a sentence Have you always seen Prof. Zhao? No, I saw him only once. The container of key information is called the focus. The semantic centre of a sentence 7 The value of synthesis of focus It is helpful for Analyzing the syntactic of sentence Understanding the meaning of utterance Capturing the turn-taking Comprehending the attempt and emotion of speaker Improve the acceptance of TTS 8 Key challenges in synthesis of focus Difficult to locate a focus in a sentence Some focuses can be found from the syntactic structure 明天你准备去买什么?我要去买红色的帽子。 The other focuses are decided by the context of a sentence 老王去年退休了。 老王去年退休了。 老王去年退休了。 Markup Language for Focus Lack of appropriate acoustic model to realize a focus Pitch accent Duration Energy Pause Weakness 9 Table of Contents 1 Synthesis of focus Make the synthesized speech clear Improve the validity of speech communication with TTS 2 Proposal for SSML 3 Examples with <focus> 10 What is SSML? It is designed to provide a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications SSML Natural Language Processing and Understanding Speech Synthesis 11 <EMPHASIS> in SSML The emphasis element requests that the contained text be spoken with emphasis (also referred to as prominence or stress) Level: strong, moderate and none For synthesizer, it is easy to know which word has sentence stress 老王买了车。 老王买了车。 12 The proposed <focus> element The focus element indicates that the contained text be the semantic centre and the carrier of important information of a sentence In the perspective of pragmatics Contrastive focus (also referred to as identificational focus) Informational focus (also referred to as the presentational focus, natural focus) 13 Samples of focus (1) 你经常见赵教授吗? 我见过他一次。 (2)昨天老张干什么了? 昨天老张去看病。 (3)是老张帮我修了车。 (4)他连我也不相信。 (5)他经常和我打球。 (6)他居然卖了房子。 (7)我们去钓鱼吧。 14 A focus in Mandarin is not one-to-one corresponding with an emphasis Most of focuses are realized by stresses 是老张退休了。 明天最高气温多少度?明天最高气温30度。 Some of them are realized by pause or intonation 你常常见赵老师吗?我见过他一次。 我们下象棋吧。 15 Differences between focus and emphasis Focus is the concept of semantics and pragmatics We can mark the focus up without speech signal 国家工商总局昨天发出紧急通知强调,全国大中城市、边境地区、发 生过疫情的地区、养殖大省四类区域必须建立健全禽类产品“挂牌经 营”制度,市场内禽类产品要标明禽类生产地、动物检验检疫证明及 销售承诺。 Emphasis is the concept of psychoacoustics The consistency of emphasis label is relatively difficult to achieve without speech signal 16 Differences between focus and emphasis (Cont.) Focus always carries the purpose of utterance We can know exactly what the sentence means Emphasis is not directly linked to the purpose of utterance The emphasized word may be trivial 黄菊强调,认真学习贯彻五中全会精神,继续推进国有商 业银行改革。 他经常和我打球。 17 What can we benefit from focus labeling? Improve the intelligibility of synthesized speech, especially in communication environment with noise Q:明天最晚一班到北京的飞机是几点? A:在晚上9点钟有一班CZ8071的飞机飞往北京。 Q:几点钟? A:是9点。 Q:哪一班? A:是CZ8071。 18 What can we benefit from focus labeling? (Cont.) focus labeling can be directly applied to text information processing The next generation of search engine should need to know which is the topic of a paragraph which are the focuses of a sentence Text highlight is important step for information retrieval Keywords in automatic digest are always the focuses 19 Table of Contents 1 Synthesis of focus 2 Proposal for SSML <focus> indicates what is semantic centre <focus> solves the problem of focus location 3 Examples with <focus> 20 Attributes of <focus> Type informational contrastive Method StrongStress ModerateStress None Pause Intonation 21 Samples of <focus> (1) 你经常见<focus type=“informational” method=“StrongStress ”>赵教授</focus>吗? 我见过他<focus type=“informational” method=“Pause”> 一次</focus>。 (2)昨天老张干什么了? 昨天老张<focus type=“informational” method=“ModerateStress ”>去看病</focus>。 (3)是<focus type=“contrastive” method=“StrongStress ”>老 张</focus>帮我修了车。 22 Samples of <focus> (Cont.) (4)他连<focus type=“contrastive” method=“StrongStress ”> 我</focus>也不相信。 (5)他经常<focus type=“informational” method=“Pause”>和 我打球</focus>。 (6)他居然<focus type=“informational” method=“ModerateStress ”>卖了房子</focus>。 (7)我们<focus type=“informational” method=“Intonation ”> 去钓鱼</focus>吧。 23 Thank you! 24