Suggestions on Tone and Word Boundary of Mandarin for SSML LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Toshiba (China) R&D Center.

Download Report

Transcript Suggestions on Tone and Word Boundary of Mandarin for SSML LOU Xiaoyan, LI Jian Research and Development Center, Toshiba China Toshiba (China) R&D Center.

Suggestions on Tone and Word Boundary
of Mandarin for SSML
LOU Xiaoyan, LI Jian
Research and Development Center, Toshiba
China
Toshiba (China) R&D Center
Outline


Tone
Word boundary
Toshiba (China) R&D Center
Tone (cont…)

Importance


As important as phonemes in tonal language
Same syllables with different tones take different
meaning:
妈(mā) 麻(má) 马(mă) 骂(mà)

Sandhi phenomenon in tonal language
你好 ni3 hao3  ni2 hao3


Synthesis with correct tone help listener catch the
meaning of speech
Non-markup behavior


Tone can be achieved by looking up dictionary or
applying rules.
Errors may occur, especially in dealing with sandhi
Toshiba (China) R&D Center
Suggestion on Tone (cont…)

Our suggestions


Using Pinyin sequence as the value of phoneme
element
Using number 1, 2, 3, 4 and 5 standing for tone “yin
ping”, “yang ping”, “shang sheng”, “qu sheng” and
neutral tone in Mandarin:
Text: 大都(dàdoū)
Pinyin sequence+tone: /da 4/dou 1/

Solution1: new tone element (optional), with
required attribute detail:
<tone detail=“4 1”>大都</tone>

Solution 2: new value “t” and “pt”of alphabet
attribute in phoneme element
<phoneme alphabet=“t” ph=“4 1”> 大都</phoneme>
<phoneme alphabet=“pt” ph=“da 4/dou 4”> 大都</phoneme>
Toshiba (China) R&D Center
Note on Tone Markup

Possible influence on SSML1.0



The tone strings given by markup cannot be
changed



Solution 1: Tone element cannot be followed by
other element, and can be enclosed by p, s, w(if
defined) element
Solution 2: phoneme element is modified, the
relation to other elements should not change
in the text normalization step
in the result of looking up the lexicon.
Tone markup should be neglected, when


Value error of tone
Unmatched length of tone sequence
Toshiba (China) R&D Center
Outline


Tone
Word boundary
Toshiba (China) R&D Center
Word Boundary (cont…)



Word is the basic unit for sentence parsing and
understanding.
Chinese sentences are composed of sequence of
Chinese characters without blanks or spaces to specify
word boundaries.
Difficulties:
Complex words, such as reduplications, derived words, such
as “简简单单”(very easily), “非物质”(immateriality)
 Proper nouns, such as location name, person name
 The ambiguous word segmentations.
A: 上海 是 个 大都会。(Shanghai is a metropolis)
B: 上海人 大都 会 那么 说。(Most Shanghainese will say that)


Non-markup behavior


Determine the boundary using language-specific knowledge
Errors may occur
Toshiba (China) R&D Center
Suggestions on Word Boundary (cont…)

New element w is suggested
<w>都会</w>

An optional attribute detail is also recommended
to mark phrases
<w detail=“3 2 1”>上海人大都会</w>
Here, the phrase is split into three words, and the number of Chinese
characters of these words are 3, 2 and 1.
Toshiba (China) R&D Center
Suggestion on Word Boundary (cont…)

Legal values of the optional attribute detail

Not bigger than the length of the contained text
<w detail=“3”>上海</w>

Default value is the length of the contained text
<w >上海</w>

When the sum of value is smaller than the length of
the contained text, the left part is regarded as a
word
<w detail=“3”>上海人大都会</w>
The first 3 Chinese characters “上海人”are regarded as one
word and the left “大都会” are regarded as another word

When the sum of value is bigger than the length of
the contained text, this markup should be neglected
Toshiba (China) R&D Center
Possible Influence on SSML 1.0

Influence on speech synthesizing steps


Word segmentation is suggested to be done before
parse text and analysis structure
Relation between SSML 1.0 markups and word
segmentation markup w (needs more discussion)


p, s element can be followed by w element;
w element can be followed by audio, emphasis,
phoneme, prosody, say-as, sub, voice and t(if defined)
<p>
<w detail=“2”>上海</w>
</p>
<w detail=“2”><prosody rate=“-10%”>上海</prosody></w>大都
会
Toshiba (China) R&D Center
Thank you!
Toshiba (China) R&D Center