0x0141 0x0142 Przemysław Zdroik [ pshemýswuf zdróik ] IPA: pʃɛ.mɨ’.swaf zdrɔ’.ik [ pshemek ] TTS system development France Telecom owns Polish Telecom Speech technologies testing and evaluating „advanced user”

Transcript 0x0141 0x0142 Przemysław Zdroik [ pshemýswuf zdróik ] IPA: pʃɛ.mɨ’.swaf zdrɔ’.ik [ pshemek ] TTS system development France Telecom owns Polish Telecom Speech technologies testing and evaluating „advanced user”

0x0141
0x0142
Przemysław Zdroik
[ pshemýswuf zdróik ]
IPA:
pʃɛ.mɨ’.swaf zdrɔ’.ik
[ pshemek ]
TTS system development
France Telecom
owns
Polish Telecom
Speech technologies testing and evaluating
„advanced user”
FT R&D France / Poland
Missing diacritics
Tokenisation
Custom word (lexical) stress
Przemysław Zdroik, Paul Bagshaw
2nd Workshop on Internationalizing SSML, Crete, 31st May 2006
Plan of the presentation
●
●
●
●
A few words about diacritics
Usefulness of tokenisation
Word stress
Discussion
Regarding: FT R&D position paper
4
Text stripped of diacritics
● Occurs in SMSes, mails, IM, news group posts
● Additional pronunciation ambiguities appear:
French:
cure => cure vs. curé
Polish:
maki => maki vs. mąki
Czech, Slovak, Slovenian …
Similar problems (incorrect TTS input):
● Lack of accents in Russian texts
● Informal „romanisations” used in SMSes
(Greek, Russian – volapuk encoding)
Regarding: FT R&D position paper
5
Question
Shall we allow incorrect text as an input for TTS?
In other words: shall we expect from a TTS system to try
correcting the input marked as incorrect (e.g. as „sms_content”
or „email”)?
IMHO: It is a „nice to have” feature of TTS/SSML 1.1, but not
essential one.
Regarding: FT R&D position paper
6
Examples of tokenisation usefulness (1/3)
French:
● The word couple “bien que” may be either a locution (to be
considered as a single word; POS=conjunction) or two separate
words (bien POS=adverb; que POS=conjunction).
Il continue <token>bien que</token> ça soit perdu.
Il faut <token>bien</token> <token>que</token> jeunesse se passe.
● The phrase “rendez-vous” may also be considered as one word
(POS=noun) or two words (rendez POS=verb; vous POS=pronoun)
<token>Rendez-vous</token> de la semaine dernière.
<token>Rendez</token><token>-vous</token> au 1 avril?
Regarding: FT R&D position paper
7
Examples of tokenisation usefulness (2/3)
Arabic:
The word
‫( فقد‬fqd) can be segmented and vowelised in several ways:
● Segmentation 1 : f + qd = ‘faqad’ (conjunction ‘fa’ + particle ‘qad’)
● Segmentation 2 : fqd = ‘faqada’ (he has lost) verb active ; ‘faqida’ (he
was lost) passive verb ; ‘fuqdi’ (the loss) noun ; etc…
‫< النوادي أكبر في لعب‬token>‫<قد‬/token><token>‫<ف‬/token>
(He has played in the greatest clubs)
Segmentation, in general, does not have to solve the ambiguities, but
at least, it can decrease the number of possible pronunciations.
Regarding: FT R&D position paper
8
Examples of tokenisation usefulness (3/3)
English:
<token>with</token><token>or</token> without your help
Polish:
<token>z</token><token>lub</token> bez twojej pomocy
Regarding: FT R&D position paper
9
Question
How tokenisation markup should affect rendering of languages with
space word demarkation?
Regarding: FT R&D position paper
10
Word (lexical) stress
● Czech – almost exception-free first syllable stress
● Slovak – first syllable stress, some exceptions
● Polish – general rule for penultimative stress,
many exceptional rules
● Russian
● „moving” stress
● no accentuation rules,
● accute as a stress idicator - Unicode combining accute accent
U+0x0301 (e.g. ы́ э́ ю́ я́ )
● In most texts, the accute is omitted
Regarding: FT R&D position paper
11
Word stress – Polish TTS engines
In the three commercial TTS systems, custom stress can be
indicated in non-standard way, by annotating accented
vowel/syllable with a special character(s)
e.g.
gramatyka (grammar), irregular stress - on the third syllable from
RealSpeak, Sayso:
grama’tyka
Ivona:
gram~!atyka
(None of the TTSes support IPA alphabet)
Regarding: FT R&D position paper
12
Question:
How custom word (lexical) stress can be represented in SSML ?
● By a dedicated tag ?
<stress primary=”~” secondary=”*”>gram~atyka</stress>
● By using special „phonetic” alphabet within the <phoneme> tag
● Other proposals ?
Regarding: FT R&D position paper
13
Dziękujemy
Thank you & let’s discuss
Regarding: FT R&D position paper
14
Prepared by:
Name:
Przemyslaw Zdroik
Division:
Department:
Vocal Services Secion
TP S.A. Research and Development Centre
Phone#:
(+ 48) 22 699 56 06
E-mail:
[email protected]
Name:
Krzysztof Majewski
Division:
Department:
Vocal Services Section
TP S.A. Research and Development Centre
Phone#:
(+ 48) 22 699 55 64
E-mail:
[email protected]
Regarding: FT R&D position paper
15

0x0141 0x0142 Przemysław Zdroik [ pshemýswuf zdróik ] IPA: pʃɛ.mɨ’.swaf zdrɔ’.ik [ pshemek ] TTS system development France Telecom owns Polish Telecom Speech technologies testing and evaluating „advanced user”

Transcript 0x0141 0x0142 Przemysław Zdroik [ pshemýswuf zdróik ] IPA: pʃɛ.mɨ’.swaf zdrɔ’.ik [ pshemek ] TTS system development France Telecom owns Polish Telecom Speech technologies testing and evaluating „advanced user”

Directory