下載/瀏覽

Download Report

Transcript 下載/瀏覽

Portable Text to Speech for
Indonesian Language (Bahasa)
Ilham Ari Elbaith Zaeni
安啓聖
DA220207
Presented on Seminar Class
March 18, 20014
Dept of Electrical Engineering
Southern Taiwan University of Science and Technology
Outline
•
•
•
•
•
•
Definition
Background
System Design
Algorithm
Hardware Implementation
Result
Definition
• A text-to-speech (TTS): system converts normal
language text into speech [wikipedia]
• Syllable : a unit of pronunciation having one
vowel sound, with or without surrounding
consonants, forming the whole or a part of a
word
• Language of Indonesia
– Indonesia have >700 languages.
– The official language is Indonesian (known as Bahasa
Indonesia), a variant of Malay + other languages.
[wikipedia]
Language
Indonesian/Malay
Javanese
Sundanese
Madurese
Minangkabau
Musi (Palembang
Malay)[4]
Manado
Malay (Minahasan)
Bugis
Number
Year
Main areas where spoken
(millions)
surveyed
210
2010
throughout Indonesia
2000
Northern Banten, Northern West
84.3 (census) Java, Yogyakarta, Central Java andEast Java
2000
34 (census) West Java, Banten
2000
13.6 (census) Madura Island (East Java)
5.5
2007
West Sumatra, Riau
2000
3.9 (census) South Sumatra
3.8
3.5
Banjarese
3.5
Acehnese
3.5
Balinese
Betawi
3.3
2.7
2001
1991
2000
(census)
2000
(census)
2000
(census)
1993
Minahasa, North Sulawesi
South Sulawesi
South Kalimantan, East Kalimantan, Central
Kalimantan
Aceh
Bali Island and Lombok Island
Jakarta
etc
English
Kutainese
Indonesian/
Malay
Javanese
Sundanese
Madurese
Minangkabau
Palembang Malay
one
two
water
person house
dog
coconut
satu
due
ranam
urang
rumah
koyok
nyiur
satu
dua
air
orang
rumah
anjing
kelapa
siji
loro
banyu
uwong
omah
asu
kambil
hiji
dua
cai/ci
jalma
imah
anjing
kalapa
settong dhuwa' âên
oreng
roma
pate'
nyior
cie'
duo
aie
urang
rumah
anjiang karambia
sikok
duo
banyu
wong
rumah
anjing
kelapo

Loan words of Sanskrit Origin
◦
◦
◦
◦
◦
◦
◦
भाषा bahasa (language),
काच kaca (glass, mirror),
राज- raja (king),
मनष्ु य manusia (mankind),
भूमम bumi (earth/ world),
आगम agama (religion),
स्त्री Istri (wife/woman),

Loan words of Arabic Origin
◦
◦
◦
◦
◦
◦
◦
◦
selamat ( ‫ السالم‬salaam = peace)
dunia (‫دنيا‬dunya = the present world),
Sabtu (‫ السبت‬as-sabt =Saturday),
kabar ( ‫خبر‬ḵabar = news),
ijazah ( ‫إجازة‬ijāza = vacation),
kitab ( ‫كتاب‬kitāb = book),
tertib ( ‫ترتيب‬tartīb = orderly)
kamus( ‫قاموس‬qāmūs = dictionary)

Loan words of Chinese Origin
◦
◦
◦
◦
◦
◦
pisau (匕首 bǐshǒu – knife),
loteng, (樓/層 = lóu/céng – [upper] floor/ level),
mie (麵 > 面 Hokkien mī – noodles),
lumpia (潤餅 (Hokkien = lūn-piáⁿ) – springroll),
cawan (茶碗 cháwǎn – teacup),
teko (茶壺 > 茶壶 = cháhú [Mandarin], teh-ko
[Hokkien] = teapot),
◦ 苦力 kuli = 苦 khu (bitter) and 力 li (energy)

Loan words of Portuguese Origin
◦
◦
◦
◦
◦
◦
◦
◦
◦
meja (from mesa = table),
boneka (from boneca = doll),
jendela (from janela = window),
gereja (from igreja = church),
bendera (from bandeira = flag),
sepatu (from sapato = shoes),
keju (from queijo = cheese),
mentega (from manteiga = butter),
Minggu (from domingo = Sunday)

Loan words of Dutch Origin
◦
◦
◦
◦
◦
◦
◦
polisi (from politie = police),
kualitas (from kwaliteit = quality),
rokok (from roken = smoking cigarettes),
korupsi (from corruptie = corruption),
kantor (from kantoor = office),
resleting (from ritssluiting = zipper)
gratis (from gratis = free)
Background
• Developed on 2005 for
– LCEN competition
– Bachelor degree final project on Dept of Electrical
Engineering Brawijaya University.
• To help people who cannot speak
• It should be portable
• Why not applied on the smartphone?
– Iphone released on 2007.
– First Android (HTC Dream) released on 2008
[wikipedia_Smartphone].
System Design
• Process that should do:
– User can type a text using keypad/keyboard.
– Text stored into memory
– Text converted into syllable
– Play sound that is corresponding the syllable.
System Design
• What is needed by the system
– Text Editor
• Input > 26 button  Using 2 keypad 4x4
• Display  LCD 2x16
• Processor  Microcontroller AT89S8252
– Memory for the Sound
• ISD25120  Chipcorder from Windbond
• Audio Amplifier
Algorithm
Syllable spelling (on Bahasa)
 Bahasa have :
 vowel (a, e, i, o, u)
 consonant
(b, c, d, f, g, h, j, k, l, m, n, p, q, r,s, t
, v, w, x, y, z)
 Diphthong (ai, au, oi)
 Consonants combination (kh, ng, ny,
sy, etc)
Rule of spelling (on Bahasa)
 If there are two vowel in sequence, the words should
be separated between two vowels.
 Ex: ma-in (play), bu-ah (fruit)
 If there is one consonant between two vowel, the
words should be separated before the consonant
 Ex: sa-ya (i), ka-mu (you)
Rule of spelling (on Bahasa)
 If there is two consonant between two vowel, the
words should be separated between the consonant
 Ex: man-di (take a bath), makh-luk (creature)
 If there is three consonant between two vowel, the
words should be separated after the first consonant
 Ex: ul-tra, in-fra
[wikisource]
Hardware Implementation
Hardware Implementation
Hardware Implementation
Result
1
saya
i
The expected
output
/sa/ /ya/
2
anda
you
/an/ /da/
/an/ /da/
correct
3
dia
he/she
/di/ /a/
/di/ /a/
correct
4
intra
intra
/in/ /tra/
-
wrong
5
mereka
they
/me/ /re/ /ka/
/me/ /re/ /ka/
correct
6
keluar
out
/ke/ /lu/ /ar/
/ke/ /lu/ /ar/
correct
7
alias
alias
/a/ /li/ /as/
/a/ /li/ /as/
correct
8
anak
child
/a/ /nak/
/a/ /na/ /ak/
wrong
9
tragedi
tragedy
/tra/ /ge/ /di/
/us/ /ra/ /ge/ /di/
wrong
10
buka
open
/bu/ /ka.
/bu/ /ka.
correct
11
angka
number
/ang/ /ka/
/ang/ /ka/
correct
12
pramuka
scout
/pra/ /mu/ /ka/
/pra/ /mu/ /ka/
correct
13
syair
poem
/sya/ /ir/
/sya/ /ir/
correct
14
karena
because
/ka/ /re/ /na/
/ka/ /re/ /na/
correct
15
artikulasi
/ar/ /ti/ /ku/ /la/ /si/
correct
No. Input Text
Meaning
articulation /ar/ /ti/ /ku/ /la/ /si/
Output
Correct/Wrong
/sa/ /ya/
correct
Reference
• http://en.wikipedia.org/wiki/Smartphone
• http://en.wikipedia.org/wiki/Text_to_Speech
• http://en.wikipedia.org/wiki/Languages_of_Indo
nesia
• http://en.wikipedia.org/wiki/Indonesian_languag
e
• http://id.wikisource.org/wiki/Pedoman_Umum_E
jaan_Bahasa_Indonesia_yang_Disempurnakan_(1
987)/Bab_I