Transcript 下載/瀏覽
Portable Text to Speech for Indonesian Language (Bahasa) Ilham Ari Elbaith Zaeni 安啓聖 DA220207 Presented on Seminar Class March 18, 20014 Dept of Electrical Engineering Southern Taiwan University of Science and Technology Outline • • • • • • Definition Background System Design Algorithm Hardware Implementation Result Definition • A text-to-speech (TTS): system converts normal language text into speech [wikipedia] • Syllable : a unit of pronunciation having one vowel sound, with or without surrounding consonants, forming the whole or a part of a word • Language of Indonesia – Indonesia have >700 languages. – The official language is Indonesian (known as Bahasa Indonesia), a variant of Malay + other languages. [wikipedia] Language Indonesian/Malay Javanese Sundanese Madurese Minangkabau Musi (Palembang Malay)[4] Manado Malay (Minahasan) Bugis Number Year Main areas where spoken (millions) surveyed 210 2010 throughout Indonesia 2000 Northern Banten, Northern West 84.3 (census) Java, Yogyakarta, Central Java andEast Java 2000 34 (census) West Java, Banten 2000 13.6 (census) Madura Island (East Java) 5.5 2007 West Sumatra, Riau 2000 3.9 (census) South Sumatra 3.8 3.5 Banjarese 3.5 Acehnese 3.5 Balinese Betawi 3.3 2.7 2001 1991 2000 (census) 2000 (census) 2000 (census) 1993 Minahasa, North Sulawesi South Sulawesi South Kalimantan, East Kalimantan, Central Kalimantan Aceh Bali Island and Lombok Island Jakarta etc English Kutainese Indonesian/ Malay Javanese Sundanese Madurese Minangkabau Palembang Malay one two water person house dog coconut satu due ranam urang rumah koyok nyiur satu dua air orang rumah anjing kelapa siji loro banyu uwong omah asu kambil hiji dua cai/ci jalma imah anjing kalapa settong dhuwa' âên oreng roma pate' nyior cie' duo aie urang rumah anjiang karambia sikok duo banyu wong rumah anjing kelapo Loan words of Sanskrit Origin ◦ ◦ ◦ ◦ ◦ ◦ ◦ भाषा bahasa (language), काच kaca (glass, mirror), राज- raja (king), मनष्ु य manusia (mankind), भूमम bumi (earth/ world), आगम agama (religion), स्त्री Istri (wife/woman), Loan words of Arabic Origin ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ selamat ( السالمsalaam = peace) dunia (دنياdunya = the present world), Sabtu ( السبتas-sabt =Saturday), kabar ( خبرḵabar = news), ijazah ( إجازةijāza = vacation), kitab ( كتابkitāb = book), tertib ( ترتيبtartīb = orderly) kamus( قاموسqāmūs = dictionary) Loan words of Chinese Origin ◦ ◦ ◦ ◦ ◦ ◦ pisau (匕首 bǐshǒu – knife), loteng, (樓/層 = lóu/céng – [upper] floor/ level), mie (麵 > 面 Hokkien mī – noodles), lumpia (潤餅 (Hokkien = lūn-piáⁿ) – springroll), cawan (茶碗 cháwǎn – teacup), teko (茶壺 > 茶壶 = cháhú [Mandarin], teh-ko [Hokkien] = teapot), ◦ 苦力 kuli = 苦 khu (bitter) and 力 li (energy) Loan words of Portuguese Origin ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ meja (from mesa = table), boneka (from boneca = doll), jendela (from janela = window), gereja (from igreja = church), bendera (from bandeira = flag), sepatu (from sapato = shoes), keju (from queijo = cheese), mentega (from manteiga = butter), Minggu (from domingo = Sunday) Loan words of Dutch Origin ◦ ◦ ◦ ◦ ◦ ◦ ◦ polisi (from politie = police), kualitas (from kwaliteit = quality), rokok (from roken = smoking cigarettes), korupsi (from corruptie = corruption), kantor (from kantoor = office), resleting (from ritssluiting = zipper) gratis (from gratis = free) Background • Developed on 2005 for – LCEN competition – Bachelor degree final project on Dept of Electrical Engineering Brawijaya University. • To help people who cannot speak • It should be portable • Why not applied on the smartphone? – Iphone released on 2007. – First Android (HTC Dream) released on 2008 [wikipedia_Smartphone]. System Design • Process that should do: – User can type a text using keypad/keyboard. – Text stored into memory – Text converted into syllable – Play sound that is corresponding the syllable. System Design • What is needed by the system – Text Editor • Input > 26 button Using 2 keypad 4x4 • Display LCD 2x16 • Processor Microcontroller AT89S8252 – Memory for the Sound • ISD25120 Chipcorder from Windbond • Audio Amplifier Algorithm Syllable spelling (on Bahasa) Bahasa have : vowel (a, e, i, o, u) consonant (b, c, d, f, g, h, j, k, l, m, n, p, q, r,s, t , v, w, x, y, z) Diphthong (ai, au, oi) Consonants combination (kh, ng, ny, sy, etc) Rule of spelling (on Bahasa) If there are two vowel in sequence, the words should be separated between two vowels. Ex: ma-in (play), bu-ah (fruit) If there is one consonant between two vowel, the words should be separated before the consonant Ex: sa-ya (i), ka-mu (you) Rule of spelling (on Bahasa) If there is two consonant between two vowel, the words should be separated between the consonant Ex: man-di (take a bath), makh-luk (creature) If there is three consonant between two vowel, the words should be separated after the first consonant Ex: ul-tra, in-fra [wikisource] Hardware Implementation Hardware Implementation Hardware Implementation Result 1 saya i The expected output /sa/ /ya/ 2 anda you /an/ /da/ /an/ /da/ correct 3 dia he/she /di/ /a/ /di/ /a/ correct 4 intra intra /in/ /tra/ - wrong 5 mereka they /me/ /re/ /ka/ /me/ /re/ /ka/ correct 6 keluar out /ke/ /lu/ /ar/ /ke/ /lu/ /ar/ correct 7 alias alias /a/ /li/ /as/ /a/ /li/ /as/ correct 8 anak child /a/ /nak/ /a/ /na/ /ak/ wrong 9 tragedi tragedy /tra/ /ge/ /di/ /us/ /ra/ /ge/ /di/ wrong 10 buka open /bu/ /ka. /bu/ /ka. correct 11 angka number /ang/ /ka/ /ang/ /ka/ correct 12 pramuka scout /pra/ /mu/ /ka/ /pra/ /mu/ /ka/ correct 13 syair poem /sya/ /ir/ /sya/ /ir/ correct 14 karena because /ka/ /re/ /na/ /ka/ /re/ /na/ correct 15 artikulasi /ar/ /ti/ /ku/ /la/ /si/ correct No. Input Text Meaning articulation /ar/ /ti/ /ku/ /la/ /si/ Output Correct/Wrong /sa/ /ya/ correct Reference • http://en.wikipedia.org/wiki/Smartphone • http://en.wikipedia.org/wiki/Text_to_Speech • http://en.wikipedia.org/wiki/Languages_of_Indo nesia • http://en.wikipedia.org/wiki/Indonesian_languag e • http://id.wikisource.org/wiki/Pedoman_Umum_E jaan_Bahasa_Indonesia_yang_Disempurnakan_(1 987)/Bab_I