Transcript Document

Arabic Natural Language Processing:
State of the Art and Prospects
Rached Zantout, Ph.D.
Electrical and Computer Engineering Department
Hariri Canadian University
Mechref, Chouf, Lebanon
Outline
•
•
•
What is NLP ?
Why NLP?
MT as a case study!
–
–
•
Problems solved by MT.
Main players in MT.
How does Arabic compare to other Languages as
far as NLP is concerned?
–
•
•
MT as a case study.
What kind of research is being conducted in
ANLP?
Recommendations!
11/2006UOB
Zantout ANLP: State of the Art and Prospects
2
Tracing the history of NLP
Mid-90spresent
Mid-80s; mid90s
1. Coming together of Symbolic and Statistical traditions;
2. Increased focus on functionality and less on representation; NL and Speech Applications
3. Availability of large corpora, large disk space, commoditization of computing resources.
4. Emergence of the World Wide Web, continues to change the field….
1. Revival of Finite-state Models for NL processing; XEROX, AT&T
2. Computational implementation of large NL Grammars in different grammatical frameworks and
3. Development of Penn Treebank; a parse annotated corpora
4. Machine learning coming of age;
1.
1970s; mid1980s
Symbolic tradition: (CS)
a. How much representational power is needed for NL? Grammars of increasing power to describe NL: Joshi, Gazdar, Bresnan, Kaplan, Periera,
Warren, Shieber.
b. AI Researchers: Winograd, Schank, Wilks, Lenhart, Woods: Understanding systems; SHRDLU, scripts, plans and goals LUNAR
c. Discourse and Dialog Structure: Grosz, Sidner, Hobbs, Perrault, Cohen
2. Statistical tradition: (EE.) Hidden Markov models for speech recognition;
1960s
1950s
1. Symbolic tradition: (CS) Generative grammar; parsing algorithms;
Newell, Simon, Shannon, McCarthy, Minsky, Rochester: Birth of AI; pattern matching based NL understanding system.
2. Statistical tradition: (EE.) Probabilistic inferences for OCR; representationally-light models
1. Kleene’s and Shannon’s probabilistic finite automaton; Chomsky’s context-free grammars; programming languages; formal language theory
2. Shannon’s information theory – information can be measured; decoding paradigm
1939-1945
World War II; Need for code-breaking algorithms; ENIAC
1936
11/2006UOB
Turing’s model of computation; theoretical basis for computer science
Zantout ANLP: State of the Art and Prospects
3
NL and NLP definitions
adapted from http://www.cs.bham.ac.uk/~pxc/nlpa/index02.htm
• 'natural language' (NL):
– Any of the languages naturally used by humans,
– not an artificial or man-made language such as a
programming language.
– (Arabic, English, Chinese, Swahili, etc.)
– evolved over thousands of years.
– efficient vehicles for human to human communication.
• 'Natural language processing' (NLP):
– attempts to use computers to process a NL.
– Enter computers.
• What's the connection?
11/2006UOB
Zantout ANLP: State of the Art and Prospects
4
Why ?
adapted from http://www.cs.utexas.edu/users/ear/cs378NLP/
• Is there any reason a computer should know
English or Chinese or Swahili?
• Yes. There are several "killer apps" for
NLP:
– retrieving information from the web,
– translating documents from one language to
another, and
– spoken front ends to all kinds of application
programs.
11/2006UOB
Zantout ANLP: State of the Art and Prospects
5
NLP
includes
adapted from http://www.cs.bham.ac.uk/~pxc/nlpa/index02.htm
• Speech synthesis:
– is this very 'intelligent‘?
– synthesis of natural-sounding speech is technically complex:
• requires some 'understanding' of what is being spoken to ensure, for
example, correct intonation. (bear vs. dear)
• Speech recognition:
– reduction of continuous sound waves to discrete words.
• Natural language understanding:
– moving from isolated words (written or via speech recognition)
– to 'meaning'.
• Natural language generation:
– generating appropriate NL responses to unpredictable inputs.
• Machine translation (MT): translating one NL into another
11/2006UOB
Zantout ANLP: State of the Art and Prospects
6
Areas Related to NLP
• Input:
– Speech Recognition.
– Natural Language Understanding.
• Lip Reading ?
• Processing:
– Information Retrieval:
• Finding where textual resources reside.
– Information Extraction:
• Extracting pertinent facts from textual resources.
– Inference: Drawing conclusions based on known facts.
– Spelling Correction.
– Grammar Checking.
• Output:
– Natural Language Generation.
– Speech Synthesis.
• Machine Translation.
• Conversational Agents.
11/2006UOB
Zantout ANLP: State of the Art and Prospects
7
NLP
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
taken from http://tangra.si.umich.edu/~radev/NLP/notes/1.ppt
Information extraction
Named entity recognition
Trend analysis
Subjectivity analysis
Text classification
Anaphora resolution, alias resolution
Cross-document cross-reference
Parsing
Semantic analysis
Word sense disambiguation
Word clustering
Question answering
Summarization
Document retrieval (filtering, routing)
Structured text (relational tables)
Paraphrasing and paraphrasing/entailment ID
Text generation
Machine translation
11/2006UOB
Zantout ANLP: State of the Art and Prospects
8
Sample projects
•
•
•
•
•
•
•
•
•
•
•
•
•
11/2006UOB
Noun phrase parser
Paraphrase identification
Question answering
NL access to databases
Named entity tagging
Rhetorical parsing
Anaphora resolution, entity
crossreference
Document and sentence
alignment
Using bioinformatics methods
Encyclopedia
Information extraction
Speech processing
Sentence normalization
•
•
•
•
•
•
•
•
•
•
•
•
•
Text summarization
Sentence compression
Definition extraction
Crossword puzzle generation
Prepositional phrase attachment
Machine translation
Generation
Semi-structured document
parsing
Semantic analysis of short
queries
User-friendly summarization
Number classification
Domain-specific PP attachment
Time-dependent fact extraction
Zantout ANLP: State of the Art and Prospects
9
Main research forums and other pointers
• Conferences: ACL/NAACL, SIGIR, AAAI/IJCAI, ANLP,
Coling, HLT, EACL/NAACL, AMTA/MT Summit,
ICSLP/Eurospeech
• Journals: Computational Linguistics, Natural Language
Engineering, Information Retrieval, Information Processing and
Management, ACM Transactions on Information Systems,
ACM TALIP, ACM TSLP
• University centers: Columbia, CMU, JHU, Brown, UMass,
MIT, UPenn, USC/ISI, NMSU, Michigan, Maryland,
Edinburgh, Cambridge, Saarland, Sheffield, and many others
• Industrial research sites: IBM, SRI, BBN, MITRE, MSR,
(AT&T, Bell Labs, PARC)
• Startups: Language Weaver, Ask.com, LCC
• The Anthology: http://www.aclweb.org/anthology
11/2006UOB
Zantout ANLP: State of the Art and Prospects
10
NLP Sources
• Journals:
–
–
–
–
–
–
Artificial Intelligence.
Computational Intelligence.
IEEE Transactions on Intelligent Systems.
Journal of Artificial Intelligence Research.
Cognitive Science.
Machine Translation.
• Conferences:
–
–
–
–
–
–
–
–
AAAI: American Association for Artificial Intelligence.
IJCAI: International Joint Conference on Artificial Intelligence.
Cognitive Science Society Conferences.
DARPA Speech and Natural Language Processing Workshop.
ARPA Workshop on Human Language Technology.
Machine Translation Summit series of conferences.
TALN series of conferences.
COLING series of conferences.
• Collection of papers:
– Readings in Natural Language Processing.
11/2006UOB
Zantout ANLP: State of the Art and Prospects
11
Why NLP?
Numbers
•
•
•
•
Information age! Information revolution!
Cheaper PCs
Advances in networking
Internet/www central pillar of modern societies
Massive production of information
• Growth of www?
Year
# Sites
92
50
93
250
94
2000
96
Sep. 99
>100K 43 M
• 800 Million Documents as of Sep. 1999
• People?
US:
6.5 M new adult users between 2/99 & 5/99
World:
26 Million in 1995
163.25 Million as of 9/99
11/2006UOB
Zantout ANLP: State of the Art and Prospects
12
11/2006UOB
Zantout ANLP: State of the Art and Prospects
13
More Recent Statistics (2006)
11/2006UOB
Zantout ANLP: State of the Art and Prospects
14
Web Characterization: Country Statistics
http://www.oclc.org/research/projects/archive/wcp/stats/intnl.htm
1999
2002
Country
Percent of public sites
Country
Percent of public sites
US
49%
US
55%
Germany
5%
Germany
6%
UK
5%
Japan
5%
Canada
4%
UK
3%
Japan
3%
Canada
3%
Australia
2%
Italy
2%
Brazil
2%
France
2%
Italy
2%
Netherlands
2%
France
2%
Others
18%
Others
16%
Unknown
4%
Unknown
10%
11/2006UOB
Zantout ANLP: State of the Art and Prospects
15
Web Characterization: Language Statistics
http://www.oclc.org/research/projects/archive/wcp/stats/intnl.htm
1999
2002
Language
Percent of public sites
Language
Percent of public sites
English
72%
English
72%
German
7%
German
7%
French
3%
Japanese
6%
Japanese
3%
Spanish
3%
Spanish
3%
French
3%
Chinese
2%
Italian
2%
Italian
2%
Dutch
2%
Portuguese
2%
Chinese
2%
Dutch
1%
Korean
1%
Finnish
1%
Portuguese
1%
Russian
1%
Russian
1%
Swedish
1%
Polish
1%
11/2006UOB
Zantout ANLP: State of the Art and Prospects
16
What’s the Use of the Numbers?
• Prove that there is a “Linguistic Problem”:
– Domination of the English Language.
– Alienates non-English Speakers.
– Computers are our interface to the internet:
• Computers do not understand a Natural Language.
• We do not have enough time to guide computers to
do what is required of them
– E.g. Search for all presentations about NLP on the
internet.
– Digest them and produce one presentation appropriate for
my talk at UOB ;-)
11/2006UOB
Zantout ANLP: State of the Art and Prospects
17
What’s the Use of the Numbers?
• Middle-East is a growing internet market:
–
–
–
–
Growing very fast.
Lots of Arabs (read non-English speakers).
Need to communicate with my own language.
Need computer to save time for me while
searching for information.
– Dream: computer could do most of my work
and I can just relax 
• Introducing the A into ANLP.
11/2006UOB
Zantout ANLP: State of the Art and Prospects
18
The Linguistic Problem
Machine Translation (MT) a Case Study
English: the de-facto international language
•
•
•
•
•
•
Internet and www (“CyberEnglish”!)
Science and Technology
Trade and Industry
Politics and Media
Tourism
Etc.
English = key to accessing Knowledge
in all walks of life!
Alienation of the HUGE majority of world population
Impoverishment of world cultures
11/2006UOB
Zantout ANLP: State of the Art and Prospects
19
The Linguistic Challenge
France:
• 1997:
7% French presence on www
• Legislation introduced (forcing I. Content providers to
translate web sites into French)
• Pres. Chirac: “If in the new media, our language, our
programs, our creations, are not strongly present, the
young generation of our country will be economically
and culturally marginalized”
• “I do not want to see the European Culture sterilized or
obliterated by the American Culture”
French is stronger than Arabic on the internet and the PC.
11/2006UOB
Zantout ANLP: State of the Art and Prospects
20
If not General NLP!
How about at least MT?
Languages in the world
• 6,800 living languages
• 600 with written tradition
• 95% of world population
speaks 100 languages
Translation Market
• $8 Billion Global Market
• Doubling every five years
(Donald Barabé, invited talk, MT Summit 2003)
11/2006UOB
Zantout ANLP: State of the Art and Prospects
21
The Problem
• Coping with the huge amount of articles, books,
patents in all disciplines (Assimilation)
• Coping with the www massive volume
• Exporting economic products (Dissemination)
• Facing the Omnipresence of English
50% of all scientific and technical references
Linguistic, cultural, social, educational,
economic, and political factors
11/2006UOB
Zantout ANLP: State of the Art and Prospects
22
Human Translation too limited 
MT
Translation Cost in EU is $1 Billion
Official Languages: from 11 to 20
1600 Human Translators
11/2006UOB
Zantout ANLP: State of the Art and Prospects
23
Why Machine Translation?
• Full Translation
– Domain specific
• Weather reports
• Machine-aided Translation
– Translation dictionaries
– Translation memories
– Requires post-editing
• Cross-lingual NLP applications
– Cross-language IR
– Cross-language Summarization
11/2006UOB
Zantout ANLP: State of the Art and Prospects
24
MT: A Strategic Choice
• USA: FCCSET report on MT (1993) on the
president’s request.
• Japan: $200 Million during 15 years till
1991. (Asian Multilingual MTS since 87)
• EU: since 1991, 220 projects on Language
Technology ($30 million on Eurotra!)
1996 report on the state of MT
11/2006UOB
Zantout ANLP: State of the Art and Prospects
25
MT Players
• Governments:
US, European, Japan, Canada, ex-USSR
(cold war), Korea, Malaysia, Indonesia,
Thailand, etc.
• International institutions:
– UN, E. Commission (12 languages; soon to be
22/23!!), etc.
• Companies (R&D):Microsoft, Siemens,
Fujitsu, Hitachi, Toshiba, Oki, NEC,
Mitsubishi, Sharp…
11/2006UOB
Zantout ANLP: State of the Art and Prospects
26
MT Market
•
•
•
•
•
•
•
World: estimated at $20 billion in 1991
MT Tools Market: $20 million in 1994
> 160 language pairs
> 60 MTSs being developed (as of 98)
Globalink claims 600 K users of its MTS
Lang. Eng. Corp. income (LogoVista): $2M
Smart Communications (Smart Translator):
$6M
• Systran (12 languages): 60,000 pages/year
11/2006UOB
Zantout ANLP: State of the Art and Prospects
27
AMT
11/2006UOB
Zantout ANLP: State of the Art and Prospects
28
ANLP
AsharqAlawsat (‫ )الشرق األوسط‬09.10.03
11/2006UOB
Zantout ANLP: State of the Art and Prospects
29
ANLP State Compared to General NLP
• Script problem:
– Arabic characters are nowhere near LatinBased Characters.
• Lack of funding:
–
–
–
–
–
11/2006UOB
Governments.
Pan-Arab Organizations.
Industry ?! Private Sector.
Research ???
Infrastructure !
Zantout ANLP: State of the Art and Prospects
30
Progress in Western MT
Statistical MT example
2002
2003
Human Translation
Egyptair Has Tomorrow to Egypt Air May Resume its
Resume Its Flights to
Flights to Libya Tomorrow
Libya
Cairo, April 6 (AFP) - An
Cairo
4-6
(AFP)
said
an
Egypt Air official
Cairo 6-4 ( AFP ) - an official at the Egyptian
official announced today Aviation Company today
announced, on Tuesday,
in the Egyptian lines
that the company egyptair that Egypt Air will resume
company for flying
may resume as of
its flights to Libya as of
Tuesday is a company " tomorrow, Wednesday its tomorrow, Wednesday,
insistent for flying " may flights to Libya after the
after the UN Security
resumed a consideration International Security
Council had announced the
Council resolution to the
of a day Wednesday
suspension of the
suspension of the embargo
tomorrow her trips to
embargo imposed on
imposed on Libya.
Libya of Security Council Libya.
decision trace
international the imposed
ban comment .
insistent Wednesday
may recurred her trips to
Libya tomorrow for flying
11/2006UOB
Form a talk by Charles Wayne, DARPA
Zantout ANLP: State of the Art and Prospects
31
A First taste of
Arabic Machine Translation
• English Text:
– Before more than 30,000 fans who headed to the Cite
Sportive from all Lebanese region on Sunday Nejmeh
drew 1-1 with their traditional rivals Ansar in a
breathtaking showdown, which saw both teams
performing their best.
• Human Translation:
‫ متفرج زحفوا إلى ملعب المدينة الرياضية نهار‬30.000 ‫– أمام أكثر من‬
‫ في مباراة مثيرة ق ّدم خاللها‬1-1 ‫األحد تعادل النجمة و االنصار‬
.‫الفريقان عرضا ً طيّبا ً افتقدته المالعب اللبنانية منذ فترة طويلة‬
• Ajeeb Translation:
‫ معجب الّذين ا ّتجهوا إلى اليذكر لعوب من ك ّل‬30،000 ‫– قبل أكثر من‬
‫ مع ترادي‬1-1 ‫المنطقة اللّبنانيّة يوم األحد نجمة رسم‬
11/2006UOB
Zantout ANLP: State of the Art and Prospects
32
A 1st Taste of Arabic MT
• A sample of sentences to be translated:
• Quite disappointing!
• But, need for a more formal assessment and
closer scrutiny
11/2006UOB
Zantout ANLP: State of the Art and Prospects
33
Multilingual Challenges
Morphological Variations
• Affixation vs. Root+Pattern
11/2006UOB
write
kill
 written
 killed
‫كتب‬
‫قتل‬


‫مكتوب‬
‫مقتول‬
do
 done
‫فعل‬

‫مفعول‬
Zantout ANLP: State of the Art and Prospects
34
Translation Divergences
conflation
‫ليس‬
‫ا نا‬
be
‫هنا‬
‫لست هنا‬
I-am-not here
11/2006UOB
I
not
etre
here
I am not here
Je
ne pas
ici
Je ne suis pas ici
I not be not here
Zantout ANLP: State of the Art and Prospects
35
Translation Divergences
categorial, thematic and structural
*
‫ا نا‬
be
‫بردان‬
‫انا بردان‬
I cold
11/2006UOB
I
cold
I am cold
Zantout ANLP: State of the Art and Prospects
36
Translation Divergences
head swap and categorial
‫اسرع‬
swim
I
Swam
across
river
quickly
‫انا‬
‫عبور‬
‫سباحة‬
‫نهر‬
I swam across the river quickly ‫اسرعت عبور النهر سباحة‬
I-sped crossing the-river swimming
11/2006UOB
Zantout ANLP: State of the Art and Prospects
37
Translation Divergences
head swap and categorial
‫انا‬
‫اسرع‬
ver
b
‫عبور‬
‫سباحة‬
‫نهر‬
swim
I
across quickly
river
ver
b
11/2006UOB
Zantout ANLP: State of the Art and Prospects
38
Fluency vs. Accuracy
FAHQ
MT
conMT
Prof.
MT
Info.
MT
Fluency
Accuracy
11/2006UOB
Zantout ANLP: State of the Art and Prospects
39
Evaluation of MTSs
• Various methodologies put forward
• Various aspects considered:
Intelligibility, Fidelity, and other software
engineering features
• Mostly human-centered:
 Get users to compare Human and M. T.
 Get users to evaluate MT output on a scale (e.g.
1-5)
• Subjective to a large extent
11/2006UOB
Zantout ANLP: State of the Art and Prospects
40
Automatic Evaluation Example
Bleu Metric
Test Sentence
Gold Standard References
colorless green ideas sleep furiously
all dull jade ideas sleep irately
drab emerald concepts sleep furiously
colorless immature thoughts nap angrily
11/2006UOB
Zantout ANLP: State of the Art and Prospects
41
Automatic Evaluation Example
Bleu Metric
Test Sentence
Gold Standard References
colorless green ideas sleep furiously
all dull jade ideas sleep irately
drab emerald concepts sleep furiously
colorless immature thoughts nap angrily
Unigram precision = 4/5
11/2006UOB
Zantout ANLP: State of the Art and Prospects
42
Automatic Evaluation Example
Bleu Metric
Test Sentence
Gold Standard References
colorless green ideas sleep furiously
colorless green ideas sleep furiously
colorless green ideas sleep furiously
colorless green ideas sleep furiously
all dull jade ideas sleep irately
drab emerald concepts sleep furiously
colorless immature thoughts nap angrily
Unigram precision = 4 / 5 = 0.8
Bigram precision = 2 / 4 = 0.5
Bleu Score = (a1 a2 …an)1/n
= (0.8 ╳ 0.5)½ = 0.6325  63.25
11/2006UOB
Zantout ANLP: State of the Art and Prospects
43
Evaluating AMT’s
• 3 Arabic MT systems tested:
- Al-Mutarjim Al-Arabey (ATA Software Tech.)
- Al-Wafi (by ATA Software Tech.)
- Arabtrans (by Arab.Net Tech.)
• Sample texts translated.
• Scoring by a human (1 or 0.5 or 0 )
• Results:
11/2006UOB
Zantout ANLP: State of the Art and Prospects
44
Analysis of the results
• Poor AMT systems overall
• Good Lexicon coverage in the domain
“Internet and Arabisation”
• Very Poor Grammatical results:
– detailed analysis focuses on bad areas.
– Pronoun resolution and semantic correctness
• barely above average
– (almost 1 error out of each 2 cases!)
• The technology used in AMTS’s is “outdated”
11/2006UOB
Zantout ANLP: State of the Art and Prospects
45
Future Work
• Develop awareness of the importance of MT and
NLP for Arabic.
• Developing our own MT system based on all what
we have learned from the evaluation
– Focus on Statistical techniques:
• Speed of Implementation.
• Obtaining better results.
11/2006UOB
Zantout ANLP: State of the Art and Prospects
46
AMT and Lebanon
ECOMLEB, no.2, 1st Quarter 2005
• “How can you explain why so many in the IT Field can’t find a job in
Lebanon when we keep hearing that we are the best in the region?”,
Reader’s Comments, P. 02.
• “Khan Al-Saboun”, a local soap maker in Tripoli now sells soaps all
over the world. “University Series, p. 05”
• “… Lebanon has one of the highest rates of internet usage in the area,
a good PC penetration, abundant human talent and resources in IT and
particularly software and web design, and no money transfer
restrictions” Interview with Minister of Economy and Trade, H.E.
Adnan Kassar, p. 16.
• “…[Lebanon needs to] reduce brain drain” Interview with Minister of
Economy and Trade, H.E. Adnan Kassar, p. 17.
• “…[Lebanon has] a multiligual and highly educated human resource
[base]” Interview with Minister of Economy and Trade, H.E. Adnan
Kassar, p. 17.
• “B2C e-commerce is expected to cross US$ 1 Billion mark by 2008 in
GCC countries … particularly in e-shopping … mainly in Saudi Arabia
and the UAE … compund average growth of 22% over 5 years … >
33.33% of transactions are booking for airline and hotels.
11/2006UOB
Zantout ANLP: State of the Art and Prospects
47
Recommendations
• Develop Arab acceptance of the strategic nature of ANLP/AMT
• Establishing an Arab Centre for Arabic language processing and
AMT
 Gather Arab researchers
 Host and sponsor research:




Morphology,
Parsing,
Speech
semantics, pragmatics
 Building a central repository:




software,
lexicons,
corpora,
Tools
 and archive (literature)
11/2006UOB
Zantout ANLP: State of the Art and Prospects
48
Recommendations (cont.)
• Strengthen ties between Academia, research centers, and
industry
• Sponsor Pan-Arab projects (ESPRIT-like)
• Sponsor conferences, exhibitions, and trade shows:
– Coordinate Different Conferences:
• 2 upcoming ANLP conferences AT THE SAME TIME in 2 Different
places (KSA and Morocco)
• Plan for a third (UAE).
• Strengthen links with western institutions (on NLP/MT):
– Already western researchers are active in ANLP:
– A workshop in London in the same time frame as both
conferences in KSA and Morocco.
11/2006UOB
Zantout ANLP: State of the Art and Prospects
49
Thank you for your patience!
•
References:
– Ahmed Guessoum, Rached Zantout, A Methodology for Evaluating Arabic Machine Translation
Systems, Machine Translation, Volume 18, Issue 4, Dec 2004, Pages 299 - 335
– R. Zantout and A. Guessoum, An Automatic English-Arabic HTML Page Translation System,
Journal of Network and Computer Applications, vol. 4, no. 24, October 2001.
– Guessoum and R. Zantout, A Methodology for a semi-automatic evaluation of the language
coverage of machine translation system lexicons, The Journal of Machine Translation, Kluwer
Academic Publishers, The Netherlands, vol. 16, October 2001.
– Zantout, Rached and Guessoum, Ahmed, Arabic Machine Translation: A Strategic Choice for the
Arab World, Journal of King Saud University, Vol. 12, Computer and Information Sciences, pp. 117144, A.H. 1420-2000.
– Ahmed Guessoum, Rached Zantout , Machine Translation, A Startegic Dimension for the Arab
World, University Forum, University of Sharjah, Issue 41, Year 6, Muharram 1427, February 2006,
pp. 32-37.
– Guessoum, Ahmed and Zantout, Rached, Arabizing the Internet and its effect on the development of
the Kingdom of Saudi Arabia, The 100 years symposium of the King Saud University, Riyadh, Saudi
Arabia, 18-19/10/1999.
– Guessoum, Ahmed and Zantout, Rached, Towards a Strategic Effort, with a Central Theme of
Machine Translation, to meet the challenges of the Information Revolution, 1998 Symposium of
Proliferation of Arabization and Development of Translation in the Kingdom of Saudi Arabia, King
Saud University, Riyadh.
– “Machine Translation: Challenges and Approaches,” Invited Lecture, CS 4705: Introduction to
Natural Language Processing Fall 2004, Nizar Habash
Post-doctoral Fellow, Center for Computational Learning Systems, Columbia University.
11/2006UOB
Zantout ANLP: State of the Art and Prospects
50