Towards Resolving Morphological Ambiguity in Arabic

Download Report

Transcript Towards Resolving Morphological Ambiguity in Arabic

Towards Resolving Morphological
Ambiguity in Arabic
Intelligent Language Tutoring
Framework
Khaled Shaalan
Doaa Samy
Marwa Magdy
Outlines
• Introduction.
• Arabic Morphological Ambiguity
Problem
• The Proposed Disambiguation System
• Evaluation & Results
• Conclusions
Outlines
• Introduction
• Arabic Morphological Ambiguity
Problem
• The Proposed Disambiguation System
• Evaluation & Results
• Conclusions
Introduction
curriculum sequence
Adaptive navigation
Intelligent Language
Tutoring System
adaptive presentation
• A computer-based educational system
that allows simulation
of remediation
a human tutor.
error
• its objective is to enhance teaching
and learning of foreign languages.
Intelligent feedback to
student solutions
Main Challenges
• Lack of resources, such as Learner corpus
for Arabic language
• Lack of tools dealing with ill-formed input
• In ILTS, relaxing the constraints of the
language to analyze learner’s answer results
in handling more interpretations than
systems designed for only well-formed input
Main Challenges (cont‘d)
Use techniques, such
as
constraints
‫قالت‬
(said)
Analyzes
relaxation
In well formed
systems:
-In
3rdILTS:
female sg past
Intelligent Language
Tutoring System•3rd person female
verb sg
•1st person sg past verb
•2nd person male sg past verb
2nd person female sg past verb
Erroneous Learner
Answer
More Interpretation
Outlines
• Introduction
• Arabic Morphological Ambiguity
Problem
• The Proposed Disambiguation System
• Evaluation & Results
• Conclusions
Arabic Morphological Ambiguity
• Arabic language is one of the Semitic languages
that is defined as a diacritized language.
• Unfortunately, diacritics are rarely used in current
Arabic writing conventions.
• So two or more words in Arabic are homographic
Homographic Example
Word
Lemma
Different Interpretations
‫يعد‬
‫أعاد‬
‫عاد‬
‫وعد‬
(bring back) ‫ي ِعد‬
(return) ‫يعُد‬
(promise) ‫ي ِعد‬
‫عد‬
‫أعد‬
(count) ّ‫يَعُد‬
(prepare) ّ‫يُ ِعد‬
Factors of Arabic Ambiguity
Fordeletion
example,ofthe
perfect verb suffix ‫ت‬
For example, the
the
For example,
the word ‫ أسد‬can be
can indicate either:
1) first person
letter (‫ )و‬in taking the present
Factors
For
example,
‫ فعل‬or ‫س ًد‬
interpreted
as ‫أسد‬the
(lion)
ٌ ‫(أ‬Isingular, 2)Main
second
person
singular
(imperfect) tense of the trilateral
and ‫ف ًعل‬
Block)./faEala/
masculine, 3) second
person singular
root ‫د‬-‫ع‬-‫ و‬/w-E-d/,
it appears
in /faE~ala/.
feminine,
or 4) third
person singular
written texts asfeminine.
‫( يعد‬promise).
Orthographic
alteration
operations
such as
deletion
Some verb
prefixes
and suffixes
can be
homographic
Ambiguity
of
Undiacritized
verb Arabic
patterns
Prefixes and
suffixes can
produce a form
homographic
with
another word
class
Outlines
• Introduction
• Arabic Morphological Ambiguity
Problem
• The Proposed Disambiguation System
• Evaluation & Results
• Conclusions
The Proposed System
Error
Arabic ILTSAffix
Stem
Feature
ErrorRepresentation:
Representation:
Value
Feature
Value final letter
Added
Added middle letter
Stem
‫ ت ل و‬+‫اقال‬
‫قالق‬
Answer Possible Word
‫ أقول‬Analyses
Prefix
‘’
Root
‫ ل‬-‫ و‬-‫ق‬
Suffix •Verb tense
‫ ت‬error
Question:
LexicalWord
Category
verb
Analyzer
Disambiguation
•Verb conjugation
Build a sentence
using the
following
Lexical
Category
Module
Module verb
Pattern
‫فعل‬
The question goal is:
•Vowel hollow
letters
roots:
Verb Type
Item banking conjugation
Verb Type
hollow
hollow
verb
in
‫م‬-‫و‬use
-‫ د‬،‫ق‬of-‫ ح‬،‫ل‬-‫و‬-‫ ق‬/q-w-l,
H-q,
d-w•Incorrect
Pattern
‫فعل‬
Selected
Word Analysis
‫دائما‬
‫الحق‬
‫قالتو‬
Prefix
‫أ‬
imperfect
tense active voice
Learner Answer
m/
perfect verb insteadSuffix
(I always said the truth)
perfect
‘’Tense
Feedback Message
of imperfect
active
Errorimperfect
TypeVoice
Tense
Error
Object
Features
Classification‘’
Voice
active
Tutoring Module
Mood
indicative
Subject
FeaturesModule1st sg
Subject Features
1st sg
Object Features
‘’
Disambguation Module
Prioritized
Conditions
Affix Collection
Multiple Analyses
Pattern
Collection
No Action
Selected Analysis
Multiple Analyses
Prioritized Conditions
The question
goal is to test
passive voice
Select Passive
Analysis
Select Active
Analysis
Item banking
Multiple Analyses
Prioritized Conditions
The question
goal is to test
imperative
tense
Item banking
Multiple Analyses
Select Imperative
Verb
Analysis
Select perfect or
imperfect verb
Analysis
Example
If the learner writes the following sentence:
(My-grandmother sells the-rice ) ‫تباع جدتي االرز‬
The system produces two analyses:
Third person singular
feminine imperfect
verb in the active
voice with converted
middle letter
Third person singular
feminine imperfect
verb in the passive
voice
Disambguation Module
Prioritized
Conditions
Affix Collection
Multiple Analyses
Pattern
Collection
No Action
Selected Analysis
Multiple Analyses
Example
If the learner writes the following sentence:
(Mohamed was-involved in murder ‫محمد تورطت في جريمة قتل‬
crime )
The system produces four analyses:
First person
singular perfect
verb in the
active voice
Second person
Second person
Singular
perfect
verb
singular
singular
masculine
conjugation in the
feminine
perfect verb in
perfect verb in
active voice
the active voice
the active voice
Third person
singular
feminine
perfect verb in
the active voice
Disambguation Module
Prioritized
Conditions
Affix Collection
Multiple Analyses
Pattern
Collection
No Action
Selected Analysis
Multiple Analyses
Example
If the learner writes the following sentence:
(my-grandfather and my- ‫جدي وجدتي نقلوا الي بيت جديد‬
grandmother moved to a new house )
The system produces two analyses:
Third person masculine
Third person masculine
Third person masculine
in the
plural perfect verb inplural
the perfect verb
plural
perfect verb in the
active voice following the
activeofvoice following the
active voice instead
pattern '‫'فعل‬.
pattern '‫'ف ًعل‬.
dual
Outlines
• Introduction.
• Arabic Morphological Ambiguity
Problem.
• The Proposed Disambiguation System.
• Evaluation & Results
• Conclusions
Evaluations
Results
Evaluation &&Results
• A real test set that consists of 116 real Arabic
sentences is collected
• The system successfully solved 60% of the cases
Evaluation Problems Classification
Orthographic
match
between undiacritized
forms
For example, the erroneous word
For example,
the word ‫;تناول‬
:‫ أجوب‬Problems
1) the
noun
‫ تناول‬/tanAwul/
(dealing
1) the
imperfect
verb ‫أجيب‬
/>u-jiyb/
with/ eating),
(I-answer),
2) the
verb ‫تناول‬verb
/tanAwala/
2) perfect
2) or imperfect
‫ أجوب‬/>a(he/it-dealt
ate), or
juwb/with/
(I-explore).
3) the imperfect verb ‫ تناول‬/tuAdditionalorthographic
nAwil/ (hand over/ deliver).
matches as a
result of
relaxing a
constraint
Outlines
• Introduction
• Arabic Morphological Ambiguity
Problem
• The Proposed Disambiguation System
• Evaluation & Results
• Conclusions
Conclusions
•
•
•
The ambiguity problem presents a challenge to ILTS
The preferred method in ILTS for disambiguating
multiple readings should consider the likelihood of
an error and the difficulty of concepts
If a large tagged learner corpus exist then the
ambiguity problem can be resolved by considering
the likelihood of errors
Thank you