Learning Subjective Nouns using Extraction Pattern

Download Report

Transcript Learning Subjective Nouns using Extraction Pattern

Learning Subjective Nouns using
Extraction Pattern Bootstrapping
Ellen Riloff
School of Computing University of Utah
Janyce Wiebe , Theresa Wilson
Computing Science University of Pittsburgh
CoNLL-03
Introduction (1/2)

Many Natural Language Processing applications can
benefit from being able to distinguish between factual
and subjective information .




Subjective remarks come in a variety of forms , including
opinions , rants , allegations , accusations and speculation .
QA should distinguish between factual and speculative
answers .
Multi-document summarization system need to summarize
different opinions and perspectives .
Spam filtering systems must recognize rants and emotional
tirades , among other things .
Introduction (2/2)

In this paper , we use Meta-Bootstrapping (Riloff and
Jones 1999) , Basilisk (Thelen and Riloff 2002) algorithms
to learn lists of subjective nouns :




Both bootstrapping algorithms automatically generated
extraction patterns to identify words belonging to a
semantic category .
We hypothesize that extraction patterns can also identify
subjective words .
The Pattern “expressed <direct_object>” often extracts
subjective nouns , such as “concern” , “hope” , “support” .
Both bootstrapping algorithm require only a handful of seed
words and unannotated texts for training ; no annotated
data is need at all .
Annotation Scheme

The goal of the annotation scheme is to identify and
characterize expressions of private states in a sentence .
 Private state is a general covering term for opinions ,


evaluations , emotions and speculations .
“ The time has come , gentleman , for Sharon , the assassin ,
to realize that injustice cannot last long” -> writer express a
negative evaluation .
Annotator are also asked to judge the strength of each
private state . A private state can have low , medium ,
high or extreme strength .
Corpus , Agreement Results



Our data consist of English-language versions of foreign
news document from FBIS .
The annotated corpus used to train and test our
subjective classifiers (the experiment corpus) consist of
109 documents with a total of 2197 sentences .
We use a separate , annotated tuning corpus to
establish experiment parameters .
Extraction Pattern


In the last few years , two bootstrapping algorithms have
been developed to create semantic dictionaries by
exploiting extraction patterns .
Extraction patterns represent lexico-syntactic expression
that typically rely on shallow parsing and syntactic role
assignment .


“ <subject> was hired . ”
A bootstrapping process looks for words that appear in
the same extraction patterns as the seeds and
hypothesize that those words belong to the same
semantic category .
Meta-Bootstrapping (1/2)



Meta-Bootstrapping process begins with a small set of
seed words that represent a targeted semantic category
(eg.” seashore ” is a location) and an unannotated corpus .
Step1 , MetaBoot automatically creates a set of extraction
patterns for the corpus by applying syntactic templates .
Step2 , MetaBoot computes a score for each pattern
based on the number of the seed words among its
extractions .

The best pattern is saved and all of its extracted noun
phrase are automatically labeled as the targeted semantic
category .
Meta-Bootstrapping (2/2)


MetaBoot then re-scores the extraction patterns , using
the original seed words plus the newly labeled words ,
and the process repeats . (Mutual Bootstrapping)
When the mutual bootstrapping process is finished , all
nouns that were put into the semantic dictionary are reevaluated.



Each noun is assigned a score based on how many different
patterns extracted it .
Only the five best nouns are allowed to remain in the
dictionary .
Mutual bootstrapping process starts over again using the
revised semantic dictionary
Basilisk (1/2)


Step1 , Basilisk automatically creates a set of extraction
patterns for the corpus and scores each pattern based on
the number of seed words among its extraction .
Basilisk Put the best patterns into a Pattern Pool .
Step2 , All nouns extracted by a pattern in the pattern
pool are put into a Candidate Word Pool .


Basilisk scores each noun based on the set of patterns that
extracted it and their collective association with the seed
words .
Step3 , the top 10 nouns are labeled as the targeted
semantic class and are added to dictionary .
Basilisk (2/2)


Then the bootstrapping process then repeats , using the
original seed and the newly labeled words .
The major difference Basilisk and Meta-Bootstrapping :



Basilisk scores each noun based on collective information
gathered from all patterns that extracted it .
Meta-Bootstrapping identify a single best pattern and
assumes that everything it extracts belongs to the same
semantic category .
In comparative experiment , Basilisk outperformed MetaBootstrapping .
Experimental Results (1/2)


We create the bootstrapping corpus , by gathering 950
new texts from FBIS and manually selected 20 highfrequency words as seed words .
We run each bootstrapping algorithm for 400 iterations ,
generating 5 word per iteration . Basilisk generates 2000
nouns and Meta-Bootstrapping generates 1996 nouns .
Experimental Results (2/2)

Next , we manually review 3996 words proposed by the
algorithm and classify the words as StrongSubjective ,
Weak Subjective or Objective .
X - the number of words generated
Y - the percentage of those words
that were manually classified as
subjective
Subjective Classifier (1/3)


To evaluate the subjective nouns , we train a Naïve Bayes
classifier using the nouns as features . We also
incorporated previously established subjectivity clues , and
added some new discourse features .
Subjective Noun Features :


We define four features BA-Strong , BA-weak , MB-Strong ,
MB-Weak to represent the sets of subjective nouns
produced by bootstrapping algorithm .
We create a three-valued feature based on the presence of
0 , 1 , >=2 words from that set .
Subjective Classifier (2/3)

WBO Features :


Wiebe , Bruce and O’Hara (1999) , a machine learning
system to classify subjective sentences .
Manual Features :




Levin 1993 ; Ballmer and Brennenstuhl 1981
Some fragment lemmas with frame element experiencer
(Baker et al. 1998)
Adjectives manually annotated for polarity (Hatzivassiloglou
and McKeown 1997 )
Some subjective clues list in (Wiebe 1990)
Subjective Classifier (3/3)

Discourse Features :




We use discourse feature to capture the density of clues in
the text surrounding a sentence .
First , we compute the average number of subjective clues
and objective clues per sentence .
Next , we characterize the number of subjective and
objective clues in the previous and next sentence as :
higher-than-expected (high) , lower-than-expected (low) ,
expected (medium) .
We also define a feature for sentence length .
Classification Result (1/3)


We evaluate each classifier using 25-fold cross validation
on the experiment corpus and use paired t-test to
measure significance at the 95% confidence level .
We compute Accuracy (Acc) as the percentage that match
the gold-standard , and Precision (Prec) , Recall (Rec)
with respect to subjective sentences .


Gold-standard : a sentence is subjective if it contains at least one
private-state expression of medium or higher strength .
Objective class consist of everything else .
Classification Result (2/3)


We train a Naive Bays classifier using only the SubjNoun
features . This classifier achieve good precision (77%) but
only moderate recall (64%) .
We discover that the subjective nouns are good indicators
when they appear , but not every subjective sentence
contains a subjective noun .
Classification Result (3/3)


There is a synergy between these feature set :
using both types of features achieves better
performance than either one alone .
In Table 8 Row 1 , we use WBO + SubjNoun +
manual + discourse feature . This classifier
achieve 81.3% precision , 77.4% recall and
76.1% accuracy .
Conclusion



We demonstrate that weakly supervised bootstrapping
techniques can learn subjective terms from unannotated
texts.
Bootstrapping algorithms can learn not only general
semantic category , but any category for which words
appear in similar linguistic phrase .
The experiment suggest that reliable subjective
classification require a broad array of features .