A Database of Narrative Schemas
Download
Report
Transcript A Database of Narrative Schemas
A Database of
Narrative Schemas
Nate Chambers and Dan Jurafsky
Stanford University
Two Joint Tasks
Events in a Narrative
Semantic Roles
suspect, criminal, client,
immigrant, journalist,
government, …
police, agent, officer,
authorities, troops, official,
investigator, …
Scripts
Schank and Abelson. 1977. Scripts Plans Goals and Understanding. Lawrence Erlbaum.
Mooney and DeJong. 1985. Learning Schemata for NLP. IJCAI-85.
• Background knowledge for language understanding
Restaurant Script
• Hand-coded
• Domain dependent
Applications
• Coreference
• Argument prediction
• Summarization
• Inform sentence selection with event confidence scores
• Textual Inference
• Does a document infer other events
• Selectional Preferences
• Use chains to inform argument types
• Aberration Detection
• Detect surprise/unexpected events in text
• Story Generation
The Protagonist
protagonist:
(noun)
1. the principal character in a
drama or other literary work
2. a leading actor, character, or
participant in a literary work
or real event
Narrative Event Chains
ACL-2008
Narrative Event Chains
(1)Narrative relations
(2)Single arguments
(3)Temporal ordering
Inducing Narrative Relations
Chambers and Jurafsky. Unsupervised Learning of Narrative Event Chains. ACL-08
Narrative Coherence Assumption
Verbs sharing coreferring arguments are semantically connected
by virtue of narrative discourse structure.
1. Dependency parse a document.
2. Run coreference to cluster entity mentions.
3. Count pairs of verbs with coreferring arguments.
4. Use pointwise mutual information to measure relatedness.
Chain Example (ACL-08)
Schema Example
Police, Agent,
Authorities
Prosecutor, Attorney
Judge, Official
Plea, Guilty, Innocent
Suspect, Criminal,
Terrorist, …
Narrative Schemas
N (E,C)
E = {arrest, charge, plead, convict, sentence}
C {C1,C2,C3 }
Learning Schemas
narsim(N,v j )
d Dv j
max chainsim (Ci , v j ,d )
Ci
Training Data
• NYT portion of the Gigaword Corpus
• David Graff. 2002. English Gigaword. Linguistic Data Consortium.
• 1.2 million documents
• Stanford Parser
• http://nlp.stanford.edu/software/lex-parser.shtml
• OpenNLP coreference
• http://opennlp.sourceforge.net
• Lemmatize verbs and noun arguments.
Viral Example
virus, disease, bacteria,
cancer, toxoplasma, strain
mosquito, aids, virus, tick,
catastrophe, disease
Authorship Example
book, report, novel, article,
story, letter, magazine
company, author, group, year,
microsoft, magazine
Temporal Ordering
• Supervised classifier for before/after relations
• Chambers and Jurafsky, EMNLP 2008.
• Chambers et al., ACL 2007.
• Classify all pairs of verbs in Gigaword
• Record counts of before and after relations
16
The Database
1. Narrative Schemas (unordered)
1. Various sizes of schemas (6, 8, 10, 12)
2. 1813 base verbs
2. Temporal Orderings
1. Pairs of verbs
2. Counts of before and after relations
• http://cs.stanford.edu/people/nc/schemas/
17
Evaluations
• The Cloze Test
• Chambers, Jurafsky. ACL-2008.
• Chambers, Jurafsky. ACL-2009.
• Comparison to FrameNet
• Chambers, Jurafsky. ACL-2009.
• Corpus Coverage
• LREC 2010
18
Comparison to FrameNet
• Narrative Schemas
• Focuses on events that occur together in a narrative.
• FrameNet (Baker et al., 1998)
• Focuses on events that share core roles.
Comparison to FrameNet
• Narrative Schemas
• Focuses on events that occur together in a narrative.
• Schemas represent larger situations.
• FrameNet (Baker et al., 1998)
• Focuses on events that share core roles.
• Frames typically represent single events.
Comparison to FrameNet
1. How similar are schemas to frames?
•
Find “best” FrameNet frame by event overlap
2. How similar are schema roles to frame elements?
•
Evaluate argument types as FrameNet frame elements.
FrameNet Schema Similarity
1. How many schemas map to frames?
•
•
13 of 20 schemas mapped to a frame
26 of 78 (33%) verbs are not in FrameNet
2. Verbs present in FrameNet
•
•
35 of 52 (67%) matched frame
17 of 52 (33%) did not match
FrameNet Schema Similarity
• Why are 33% unaligned?
• FrameNet represents subevents as separate frames
• Schemas model sequences of events.
One Schema
trade
rise
fall
slip
quote
Multiple FrameNet Frames
Exchange
Narrative Relation
Change Position on a Scale
Undressing
n/a
Evaluations
• The Cloze Test
• Chambers, Jurafsky. ACL-2008.
• Chambers, Jurafsky. ACL-2009.
• Comparison to FrameNet
• Chambers, Jurafsky. ACL-2009.
• Corpus Coverage
• LREC 2010
24
Corpus Coverage Evaluation
• Narrative Schemas are generalized knowledge
structures.
• Newspaper articles discuss specific scenarios.
• How many events in an article’s description are
stereotypical events in narrative schemas?
25
Coverage Example
He is painfully aware
aware that
that ifif he
he sold
soldhis
his four-bedroom
four-bedroom brick
brick suburban
suburban
home for the $220,000 that he thinks
thinks he
he can
can get
get for
for itit and
and then
then paid
paidoff
off
his mortgage,
his
mortgage, he
he would
would walk
walk away with, as he
he puts
puts it,
it, ….
….
Article Text
• aware
• sell
• think
• pay
• walk
• put
Investing Schema
• invest
• sell
• take_out
• buy
• pay
• withdraw
• pull
• pull_out
• put
• put_in
26
Coverage Score
• Largest Connected Component
• Largest subset of vertices such that a path exists between all vertices
• Events are connected if there exists some schema such that both
events are members.
Article Text
• aware
• sell
• think
• pay
• walk
• put
3 of the 6 are connected
50% coverage
27
Coverage Results
• 69 documents
• 740 events
• Macro-average document coverage
• Final coverage: 34%
“One third of a document’s events are part of a selfcontained narrative schema.”
28
Evaluation Results
• The Cloze Test
• Schemas improve 36% on event prediction over verb-based
similarity
• Comparison to FrameNet
• 65% of schemas match FrameNet frames
• 33% of schema events are novel to FrameNet
• Corpus Coverage
• 96% of events are connected in the space of events
• 34% of events are connected by self-contained schemas
29
Schemas Online
• Narrative Schemas
• http://cs.stanford.edu/people/nc/schemas/
• Coverage Evaluation (Cloze Test)
• http://cs.stanford.edu/people/nc/data/chains/
Nate Chambers and Dan Jurafsky
30
31
FrameNet Argument Similarity
2. Argument role mapping to frame elements.
• 72% of arguments appropriate as frame elements
FrameNet frame: Enforcing
Frame element: Rule
law, ban, rule, constitutionality,
conviction, ruling, lawmaker, tax
INCORRECT
Coverage Evaluation
1. Choose a news article at random.
2. Identify the protagonist.
3. Extract the narrative event chain.
•
Match the chain to the best narrative schema with
the largest event overlap.
Coverage Dataset
• NYT portion of the Gigaword Corpus
• Randomly selected the year 2001
• 69 random newspaper articles within 2001
• 100 initially chosen, 31 removed that are not news
• Identified the protagonist and events by hand
34
The Resource
• 3.5% of events in new documents are disconnected
from the space of narrative relations
• Chambers and Jurafsky. ACL 2008.
• 66% of events in new documents are not clustered
into generalized narrative schemas.
• The extent to which a document discusses new
variants of known schemas remains for future work.
35