Event-Centric Summary Generation

Download Report

Transcript Event-Centric Summary Generation

Event-Centric Summary
Generation
Lucy Vanderwende, Michele Banko and
Arul Menezes
One Microsoft Way, WA, USA
DUC 2004
Abstract
• Our primary interest is two folds:
– To explore an event-centric approach to
summarization
– To explore a generation approach to summary
realization
2
Introduction
• Identifying important events, as opposed
to entities
• Generation component
– Human-authored rely less on sentence
extraction
• Graph-scoring algorithm
– To identify highest weighted node to guide
content selection
3
System Description
• MSR-NLP
– Analysis component
• Rule-base syntactic analysis component
• Produces a logical form
– Syntactic variations, words label
– Generation component
• Syntactic realization component
• Produces a syntactic tree
4
Creating document representations
• Cluster sentence
• Analysis sentence and get logical form
5
Creating document representations
• Produces triples result from logical form
– (LFNodei, rel, LFNodej)
6
Forming Document Graph
• Take those triples and join nodes by way
of their semantic relation using a
bidirectional link structure
• Keep track of how many times we observe
the relationship
• Stop words are not included in the graph
construction
7
8
Node scoring Using Pagerank
• Using Pagerank algorithm
– Hyperlink such as WWW
– When link between nodes, vote for that node
–
9
Node scoring Using Pagerank
• Pagerank framework
– “Pages”, correspond to base forms of words in the
documents
– “hyperlink”, correspond to semantic relationships
– Verbs, identify events
– Noun, Identify entities
– Use event to identify summary content
• Typically, the algorithm converges around 40
iterations
10
Graph Scoring
• Use pagerank scores to assess the link
weight (LW(i->n))
•
11
Summary Generation
• Generated by extracting and merging of
logical form
– Identify important triples
• Defined highly link weight node, and together with
most highly weighted
• (leave, Tobj, LonLondon_Bridge_Hospital)
• Not (leave, Tobj, government)
– Extract fragments divided into “event” and
“entity”
• Event used to generate summary
• Entity used to expanded upon reference to the
same entity within the selected event fragment
12
13
Summary Generation
• Event fragment order
– Cluster event fragment by they refer to
– Choose the greatest number of argument
node for the event
– Order the selected event fragments
• To group sentence referring to the same entity
together
• Order sentence which exhibit event-coreference
14
Experiments and Evaluation
•
(Rule-based pronoun resolution method, 75% accuracy)
15
Experiments and Evaluation
•
Reason: the potential to introduce disfluent text
16
17
Directions and Future Work
• Produce more human-like generated
summaries
• Further study the impact of anaphora
resolution
• Study new page-ranking algorithm
• While ordering groups event fragments
mentioning the same entity, we have not
yet implemented a system to combine
them into larger logical form construction
18