Product Review Summarization - Web Information Retrieval

Download Report

Transcript Product Review Summarization - Web Information Retrieval

Product Review Summarization
Ly Duy Khang
Outline
1.
2.
3.
4.
5.
Motivation
Problem statement
Related works
Baseline
Discussion
1. Motivation (1)
• A rapid expansion of e-commerce, where
more and more products are sold via online
portals (Amazon, eBay … )
• Online product reviews thus become an
important resource:
– Customers to share and find opinions about
products easily
– Producers to get certain degrees of feedback
1. Motivation (2)
2. Problem statement
• Given a set of reviews of a product, produce
an abstractive summary that captures users’
opinions about that product
3. Related works (1)
• Single-document summarization
– Extractive-based approach
• Sentence score + ranking
• Machine learning technique
– Abstractive-based approach
• Template
• Concept hierarchy
3. Related works (2)
• Multi-document summarization
– Extractive-based approach
• Sentence score + ranking + MMR + Ordering
– Abstractive-based approach
• Template
• Concept hierarchy
• Sentence fusion with paraphrasing rules
3. Related works (3)
• Sentiment analysis
– Reviews polarity classification
– PROS/ CONS identification
– Mining review opinions
• Identify product facets
• Identify opinion orientation on the facet
4. Baseline (1)
• Extractive based summary
• An integration between Liu et. al. (2004) and
NUS - DUC 2005
4. Baseline (2)
4. Baseline (3)
• Product facets identification
– Association rule mining
• Each transaction consists of nouns/noun phrases from single
sentence
• The frequent itemsets are the candidate product facets
– Redundancy pruning
• Removing redundant facets that contain only single words.
(e.g. life -> battery life)
– Compactness pruning
• Removing meaningless facets that contain multiple words
4. Baseline (4)
• Sentiment classification
– WordNet to grow seed lists of (+) and (-) ADJ
– ADJ share the same orientation as their synonyms
and opposite orientation as their antonyms
4. Baseline (5)
• Reviews labeling with facets and polarity
– The unit of labeling is sentence
– The summation of all these polarities yields the
polarity of the whole sentence
4. Baseline (6)
• Summary generation
– Sentences are clustered based on their labeling
– For each facet, we produce a summary
• Sentences are scored based on concept link similarity
• MMR ranks the sentences
5. Discussion (1)
• Evaluation
– We plan to carry on human evaluation.
5. Discussion (2)
• In the baseline,
– Inherit all problems of extractive-based summary
– The unit of sentence is too coarse-grained
– Relationship between facets are not addressed
References
[1] V. Hatzivassiloglou, J. L. Klavans, M. L. Holcombe, R. Barzilay, M. Y. Kan, and
K. R. Mckeown. SimFinder: A Flexible Clustering Tool for Summarization.
Machine Learning, 1999.
[2] R. Barzilay, K. R. Mckeown, and M. Elhadad. Information fusion in the
context of multidocument summarization. Proceedings of the 37th annual
meeting of the Association for Computational Linguistics on
Computational Linguistics, page 550-557, 1999.
[3] I. Mani and M. T. Maybury. Advances in automatic text summarization.
1999.
[4] R. Mooney and G. DeJong. Learning schemata for natural language
processing. Strategied for Natural Lanaguage Processing, pages 146 - 176.
[5] E. Hovy and C. Lin. Automated text summarization in SUMMARIST.
Advances in Automatic Text Summarization, 94, 1999.
[6] M. Hu and B. Liu. Mining and summarizing customer reviews. Proceedings
of the tenth ACM SIGKDD international conference on Knowledge
discovery and data mining, page 168-177, 2004.
[7] M. Hu and B. Liu. Mining opinion features in customer reviews.
Proceedings of the National Conference on Articial Intelligence, page
755760, 2004.
[8] S. Ye, L. Qiu, T. S. Chua, and M. Y. Kan. NUS at DUC 2005: Understanding
Documents via Concept Links. Document Understanding Conference
(DUC05), 2005.
[9[ X. Ding, B. Liu, and P. S. Yu. A holistic lexicon-based approach to opinion
mining Proceedings of the international conference on Web search and
web data mining – WSDM '08, page 231, 2008.