THOMSON REUTERS PRESENTATION TEMPLATE

Download Report

Transcript THOMSON REUTERS PRESENTATION TEMPLATE

Query-based Opinion Summarization
for Legal Blog Entries
Jack G. Conrad, Jochen L. Leidner, Frank Schilder, Ravi Kondadadi
Corporate Technology Research & Development
Twelfth International Conference on Artificial Intelligence & Law (ICAIL 2009)
Barcelona, Spain
8-12 June 2009
OUTLINE
• INTRODUCTION
• RELATED WORK
• SYSTEM
• METHODOLOGY
• RESULTS
• CONCLUSIONS
• FUTURE WORK
• AI & LAW PROPOSAL
INTRODUCTION (1/4)
— Motivations
• Amount and rate of legal information flow increasing
• Demands on attorneys for work products very high
• Essential for productivity tools to be efficient
• Legal blogs provide a more immediate forum
– Unmoderated, instantaneous, candid, terse
– Contain rich viewpoints, individual or in aggregate
• Missing piece: ability to summarize blog entries
– Legal professionals busy synthesizing traditional legal
materials (cases, statutes, analytical documents)
• Pressures due to case load and schedules immense
• Increasingly impossible to keep up with information bandwidth
• Means of consolidating, summarizing artifacts invaluable
J.G.Conrad, ICAIL09, 11 June 2009
3
J.G.Conrad, ICAIL09, 11 June 2009
4
INTRODUCTION (3/4)
• Key contributions
1. First work to perform multi-document opinion-based
summarization on legal blog entries
2. Extends the TAC evaluation of opinion summarization
task to assess the accuracy of measured polarity, using
expert reviewers
3. Presents a proposal to the AI & Law community — host a
formal track to pursue the topic in a more structured,
in-depth manner
J.G.Conrad, ICAIL09, 11 June 2009
5
INTRODUCTION (4/4)
• Opinion Mining for Legal Blogs
– Prospective Applications
1. Monitoring — follow what communities are saying about
firms, products, services, topics
2. Alerting — inform subscribers of unfavorable developments
3. Profiling — represent litigation patterns of attorneys, courts ...
4. Tracking — study decisions of judges, reputations of firms ...
5. Exploration/Education — present law students with
contrasting opinions
J.G.Conrad, ICAIL09, 11 June 2009
6
OUTLINE
• INTRODUCTION
• RELATED WORK
• SYSTEM
• METHODOLOGY
• RESULTS
• CONCLUSIONS
• FUTURE WORK
• AI & LAW PROPOSAL
RELATED WORK (1/2)
Ashley & Aleven (1991 ff.)
Intelligent tutoring
Summarization
Hachey & Grover (2006)
Conrad, Leidner, Schilder
and Kondadadi (2009)
Argumentative zoning
Blawg sentiment summarization
Saravanan & Raman (2006)
TREC, TAC, et al.
Conditional Random Fields
Lerman & McDonald (2009)
Sentiment-modeled summarizers
ICAIL, JURIX
Legal Domain
ICWSM
Sentiment Analysis
Conrad & Schilder (2007)
Blawg polarity classification
J.G.Conrad, ICAIL09, 11 June 2009
8
RELATED WORK (2/2)
— TAC, the Text Analysis Conference (www.nist.gov/tac/)
– a new annual international workshop sponsored by NIST
– the US National Institute of Standards & Technology
• organizers disseminate NLP-type tasks and datasets
• participants develop systems that solve the tasks
– submit their results to NIST for evaluation
• members can also propose new tasks for future workshops
– the sentiment summarization pilot task consisted of
producing short, coherent sentiment summaries of blog
text
• Thomson Reuters R&D addressed the task
• system produced multi-document summaries
J.G.Conrad, ICAIL09, 11 June 2009
9
OUTLINE
• INTRODUCTION
• RELATED WORK
• SYSTEM
• METHODOLOGY
• RESULTS
• CONCLUSIONS
• FUTURE WORK
• AI & LAW PROPOSAL
SYSTEM (1/3)
— Workflow Diagram for Blawg Opinion Summarization
Topic: Google Net Neutrality
Sample Query: Has Google been a
consistent supporter of Net neutrality?
Sample Target: Google Net Neutrality
They want freedom yet support net neutrality? But Black doesn't believe that issues like net neutrality
or privacy or copyrights can be considered in isolation; they're all of a piece. The Wall Street Journal
attempted to kick up a controversy a couple weeks back with its pronouncement that major tech
companies‚ Äîincluding Microsoft and Google‚ Äîwere backing away from their commitment to
network neutrality. As Ed Black, the group's president, puts it," Since we represent innovators, we
have continually taken a stand for competition policy that makes it possible for the next YouTube to
make it out of the dormitory or garage—so that the best technology can prevail over current business
models." “ When it comes to broadband deployment, CCIA wants to see federal money only going to
companies that roll out high-speed infrastructure: 25Mbps fiber links to the home or 2 4Mbps wireless
links in areas where fiber laying might be too expensive. Freedom on the Internet is critical to vibrant
communication and information exchange, which foster innovation and help drive our economy. Net
Neutrality seeks to treat all traffic on the internet the same and prevent service providers from
regulating availability or content.  # 6 Internet service providers are chomping at the bit to begin
charging for priority access on the Internet.¬† “ At the core of these issues,” he writes,“ is the
question of how firmly we are committed to a common ethic of promoting Internet openness,
freedom,
J.G.Conrad, ICAIL09, 11 June 2009
11
SYSTEM (2/3)
— FastSum, design and application
• TR’s legal blog opinion summarization system
– multi-document summarization system
– harnesses regression Support Vector Machine
(SVM) for ranking candidate sentences
– original system extended to sentiment
– current system applied to legal domain (blawgs)
Summarization
(2007)
(2009)
(2008)
Legal
Sentiment
J.G.Conrad, ICAIL09, 11 June 2009
12
SYSTEM (3/3)
FastSum Blog Opinion
Summarization Processing
Key Modifications
• A.1 HTML parsing &
clean-up module
• B.1 Question sentiment
& target analyzer
• C.1 Sentence tagger
• C.2 Target overlap
J.G.Conrad, ICAIL09, 11 June 2009
13
OUTLINE
• INTRODUCTION
• RELATED WORK
• SYSTEM
• METHODOLOGY
• RESULTS
• CONCLUSIONS
• FUTURE WORK
• AI & LAW PROPOSAL
METHODOLOGY (1/7)
• Application of Thomson Reuters’ legal blog
opinion summarization system
1. Data collection via Web-based queries
• submitted to Web Search Engine, Blog Search Engine
2. Summary generation
• via modified FastSum System
3. Evaluation
• human assessment
– two assessors rated each summary
– measures modeled on TAC metrics
J.G.Conrad, ICAIL09, 11 June 2009
15
METHODOLOGY (2/7)
Scope
Engine
Properties (selected)
General Blog Search Engines
(Focus: Blogosphere)
technorati.com
Includes authority score
blogsearch.google.com
Date or relevancy ranking
www.blogsearchengine.com
Focus on higher quality content
www.blawg.com
Generally shorter entries
blawgsearch.justia.com
Date or relevancy ranking
www.blawgrepublic.com
Generally shorter entries
Legal Blog Search Engines
(Focus: Blawgosphere)
Blog Search Engines Examined along with Their Properties
J.G.Conrad, ICAIL09, 11 June 2009
16
METHODOLOGY (2/4)
J.G.Conrad, ICAIL09, 11 June 2009
17
J.G.Conrad, ICAIL09, 11 June 2009
18
METHODOLOGY (5/7)
— Evaluation
• Metrics used modeled on TAC (et al.)
evaluation
– Two metrics used:
1. Responsiveness
2. Linguistic Quality
– Scale: Five-point Likert [1- 5 ]
• 5 = high
• 1 = low
– Scores generally track those of TAC, though task not
completely identical
J.G.Conrad, ICAIL09, 11 June 2009
19
METHODOLOGY (6/7)
— Evaluation: Responsiveness
Grade
Meaning
Interpretation
(5)
Very good
On point relative to question, including polarity
(4)
Good
Addresses question, including at least partially the polarity
(3)
Adequate
Marginally relevant to the question, independent of polarity
(2)
Poor
May have overlap with question topic, and its polarity
(1)
Very poor
Misses the general point of question, polarity aside
Reviewer Guidelines for Responsiveness [1-5]
J.G.Conrad, ICAIL09, 11 June 2009
20
METHODOLOGY (7/7)
— Evaluation: Linguistic Quality
Dimensions
Essential Considerations
Grammaticality
no datelines, system internal formatting, fragments, omissions,
capitalization errors, etc.
Non-redundancy
no unnecessary repetition, especially among complete sentences, facts,
noun phrases
Referential Clarity
easily identifiable pronouns and noun phrases, same with role in
summary
Focus
should have clear focus, sentences’ information should relate only to
rest of summary
Structure and
Coherence
should be well-structured and organized, sentences tied together, not
an information heap
Reviewer Guidelines for Linguistic Quality [1-5]
J.G.Conrad, ICAIL09, 11 June 2009
21
OUTLINE
• INTRODUCTION
• RELATED WORK
• SYSTEM
• METHODOLOGY
• RESULTS
• CONCLUSIONS
• FUTURE WORK
• AI & LAW PROPOSAL
RESULTS (1/2)
— Baseline Averages
No.
Topic
Queries
Polarity
Blogs per Responsiveness
Summary
Rater A
Rater B
Linguistic Quality
Rater A
Rater B
2
Training Average:
+
4.5
2.5
3.0
3.0
2.5
10
Testing Average:
+/ - / neut
4.4
2.1
2.0
2.0
2.3
• Scores comparable to those of TAC 2008 (in 2-3 range)
• Caveat — we scored for correct sentiment polarity; TAC didn’t
• Kappa statistic for inter-rater agreement between pair, Κ = 0.75
J.G.Conrad, ICAIL09, 11 June 2009
23
RESULTS (2/2)
— Sample FastSum Summary
Topic: Anonymous Internet Query Logs
to shorten the time they keep information about their users. Under the new policy, Yahoo will delete the last
eight bits of the Internet Protocol, or I.P., address associated with a search query after 90 days. To start
with, there’ s often not a one-to-one correspondence between IP addresses and Internet users. So, before
you search, think. It will also alter so-called cookie data related to each search log and strip out any
personal information, like a name, phone number, address or Social Security number, from the query. 2008
December 17 Internet WEB2.0 Internet news Categories Internet Archives: January 2009 December 2008
November 2008 October 2008 September 2008 Yahoo Limits Retention of Personal Data Posted by admin|
Internet| Wednesday 17 December 2008 4: 16 am, the Internet search company, said that it would limit the
time it holds identifiable personal information related to searches to 90 days to address the growing
concerns of privacy advocates and government regulators. It turns out that Yahoo won't be deleting the
contents of its search logs. Whether or not a search engine does this is usually disclosed in the search
engine’ s privacy policy. If Yahoo's logs include information linking each user's various searches together,
then even deleting the IP address entirely probably won't be enough to safeguard user privacy. Ms. Toth
said she hoped that the new policy would make Yahoo‚ Äôs search service more attractive with users
concerned about privacy. Obviously, Yahoo’ s new policy will do little to allay such concerns.
J.G.Conrad, ICAIL09, 11 June 2009
24
RESULTS (2/2)
— Sample FastSum Summary
Topic: Anonymous Internet Query Logs
Topical
overlap
to shorten the time they keep information about their users. Under the new policy, Yahoo will delete the last
eight bits of the Internet Protocol, or I.P., address associated with a search query after 90 days. To start
with, there’ s often not a one-to-one correspondence between IP addresses and Internet users. So, before
Deficient
you search, think. It will also alter so-called cookie data related to each search log and strip out any
Useful to
personal information, like a name, phone number, address or Social Security number, from the query. 2008researcher
December 17 Internet WEB2.0 Internet news Categories Internet Archives: January 2009 December 2008
November 2008 October 2008 September 2008 Yahoo Limits Retention of Personal Data Posted by admin|
Internet| Wednesday 17 December 2008 4: 16 am, the Internet search company, said that it would limit the
time it holds identifiable personal information related to searches to 90 days to address the growing
concerns of privacy advocates and government regulators. It turns out that Yahoo won't be deleting theDisplay of
contents of its search logs. Whether or not a search engine does this is usually disclosed in the searchsentiment
engine’ s privacy policy. If Yahoo's logs include information linking each user's various searches together,
then even deleting the IP address entirely probably won't be enough to safeguard user privacy. Ms. Toth
said she hoped that the new policy would make Yahoo‚ Äôs search service more attractive with users
concerned about privacy. Obviously, Yahoo’ s new policy will do little to allay such concerns.
J.G.Conrad, ICAIL09, 11 June 2009
25
OUTLINE
• INTRODUCTION
• RELATED WORK
• SYSTEM
• METHODOLOGY
• RESULTS
• CONCLUSIONS
• FUTURE WORK
• AI & LAW PROPOSAL
CONCLUSIONS (1/1)
• Amount, rate of legal information flow growing
– Summarization, identification of trends increasingly valuable
• Forums like TAC-opinion summarization beginning to study topic
• For certain legal research, such synopses can be very helpful
• Viewpoints, individually or in aggregate, can expand arguments,
comprehension of underlying legal issues
• First effort to produce automatic opinion summaries
for entries in legal blog space
• Based on multiple documents
• For pre-specified polarity
– Trained on general, homogeneous news documents (okay)
– Trained on specific heterogeneous legal blogs (better)
• Assessed by expert legal reviewers
– Baseline scores in the low 2.0s out of 5 (comparable to TAC)
J.G.Conrad, ICAIL09, 11 June 2009
27
OUTLINE
• INTRODUCTION
• RELATED WORK
• SYSTEM
• METHODOLOGY
• RESULTS
• CONCLUSIONS
• FUTURE WORK
• AI & LAW PROPOSAL
FUTURE WORK (1/1)
• Compare to other summarization systems/techniques
– From TAC or elsewhere
• Test against model summaries and use the nugget
pyramid evaluation method
• Train the ML component of FastSum on various blog
entries, rather than general news
• Formalize the role input data has on result sets; and
the impact output length has on results
• Incorporate more structure
– Qualitative — best template to harness?
– Quantitative — optimal length for each section?
• Leverage features from the legal domain
– E.g., use a legal dictionary to help rank sentences
J.G.Conrad, ICAIL09, 11 June 2009
29
OUTLINE
• INTRODUCTION
• RELATED WORK
• SYSTEM
• METHODOLOGY
• RESULTS
• CONCLUSIONS
• FUTURE WORK
• AI & LAW PROPOSAL
AI & LAW PROPOSAL (1/1)
• For AI & Law (IAAIL) and TAC (NIST)
– NIST offers research groups shared task in multi-document
summarization
– Why not focus on a shared task in the legal domain?
• Need be assessed by IAAIL, NIST communities to determine
interest
– Who would benefit?
1. Legal practitioners — potentially highly beneficial results
2. Legal researchers — thanks to valuable testbed
3. AI & Law Community — can breath in new life, members
• What data collections could be used?
(i) TAC uses the very large BLOG06 collection
(ii) Text Entailment uses the RTE collection; a hybrid also possible
J.G.Conrad, ICAIL09, 11 June 2009
31
Query-based Opinion Summarization
for Legal Blog Entries
Jack G. Conrad, Jochen L. Leidner, Frank Schilder, Ravi Kondadadi
Research & Development
Gracias!
Twelfth International Conference on Artificial Intelligence & Law (ICAIL 2009)
Barcelona, Spain
8-12 June 2009
¿Preguntas?
INTRODUCTION (1/4)
— Motivations
• Modern legal information environment increasingly
dynamic, fast-paced
– Blawgs (legal blogs) provide a more immediate forum
• Generally unmoderated, instantaneous, candid, terse
• Viewpoints to be gleaned, in aggregate or individually, are
rich
• Missing piece: ability to summarize blog entries
– Legal professionals busy simply synthesizing traditional
legal materials (cases, statutes, analytical documents)
• Pressures due to case load and schedules immense
• Increasingly impossible to keep up with information
bandwidth
• Means of consolidating, summarizing artifacts invaluable
J.G.Conrad, ICAIL09, 11 June 2009
33
AI & LAW PROPOSAL (1/1)
• For AI & Law (IAAIL) and TAC (NIST)
– NIST offers research groups shared task in multidocument summarization
– Why not focus on such a shared task in the legal
domain?
– Need be assessed by IAAIL, NIST communities to
determine interest
– Potentially of great benefit to legal practitioners
– Could raise the bar on current baseline system
– Could use:
• Blog-based data set like BLOG06, as used in TAC 2008,
with a legal component
• RTE (Recognizing Text Entailment) data set, again with a
legal component
J.G.Conrad, ICAIL09, 11 June 2009
• a combination of the two
34
RELATED WORK (1/2)
– Ashley & Aleven (1991 ff.) — produce intelligent tutoring
applications to teach law students how to argue in the context
of caselaw
– Farzindar & Lapalme (2004) — present the LetSum system
to summarize Canadian court decisions
– Hachey & Grover (2006) — apply argumentative zoning to
summarize decisions from the House of Lords
– Saravanan & Raman (2006) — use statistical graphical
models (CFRs) for legal summarization, while extracting
rhetorical roles
– Lerman, B.-G., and McDonald (2009) — show users have a
strong preference for summarizers that model sentiment over
non-sentiment baselines
J.G.Conrad, ICAIL09, 11 June 2009
SYSTEM (4/5)
• FastSum’s legal blog opinion summarization system
– Sequence of operation
1. Pre-processing
(a) tokenization
(b) sentence splitting
(c) boiler plate expression removal (e.g., ‘Response by ...’)
2. Question analysis
(a) sentiment analysis (tagging)
(b) target analysis (matching)
3. Sentiment Filter
–
sentences with proper polarity selected; else, filtered out
J.G.Conrad, ICAIL09, 11 June 2009
36
RELATED WORK (2/2)
— TAC, the Text Analysis Conference
– a new annual international workshop sponsored by NIST
– the National Institute of Standards & Technology
• organizers disseminate NLP-type tasks and datasets
• participants develop systems that solve the tasks
– submit their results to NIST for evaluation
• members can also propose new tasks for future workshops
– the sentiment summarization pilot task consisted of
producing short, coherent sentiment summaries of blog
text
• our system produced multi-document summaries
– Related Conferences:
• TREC — the Text Retrieval Conference (started in mid-90s)
• DUC — Document Understanding Conference (from 2001J.G.Conrad, ICAIL09, 11 June 2009
07)
37
SYSTEM (5/5)
• FastSum’s legal blog opinion summarization system
– Sequence of operation (cont.)
4. Feature extraction
–
focus largely on correspondence with terms in query
• at different levels of granularity: title, description, document
–
also harness sentence-based features
• length, position
5. Sentence ranker
–
trained regression SVM on feature set — goal: summary worthiness
6. Redundancy removal
–
basic idea — change relative importance of remaining sentences
w.r.t. currently selected sentences
J.G.Conrad, ICAIL09, 11 June 2009
38
SYSTEM (5/5)
• FastSum’s legal blog opinion summarization system
– Sequence of operation (cont.)
4. Feature extraction
–
–
–
–
–
topic word frequency (title, description)
content word frequency
document frequency
headline frequency
sentence-based features (length, position)
<topic>
<num> D0703A </num>
<title> age
discrimination </title>
<narr>
This expose documents the
increasing occurrence of
age discrimination in the
workplace in Canada ...
</narr>
</topic>
5. Sentence ranker
–
trained regression SVM on feature set — goal: summary worthiness
6. Redundancy removal
–
basic idea — change relative importance of remaining sentences
w.r.t. currently selected sentences
J.G.Conrad, ICAIL09, 11 June 2009
39