Transcript ICoS-4

© Johan Bos
November 2005
Carol Beer (Little Britain)
© Johan Bos
November 2005
Computer says “no”
Question Answering
• Lecture 1 (Today):
Introduction; History of QA; Architecture of a QA system;
Evaluation.
© Johan Bos
November 2005
• Lecture 2 (Friday):
Question Classification; NLP techniques for question analysis;
POS-tagging; Parsing; Semantic analysis; WordNet.
• Lecture 3 (Next Monday):
Retrieving Answers; Document pre-processing; Tokenisation;
Stemming; Lemmatisation; Named Entity Recognition;
Anaphora Resolution; Matching; Use of knowledge resources;
Reranking; Sanity checking.
© Johan Bos
November 2005
What is Question Answering?
Information Pinpointing
© Johan Bos
November 2005
Information required:
Average number of car accidents per
year in Sweden.
Two ways of getting this information:
- Ask Google or a similar search engine
(good luck!)
- Ask a QA system the question:
What’s the rate of car accidents in
Sweden?
QA vs IR
• Traditional method for information
access: IR (Information Retrieval)
© Johan Bos
November 2005
– Think of IR as finding the “right book in a
library”
– Think of QA as a “librarian giving you the
book and opening it on the page with the
information you’re looking for”
QA vs IE
• Traditional method for information
access: IE (Information Extraction)
© Johan Bos
November 2005
– Think of IE as finding answers to a predefined question (i.e., a template)
– Think of QA as asking any question you
like
What is Question Answering?
• Questions in natural language,
not queries!
© Johan Bos
November 2005
• Answers, not documents!
Why do we need QA?
© Johan Bos
November 2005
• Information overload problem
• Accessing information using traditional
methods such as IR and IE are limited
• QA increasingly important because:
– Size of available information grows
– There is duplicate information
– There is false information
– More and more “computer illiterates”
accessing electronically stored information
Information Avalanche
• Available information is growing*:
– 1999: 250MB pp for each person on earth
– 2002: 800MB pp for each person on earth
© Johan Bos
November 2005
• People want specific information
* source: M.de Rijke 2005
© Johan Bos
November 2005
People ask Questions*
* source: M.de Rijke 2005
Why is QA hard? (1/3)
© Johan Bos
November 2005
• Questions are expressed in natural
language (such as English or Italian)
• Unlike formal languages, natural languages
allow a great deal of flexibility
• Example:
–
–
–
–
What is the population of Rome?
How many people live in Rome?
What’s the size of Rome?
How many inhabitants does Rome have?
Why is QA hard? (2/3)
• Answers are expressed in natural language
(such as English or Italian)
• Unlike formal languages, natural languages
allow a great deal of flexibility
• Example:
© Johan Bos
November 2005
…is estimated at 2.5 million residents…
… current population of Rome is 2817000…
…Rome housed over 1 million inhabitants…
Why is QA hard? (3/3)
• Answers could be spread across different
documents
• Examples:
© Johan Bos
November 2005
– Which European countries produce wine?
[Document A contains information about Italy, and
document B about France]
– What does Bill Clinton’s wife do for a living?
[Document A explains that Bill Clinton’s wife is
Hillary Clinton, and Document B tells us that she’s
a politician]
© Johan Bos
November 2005
History of QA (de Rijke & Webber 2003)
• QA is by no means a new area!
• Simmons (1965) reviews 15
implemented and working systems
• Many ingredients of today’s QA
systems are rooted in these early
approaches
• Database oriented systems, domain
independent, as opposed to today’s
systems that work on large sets of
unstructured texts
© Johan Bos
November 2005
Examples of early QA systems
• BASEBALL (Green et al. 1963)
Answers English questions about scores,
locations and dates of baseball games
• LUNAR (Woods 1977)
Accesses chemical data on lunar material
compiled during the Apollo missions
• PHLIQA1 (Scha et al. 1980)
Answers short questions against a database
of computer installations in Europe
Recent work in QA
• Since the 1990s research in QA has by
and large focused on open-domain
applications
© Johan Bos
November 2005
• Recently interest in restricted-domain
QA has increased, in particular in
commercial applications
– Banking, entertainment, etc.
Architecture of a QA system
corpus
question
Question
Analysis
query
documents/passages
answer-type
question
representation
© Johan Bos
November 2005
answers
IR
Answer
Extraction
Document
Analysis
passage
representation
Question Analysis
• Input:
Natural Language Question
© Johan Bos
November 2005
• Output:
Expected Answer Type
(Formal) Representation of Question
• Techniques used:
Machine learning, parsing
Document Analysis
• Input:
Documents or Passages
© Johan Bos
November 2005
• Output:
(Formal) Representation of Passages
that might contain the answer
• Techniques used:
Tokenisation, Named Entity
Recognition, Parsing
Answer Retrieval
• Input:
Expected Answer Type
Question (formal representation)
Passages (formal representation)
© Johan Bos
November 2005
• Output:
Ranked list of answers
• Techniques used:
Matching, Re-ranking, Validation
Example Run
corpus
question
Question
Analysis
query
documents/passages
answer-type
question
representation
© Johan Bos
November 2005
answers
IR
Answer
Extraction
Document
Analysis
passage
representation
Example Run
How long is the
river Thames?
corpus
question
Question
Analysis
query
documents/passages
answer-type
question
representation
© Johan Bos
November 2005
answers
IR
Answer
Extraction
Document
Analysis
passage
representation
Example Run
length river thames
corpus
question
Question
Analysis
query
documents/passages
answer-type
question
representation
© Johan Bos
November 2005
answers
IR
Answer
Extraction
Document
Analysis
passage
representation
Example Run
corpus
question
Question
Analysis
MEASURE
query
documents/passages
answer-type
question
representation
© Johan Bos
November 2005
answers
IR
Answer
Extraction
Document
Analysis
passage
representation
Example Run
corpus
question
Question
Analysis
query
Answer(x) &documents/passages
length(y,x) &
river(y) & named(y,thames)
answer-type
question
representation
© Johan Bos
November 2005
answers
IR
Answer
Extraction
Document
Analysis
passage
representation
Example Run
A: NYT199802-31
B: APW199805-12
C: NYT200011-07corpus
question
Question
Analysis
query
documents/passages
answer-type
question
representation
© Johan Bos
November 2005
answers
IR
Answer
Extraction
Document
Analysis
passage
representation
Example Run
A: 30(u) & mile(u) &
length(v,u) & river(y)
query
B: 60(z)
& centimeter(z)
&
Question
question
height(v,z)
& dog(z)
Analysis
C: 230(u) & kilometer(u) &
length(x,u) answer-type
& river(x)
question
representation
© Johan Bos
November 2005
answers
Answer
Extraction
corpus
IR
documents/passages
Document
Analysis
passage
representation
Example Run
corpus
question
Question
Analysis
query
documents/passages
C: 230
kilometer
answer-type
A: 30 miles
B: 60question
centimeter
representation
© Johan Bos
November 2005
answers
IR
Answer
Extraction
Document
Analysis
passage
representation
Evaluating QA systems
• International evaluation campaigns for
QA systems (open domain QA):
© Johan Bos
November 2005
– TREC (Text Retrieval Conference)
http://trec.nist.gov/
– CLEF (Cross Language Evaluation Forum)
http://clef-qa.itc.it/
– NTCIR (NII Test Collection for IR Systems)
http://www.slt.atr.jp/CLQA/
TREC-QA (organised by NIST)
• Annual event, started in 1999
• Difficulty of the QA task increased over
the years:
– 1999: Answers in snippets, ranked list of
answers;
– 2005: Exact answers, only one answer.
© Johan Bos
November 2005
• Three types of questions:
– Factoid questions
– List questions
– Definition questions
QA@CLEF
• CLEF is the “European edition” of TREC
• Monolingual (non-English) QA
– Bulgarian (BG), German (DE), Spanish (ES),
Finnish (FI), French (FR), Italian (IT), Dutch (NL),
Portuguese (PT)
• Cross-Lingual QA
© Johan Bos
November 2005
– Questions posed in source language, answer
searched in documents of target language
– All combinations possible
Open-Domain QA
• QA at TREC is considered
“Open-Domain” QA
– Document collection is Acquint Corpus
(over a million documents)
– Questions can be about anything
© Johan Bos
November 2005
• Restricted-Domain QA
– Documents described a specific domain
– Detailed questions
– Less redundancy of answers!
TREC-type questions
• Factoid questions
– Where is the Taj Mahal?
• List questions
– What actors have played Tevye in
`Fiddler on the Roof'?
© Johan Bos
November 2005
• Definition/biographical questions
– What is a golden parachute?
– Who is Vlad the Impaler?
What is a correct answer?
• Example Factoid Question
– When did Franz Kafka die?
© Johan Bos
November 2005
• Possible Answers:
– Kafka died in 1923.
– Kafka died in 1924.
– Kafka died on June 3, 1924 from
complications related to Tuberculosis.
– Ernest Watz was born June 3, 1924.
– Kafka died on June 3, 1924.
What is a correct answer?
• Example Factoid Question
– When did Franz Kafka die?
© Johan Bos
November 2005
• Possible Answers:
Incorrect
– Kafka died in 1923.
– Kafka died in 1924.
– Kafka died on June 3, 1924 from
complications related to Tuberculosis.
– Ernest Watz was born June 3, 1924.
– Kafka died on June 3, 1924.
What is a correct answer?
• Example Factoid Question
– When did Franz Kafka die?
© Johan Bos
November 2005
• Possible Answers:
Inexact
(under-informative)
– Kafka died in 1923.
– Kafka died in 1924.
– Kafka died on June 3, 1924 from
complications related to Tuberculosis.
– Ernest Watz was born June 3, 1924.
– Kafka died on June 3, 1924.
What is a correct answer?
• Example Question
– When did Franz Kafka die?
© Johan Bos
November 2005
• Possible Answers:
Inexact
(over-informative)
– Kafka died in 1923.
– Kafka died in 1924.
– Kafka died on June 3, 1924 from
complications related to Tuberculosis.
– Ernest Watz was born June 3, 1924.
– Kafka died on June 3, 1924.
What is a correct answer?
• Example Question
– When did Franz Kafka die?
© Johan Bos
November 2005
• Possible Answers:
– Kafka died in 1923.
Unsupported
– Kafka died in 1924.
– Kafka died on June 3, 1924 from
complications related to Tuberculosis.
– Ernest Watz was born June 3, 1924.
– Kafka died on June 3, 1924.
What is a correct answer?
• Example Question
– When did Franz Kafka die?
© Johan Bos
November 2005
• Possible Answers:
Correct
– Kafka died in 1923.
– Kafka died in 1924.
– Kafka died on June 3, 1924 from
complications related to Tuberculosis.
– Ernest Watz was born June 3, 1924.
– Kafka died on June 3, 1924.
Answer Accuracy
© Johan Bos
November 2005
# correct answers
Answer Accuracy = --------------------------# questions
Correct answers to list questions
Example List Question
Which European countries produce wine?
System A:
© Johan Bos
November 2005
France
Italy
System B:
Scotland
France
Germany
Italy
Spain
Iceland
Greece
the Netherlands
Japan
Turkey
Estonia
Evaluation metrics for list questions
• Precision (P):
# answers judged correct & distinct
P = ---------------------------------------------# answers returned
• Recall (R):
© Johan Bos
November 2005
# answers judged correct & distinct
R = -----------------------------------------------# total answers
• F-Score (F):
2*P*R
F = -----------P+R
Correct answers to list questions
Example List Question
Which European countries produce wine?
System A:
© Johan Bos
November 2005
France
Italy
P = 1.00
R = 0.25
F = 0.40
System B:
Scotland
France
Germany
Italy
Spain
Iceland
Greece
the Netherlands
P = 0.64
Japan
R = 0.88
Turkey
F = 0.74
Estonia
Other evaluation metrics
System A: Ranked answers (Accuracy = 0.2)
Q1
Q2
Q3
Q4
Q6
Q7
Q8
Q9
….
Qn
A1
W
W
C
W
C
W
W
W
….
W
A2
W
W
W
W
W
W
W
W
….
W
A3
W
W
W
W
W
W
W
W
….
W
A4
W
W
W
W
W
W
W
W
….
W
A5
W
C
W
W
W
C
W
W
….
W
© Johan Bos
November 2005
System B: Ranked answers (Accuracy = 0.1)
Q1
Q2
Q3
Q4
Q6
Q7
Q8
Q9
….
Qn
A1
W
W
W
W
C
W
W
W
….
W
A2
C
W
C
W
W
C
C
W
….
C
A3
W
C
W
W
W
W
W
W
….
W
A4
W
W
W
C
W
W
W
W
….
W
A5
W
W
W
W
W
W
W
W
….
W
Mean Reciprocal Rank (MRR)
• Score for an individual question:
– The reciprocal of the rank at which
the first correct answer is returned
– 0 if no correct response is returned
• The score for a run:
© Johan Bos
November 2005
– Mean over the set of questions in the test
MRR in action
System A: MRR = (.2+1+1+.2)/10 = 0.24
Q1
Q2
Q3
Q4
Q6
Q7
Q8
Q9
….
Qn
A1
W
W
C
W
C
W
W
W
….
W
A2
W
W
W
W
W
W
W
W
….
W
A3
W
W
W
W
W
W
W
W
….
W
A4
W
W
W
W
W
W
W
W
….
W
A5
W
C
W
W
W
C
W
W
….
W
© Johan Bos
November 2005
System B: MRR = (.5+.33+.5+.25+1+.5+.5+.5)/10=0.42
Q1
Q2
Q3
Q4
Q6
Q7
Q8
Q9
….
Qn
A1
W
W
W
W
C
W
W
W
….
W
A2
C
W
C
W
W
C
C
W
….
C
A3
W
C
W
W
W
W
W
W
….
W
A4
W
W
W
C
W
W
W
W
….
W
A5
W
W
W
W
W
W
W
W
….
W
Open-Domain Question Answering
• TREC QA Track
– Factoid questions
– List questions
– Definition questions
© Johan Bos
November 2005
• State-of-the-Art
– Hard problem
– Only few systems with
good results
Accuracy TREC 2004
(n=28)
0.6-0.7
0.4-0.5
0.2-0.3
0.0-0.1
0
10
Friday
• QA Lecture 2:
© Johan Bos
November 2005
–
–
–
–
–
–
Question Classification
NLP techniques for question analysis
POS-tagging
Parsing
Semantic analysis
Use of lexical resources such as WordNet
© Johan Bos
November 2005
Question Classification
(preview)
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
How many islands does Italy have?
When did Inter win the Scudetto?
What are the colours of the Lithuanian flag?
Where is St. Andrews located?
Why does oil float in water?
How did Frank Zappa die?
Name the Baltic countries.
Which seabird was declared extinct in the 1840s?
Who is Noam Chomsky?
List names of Russian composers.
Edison is the inventor of what?
How far is the moon from the sun?
What is the distance from New York to Boston?
How many planets are there?
What is the exchange rate of the Euro to the Dollar?
What does SPQR stand for?
What is the nickname of Totti?
What does the Scottish word “bonnie” mean?
Who wrote the song “Paranoid Android”?