Information Extraction Junichi Tsujii Graduate School of Science University of Tokyo Japan Ronen Feldman Bar Ilan University Israel.

Download Report

Transcript Information Extraction Junichi Tsujii Graduate School of Science University of Tokyo Japan Ronen Feldman Bar Ilan University Israel.

Information Extraction
Junichi Tsujii
Graduate School of Science
University of Tokyo
Japan
Ronen Feldman
Bar Ilan University
Israel
Application Tasks of NLP
(1)Information Retrieval/Detection
To search and retrieve documents in response to queries
for information
(2)Passage Retrieval
To search and retrieve part of documents in response
to queries for information
(3)Information Extraction
To extract information that fits pre-defined database schemas
or templates, specifying the output formats
(4) Question/Answering Tasks
To answer general questions by using texts as knowledge
base: Fact retrieval, combination of IR and IE
(5)Text Understanding
To understand texts as people do: Artificial Intelligence
Ranges of Queries
(1)Information Retrieval/Detection
(2)Passage Retrieval
(3)Information Extraction
(4) Question/Answering Tasks
(5)Text Understanding
Pre-Defined: Fixed aspects
of information carried in texts
IE definitions




Entity: an object of interest such as a person or
organization
Attribute: A property of an entity such as name, alias,
descriptor or type
Fact: A relationship held between two or more entities
such as Position of Person in Company
Event: An activity involving several entities such as
terrorist act, airline crash, product information
IE accuracy typical figures by information
type

Entity:
90-98%

Attribute: 80%

Fact:
60-70%

Event:
50-60%
MUC conferences



MUC 1 to MUC 7
1987 to 1997
Topics:
 Naval operations (2)
 Terrorist Activity (2)
 Joint venture and microelectronics
 Management changes
 Space Vehicles and Missile launches
The ACE Evaluation


The ACE program – challenge of extracting content from
human language. Research effort directed to master
 first the extraction of “entities”
 Then the extraction of “relations” among these entities
 Finally the extraction of “events” that are causally
related sets of relations
After two years, top systems successfully capture well
over 50 % of the value at the entity level
Applications of IE




Routing of information
Infrastructure for IR and categorization (higher level
features)
Event based summarization
Automatic creation of databases and knowledge bases
Where would IE be useful?



Semi-structured text
Generic documents like news articles
Most of the information in the doc is centred around a set
of easily identifiable entities
Example of IE: FASTUS(1993)
Bridgestone Sports Co. said Friday it had set up a joint venture
in Taiwan with a local concern and a Japanese trading house to
produce golf clubs to be supplied to Japan.
The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20
million new Taiwan dollars, will start production in January 1990
with production of 20,000 iron and “metal wood” clubs a month.
TIE-UP-1
Relationship: TIE-UP
Entities: “Bridgestone Sport Co.”
“a local concern”
“a Japanese trading house”
Joint Venture Company:
“Bridgestone Sports Taiwan Co.”
Activity:
ACTIVITY-1
Amount:
NT$200000000
ACTIVITY-1
Activity: PRODUCTION
Company:
“Bridgestone Sports Taiwan Co.”
Product:
“iron and ‘metal wood’ clubs”
Start Date:
DURING: January 1990
Example of IE: FASTUS(1993)
Bridgestone Sports Co. said Friday it had set up a joint venture
in Taiwan with a local concern and a Japanese trading house to
produce golf clubs to be supplied to Japan.
The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20
million new Taiwan dollars, will start production in January 1990
with production of 20,000 iron and “metal wood” clubs a month.
TIE-UP-1
Relationship: TIE-UP
Entities: “Bridgestone Sport Co.”
“a local concern”
“a Japanese trading house”
Joint Venture Company:
“Bridgestone Sports Taiwan Co.”
Activity:
ACTIVITY-1
Amount:
NT$200000000
ACTIVITY-1
Activity: PRODUCTION
Company:
“Bridgestone Sports Taiwan Co.”
Product:
“iron and ‘metal wood’ clubs”
Start Date:
DURING: January 1990
Example of IE: FASTUS(1993)
Bridgestone Sports Co. said Friday it had set up a joint venture
in Taiwan with a local concern and a Japanese trading house to
produce golf clubs to be supplied to Japan.
The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20
million new Taiwan dollars, will start production in January 1990
with production of 20,000 iron and “metal wood” clubs a month.
TIE-UP-1
Relationship: TIE-UP
Entities: “Bridgestone Sport Co.”
“a local concern”
“a Japanese trading house”
Joint Venture Company:
“Bridgestone Sports Taiwan Co.”
Activity:
ACTIVITY-1
Amount:
NT$200000000
ACTIVITY-1
Activity: PRODUCTION
Company:
“Bridgestone Sports Taiwan Co.”
Product:
“iron and ‘metal wood’ clubs”
Start Date:
DURING: January 1990
Example of IE: FASTUS(1993)
Bridgestone Sports Co. said Friday it had set up a joint venture
in Taiwan with a local concern and a Japanese trading house to
produce golf clubs to be supplied to Japan.
The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20
million new Taiwan dollars, will start production in January 1990
with production of 20,000 iron and “metal wood” clubs a month.
TIE-UP-1
Relationship: TIE-UP
Entities: “Bridgestone Sport Co.”
“a local concern”
“a Japanese trading house”
Joint Venture Company:
“Bridgestone Sports Taiwan Co.”
Activity:
ACTIVITY-1
Amount:
NT$200000000
ACTIVITY-1
Activity: PRODUCTION
Company:
“Bridgestone Sports Taiwan Co.”
Product:
“iron and ‘metal wood’ clubs”
Start Date:
DURING: January 1990
Example of IE: FASTUS(1993)
Bridgestone Sports Co. said Friday it had set up a joint venture
in Taiwan with a local concern and a Japanese trading house to
produce golf clubs to be supplied to Japan.
The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20
million new Taiwan dollars, will start production in January 1990
with production of 20,000 iron and “metal wood” clubs a month.
TIE-UP-1
Relationship: TIE-UP
Entities: “Bridgestone Sport Co.”
“a local concern”
“a Japanese trading house”
Joint Venture Company:
“Bridgestone Sports Taiwan Co.”
Activity:
ACTIVITY-1
Amount:
NT$200000000
ACTIVITY-1
Activity: PRODUCTION
Company:
“Bridgestone Sports Taiwan Co.”
Product:
“iron and ‘metal wood’ clubs”
Start Date:
DURING: January 1990
FASTUS
Based on finite states automata (FSA)
set up
new Twaiwan dallors
1.Complex Words:
a Japanese trading house
had set up
2.Basic Phrases:
production of
20, 000 iron and
metal wood clubs
3.Complex phrases:
Recognition of multi-words and proper names
Simple noun groups, verb groups and particles
Complex noun groups and verb groups
4.Domain Events:
[company]
[set up]
[Joint-Venture]
with
[company]
Patterns for events of interest to the application
Basic templates are to be built.
5. Merging Structures:
Templates from different parts of the texts are
merged if they provide information about the
same entity or event.
Example of IE: FASTUS(1993)
Bridgestone Sports Co. said Friday it had set up a joint venture
in Taiwan with a local concern and a Japanese trading house to
produce golf clubs to be supplied to Japan.
The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20
million new Taiwan dollars, will start production in January 1990
with production of 20,000 iron and “metal wood” clubs a month.
TIE-UP-1
Relationship: TIE-UP
Entities: “Bridgestone Sport Co.”
“a local concern”
“a Japanese trading house”
Joint Venture Company:
“Bridgestone Sports Taiwan Co.”
Activity:
ACTIVITY-1
Amount:
NT$200000000
ACTIVITY-1
Activity: PRODUCTION
Company:
“Bridgestone Sports Taiwan Co.”
Product:
“iron and ‘metal wood’ clubs”
Start Date:
DURING: January 1990
Information Extraction
……….
Jurgen Pfrang, 51, reportedly stumbled upon the robbers on the
second floor of his Nanjing home early on Sunday.
The deputy general manager of Yaxing Benz, a Sino-German
joint venture that makes buses and bus chassis in nearby Yangzhou,
was hacked to death with 45 cm watermelon knives.
……….
Name of the Venture: Yaxing Benz
Products:
buses and bus chassis
Location:
Yangzhou,China
Companies involved: (1)Name: X?
Country: German
(2)Name: Y?
Country: China
Information Extraction
A German vehicle-firm executive was stabbed to death ….
……….
Jurgen Pfrang, 51, reportedly stumbled upon the robbers on the
second floor of his Nanjing home early on Sunday.
The deputy general manager of Yaxing Benz, a Sino-German
joint venture that makes buses and bus chassis in nearby Yangzhou,
was hacked to death with 45 cm watermelon knives.
……….
Different template
Crime-Type: Murder
for crimes
Type: Stabbing
The killed: Name: Jurgen Pfrang
Age:
51
Profession: Deputy general manager
Location: Nanjing, China
Interpretation of Texts
(1)Information Retrieval/Detection
User
(2)Passage Retrieval
User
(3)Information Extraction
System
(4) Question/Answering Tasks
System
(5)Text Understanding
System
Characterization of Texts
IR System
Queries
Collection of Texts
Knowledge
Interpretation
Characterization of Texts
IR System
Queries
Collection of Texts
Knowledge
Interpretation
Characterization of Texts
Passage
IR System
Collection of Texts
Queries
Knowledge
Characterization of Texts
Interpretation
Passage
IR System
IE System
Queries
Structures
of
Sentences
NLP
Collection of Texts
Texts
Templates
Knowledge
Interpretation
IE System
Texts
Templates
IE as
compromise NLP
Knowledge
Interpretation
IE System
General Framework
of
NLP/NLU
Texts
Templates
Predefined
Performance Evaluation
(1)Information Retrieval/Detection
Rather clear
(2)Passage Retrieval
A bit vague
(3)Information Extraction
Rather clear
(4) Question/Answering Tasks
A bit vague
(5)Text Understanding
Very vague
Query
N: Correct Documents
M:Retrieved Documents
C: Correct Documents that are
actually retrieved
N
Collection of Documents
M
Precision: C
M
C
Recall:
N
F-Value: 2P・R
P+R
P
C
R
Query
N: Correct Templates
M:Retrieved Templates
C: Correct Templates that are
actually retrieved
N
Collection of Documents
M
Precision: C
M
C
Recall:
N
F-Value: 2P・R
P+R
P
C
R
More complicated due to partially
filled templates
Framework of IE
IE as compromise NLP
Difficulties of NLP
General Framework of NLP
(1) Robustness:
Incomplete Knowledge
Morphological and
Lexical Processing
Syntactic Analysis
Predefined
Aspects of
Information
Semantic Analysis
Context processing
Interpretation
Incomplete
Domain Knowledge
Interpretation Rules
Difficulties of NLP
General Framework of NLP
(1) Robustness:
Incomplete Knowledge
Morphological and
Lexical Processing
Syntactic Analysis
Predefined
Aspects of
Information
Semantic Analysis
Context processing
Interpretation
Incomplete
Domain Knowledge
Interpretation Rules
Approaches for building IE systems

Knowledge Engineering Approach
 Rules crafted by linguists in cooperation with domain
experts
 Most of the work done by insoecting a set of relevant
documents
Approaches for building IE systems

Automatically trainable systems
 Techniques based on statistics and almost no
linguistic knowledge
 Language independent
 Main input – annotated corpus
 Small effort for creating rules, but crating annotated
corpus laborious
Techniques in IE
(1) Domain Specific Partial Knowledge:
Knowledge relevant to information to be extracted
(2) Ambiguities:
Ignoring irrelevant ambiguities
Simpler NLP techniques
(3) Robustness:
Coping with Incomplete dictionaries
(open class words)
Ignoring irrelevant parts of sentences
(4) Adaptation Techniques:
Machine Learning, Trainable systems
General Framework of NLP
Morphological and
Lexical Processing
Syntactic Analysis
Semantic Anaysis
Context processing
Interpretation
95 %
FSA rules
Part of Speech Tagger
Statistic taggers
Open class words:
Named entity recognition
(ex) Locations
Persons
Companies
Organizations
Position names
Local Context
Statistical Bias
Domain specific rules:
<Word><Word>, Inc.
Mr. <Cpt-L>. <Word>
Machine Learning:
HMM, Decision Trees
Rules + Machine Learning
F-Value
90
Domain
Dependent
FASTUS
General Framework of NLP
Based on finite states automata (FSA)
1.Complex Words:
Morphological and
Lexical Processing
Recognition of multi-words and proper names
2.Basic Phrases:
Simple noun groups, verb groups and particles
Syntactic Analysis
3.Complex phrases:
Complex noun groups and verb groups
Semantic Anaysis
4.Domain Events:
Patterns for events of interest to the application
Basic templates are to be built.
Context processing
Interpretation
5. Merging Structures:
Templates from different parts of the texts are
merged if they provide information about the
same entity or event.
FASTUS
General Framework of NLP
Based on finite states automata (FSA)
1.Complex Words:
Morphological and
Lexical Processing
Recognition of multi-words and proper names
2.Basic Phrases:
Simple noun groups, verb groups and particles
Syntactic Analysis
3.Complex phrases:
Complex noun groups and verb groups
Semantic Anaysis
4.Domain Events:
Patterns for events of interest to the application
Basic templates are to be built.
Context processing
Interpretation
5. Merging Structures:
Templates from different parts of the texts are
merged if they provide information about the
same entity or event.
FASTUS
General Framework of NLP
Based on finite states automata (FSA)
1.Complex Words:
Morphological and
Lexical Processing
Recognition of multi-words and proper names
2.Basic Phrases:
Simple noun groups, verb groups and particles
Syntactic Analysis
3.Complex phrases:
Complex noun groups and verb groups
Semantic Analysis
4.Domain Events:
Patterns for events of interest to the application
Basic templates are to be built.
Context processing
Interpretation
5. Merging Structures:
Templates from different parts of the texts are
merged if they provide information about the
same entity or event.
Chomsky Hierarchy
of Grammar
Hierarchy
of Automata
Regular Grammar
Finite State Automata
Context Free Grammar
Push Down Automata
Context Sensitive Grammar
Linear Bounded Automata
Type 0 Grammar
Turing Machine
Computationally more complex, Less Efficiency
Chomsky Hierarchy
of Grammar
Hierarchy
of Automata
Regular Grammar
Finite State Automata
AnB n
Context Free Grammar
Push Down Automata
Context Sensitive Grammar
Linear Bounded Automata
Type 0 Grammar
Turing Machine
Computationally more complex, Less Efficiency
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
Pattern-maching
{PN ’s/ Art}(ADJ)* N(P Art (ADJ)* N)*
PN ’s (ADJ)* N P Art (ADJ)* N
1
’s
PN
0
Art
2
ADJ
N
’s
3
John’s interesting
book with a nice cover
Art
P
PN
4
FASTUS
General Framework of NLP
Based on finite states automata (FSA)
1.Complex Words:
Morphological and
Lexical Processing
Recognition of multi-words and proper names
2.Basic Phrases:
Simple noun groups, verb groups and particles
Syntactic Analysis
3.Complex phrases:
Complex noun groups and verb groups
Semantic Analysis
4.Domain Events:
Patterns for events of interest to the application
Basic templates are to be built.
Context processing
Interpretation
5. Merging Structures:
Templates from different parts of the texts are
merged if they provide information about the
same entity or event.
Example of IE: FASTUS(1993)
Bridgestone Sports Co. said Friday it had set up a joint venture
in Taiwan with a local concern and a Japanese trading house to
produce golf clubs to be supplied to Japan.
The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20
million new Taiwan dollars, will start production in January 1990
with production of 20,000 “metal wood” clubs a month.
1.Complex words
Attachment
Ambiguities
are not made
explicit
2.Basic Phrases:
Bridgestone Sports Co.: Company name
said
: Verb Group
Friday
: Noun Group
it
: Noun Group
had set up
: Verb Group
a joint venture
: Noun Group
in
: Preposition
Taiwan
: Location
Example of IE: FASTUS(1993)
Bridgestone Sports Co. said Friday it had set up a joint venture
in Taiwan with a local concern and
{{ a Japanese trading house to
}}
produce golf clubs to be supplied to Japan.
The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20
million new Taiwan dollars, will start production in January 1990
with production of 20,000 “metal wood” clubs a month.
1.Complex words
a Japanese tea house
a [Japanese tea] house
a Japanese [tea house]
2.Basic Phrases:
Bridgestone Sports Co.: Company name
said
: Verb Group
Friday
: Noun Group
it
: Noun Group
had set up
: Verb Group
a joint venture
: Noun Group
in
: Preposition
Taiwan
: Location
Example of IE: FASTUS(1993)
Bridgestone Sports Co. said Friday it had set up a joint venture
in Taiwan with a local concern and a Japanese trading house to
produce golf clubs to be supplied to Japan.
The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20
million new Taiwan dollars, will start production in January 1990
with production of 20,000 “metal wood” clubs a month.
1.Complex words
Structural
Ambiguities of
NP are ignored
2.Basic Phrases:
Bridgestone Sports Co.: Company name
said
: Verb Group
Friday
: Noun Group
it
: Noun Group
had set up
: Verb Group
a joint venture
: Noun Group
in
: Preposition
Taiwan
: Location
Example of IE: FASTUS(1993)
Bridgestone Sports Co. said Friday it had set up a joint venture
in Taiwan with a local concern and a Japanese trading house to
produce golf clubs to be supplied to Japan.
The joint venture, Bridgestone Sports Taiwan Co., capitalized at 20
million new Taiwan dollars, will start production in January 1990
with production of 20,000 “metal wood” clubs a month.
2.Basic Phrases:
Bridgestone Sports Co.: Company name
said
: Verb Group
Friday
: Noun Group
it
: Noun Group
had set up
: Verb Group
a joint venture
: Noun Group
in
: Preposition
Taiwan
: Location
3.Complex Phrases
Example of IE: FASTUS(1993)
[COMPNY] said Friday it [SET-UP] [JOINT-VENTURE]
in [LOCATION] with [COMPANY] and [COMPNY] to
produce [PRODUCT] to be supplied to [LOCATION].
[JOINT-VENTURE], [COMPNY], capitalized at 20 million
[CURRENCY-UNIT] [START] production in [TIME]
with production of 20,000 [PRODUCT] a month.
2.Basic Phrases:
Bridgestone Sports Co.: Company name
said
: Verb Group
Friday
: Noun Group
it
: Noun Group
had set up
: Verb Group
a joint venture
: Noun Group
in
: Preposition
Taiwan
: Location
3.Complex Phrases
Some syntactic structures
like …
Example of IE: FASTUS(1993)
[COMPNY] said Friday it [SET-UP] [JOINT-VENTURE]
in [LOCATION] with [COMPANY] to
produce [PRODUCT] to be supplied to [LOCATION].
[JOINT-VENTURE] capitalized at [CURRENCY] [START]
production in [TIME]
with production of [PRODUCT] a month.
2.Basic Phrases:
Bridgestone Sports Co.: Company name
said
: Verb Group
Friday
: Noun Group
it
: Noun Group
had set up
: Verb Group
a joint venture
: Noun Group
in
: Preposition
Taiwan
: Location
3.Complex Phrases
Syntactic structures relevant
to information to be extracted
are dealt with.
Syntactic variations
GM set up a joint venture with Toyota.
GM announced it was setting up a joint venture with Toyota.
GM signed an agreement setting up a joint venture with Toyota.
GM announced it was signing an agreement to set up a joint
venture with Toyota.
Syntactic variations
GM set up a joint venture with Toyota.
GM announced it was setting up a joint venture with Toyota.
GM signed an agreement setting up a joint venture with Toyota.
GM announced it was signing an agreement to set up a joint
venture with Toyota.
S
NP
GM
[SET-UP]
VP
V
signed
NP
VP
N
agreement
V
GM plans to set up a joint venture with Toyota. setting up
GM expects to set up a joint venture with Toyota.
Syntactic variations
GM set up a joint venture with Toyota.
GM announced it was setting up a joint venture with Toyota.
GM signed an agreement setting up a joint venture with Toyota.
GM announced it was signing an agreement to set up a joint
venture with Toyota.
S
NP
GM
[SET-UP]
VP
V
set up
GM plans to set up a joint venture with Toyota.
GM expects to set up a joint venture with Toyota.
Example of IE: FASTUS(1993)
[COMPNY] [SET-UP] [JOINT-VENTURE]
in [LOCATION] with [COMPANY] to
produce [PRODUCT] to be supplied to [LOCATION].
[JOINT-VENTURE] capitalized at [CURRENCY] [START]
production in [TIME]
with production of [PRODUCT] a month.
3.Complex Phrases
4.Domain Events
[COMPANY][SET-UP][JOINT-VENTURE]with[COMPNY]
[COMPANY][SET-UP][JOINT-VENTURE] (others)* with[COMPNY]
The attachment positions of PP are determined at this stage.
Irrelevant parts of sentences are ignored.
Complications caused by syntactic variations
Relative clause
The mayor, who was kidnapped yesterday, was found dead today.
[NG] Relpro {NG/others}* [VG] {NG/others}*[VG]
[NG] Relpro {NG/others}* [VG]
Complications caused by syntactic variations
Relative clause
The mayor, who was kidnapped yesterday, was found dead today.
[NG] Relpro {NG/others}* [VG] {NG/others}*[VG]
[NG] Relpro {NG/others}* [VG]
Complications caused by syntactic variations
Relative clause
The mayor, who was kidnapped yesterday, was found dead today.
[NG] Relpro {NG/others}* [VG] {NG/others}*[VG]
[NG] Relpro {NG/others}* [VG]
Basic patterns
Surface Pattern
Generator
Patterns used
by Domain Event
Relative clause construction
Passivization, etc.
FASTUS
Based on finite states automata (FSA)
1.Complex Words:
NP, who was kidnapped, was found.
2.Basic Phrases:
3.Complex phrases:
4.Domain Events:
Piece-wise recognition
Patterns for events of interest to the application
of basic templates
Basic templates are to be built.
5. Merging Structures:
Reconstructing information
Templates from different parts of the texts are carried via syntactic structures
merged if they provide information about the
by merging basic templates
same entity or event.
FASTUS
Based on finite states automata (FSA)
1.Complex Words:
NP, who was kidnapped, was found.
2.Basic Phrases:
3.Complex phrases:
4.Domain Events:
Piece-wise recognition
Patterns for events of interest to the application
of basic templates
Basic templates are to be built.
5. Merging Structures:
Reconstructing information
Templates from different parts of the texts are carried via syntactic structures
merged if they provide information about the
by merging basic templates
same entity or event.
FASTUS
Based on finite states automata (FSA)
1.Complex Words:
NP, who was kidnapped, was found.
2.Basic Phrases:
3.Complex phrases:
4.Domain Events:
Piece-wise recognition
Patterns for events of interest to the application
of basic templates
Basic templates are to be built.
5. Merging Structures:
Reconstructing information
Templates from different parts of the texts are carried via syntactic structures
merged if they provide information about the
by merging basic templates
same entity or event.
FASTUS
Based on finite states automata (FSA)
1.Complex Words:
NP, who was kidnapped, was found.
2.Basic Phrases:
3.Complex phrases:
4.Domain Events:
Piece-wise recognition
Patterns for events of interest to the application
of basic templates
Basic templates are to be built.
5. Merging Structures:
Reconstructing information
Templates from different parts of the texts are carried via syntactic structures
merged if they provide information about the
by merging basic templates
same entity or event.
Current state of the arts of IE
1. Carefully constructed IE systems
F-60 level (interannotater agreement: 60-80%)
Domain: telegraphic messages about naval operation
(MUC-1:87, MUC-2:89)
news articles and transcriptions of radio broadcasts
Latin American terrorism (MUC-3:91, MUC-4:1992)
News articles about joint ventures (MUC-5, 93)
News articles about management changes (MUC-6, 95)
News articles about space vehicle (MUC-7, 97)
2. Handcrafted rules (named entity recognition, domain events, etc)
Automatic learning from texts:
Supervised learning : corpus preparation
Non-supervised, or controlled learning