Transcript Slide 1
Building Ontologies Automatically
Theory and Demonstration
Dan Moldovan
Human Language Technology Research Institute
University of Texas at Dallas
Outline
Introduction to Ontologies
Automatic Ontology Building
Applications
OWL/RDF Representation
Jaguar-Jager Demo
CHiPS Demo
ABBYY - 2012
2
Ontology
An ontology is an organization of concepts and semantic relations
within a given domain
Ontologies explicitly represent knowledge about domains of interest;
i.e. what concepts are important and how do they relate to each other
Ontologies serve as the backbone of semantic technologies and
applications
Ontologies can help users achieve an unified understanding of
concepts
Ontologies facilitate dealing with acronyms
Ontologies can be used as interchange formats to enable common
access to data
ABBYY - 2012
3
Ontology
Ontologies facilitate exchange of knowledge between machines and
between people and machines
Ontologies allow easier visualization of documents; i.e. which
concepts are important and how far semantically they are
Once an ontology is created, it can be used to tag new texts to enable
better retrieval and further processing [this is the idea of the semantic
web]
Ontologies help browsing, searching and question answering; it is
possible to understand questions and provide semantic connections
between question concepts and text words
ABBYY - 2012
4
Ontologies for Question Answering
QP: determine the expected answer type and select the keywords
used to retrieve relevant passages
PR: retrieve and rank passages that are relevant to the input question
Question classification
Answer type detection
Query formulation
Keyword expansion
AP: extract an exact answer by evaluating all answer candidates
Answer surface form
Answer redundancy
Questions
Question Processing
Documents
Documents
Passage Retrieval
ABBYY - 2012
Answer Processing
Answers
5
How to Create an Ontology?
Manual ontology creation
Time consuming
Error prone
Requires subject matter experts
Automatic/Semi-automatic
ontology generation
The end product is difficult to
maintain
Hard to cope with the rapidly
changing and vast amount of
information available for a domain
ABBYY - 2012
Leverage existing domain models
to seed the process of extracting
semantically rich ontologies from
unstructured text
Automatically update the ontology
when new documents are made
available or the domain model
changes
Communicate ontology content
across multiple applications using
OWL/RDF as the common
interchange format
Allow the user to easily review,
update, and maintain the ontology
Customize ontology relations using
semantic calculus and/or user
defined rules
6
Ontologies for Question Answering
QA system integrated with an automatic ontology building system
Documents
Ontology builder
Document
stream
Indexer
Questions
Question-answering system
Answers
ABBYY - 2012
7
Outline
Introduction to Ontologies
Ontologies for Question Answering
Automatic Ontology Building
Applications
OWL/RDF Representation
Jaguar- Jager Demo
CHiPS Demo
ABBYY - 2012
8
Knowledge Acquisition from Text
KAT: automatically builds ontologies and knowledge bases (KBs) from
concepts and semantic relationships found in text
Constituents of an ontology/KB
Concepts/Vocabulary
Key domain concepts (often missing from general purpose machinereadable dictionaries, e.g., WordNet)
“weapon”, “WMD”, “launcher”
Relations between ontological concepts
“anthrax” ISA “biological weapon”, “anthrax” CAUSE “death”
Organization of Relations
Hierarchical (universally true transitive relations, e.g. ISA, PARTWHOLE)
Contextual (text-conveying relations identified by a semantic parser)
ABBYY - 2012
9
Types of Knowledge
Universal (or ontological)
Represented in hierarchies
Simple binary relations
between concepts
“Chemical weapons such as
nerve gas, …”
Contextual
Represented in individual
(semantic) contexts
Groups of relations centered
on a common concept
“The forces launched a fullscale attack on Monday”
chemical weapon
launch
ISA
AGENT
forces
nerve gas
THEME
full-scale
attack
DURING
Monday
ABBYY - 2012
10
Knowledge Base Constituents
anthrax
biological
weapon
Knowledge Base
Ontology
Concept set
C1
C3
C2
C7
Contextual knowledge
C5
C4
C21
R1 C22
R2 C23
R3 C24
C6
Hierarchy
ISA
ISA
C2
C1
C4
PW
PW
ISA
C6
C5
ISA
PW
C7
AGT
THM
TMP
assassinate
rebel
political leader
may 21
C23
R4 C34
R5 C35
C3
ABBYY - 2012
11
Knowledge Acquisition from Text
Functionality
1.
2.
3.
4.
5.
6.
Produce ontologies
Link concepts and relations to text
Visualize ontology
Edit ontology
Enhance an existing ontology
Merge two ontologies into a consistent ontology
Documents
Ontology/KB
* concepts
* universal knowledge
Documents
Seed concepts
KAT
Ontology (structured
knowledge)
* contextual knowledge
* pointers to text
ABBYY - 2012
12
Automatically Building Ontologies
Ontology/KB creation
Knowledge extraction from text
Pattern recognition; semantic parsing
Knowledge representation and storage
Contextual vs. universal
XML; relational database
Knowledge base maintenance
Conflict resolution
Ontology mapping; ontology merging
User interaction; ontology modification
ABBYY - 2012
13
KAT Modules – Text Processing
KAT
1.
Text Processing
2.
3.
Classification
Hierarchy Creation
Input: Documents, Seeds
Extract “concepts” of interest
Extract binary relations (universal)
Use semantic parser to obtain contextual
knowledge
Output: Concepts, Contexts, Binary
Relations
The rebels had access to chemical
weapons, such as nerve gas and other
poisonous gases.
Knowledge Base
Maintenance
ABBYY - 2012
14
Text Processing
1.
2.
3.
4.
Candidate concepts: NPs that contain seed concepts (e.g., <modifier>
<seed_word>) and NPs semantically linked to seed concepts
Concept selection: discard candidates that match certain criteria( e.g.
<modifier_descriptive_adjective> <seed_word>
Seed enrichment: enhance the
Seeds (keyword list
Documents
Documents
or ontology)
current set of seeds with Step 2’s
domain concepts and return to
Text Processing
Text extraction from
Step 1
HTML, MS Word, and
Relation selection: collect all
Relation selection
PDF documents
semantic relations that link
Concepts
Tokenization
Seed set
domain concepts with other
Part of speech tagging
augmentation
concepts (in- or out-of-theNamed entity recognizer
domain). The relations between
Concept selection
Syntactic parser
based on semantic
domain concepts will become
links to seeds
Word sense
part of the ontology.
Relations
Concepts
disambiguation
Semantic parser
ABBYY - 2012
Concept extractor
15
Semantic Relations Stored in KB
Relation (Code)
Definition
Example
Agent (AGT)
X is the agent for Y; X is
prototypically a person.
[XY] [John] [eats] eggs and ham
Cause (CAU)
X causes Y
[XY] [Drinking] causes [accidents].
Influence (IFL)
X caused something to happen
to Y
[XY] [The war] had an impact on [the Economy]
Instrument (INS)
X is an instrument in Y
[YX] John [broke] the window with [a hammer].
[YX] John [played] the Brandenburg Concerto on [the
harmonica]
ISA
X is a (kind of) Y
[XY] [John] is a [person].
Location/Direction/
Source/Path (LOC)
X is the location of Y or where
Y take place
[YX] There is [a cat] on [the roof]
[YX] The hurricane [passes] through [Galveston].
Make-Produce (MAK)
X makes Y
[XY] [GM] manufactures [cars].
Manner (MNR)
X is the manner in which Y
happens
[YX] John [read] [carefully]; [ran] [quickly]; [spoke]
[hastily]
Part-Whole (PW)
X is a part of Y
[YX] [faculty] [professor];
[XY] [door] of the [car]
Property Type (PRO)
X is a property type of Y
[XY] [The color] of [the car] is blue.
Attribute/Value (VAL)
X is a attribute/value of Y
[YX] [The car] is [blue]
[YX] [The color] of the car is [blue].
ABBYY - 2012
16
Semantic Relations Stored in KB
Relation (Code)
Purpose (PRP)
Definition
Example
X is the purpose for Y; Y did
[YX] John [swims] for [fun]; Mary [works] part-time [to
something because this person
earn some extra money]
wanted X
Quantification/ Extent X is a quantification of Y; Y can [XY] [XY] John saw [three] [hurricanes].
(QNT)
be an entity or event
[Y X] The budget [increased] with [10%]
Synonymy/Name
(SYN)
X is a synonym/name/equal
for/to Y
[XY] [FBI] ([Federal Bureau of Investigation])
[YX] [This car] is called ["Johann"]
Temporal (TMP)
X is the time of Y (when Y take
place)
[XY] John [woke up] at [noon]
Theme/Patient/
Result/Consumed
(THM)
X is the theme/patient/result/
consumed in/from/of Y
[YX] John [painted] [his truck].
[YX] John [baked] [a cake].
ABBYY - 2012
17
Examples of Semantic Relations in text
Semantic Relations are the interconnections between words or
concepts that define the meaning of text. They are used as
elements of knowledge bases.
Example:
John went to the park yesterday because he saw hot air balloons taking off from there
Agent(John, went)
Agent
At-Time
At-Location
At-Location(went, to the park)
At-Time(went, yesterday)
Cause(saw, went)
John went
Cause
Experiencer(He, saw)
Stimulus(hot air balloons taking off from
there, saw)
to the park
Value
yesterday
because
Part-Whole
ISA
he saw
hot air
Experiencer
Stimulus
balloons
taking off from there
Value(hot, air)
Part-Whole(hot air, balloons)
Is-A(hot air balloons, balloons)
At-Location
Experience
Experiencer(hot air balloons, taking off)
At-Loc(taking off, from there)
ABBYY - 2012
18
Semantic Parser
Various syntactic patterns: verb-argument, complex nominals,
genitives, adjectival phrases/clauses, etc.
Semantic restrictions on relation arguments R(x,y)
Domain and range restrictions defined using an ontology of sorts
KINSHIP: [AnimateConcreteObject] [AnimateConcreteObject]
Filter relations that cannot exist between certain arguments
ABBYY - 2012
19
Semantic Parser
Bracketer – determine semantic dependencies between compound
nouns with three or more nouns
Argument detection – identify argument pairs likely to encode a
semantic relation based on lexico-syntactic patterns
Domain and range filtering – filter candidate arguments based on
their semantic classes and relation definitions
Feature extraction – extract features corresponding to each pattern
Semantic class of modifier noun, syntactic path, voice, etc.
Machine learning classifiers – per-relation and per-pattern
approaches
Sugar industry analyst vs. Female industry analyst
Support vector machines, Decision trees, Naïve Bayes, Semantic Scattering
Conflict resolution – resolve relation conflicts between classifiers
ABBYY - 2012
20
KAT Modules – Classification/Hierarchy Creation
KAT
Text Processing
Classification
Hierarchy Creation
Knowledge Base
Maintenance
Input: Concepts, Binary Relations
Classify each concept against every other
using defined procedures, obtaining set
of ISA relations
Add all ISA and other binary relations to
the hierarchy using conflict resolution
Output: Hierarchy of relations
“Scud missile” ISA “missile”
“Iraqi standing_army” ISA “Asian army”
“weapons inspection team” ISA
“inspection team”
ABBYY - 2012
21
Subsumption used for Knowledge Classification
Proposition
Let C = A1 ⊓ ⋯ ⊓ Am ⊓ ∀R1.C1 ⊓ ⋯ ⊓ ∀Rn.Cn
be the normal form of the concept description C, and
D = B1 ⊓ ⋯ ⊓ Bk ⊓ ∀S1.D1 ⊓ ⋯ ⊓ ∀Sl.Dl
be the normal form concept description D.
Then C ⊑ D iff both conditions hold.
(1)
For all i, 1 ≤ i ≤ k, there exists j, 1 ≤ j ≤ m such that Bi = Aj
(2)
For all i, 1 ≤ i ≤ l, there exists j, 1 ≤ j ≤ n such that Si = Rj and Cj ⊑ Di
This formulation of subsumption is
Sound (the “if” part holds)
Complete (the “only if” part holds)
Algorithm has a polynomial complexity.
ABBYY - 2012
22
Classification/Hierarchy Creation
Classification procedures
For domain concepts modifier1 head1 and modifier2 head2, create
If ISA(modifier1,modifier2) and ISA(head1,head2), then
ISA(modifier1 head1, modifier2 head2)
If ISA(modifier1,modifier2) and SYNONYMY(head1,head2), then
ISA(modifier1 head1, modifier2 head2)
Japan discount rate ISA Asian country discount rate
If SYNONYMY(modifier1,modifier2) and ISA(head1,head2), then
ISA(modifier1 head1, modifier2 head2)
Japan discount rate ISA Asian country interest rate
Japan discount rate ISA Japan interest rate
If SYNONYMY(modifier1,modifier2) and SYNONYMY(head1,head2), then
SYNONYMY(modifier1 head1, modifier2 head2)
ABBYY - 2012
23
Classification/Hierarchy Creation
Classification procedures
For domain concepts modifier head and head, create
ISA(modifier head, head) relation
nontaxable dividends ISA dividends
For domain concepts modifier1 modifier2 head, create
If modifier1 head exists, then ISA(modifier1 modifier2 head, modifier1
head)
nuclear weapon testing ISA nuclear testing
If modifier2 head exists, then ISA(modifier1 modifier2 head, modifier2
head)
nuclear weapon testing ISA weapon testing
ABBYY - 2012
24
Classification/Hierarchy Creation
Textual entailment for concept subsumption
monetary policy ? fiscal policy ISA economic policy ISA policy (WordNet
hierarchy)
economic policy: (a government policy for maintaining economic growth and tax revenues)
= INFLUENCE
MAK = MAKE-PRODUCE
PW = PART-WHOLE
IFL
policy
ISA
IFL
economic policy
o
budget
IFL
ISA
fiscal policy
government
MAK
ISA
monetary policy
ISA
MAK
federal government
economy
PW
money supply
IFL
fiscal policy (a government
policy for dealing with the
budget (especially with taxation
and borrowing))
monetary policy (policy followed by the federal
government through the Bank of Canada for
controlling credit and the money supply in the
economy [24])
ABBYY - 2012
25
Domain Ontology/KB Creation - Example
ABBYY - 2012
26
Domain Ontology/KB Creation - Example
ABBYY - 2012
27
“Our Balancing Act”
Quantity
Beauty
Making sure that the available information is actually extracted
Making sure that the ontology concepts are real concepts, not just
sentence fragments
Relevance
Not including every concept mentioned in a sentence
ABBYY - 2012
28
“Striking the Balance”
Tuning text exploration aggressiveness
Pruning sentence phrases down to the “real concept”
Filtering out “ugly” sentence fragments
Handling conjunctions
“Tom and Bill” went to “Dallas and Fort Worth”
“Hank or Susan” went to “Chicago or New York”
ABBYY - 2012
29
Ontology - Example
International Economics
Ontology
Document collection:
International Economics Book
Seed ontology: economics
reference taxonomy
2.8 MB of plain text
558 seed concepts, e.g.
aggregate demand, ATC curve,
budget deficit, commodity
money, etc.
791 semantic relations
5,678 ontological concepts
13,878 semantic relations
AGENT, CAUSE, INFLUENCE,
INSTRUMENT, ISA, ATLOCATION, MAKE-PRODUCE,
MANNER, PROPERTY, PURPOSE,
PART-WHOLE, QUANTITY,
SYNONYMY, THEME, AT-TIME,
VALUE
ABBYY - 2012
30
KAT Modules – Knowledge Base Maintenance
KAT
Text Processing
Knowledge base merging
Visualization
Knowledge base editing
User interaction
Modifications
Classification
Hierarchy Creation
Knowledge Base
Maintenance
ABBYY - 2012
31
Knowledge Base Maintenance
New concept integration: concepts and relations extracted from
incoming documents are added to the existing ontology
Establish a mapping between the new set of concepts/relations and the
existing ontology
Add non-mapped concepts and relations to the ontology
Ontology mapping: identify a set of rules that link concepts from one
ontology to analogous concepts (in another ontology)
Calculate semantic similarity of concepts
Similarity between the semantic models of concepts
Degree of textual entailment between the concepts’ glosses
Concept label-based similarity
Calculate semantic similarity of relations
Function of their arguments’ similarity degree
ABBYY - 2012
32
Knowledge Base Maintenance
Ontology merging: create a new ontology by combining information
from two or more ontologies
Map the ontologies (two at a time)
Combine domain concepts (use a single copy for mapped concepts)
Merge the relation sets of mapped concepts
Conflict resolution algorithm
Re-classify the new set of ontological concepts
Classification/hierarchy creation procedures
ABBYY - 2012
33
Conflict Resolution
Approach used – prevention
Start from an empty hierarchy and an input relation set
Add a relation from the input set to the hierarchy, if
It does not form a cycle
It is not redundant (does not duplicate a path)
Remove jump links
Properties of hierarchical relations
Transitive
If R(A,B) and R(B,C), then R(A,C)
ISA(cat,mammal) and ISA(mammal,animal) ISA(cat,animal)
Strictly non-symmetric
If R(A,B), then NOT R(B,A)
ISA(cat,mammal) ¬ISA(mammal,cat)
ABBYY - 2012
34
Types of Conflict
Inconsistencies
Simple loops
a
Redundancies
Duplicate relations
b
a
Cycles
Jump links
b
b
a
b
a
c
ABBYY - 2012
c
35
Jump Links
Multiple paths from one node to another are acceptable
As long as no single link duplicates a path
d
b
Jump link removal
When it is safe to add R(A,B),
remove links from direct
descendents of B to B, if
they have a path to A
c
c
a
d
b
a
b
a
f
c
e
d
ABBYY - 2012
36
Do fewer links mean fewer knowledge?
Number of links: 4
Assertions
1.
2.
3.
4.
5.
ab
ac
bd
cd
ad
Number of links: 3
Assertions
1.
2.
3.
4.
5.
6.
ab
bc
cd
ac
bd
ad
d
b
c
a
d
c
b
a
ABBYY - 2012
37
Ontology Merging - Example
work place
industry
exchange
market
stock
exchange
money
market
financial market
capital
market
+
money
market
stock market
=
industry
market
work place
exchange
financial market
capital market
stock exchange,
stock market
ABBYY - 2012
money market
38
Domain Ontology/KB Evaluation
Compare KAT’s automatically generated ontologies against gold
annotations
Evaluation focuses on
Lexical level
Vocabulary/data layer level
Other semantic relations level
Viewing an ontology as a set of semantic relations between two
concepts, the human annotators:
Labeled an entry correct if the concepts and the semantic relation are
correctly detected by the system, else marked the entry as incorrect
Labeled a correct entry as irrelevant if any of the concepts or the semantic
relation are irrelevant to the domain
Added new entries for concepts and semantic relations omitted by KAT
(from input documents)
ABBYY - 2012
39
Ontology/KB Evaluation - Metrics
NK(*) gives the counts from KAT’s output
NG(*) correspond to counts from gold annotations
Pr(Correctness)
NK (correct) NK (irrelevant)
NK (correct) NK (irrelevant) NK (incorrect)
Correctness
NK (correct)
Pr
Relevance NK (correct) NK (irrelevant) NK (incorrect)
Cvg(Correctness)
NK (correct) NK (irrelevant)
NG (correct) NG (irrelevant) NG (added)
Correctness
NK (correct)
Cvg
Relevance NG (correct) NG (added)
ABBYY - 2012
40
Domain Ontology/KB Evaluation - Results
ABBYY - 2012
41
Jager™: Ontology Visualization and Editing
Web application - scalable, multi-user visualization and editing of
KAT’s ontologies/KBs
Based on the Django framework and written in a mix of Python, HTML and
Javascript
Jager (pronounced yeager) is a corruption of the German word Jäger
(hunter)
Capabilities
Jager admin tool
Import/Export/Delete/Trim ontology
Compare two ontologies
Edit ontology name
For a given ontology
Edit/Delete/Insert concept/semantic relation
ABBYY - 2012
42
Jager™: Ontology Visualization and Editing
ABBYY - 2012
43
Outline
Introduction to Ontologies
Ontologies for Question Answering
Automatic Ontology Building
Applications
OWL/RDF Representation
Jaguar – Jager Demo
CHiPS Demo
ABBYY - 2012
44
Collaborative High Precision Search
CHiPS™: ontology-guided search
More powerful than keyword search
Search from the perspective of a given ontology
Document matching
Semantic profiles are generated for documents based on a given ontology
Ontology concepts are identified in the text
Each identified concept is assigned a weight
Semantic profile matching
Semantic profiles for each document in a repository are generated in
advance
Semantic profile for input search text is generated on the fly
Search algorithm finds a list of repository documents whose profiles
most closely match that of the input search text profile
ABBYY - 2012
45
CHiPS™ Architecture
ABBYY - 2012
46
Document Similarity
Possible applications in medical domain
For diagnosis – patient data vs medical knowledge
For research – text snippet vs Medline
Match decision rules to KB
Others
Approaches
Statistical approaches: Latent Dirichlet Allocation, Pachinko Allocation,
others
Semantic approaches:
Event based
Ontology based – outlined here
Others
ABBYY - 2012
47
Sample Search
Search: The patient’s eye pain was associated with the surgical procedure and poly-Llactic acid
Result: She describes this area as looking like a "bug bite" & was located "on top of"
(above) gortex implant, near the lateral canthus. Its shape is round about one-fourth
inch in diameter w/a rise w/a peak "maybe" one-eighth of an inch in height total. She
said her phys has treated the "bug bite" area w/an unknown type of steroid injection,
w/o effect. He now wants to remove this surgically, however, she is not certain if she
wants this done. She noted that she did not massage for first week, as had no
instruction to do so; she also had lid lift surgery at the time (of the face lift,) & surgeon
did not want any pressure on surgical site. She reported her concomitant medications
as estradiol, gabapentin (neurontin), for trigeminal neuralgia & facial non-specific
neuralgia; also a multivitamin. Add'l medical history included trigeminal neuralgia &
facial non-specific neuralgia both following the accident. No further medical info
reported. Add'l info for sculptra from ptc report case (b)(4) dated (b)(6)2008, received
by (b)(6) on 25mar08: b/c no lot # is available, an investigation has been performed on
the documentation of all potentially involved manufactured batches. The review of the
device history reports & of the analytical results of these batches did not show any
anomaly that could be related to the event which occurred.
Repository: Manufacturer and User Facility Device Experience (MAUDE)
ABBYY - 2012
48
Sample Search – Supporting Ontologies
Medical Subject Headings (MeSH)
controlled vocabulary
Encyclopedic knowledge
pain
ISA
angina
face
ISA
PW
neuralgia
eye
ISA
PW
lid
PW
trigeminal
neuralgia
canthus
ISA
lateral
canthus
ABBYY - 2012
ISA
medial
canthus
49
CHiPS™ Demo
Hybrid MeSH-MedRA ontology
NIH Medical Subject Headings (MeSH) taxonomy
Medical Dictionary for Regulatory Activities (MedRA)
http://www.nlm.nih.gov/mesh/
http://www.meddramsso.com/
29,302 concepts
38,828 semantic relations (ISA)
Document repositories
FDA MAUDE document repository
Manufacturer And User facility Device Experience
Database of adverse medical events
http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfmaude/search.cfm
NIH MEDLINE document repository
journal citations and abstracts for biomedical literature from around the world
http://www.nlm.nih.gov/bsd/pmresources.html
ABBYY - 2012
50
Outline
Introduction to Ontologies
Ontologies for Question Answering
Automatic Ontology Building
Applications
OWL/RDF Representation
Jaguar Demo
CHiPS Demo
ABBYY - 2012
51
Conversion to OWL/RDF
World Wide Web Consortium (W3C) standard formats
Resource Description Framework (RDF) XML/N-Triples
Web Ontology Language (OWL)
http://www.w3.org/TR/rdf-syntax-grammar
Subject-predicate-object expressions (triples) to represent information
“The sky is blue” (sky,hasColor,blue) triple
http://www.w3.org/TR/owl-features
Designed to represent ontologies; creates RDF-XML-compatible semantic
models
Goal: Define a schema encodes the semantic markup without creating
an intractable number of RDF and OWL relations
Increase interoperability
Facilitate integration of KAT’s ontologies into application systems
ABBYY - 2012
52
Ontology to OWL Translation
Definition of domain concepts and properties of concepts (lexeme, sense
number)
<owl:Class rdf:ID="DomainConcept"/>
<owl:Class rdf:ID="OtherConcept">
<rdfs:subClassOf rdf:resource="#DomainConcept"/>
</owl:Class>
<owl:Class rdf:ID="HierarchyConcept">
<rdfs:subClassOf rdf:resource="#DomainConcept"/>
</owl:Class>
<owl:FunctionalProperty rdf:ID="lexeme">
<rdfs:range rdf:resource="&xsd;string"/>
<rdf:type rdf:resource="&owl;DatatypeProperty"/>
<rdf:type rdf:resource="&owl;AnnotationProperty"/>
</owl:FunctionalProperty>
<owl:FunctionalProperty rdf:ID="sense">
<rdf:type rdf:resource="&owl;DatatypeProperty"/>
<rdf:type rdf:resource="&owl;AnnotationProperty"/>
<rdfs:range rdf:resource="&xsd;int"/>
</owl:FunctionalProperty>
ABBYY - 2012
53
Ontology to OWL Translation
Definition for concept part-of-speech
<owl:FunctionalProperty rdf:ID="pos">
<rdf:type rdf:resource="&owl;DatatypeProperty"/>
<rdf:type rdf:resource="&owl;AnnotationProperty"/>
<rdfs:range>
<owl:DataRange>
<owl:oneOf rdf:parseType="Resource">
<rdf:first rdf:datatype="&xsd;string">noun</rdf:first>
<rdf:rest rdf:parseType="Resource">
<rdf:first rdf:datatype="&xsd;string">verb</rdf:first>
<rdf:rest rdf:parseType="Resource">
<rdf:first rdf:datatype="&xsd;string">adjective</rdf:first>
<rdf:rest rdf:parseType="Resource">
<rdf:rest rdf:resource="&rdf;nil"/>
<rdf:first rdf:datatype="&xsd;string">adverb</rdf:first>
</rdf:rest>
</rdf:rest>
</rdf:rest>
</owl:oneOf>
</owl:DataRange>
</rdfs:range>
</owl:FunctionalProperty>
ABBYY - 2012
54
Ontology to OWL Translation
Definition for PART-WHOLE semantic relation
<owl:ObjectProperty rdf:ID="isPartOf">
<owl:inverseOf>
<owl:ObjectProperty rdf:ID="hasPart"/>
</owl:inverseOf>
<rdfs:range rdf:resource="#DomainConcept"/>
<rdfs:domain rdf:resource="#DomainConcept"/>
</owl:ObjectProperty>
ABBYY - 2012
55
Ontology to OWL - Example
ISA(F-16,fighter_aircraft)
<owl:Class rdf:about="&wn20instances;synset-fighter_aircraft-1">
<lexeme>fighter aircraft</lexeme>
<pos>noun</pos>
<sense>1</sense>
<conceptcount>1</conceptcount>
<doccount>1</doccount>
<netag></netag>
</owl:Class>
<owl:Class rdf:ID="synset-f_16-noun-1">
<lexeme>F-16</lexeme>
<pos>noun</pos>
<sense>1</sense>
<conceptcount>1</conceptcount>
<doccount>1</doccount>
<netag></netag>
</owl:Class>
<owl:Class rdf:about="#synset-f_16-noun-1">
<rdfs:subClassOf rdf:resource="&wn20instances;synset-fighter_aircraft-noun-1"/>
</owl:Class>
ABBYY - 2012
56
Converting Relations into RDF
Ontology is transformed into RDF triples
Semantic relations from text are transformed into RDF triples
Millions of Americans went to the polls on Tuesday to elect a president.
MEASURE(Millions, American)
AGENT(American, go)
<utdns#verb-elect-1><utdkatowl#ispurposeof><utdns#verb-go-1>
THEME(elect, president)
<utdns#noun-tuesday-1><utdkatowl#istimeof><utdns#verb-go-1>
PURPOSE(go, elect)
<utdns#noun-poll-1><utdkatowl#islocationof><utdns#verb-go-1>
TEMPORAL(go, Tuesday)
<utdns#noun-american-1><utdkatowl#isagentof><utdns#verb-go-1>
LOCATION(go, poll)
<utdns#adj-million-1><utdkatowl#ismeasureof><utdns#noun-american-1>
<utdns#noun-president-1><utdkatowl#isthemeof><utdns#verb-elect-1>
AGENT(American, elect)
<utdns#noun-american-1><utdkatowl#isagentof><utdns#verb-elect-1>
ABBYY - 2012
57
Conclusions
We presented a generalized and improved procedure to automatically
extract deep semantic information from text resources
A methodology to rapidly create semantically-rich domain ontologies
while keeping the manual intervention to a minimum
We defined evaluation metrics to assess the quality of the ontologies
and presented evaluation results for a subset of the intelligence and
financial ontology libraries, semi-automatically created using freelyavailable textual resources from the Web
The results show that a decent amount of knowledge can be
accurately extracted while keeping the manual intervention in the
process to a minimum.
ABBYY - 2012
58
Thank You!
Discussion
ABBYY - 2012
59