Ontologies for Information Fusion

Download Report

Transcript Ontologies for Information Fusion

Ontologies for Information
Fusion
Deborah L. McGuinness
Associate Director and Senior Research Scientist
Knowledge Systems Laboratory
Stanford University
Stanford, CA 94305 USA
650-723-9770
[email protected]
What is an Ontology?
Catalog/
ID
Thesauri ->
“narrower
term”
relation
Terms/
glossary
Formal
taxonomy
Term
Hierarchy
(e.g.
Yahoo!)
General
Description
Frames
Logics*
(properties)
Formal
instance
Value General
Restrs. Logic
*based on AAAI ’99 Ontologies panel – Gruninger, Lehmann, McGuinness, Uschold, Welty
Updated by McGuinness, additional input from Gruninger, Uschold, and Rockmore
July 9, 2003
Deborah L. McGuinness
1
Some uses of (simple) Ontologies
Simple ontologies (taxonomies) provide:
 Controlled shared vocabulary (search engines, authors,
users, databases, programs/agents all speak same
language)
 Site Organization, Navigation Support, Expectation
setting
 “Umbrella” Upper Level Structures (for extension e.g.,
UNSPSC)
 Browsing support (tagged structures such as Yahoo!)
 Search support (query expansion approaches such as
FindUR, e-Cyc)
 Sense disambiguation (e.g., TAP)
July 9, 2003
Deborah L. McGuinness
2
Uses of Ontologies II
Interoperability Support
 Consistency Checking
 Completion
 Support for validation and verification testing (e.g.
http://ksl.stanford.edu/projects/DAML/chimaera-jtp-cardinalitytest1.daml )
 Configuration support
 Structured, “surgical” comparative customized search
 Generalization / Specialization
 Query and answer analysis and refinement
See pedagogical wine agent example at:
http://www.ksl.stanford.edu/people/dlm/webont/wineAgent/

July 9, 2003
Deborah L. McGuinness
3
KSL Wine Agent
Semantic Web Integration
Agent receives an analysis and retrieval task description and uses emerging web
standards to provide answer description and return specific answers. (Given a meal
description, describe the class(es) of matching wines and retrieve some from web.)
• DAML+OIL / OWL for representing a domain ontology of foods, wines,
their properties, and relationships between them
• JTP theorem prover for deriving appropriate pairings
• DQL for querying a knowledge base consisting of the above information
• Inference Web for explaining and validating answers (descriptions or
instances)
• [Web Services for interfacing with vendors]
• Connections to online web agents/information services
• Utilities for conducting and caching the above transactions
Info: http://www.ksl.stanford.edu/people/dlm/webont/wineAgent/
July 9, 2003
Deborah L. McGuinness
4
Implications and Needs for
Ontology-enhanced applications








Ontology language syntax and semantics (DAML+OIL, OWL)
Upper level/core ontologies for reuse/extension (Cyc, SUMO, CNS
coalition, DAML-S…)
Environments for creation of ontologies (Protégé, Sandpiper,
Construct, OilEd, …)
Environments for maintenance of ontologies: evolution, diagnostics,
merging, partitioning, views, versions, (Chimaera, OntoBuilder,
Prompt, …)
Reasoning environments (Cerebra, Fact, JTP, Snark, …)
Distributed explanation support facilitating trust (Inference Web)
Surrounding tools – semantic search (TAP, FindUR, …), agent
platforms,
Training (conceptual modeling, reasoning usage, tutorials – OWL
Guide, Ontologies 101, OWL Tutorial, …)
July 9, 2003
Deborah L. McGuinness
5
July 9, 2003
Deborah L. McGuinness
6
July 9, 2003
Deborah L. McGuinness
7
Inference Web
Infrastructure for trust. Supports explanation of reasoning and retrieval tasks
by storing, exchanging, combining, annotating, filtering, segmenting,
comparing, and rendering proofs and proof fragments
 DAML+OIL/OWL specification of proofs is interlingua for proof
interchange
 Proof browser for displaying IW proofs and their explanations
(possibly from multiple inference engines)
 Registration for inference engines/rules/languages; pedigree
 Proof explainer for abstracting proofs into more understandable
formats
 Proof generation service for facilitate the creation of IW proofs by
inference engines
 Hosted service available integrated with Stanford’s JTP reasoner and
SRI’s SNARK reasoner. Integrated in DQL Client/Server, Wine
Agent, …
 Discussions with Boeing, Cycorp, Fetch, ISI, Northwestern, SRI, UT,
UW, W3C, …
July 9, 2003
Deborah L. McGuinness
8
DAML/OWL Language
•Extends vocabulary of
XML and RDF/S
•Rich ontology
representation language
•Language features
chosen for efficient
implementations
Frame Systems
Web Languages
RDF/S
XML
DAML-ONT
DAML+OIL
OWL
OIL
Formal Foundations
Description Logics
FACT, CLASSIC, DLP, …
July 9, 2003
Deborah L. McGuinness
9
Discussion/Conclusion
• Ontologies are exploding; core of many applications as seen at IF2003
• Business/govt. “pull” is driving ontology tools and languages
• New generation applications need more expressive ontologies and more
back end reasoning
• User base is broader thus tools are providing support aimed at audience
larger than KR&R-trained people
• Distributed ontologies motivating more supporting tools: merging,
analysis, explanation support, incompleteness techniques, versioning, etc.
• Scale and distribution of the web force mind shift (no longer monolithic
single ontologies)
• Everyone is in the game – Government (DARPA, NSF, NIST, ARDA…),
DSTO, EU, W3C, consortiums, business, …
• Consulting and product companies are in the space (not just academics)
Good time to bring ontologies into Info. Fusion in a larger way
July 9, 2003
Deborah L. McGuinness
10
A few US Govt. Programs

DARPA:






DAML – DARPA Agent Markup Language
RKF – Rapid Knowledge Formation
HPKB – High Performance Knowledge Base
PBA – Predictive Battle Space Awareness
EPCA – Enduring Personalized Cognitive
Assistant/PAL/CALO, KnowledgePad
ARDA:


AQUAINT – Question Answering
NIMD – Novel Intelligence for Massive Data
July 9, 2003
Deborah L. McGuinness
11
Pointers
Selected Papers:
- McGuinness. Ontologies come of age, 2003
- Das, Wei, McGuinness, Industrial Strength Ontology Evolution Environments, 2002.
- Kendall, Dutra, McGuinness. Towards a Commercial Strength Ontology Development Environment, 2002.
- McGuinness Description Logics Emerge from Ivory Towers, 2001.
- McGuinness. Ontologies and Online Commerce, 2001.
- McGuinness. Conceptual Modeling for Distributed Ontology Environments, 2000.
- McGuinness, Fikes, Rice, Wilder. An Environment for Merging and Testing Large Ontologies, 2000.
- Brachman, Borgida, McGuinness, Patel-Schneider. Knowledge Representation meets Reality, 1999.
- McGuinness. Ontological Issues for Knowledge-Enhanced Search, 1998.
- McGuinness and Wright. Conceptual Modeling for Configuration, 1998.
Selected Tutorials:
-Smith, Welty, McGuinness. OWL Web Ontology Language Guide, 2003.
-Noy, McGuinness. Ontology Development 101: A Guide to Creating your First Ontology. 2001.
- Brachman, McGuinness, Resnick, Borgida. How and When to Use a KL-ONE-like System, 1991.
Languages, Environments, Software:
- OWL - http://www.w3.org/TR/owl-features/ , http://www.w3.org/TR/owl-guide/
- DAML+OIL: http://www.daml.org/
- Inference Web - http://www.ksl.stanford.edu/software/iw/
- Chimaera - http://www.ksl.stanford.edu/software/chimaera/
- FindUR - http://www.research.att.com/people/~dlm/findur/
- TAP – http://tap.stanford.edu/
- DQL - http://www.ksl.stanford.edu/projects/dql/
July 9, 2003
Deborah L. McGuinness
12
Extras
July 9, 2003
Deborah L. McGuinness
13
General Nature of Descriptions
class
superclass
number/card
restrictions
Roles/
properties
value
restrictions
July 9, 2003
a WINE
a LIQUID
a POTABLE-THING
general categories
grape: chardonnay, ... [>= 1]
sugar-content: dry, sweet, off-dry
color: red, white, rose
price: a PRICE
winery: a WINERY
structured
components
grape dictates color (modulo skin)
harvest time and sugar are related
interconnections
between parts
Deborah L. McGuinness
14



A Few Observations about Ontologies
Ontologies can be built by non-experts by COTS and academic tools
 Verity’s Topic Editor, Constructor, Collaborative Topic Builder, GFP,
Chimaera, Protégé, OIL-ED, etc.
Ontologies can be semi-automatically generated
 from crawls of site such as yahoo!, amazon, excite, etc.
 Semi-structured sites can provide starting points
Ontologies are exploding (business pull instead of technology push)
 e-commerce - MySimon, Amazon, Yahoo! Shopping, VerticalNet, …
 Controlled vocabularies (for the web) abound - SIC codes, UMLS,
UNSPSC, Open Directory (DMOZ), Rosetta Net, SUMO
 Business interest expanding: ontology directors, business ontologies ar
becoming more complicated (roles, value restrictions, …), VC firms
interested- Vulcan’s HALO project
 Markup Languages growing XML,RDF, DAML, OWL,RuleML, xxML
 “Real” ontologies are becoming more central to applications
 Search companies moving towards them – Yahoo, recently Google
July 9, 2003
Deborah L. McGuinness
15
July 9, 2003
Deborah L. McGuinness
16
Processing

Given a description of a meal,





Use DQL to state a premise (the meal) and query the knowledge base
for a suggestion for a wine description or set of instances
Use JTP Theorem Prover to deduce answers (and proofs)
Use Inference Web to explain results (descriptions, instances,
provenance, reasoning engines, etc.)
Access relevant web sites (wine.com, wine commune, …) to access
current information
Use DAML-S for markup and protocol*
http://www.ksl.stanford.edu/projects/wine/explanation.html
July 9, 2003
Deborah L. McGuinness
17
July 9, 2003
Deborah L. McGuinness
18
July 9, 2003
Deborah L. McGuinness
19
Querying multiple online sources
July 9, 2003
Deborah L. McGuinness
20
FindUR Architecture
Content to Search:
Research Site
Technical Memorandum
Calendars (Summit 2005, Research)
Yellow Pages (Directory Westfield)
Newspapers (Leader)
Internal Sites (Rapid Prototyping)
AT&T Solutions
Worldnet Customer Care
Medical Information
Search Technology:
User Interface:
Content (Web
Pages or Databases
Classification
CLASSIC Knowledge
Representation System
Search
Engine
Domain
Domain
Knowledge
Knowledge
GUI supporting
browsing
and selection
Results
(standard format)
July 9, 2003
Content
Results
(domain specific)
Deborah L. McGuinness
Verity (and
topic sets)
Collaborative
Topic Set Tool
Verity SearchScript,
Javascript, HTML,
CGI, CLASSIC
21
July 9, 2003
Deborah L. McGuinness
22
July 9, 2003
Deborah L. McGuinness
23
July 9, 2003
Deborah L. McGuinness
24
July 9, 2003
Deborah L. McGuinness
25




























<rdfs:Class rdf:ID="BLAND-FISH-COURSE">
<daml:intersectionOf rdf:parseType="daml:collection">
<rdfs:Class rdf:about="#MEAL-COURSE"/>
<daml:Restriction>
<daml:onProperty rdf:resource="#FOOD"/>
<daml:toClass rdf:resource="#BLAND-FISH"/>
</daml:Restriction>
</daml:intersectionOf>
<rdfs:subClassOf rdf:resource="#DRINK-HAS-DELICATE-FLAVOR-RESTRICTION"/>
</rdfs:Class>
<rdfs:Class rdf:ID="BLAND-FISH">
<rdfs:subClassOf rdf:resource="#FISH"/>
<daml:disjointWith rdf:resource="#NON-BLAND-FISH"/>
</rdfs:Class>
<rdf:Description rdf:ID="FLOUNDER">
<rdf:type rdf:resource="#BLAND-FISH"/>
</rdf:Description>
<rdfs:Class rdf:ID="CHARDONNAY">
<rdfs:subClassOf rdf:resource="#WHITE-COLOR-RESTRICTION"/>
<rdfs:subClassOf rdf:resource="#MEDIUM-OR-FULL-BODY-RESTRICTION"/>
<rdfs:subClassOf rdf:resource="#MODERATE-OR-STRONG-FLAVOR-RESTRICTION"/> […]
</rdfs:Class>
<rdf:Description rdf:ID="BANCROFT-CHARDONNAY">
<rdf:type rdf:resource="#CHARDONNAY"/>
<REGION rdf:resource="#NAPA"/>
<MAKER rdf:resource="#BANCROFT"/>
<SUGAR rdf:resource="#DRY"/> […]
</rdf:Description>
July 9, 2003
Deborah L. McGuinness
26
DAML/OWL Language
•Extends vocabulary of
XML and RDF/S
•Rich ontology
representation language
•Language features
chosen for efficient
implementations
Frame Systems
Web Languages
RDF/S
XML
DAML-ONT
DAML+OIL
OWL
OIL
Formal Foundations
Description Logics
FACT, CLASSIC, DLP, …
July 9, 2003
Deborah L. McGuinness
27
Issues
Collaboration among distributed teams
 Interconnectivity with many systems/standards
 Analysis and diagnosis
 Scale
 Versioning
 Security
 Ease of use
 Diverse training levels / user support
 Presentation style
 Lifecycle
 Extensibility

July 9, 2003
Deborah L. McGuinness
28
Services Ontologies
DAML-S http://www.daml.org/services/
 publication references
 ontology specifications
 examples
A few interesting projects using DAML-S:



MyGrid: (http://mygrid.man.ac.uk)
AgentCities (http://www.agentcities.org)
Services composer
(http://www.mindswap.org/~evren/composer/)
July 9, 2003
Deborah L. McGuinness
29
General Nature of Descriptions
a WINE
July 9, 2003
a LIQUID
a POTABLE
general categories
grape: chardonnay, ... [>= 1]
sugar-content: dry, sweet, off-dry
color: red, white, rose
price: a PRICE
winery: a WINERY
structured
components
grape dictates color (modulo skin)
harvest time and sugar are related
interconnections
between parts
Deborah L. McGuinness
30
SUMO







Available in KIF (first order logic), DAML, LOOM and
XML
May be used without fee for any purpose (including for
profit)
Mapped by hand to 100,000 synsets of WordNet lexicon
Validated with formal theorem proving
52 publicly released versions created over two years
(approximately 1,000 concepts, 4000 assertions, and 750
rules so far)
Specialized with dozens of free domain ontologies
In use by companies, universities and government around the
world


Acadmica Sinica – Taiwan, U Arizona, lookwayup.com, NIST etc
Available at http://ontology.teknowledge.com
July 9, 2003
Deborah L. McGuinness
31
Chimaera – A Ontology
Environment Tool
An interactive web-based tool aimed at supporting:
•Ontology analysis (correctness, completeness, style, …)
•Merging of ontological terms from varied sources
•Maintaining ontologies over time
•Validation of input
• Features: multiple I/O languages, loading and merging into multiple
namespaces, collaborative distributed environment support, integrated
browsing/editing environment, extensible diagnostic rule language
• Used in commercial and academic environments; used in HORUS to
support counter-terrorism ontology generation
• Available as a hosted service from www-ksl-svc.stanford.edu
• Information: www.ksl.stanford.edu/software/chimaera
July 9, 2003
Deborah L. McGuinness
32
Layer Cake Foundation
July 9, 2003
Deborah L. McGuinness
33
July 9, 2003
Deborah L. McGuinness
34
July 9, 2003
Deborah L. McGuinness
35
Some Pointers





Ontologies Come of Age Paper:
http://www.ksl.stanford.edu/people/dlm/papers/ontolo
gies-come-of-age-abstract.html
Ontologies and Online Commerce Paper:
http://www.ksl.stanford.edu/people/dlm/papers/ontolo
gies-and-online-commerce-abstract.html
DAML+OIL: http://www.daml.org/
WEBONT: http://www.w3.org/2001/sw/WebOnt/
OWL: http://www.w3.org/TR/owl-features/
July 9, 2003
Deborah L. McGuinness
36
E-Commerce Search
(starting point Forrester Research modified by McGuinness)
Ask Queries
- multiple search interfaces (surgical shoppers, advice seekers,
window shoppers)
- set user expectations (interactive query refinement)
- anticipate anomalies
 Get Answers
- basic information (multiple sorts, filtering, structuring)
- modify results (user defined parameters for refining, user profile
info, narrow query, broaden query, disambiguate query)
- suggest alternatives (suggest other comparable products even
from competitor’s sites)
 Make Decisions
- manipulate results (enable side by side comparison)
- dive deeper (provide additional info, multimedia, other views)
- take action (buy) Deborah L. McGuinness
37
July 9, 2003

The Need For KB Analysis

Large-scale knowledge repositories will necessarily contain KBs
produced by multiple authors in multiple settings

KBs for applications will typically be built by assembling and
extending multiple modular KBs from repositories that may not be
consistent

KBs developed by multiple authors will frequently



Express overlapping knowledge in different, possibly contradictory
ways
Use differing assumptions and styles
For such KBs to be used as building blocks They must be reviewed for appropriateness and “correctness”

That is, they must be analyzed
July 9, 2003
Deborah L. McGuinness
38
Our KB Analysis Task


July 9, 2003
Review KBs that:

Were developed using differing standards

May be syntactically but not semantically validated

May use differing modeling representations
Produce KB logs (in interactive environments)

Identify provable problems

Suggest possible problems in style and/or modeling

Are extensible by being user programmable
Deborah L. McGuinness
39



A Few Observations about Ontologies
Ontologies can be built by non-experts by COTS and academic tools
 Verity’s Topic Editor, Constructor, Collaborative Topic Builder, GFP,
Chimaera, Protégé, OIL-ED, etc.
Ontologies can be semi-automatically generated
 from crawls of site such as yahoo!, amazon, excite, etc.
 Semi-structured sites can provide starting points
Ontologies are exploding (business pull instead of technology push)
 e-commerce - MySimon, Amazon, Yahoo! Shopping, VerticalNet, …
 Controlled vocabularies (for the web) abound - SIC codes, UMLS,
UNSPSC, Open Directory (DMOZ), Rosetta Net, SUMO
 Business interest expanding: ontology directors, business ontologies ar
becoming more complicated (roles, value restrictions, …), VC firms
interested- Vulcan’s HALO project
 Markup Languages growing XML,RDF, DAML, OWL,RuleML, xxML
 “Real” ontologies are becoming more central to applications
 Search companies moving towards them – Yahoo, recently Google
July 9, 2003
Deborah L. McGuinness
43