Foundations I: Methodologies, Knowledge Representation Deborah McGuinness and Joanne Luciano CSCI/ITEC-6962-01 Week 2, September 13, 2010

Download Report

Transcript Foundations I: Methodologies, Knowledge Representation Deborah McGuinness and Joanne Luciano CSCI/ITEC-6962-01 Week 2, September 13, 2010

Foundations I: Methodologies,
Knowledge Representation
Deborah McGuinness and Joanne Luciano
CSCI/ITEC-6962-01
Week 2, September 13, 2010
1
Review of reading Assignment 1
• Ontologies 101, Semantic Web, e-Science,
RDFS, OWL guide
• Any comments, questions?
2
Contents
•
•
•
•
•
•
•
Review of methodologies
Elements of KR in semantic web context
And in e-Science
Choices of representation, models
Examples of KR
Encoding and understanding representations
Assignment 1
3
Semantic Web Methodology and
Technology Development Process
•
•
Establish and improve a well-defined methodology vision for
Semantic Technology based application development
Leverage controlled vocabularies, et c.
Rapid
Open World:
Evolve, Iterate, Prototype
Redesign,
Redeploy
Leverage
Technology
Infrastructure
Adopt
Science/Expert
Technology
Approach Review & Iteration
Use Tools
Evaluation
Analysis
Use Case
Small Team,
mixed skills
Develop
model/
ontology
4
KR and methodologies
• Procedural Knowledge: Knowledge is encoded in
functions/procedures.
This can be viewed as hard coded and less flexible.
E.g.:
function Person(X) return boolean is
if (X = ``Socrates'') or (X = ``Hillary'')
then return true else return false;
OR
function Mortal(X) return boolean is return person(X);
• Networks: A compromise between declarative and procedural
schemes. Knowledge is represented in a labeled, directed graph
whose nodes represent concepts and entities, while its arcs represent
relationships between these entities and concepts.
5
KR and methodologies
• Frames: Much like a semantic network except each node
represents prototypical concepts and/or situations. Each
node has several property slots whose values may be
specified or inherited.
• Logic: A way of declaratively representing knowledge. For
example:
–
–
–
–
person(Socrates).
person(Hillary).
forall X [person(X) ---> mortal(X)]
DL, FOL, HOL
6
KR and methodologies
• Decision Trees: Concepts are organized in the form
of a tree.
• Statistical Knowledge: The use of certainty factors,
Bayesian Networks, Dempster-Shafer Theory,
Fuzzy Logics, ..., etc.
• Rules: The use of Production Systems to encode
condition-action rules (as in expert systems).
7
KR and methodologies
• Parallel Distributed processing: The use of
connectionist models.
• Subsumption Architectures: Behaviors are
encoded (represented) using layers of simple
(numeric) finite-state machine elements.
• Hybrid Schemes: Any representation
formalism employing a combination of KR
schemes.
8
Remember, in science!
• Some of the knowledge is lost when it is
placed into any particular representation
structure, or may not be reusable (e.g.
Frames)
• So, you may ask something that cannot be
answered or inferred
• Knowledge evolves, i.e. changes
• Knowledge and understanding is very often
context dependent (and discipline, language,
and skill-level dependent, and …)
9
And, if you are used to logic
• You are working mostly within the world of
logic, whereas we are trying to represent
knowledge with logic and we are usually
dealing with tangible objects, such as trees,
clouds, rock, storms, etc.
• Because of this, we have to be very careful
when translating real things into logical
symbols - this can, surprisingly, be a difficult
challenge.
• Consider your method of representation (yes,
we do want to compute with it)
10
Thus
• A person who wants to encode knowledge
needs to decouple the ambiguities of
interpretation from the mathematical certainty
of (any form of) logic.
• The nature of interpretation is critical in formal
knowledge representation and is carefully
formalized by KR scientists in order to
guarantee that no ambiguity exists in the
logical structure of the represented
knowledge.
11
Representing Knowledge With Objects
• Take all individuals that we need to keep track of and
place them into different buckets based on how similar
they are to each other. Each bucket is given a
descriptive based on what objects it contains.
• Since the individuals in a given bucket are at least
somewhat similar, we can avoid needing to describe
every inconsequential detail about each individual.
Instead, properties that are common to all individuals
in a bucket can just be assigned to the entire bucket at
once. Properties are typically either primitive values
(such as numbers or text strings) or may be
12
references to other buckets.
Representing Knowledge With Objects
• Some buckets will be more similar to each other than
others and we can arrange the buckets into a
hierarchy based on the similarity.
• If all buckets in a branch in the tree of buckets share a
property, the information can be further simplified by
assigning the property only to the parent bucket. Other
buckets (and individuals) are said to inherit that
property.
• Buckets may have different names: e.g. Classes,
Frames, or Nodes
• BUT, once we move to (e.g.) DL, not all object rules
apply, e.g. cannot override properties
13
• Multiple inheritance is not always obvious to people
Re-enter Semantic Web
At its core, the Semantic Web can be
thought of as a methodology for
linking up pieces of structured and
unstructured information into
commonly-shared description logics
ontologies.
14
Semantic Web Layers
15
http://www.w3.org/2003/Talks/1023-iswc-tbl/slide26-0.html, http://flickr.com/photos/pshab/291147522/
Elements of KR in Semantic Web
• Declarative Knowledge
• Statements as triples: {subject-predicate-object}
interferometer is-a optical instrument
Fabry-Perot is-a interferometer
Optical instrument has focal length
Optical instrument is-a instrument
Instrument has instrument operating mode
Instrument has measured parameter
Instrument operating mode has measured parameter
NeutralTemperature is-a temperature
Temperature is-a parameter
• A query: select all optical instruments which have operating
mode vertical
• An inference: infer operating modes for a Fabry-Perot
Interferometer which measures neutral temperature
16
Ontology Spectrum
Thesauri
“narrower
Catalog/
term”
ID
relation
Terms/
glossary
Informal
is-a
Selected
Formal Frames
Logical
is-a (properties)Constraints
(disjointness,
inverse, …)
Formal
Value
instance
Restrs.
General
Logical
constraints
Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty;
– updated by McGuinness.
Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html
17
OWL or RDF or OWL 2 RL?
• In representing knowledge you will need to
balance expressivity with implementability
• OWL (Lite, DL, Full) 1 or 2?
• RDF and RDFS
• Rules, e.g. SWRL or OWL 2 RL
• You will need to consider the sources of your
knowledge
• You will need to consider what you want to do
with the represented knowledge
18
The knowledge base
• Using, Re-using, Re-purposing, Extending,
Subsetting
• Approach:
– Bottom-up (instance level or vocabularies)
– Top-down (upper-level or foundational)
– Mid-level (use case)
• Coding and testing (understanding)
• Using tools (some this class, more over the next two
classes)
• Iterating (later)
• Maintaining and evolving (curation, preservation)
19
(later)
‘Collecting’ the ‘data’
• Part of the (meta)data information is present in tools ... but thrown away
at output e.g., a business chart can be generated by a tool: it ‘knows’ the
structure, the classification, etc. of the chart,but, usually, this information
is lost storing it in web data would be easy!
• Semantic Web-aware tools are around (even if you do not know it...),
though more would be good:
– Photoshop CS stores metadata in RDF in, say, jpg files (using XMP)
– RSS 1.0 feeds are generated by (almost) all blogging systems (a huge
amount of RDF data!)
• Scraping - different tools, services, etc, come around every day:
– get RDF data associated with images, for example: service to get RDF from
flickr images
– service to get RDF from XMP
– XSLT scripts to retrieve microformat data from XHTML files
– RSS scraping in use in Virtual Observatory projects in Japan
– scripts to convert spreadsheets to RDF
• SQL - A huge amount of data in Relational Databases
– Although tools exist, it is not feasible to convert that data into RDF
– Instead: SQL ⇋ RDF ‘bridges’ are being developed: a query to RDF data is
transformed into SQL on-the-fly
20
More Collecting
• RDFa (formerly known as RDF/A) extends XHTML
by:
– extending the link and meta to include child elements
– add metadata to any elements (a bit like the class in
microformats, but via dedicated properties)
• It is very similar to microformats, but with more
rigor:
– it is a general framework (instead of an メagreementモ on
the meaning of, say, a class attribute value)
– terminologies can be mixed more easily
• GRDDL - Gleaning Resource Descriptions from
Dialects of Languages
• ATOM - XML-based Web content and metadata
syndication format (used with RSS)
21
Foundational Ontologies
Domain independent concepts and relations
physical object, process, event,…, participates,…
 (Usually) Rigorously defined
formal logic, philosophical principles, highly structured
 Examples
DOLCE – Descriptive Onotology for Linguistic and Cognitive
Engineering
SUMO – Suggested Upper Merged Ontology
CYC Upper Level Ontology
BFO – Basic Formal Ontology
GFO – General Formal Ontology (developed by Onto Med)
22
Foundational Ontologies
PURPOSE: help integrate domain ontologies
“…and then there was one…”
Foundational ontology
Geology
ontology
Struc
Rock
ontology
ontology
Geophysics
ontology
Marine
ontology
Water
ontology
Planetary
ontology
23
Courtesy: Boyan Brodaric
Foundational Ontologies
PURPOSE: help organize domain ontologies
“…a place for everything, and everything in its place…”
Foundational ontology
shale
rock
formation
lithification
24
Courtesy: Boyan Brodaric
Problem scenario

Little work done on linking foundational
ontologies with geoscience ontologies

Such linkage might benefit various scenarios
requiring cross-disciplinary knowledge, e.g.:
water budgets: groundwater (geology) and surface water (hydro)
hazards risk: hazard potential (geology, geophysics) and items at
threat (infrastructure, people, environment, economic)
health: toxic substances (geochemistry) and people, wildlife
many others…
25
Courtesy: Boyan Brodaric
DOLCE - Descriptive Ontology for Linguistic and
Cognitive Engineering
26
SUMO - Standard Upper Merged Ontology
•
•
Physical
• Object
•
SelfConnectedObject
•
ContinuousObject
•
CorpuscularObject
•
Collection
• Process
Abstract
• SetClass
•
Relation
• Proposition
• Quantity
•
Number
•
PhysicalQuantity
• Attribute
27
• http://www.ifomis.org/Research/IFOMISRepor
ts/IFOMIS%20Report%2005_2003.pdf
http://www.ifomis.org/Research/IFOMISReports/IFOMIS%20Report%2005_2003.pdf
BFO – Basic Formal Ontology
Snap comes from a snapshot at any given time
28
Span comes from spanning time;
sometimes considered a 4D description
29
Using SNAP/ SPAN
30
31
SWEET 2.0 Modular Design
• Supports easy extension
by domain specialists
Math, Time, Space
Basic Science
• Organized by subject
(theoretical to applied)
Geoscience
Processes
• Reorganization of classes,
but no significant changes to
content
• Importation is
unidirectional
Geophysical
Phenomena
Applications
importation
32
SWEET 2.0 Ontologies
33
Using SWEET
• Plug-in (import) domain detailed modules
• Lots of classes, few relations (properties)
• Version 2.0 is re-usable and extensible
34
Mix-n-Match
• The hybrid example:
– Collect a lot of different ontologies representing
different terms, levels of concepts, etc. into a
base form: RDF
35
Mid-Level: Developing ontologies
• Use cases and small team (7-8; 2-3 domain experts,
2 knowledge experts, 1 software engineer,
1 facilitator, 1 scribe)
• Identify classes and properties (leverage controlled
vocab.)
– Start with narrower terms, generalize when needed or
possible
– Adopt a suitable conceptual decomposition (e.g. SWEET)
– Import modules when concepts are orthogonal
• Review, vet, publish
• Only code them (in RDF or OWL) when needed
(CMAP, …)
• Ontologies: small and modular
36
Use Case example
• Plot the neutral temperature from the Millstone-Hill
Fabry Perot, operating in the non-vertical mode
during January 2000 as a time series.
• Plot the neutral temperature from the MillstoneHill Fabry Perot, operating in the non-vertical
mode during January 2000 as a time series.
• Objects:
–
–
–
–
–
–
–
Neutral temperature is a (temperature is a) parameter
Millstone Hill is a (ground-based observatory is a) observatory
Fabry-Perot is a interferometer is a optical instrument is a instrument
Non-vertical mode is a instrument operating mode
January 2000 is a date-time range
Time is a independent variable/ coordinate
Time series is a data plot is a data product
37
Class and property example
• Parameter
– Has coordinates (independent variables)
• Observatory
– Operates instruments
• Instrument
– Has operating mode
• Instrument operating mode
– Has measured parameters
• Date-time interval
• Data product
38
39
40
41
Higher level use case
• Find data which represents the state of the
neutral atmosphere above 100km, toward the
arctic circle at any time of high geomagnetic
activity
• Find data which represents the state of the
neutral atmosphere above 100km, toward
the arctic circle at any time of
high geomagnetic activity
42
Extending the KR for a purpose
GeoMagneticActivity has
ProxyRepresentation
Input
GeophysicalIndex is a
ProxyRepresentation (in
Physical properties: State of
Realm of Neutral Atmosphere)
neutral atmosphere
Kp is a GeophysicalIndex
Spatial:
hasTemporalDomain: “daily”
• Above 100km
hasHighThreshold:
• Toward arctic circlexsd_number = 8
(above 45N)
Date/time when KP => 8
Conditions:
Specification needed for
query to CEDARWEB
Instrument
Parameter(s)
Operating Mode
Observatory
Date/time
• High geomagnetic activity
Action: Return Data
Return-type: data
43
Translating
the
Use-Case
hasPhysicalProperties: NeutralTemperature, Neutral Wind, etc.
ctd.
hasSpatialDomain: [0,360],[0,180],[100,150]
NeutralAtmosphere is a subRealm of TerrestrialAtmosphere
hasTemporalDomain:
Specification needed for
Input
query to CEDARWEB
NeutralTemperature
is
a
Temperature
(which)
is
a
Parameter
Physical properties: State of
Instrument
neutral atmosphere
Spatial:
Above 100km
GeoMagneticActivity
has
ProxyRepresentation
Toward arctic
circle (above
GeophysicalIndex
is a 45N)
ProxyRepresentation
(in
Conditions:
Realm of Neutral Atmosphere)
High geomagnetic
Kp
is a GeophysicalIndex
activity
hasTemporalDomain: “daily”
Action: Return Data
hasHighThreshold:
xsd_number = 8
Date/time when KP => 8
Parameter(s)
FabryPerotInterferometer
is a Interferometer,
(which) is a OpticalOperating
Instrument
(which) is a
Mode
Instrument
Observatory
hasFilterCentralWavelength: Wavelength
hasLowerBoundFormationHeight: Height
Date/time
ArcticCircle is a GeographicRegion
Return-type: data
hasLatitudeBoundary:
hasLatitudeUpperBoundary:
44
Knowledge representation - visual
• UML – Universal Modeling Language
– Ontology Definition Metamodel/Meta Object
Facility (OMG) for UML
– Provides standardized notation
• CMAP Ontology Editor (concept mapping tool
from IHMC - http://cmap.ihmc.us/coe )
– Drag/drop visual development of classes,
subclass (is-a) and property relationship
– Read and writes OWL
– Formal convention (OWL/RDF tags, etc.)
• White board, text file
45
46
Representing processes
47
Is OWL/RDF the only option? No…
• SKOS - Simple Knowledge Organization Scheme for
Taxonomies http://www.w3.org/2004/02/skos/
• Annotations (RDFa) – for un- or semi-structured
information sources http://www.w3.org/TR/xhtml-rdfaprimer/ http://rdfa.info
• Atom (and RSS) – for representing syndication feeds –
structured http://tools.ietf.org/html/rfc4287
• More expressive languages IKL, CL, …
• Languages aimed at different paradigms – e.g., rule
languages
48
Query
• Querying knowledge representations in OWL and/or
RDF
• SPARQL for RDF http://www.sparql.org/ and
http://www.w3.org/TR/rdf-sparql-query/
• OWL-QL (for OWL)
http://projects.semwebcentral.org/projects/owl
-ql/
•
•
•
•
XQUERY (for XML)
SeRQL (for SeSAME)
RDFQuery (RDF)
Few as yet for natural language representations
49
Best practices (some)
• Ontologies/ vocabularies must be shared and
reused - swoogle.umbc.edu, bioportal, OOR
• Examine ‘core vocabularies’ to start with
– SKOS Core: about knowledge systems
– Dublin Core: about information resources, digital libraries,
with extensions for rights, permissions, digital right
management
– FOAF: about people and their organizations
– SIOC: about communities
– DOAP: on the descriptions of software projects
– DOLCE seems the most promising to match science
ontologies
• Go “Lite” as much as possible, then increasing logic
- balancing expressibility vs. implementability
• Minimal properties to start, add only when needed
50
Assembling Knowledge
Aggregation, Integration, Inference
Case Study
“When it comes to data
BioDASH Aggregation
cleaning, there’s no
such thing as a free
lunch.” Tim Berners-Lee
Case Study
BioPAX Integration
Case Study
Flux Balance Analysis
Some tasks are specific to a use case, some are
common to more than one and there’s no
escaping others.
The Siderean Demo
Aggregation Case Study
• Question: What drugs can be used as
candidates for treating for B-cell
Lymphoma patients?
• By comparing gene expression patterns
between patients with and without
B-cell lymphoma, a top biomarker was
found: BRKCB-1
Seamark Demo: Background & Concepts

Demonstration premise


Leveraging strengths of Oracle 10g & Seamark v3.6



RDF offers high value during early stage research
Oracle – large datasets / scalability
Seamark – useful subsets / flexible navigation
Project elapsed time - about one week
Locating and identifying data sources represented the
greatest time element
 Data sources in RDF required minimal integration time
 Non-RDF data sources required transformation and linking
values (non-trivial but straightforward)

53
Seamark Demonstration: Identification of new drug candidates
GO2Keyword.rdf
Keywords.rdf
ProbeSet.rdf
Keyword
GO2UniProt.rdf
Probe
GO2OMIM.rdf
Protein
Gene
MIM Id
IntAct.rdf
OMIM.rdf
GO.rdf
UniProt.rdf
Organism
Enzyme
GO2Enzyme.rdf
Citation
Compound
Taxonomy.rdf
PubMed.xml
Enzymes.rdf
KEGG.rdf
Pathway
1. Differentiate different forms
of disease
2. Identify patients subgroups.
3. Identify top biomarkers
4. Identify function
5. Identify biological and
chemical properties and
disease associations of
biomarker
6. Identify documents
7. Identify role in metabolic
pathways
8. Identify compounds that
interact
9. Identify and compare
function in other organisms
10. Identify any prior art
Siderean Seamark Demonstration in collaboration with Joanne Luciano, Predictive Medicine, Inc.54
BioPAX
Biological PAthway eXchange
An abstract data model for biological
pathway integration
Initiative arose from the community
Biological Pathways of the Cell
BioPAX
Metabolic
Pathways
BioPAX
Level 1
Molecular
Interaction
Networks
BioPAX
Level 2
Signaling
Pathways
BioPAX
Level 3
Gene
Regulation
BioPAX
Level 4
Different representations of the
same pathways
Reactions clickable but...
Does not
compute.
Pretty,
but useless
Starts at Glucose (but it doesn’t matter)
BioCarta Reference Pathway GLYCOLYSIS
Pathway Data (domain)
How bad is it?
Pathway Databases
So many pathway databases, so little time.
Graphic from Mike Cary and Gary Bader
Exchange Formats in Pathway Data Space
(Scope)
Database Exchange
Formats
BioPAX
Genetic
Interactions
PSI-MI 2
Interaction Networks
Molecular
Pro:Pro
Simulation Model
Exchange Formats
Non-molecular
TF:Gene
SBML,
CellML
Regulatory Pathways
Low Detail
Genetic
Molecular Interactions
Pro:Pro
Biochemical
Reactions
All:All
Metabolic
Small Molecules
Low Detail
High Detail
Low Detail
Rate
Formulas
Pathways
High Detail
High Detail
Graphic from Mike Cary & Gary Bader
BioPAX Motivation
>180 DBs and tools
Application
Database
User
Before BioPAX
With BioPAX
Common format will make data more accessible,
promoting data sharing and distributed curation efforts
BioPAX Objectives
• Accommodate existing database
representations
• Integration and exchange of pathway
data
• Interchange through a common
(standard) representation
• Provide a basis for future databases
• Enable development of tools for searching
and reasoning over the data
Data Aggregation, Integration
and Inference with BioPAX
1.
–
–
–
Multiple kinds of pathway databases
metabolic
molecular interactions
signal transduction
2. Constructs designed for integration
–
–
–
–
DB References
XRefs (Publication, Unification, Relationship)
synonyms
provenance
3. OWL DL – to enable reasoning
BioPAX Biochemical Reaction
OWL
(schema)
Instances
(Individuals)
(data)
phosphoglucose
isomerase
5.3.1.9
BioPAX Ontology: Overview
a set of
interactions
parts
how the parts are known to interact
Level 1 v1.0 (July 7th, 2004)
BioDASH
Bridging Chemistry and Molecular Biology
•Different Views have different semantics:
Lenses
• When there is a correspondence between
objects, a semantic binding is possible
Uniprot:P49841
Apply Correspondence Rule:
if ?target.xref.lsid == ?bpx:prot.xref.lsid
then ?target.correspondsTo.?bpx:prot
Source: Eric Neumann
Haystack BioDASH Demo http://www.w3.org/2005/04/swls/BioDash/Demo/
Summary
• The science of knowledge representation has, throughout its
history, consisted of a compromise between pragmatism,
scientific rigor, and accessibility to domain experts
• Many different options for ontology development and
encoding, i.e. knowledge representation
• Sometimes, your choice of representation may need to
change based on language and tools availability/
capability…
• Balancing expressivity and implementability means we favor
an object-type, e.g. DL representation (but also suggests the
need for a meta-representation: e.g. KIF – Knowledge
Interchange Format)
• Next class (3) – ontology engineering
• Use cases should drive the functional requirements of both
your ontology and how you will ‘build’ one (see class 4)
66
Assignment for Week 2
• Reading:
– Semantic Web for the Working Ontologist
– Alternate reading: Pizza Tutorial
• Assignment 1:
Representing Knowledge and Understanding
Representations
67
Extras
68
DOLCE + SWEET
DOLCE
= SWEET
< SWEET
Physical-body
BodyofGround,
BodyofWater,…
Material-Artifact
Infrastructure,
Dam, Product,…
Physical-Object
LivingThing,
MarineAnimal
Amount-of-Matter

full coverage
rich relations
home for orphans
single
superclasses
Substance
HumanActivity
Activity
Physical-Phenomenon
Phenomena
Process
Process
State
StateOfMatter
Quality
Quantity,
Moisture,…
Physical-Region
Basalt,…
Temporal-Region
Ordovician,…
Benefits

Issues
individuals
(e.g. Planet Earth)
roles
(contaminant)
features
(SeaFloor)
69
Courtesy: Boyan Brodaric
Conclusions
 Surprisingly good fit amongst ontologies
so far: no show-stopper conflicts, a few difficult conflicts
 DOLCE richness benefits geoscience ontologies
good conceptual foundation helps clear some existing problems
 Unresolved issues in modeling science entities
modeling classifications, interpretations, theories, models,…
 Same procedure with GeoSciML
70
Courtesy: Boyan Brodaric
NC basic attributes
CF attributes
IRIDL
attributes/objects
CF data objects
SWEET Ontologies
(OWL)
CF Standard Names
(RDF object)
Location
CF Standard Names
As Terms
IRIDL Terms
SWEET as Terms
Search Terms
Gazetteer Terms
71
Blumenthal
IRI RDF Architecture
MMI
Data Servers
Ontologies
JPL
bibliography
Start Point
Standards
Organizations
RDF Crawler
RDFS Semantics
Owl Semantics
SWRL Rules
SeRQL CONSTRUCT
Sesame
Location
Canonicalizer
Time
Canonicalizer
Search Queries
72
Blumenthal
Search Interface
CLCE - Common Logic Controlled English
CLCE: If a set x is the set of (a cat, a
dog, and an elephant), then the cat is an
element of x, the dog is an element of x,
and the elephant is an element of x.
PC:~(∃x:Set)(∃x1:Cat)(∃x2:Dog)(∃x3:Elep
hant)(Set(x,x1,x2,x3) ∧ ~(x1∈x ∧ x2∈x ∧
x3∈x))
73
Use Case
• Provide a decision support capability for an
analyst to determine an individual’s
susceptibility to avian flu without having to be
precise in terminology (-nyms)
74
75
76
Building SKOS
• ThManager
• Protégé (4) plugin for SKOS
77
Is OWL the only option II? No…
• Natural Language (NL)
– Read results from a web search and transform to a
usable form
– Find/filter out inconsistencies, concepts/relations that
cannot be represented
• Popular options
– CLCE (common logic controlled english)
– Rabbit, e.g. ShellfishCourse is a Meal Course that (if has
drink) always has drink Potable Liquid that has Full body
and which either has Moderate or Strong flavour
– PENG (processable English)
• Really need PSCI - process-able science but that’s
another story (research project)
78
Sydney syntax
If X has Y as a father then Y is the
only father of X.
The class person is equivalent to
male or female, and male and
female are mutually exclusive.
equivalent to
The classes male and female are
mutually exclusive. The class
person is fully defined as anything
that is a male or a female.
79
PENG - Processible English
1. If X is a research programmer then
X is a programmer.
2. Bill Smith is a research
programmer who works at the CLT.
3. Who is a programmer and works at
the CLT?
80
Rules (aka ‘Logic’)
• OWL is based on Description Logic
• OWL DL follows it precisely
• There are things that DL cannot express
(though there are things that are difficult to
express with rules and easy in DL...)
– A well known examples is Horn rules (eg, the
‘uncle’ relationship): (P1 ∧ P2 ∧ ...) → C
– e.g.: parent(?x,?y) ∧ brother(?y,?z) ⇒
uncle(?x,?z)
– Or, for any X, Y and Z: if Y is a parent of X, and Z
is a brother of Y then Z is the uncle of X
81
Examples from
http://www.w3.org/Submission/SWRL/
• A simple use of these rules would be to assert that
the combination of the hasParent and
hasBrother properties implies the hasUncle
property. Informally, this rule could be written as:
– hasParent(?x1,?x2) ∧ hasBrother(?x2,?x3) ⇒
hasUncle(?x1,?x3)
• In the abstract syntax the rule would be written like:
– Implies(Antecedent(hasParent(Ivariable(x1) I-variable(x2))
hasBrother(I-variable(x2) Ivariable(x3)))Consequent(hasUncle(Ivariable(x1) I-variable(x3))))
• From this rule, if John has Mary as a parent and
Mary has Bill as a brother then John has Bill as an
uncle.
82
Examples
• An even simpler rule would be to assert that
Students are Persons, as in
– Student(?x1) ⇒
Person(?x1).Implies(Antecedent(Student(Ivariable(x1)))Consequent(Person(Ivariable(x1))))
– However, this kind of use for rules in OWL just duplicates
the OWL subclass facility. It is logically equivalent to write
instead
• Class(Student partial Person) or
• SubClassOf(Student Person)
– which would make the information directly available to an
OWL reasoner.
83
Semantic Web with Rules
•
•
•
•
•
•
•
•
Metalog
RuleML
SWRL
RIF
OWL 2 RL
WRL
Cwm
Jess - rules engine
84
Developing a service ontology
• Use case: find and display in the same projection,
sea surface temperature and land surface
temperature from a global climate model.
• Find and display in the same projection, sea
surface temperature and land surface
temperature from a global climate model.
• Classes/ concepts:
–
–
–
–
–
–
–
Temperature
Surface (sea/ land)
Model
Climate
Global
Projection
Display …
85
Service ontology
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Climate model is a model
Model has domain
Climate Model has component representation
Land surface is-a component representation
Ocean is-a component representation
Sea surface is part of ocean
Model has spatial representation (and temporal)
Spatial representation has dimensions
Latitude-longitude is a horizontal spatial representation
Displaced pole is a horizontal spatial representation
Ocean model has displaced pole representation
Land surface model has latitude-longitude representation
Lambert conformal is a geographic spatial representation
Reprojection is a transform between spatial representation
….
86
Service ontology
• A sea surface model has grid representation displaced pole
and land surface model has grid representation latitudelongitude and both must be transformed to Lambert
conformal for display
87