Transcript Document

The Semantic Web
and
Efficient Reuse of
Ontology Modules
MSc CO3701 Advanced Database Systems Research Topics
5 March 2008
David George, Dept. Computing, UCLan.
What is the Semantic Web?
• A project aimed to make web pages machine understandable.
“An extension of the current Web, … information given well-defined
meaning, …enabling computers and people to work in co-operation”
(Berners-Lee et al, 2001)
• A universal medium for information integration and exchange.
• Uses Ontology – a formal domain representation that specifies
the meaning (semantics) of a domain or context.
We have the Web: a Global Information Space
Some current Web statistics
• Approx. 70m web sites
• Circa 15-20 billion pages (files)
Semantic Web share
• 0.004% usable Semantic Web files (800k)
• 0.00005% are Ontology files (10k)
Swoogle
Visualising the Semantic Web?
DE BRUIJN, J. (2003) Using Ontologies - Enabling Knowledge Sharing and Reuse on the Semantic Web [online]. DERI – Digital
Enterprise Research Institute. Available from: http://www.deri.ie/publications/techpapers/documents/DERI-TR-2003-10-29.pdf.
[Accessed 4 March 2008].
What is an Ontology?
•
“An Ontology is a formal, explicit specification of a shared
conceptualization”
(Gruber, 1993 & Borst, 1997).
•
Ontology specifies the vocabulary of a “Domain”
 concepts and their attributes
 relationships between concepts
 constraints on those relationships
moonRocket
[ hasWeight = 100t ]
makesSolarFlightTo
Every moonRocket makesSolarFlightTo only Moon
Moon
An What
example
of an
a Bibliographic
does
Ontology lookOntology
like?
Biblio-Thing
Agent
Document
Person
Book
Thesis
Periodical-Publication
Journal
Author
Publisher
Organisation
University
Doctoral-Thesis
Newspaper
Master-Thesis
Magazine
A useful source for ontologies:
http://protegewiki.stanford.edu/index.php/Protege_Ontology_Library
Semantic Web Technologies
• Based on “XML-based” RDF (Resource Description Framework)
and OWL Ontology languages (W3C, 2004).
• OWL has foundations in Description Logics (DL)
– decidable fragments of First Order Logic.
• OWL can be reasoned with using DIG Reasoners (short for DL
Implementation Group)
– Reasoner can establish subclass/superclass relationship of concept.
– Can infer equivalence, transitivity of classes and relations class
– Can determine ontology consistency.
RDF (Resource Description Framework)
•
Built on subject, predicate, object triples [a statement]
•
A statement may say: <student> <lastname> is <George>
•
predicate
subject
For example:
object
Uses for the Semantic Web?
• Data integration e.g. integrating heterogeneous database
structures/schemas and semantics?
• Annotation of Internet resources i.e. Web pages – to assist Web
crawler/robot/spiders. Semantic (Shadow) Web?
• Support Search Engine queries – to improve relevance of retrieval
hits?
• Facilitation of understanding between e-government portal
terminology and users natural language?
Typical Search Engine Query
Search Hits
Semantic
Semantic
DB
a
a
a
a
KB
a
a
a
a
Ontology Development
One large ontology or many?
– Complexity of ontology specification makes it impractical.
– How do you describe the world!!
– Ontologies conceptualised by domain specialists.
– Applications will require ontology integration capability.
– Fulfils Reuse capability
– Risk of redundancy through overlapping class sets.
So let’s consider Land Transport ….
Our Transport Ontology
Possible Application uses:
– Public transport services
– Commercial Freight services
– Linking towns and cities by road and rail
– We may need to consider bringing together road, rail and
population centre ontologies.
But first, why not use an existing ontology?
•
Reuse via Ontology imports
– E.g. if OTN 1 is imported: what do we
see?
– Small Ontology but describes multiple
sub-domains
1
•
•
•
•
Potential redundancy
Vulnerability to change
How relevant are they?
Only for an application that uses
ALL concepts
OTN - Ontology of Transportation Networks (Lorenz et al, 2005)
Our Land Transport scenario
The Channel Tunnel Rail Link (CTRL) is the international connection and, whilst essentially a single mode of transport, it
also interfaces with road transport. Other road-rail interfaces, not shown, might be level crossings and transport
interchanges.
Cheriton Channel Tunnel Terminal © OS Get-a-Map.
The multimodal element of CTRL operation is the Channel Tunnel transport interchange in Cheriton, accessed by road and
rail for its drive-on drive-off service
Let us assume the Rail domain contains various concept and relationship statements:
•
•
•
•
•
•
RailRoute startsFrom RailwayStation
RailwayStation locatedIn City
RailRoute hasRailComponent RailwayLine
RailwayLine meetsObstacle LevelCrossing
LevelCrossing intersectionBetween (RailwayLine ⊓ Highway)
RailwayStation accessedVia Highway
These statements are combined to form the Rail model
NB: certain concepts (City, Highway) are likely to be logical concepts in Road and
PopGroup sub-domains.
For the Road module fragment we have described that:
-
a highway provides access to a city and transport facility
a drive-on/drive-off facility is available at a transport interchange
our highway encounters a railway (level crossing)
various operators use the transport interchange.
Again, for the Road domain we see that certain concepts (City, Highway)
replicate the Rail sub-domain.
The PopGroup sub-domain shows various travel relationships including City and
Town, and the DormitoryTownRole the latter may fulfil.
PopGroup would specify how concepts might be accessed from each other, again
resulting in similar relationships as Rail and Road.
LandTransport Ontology
We have three sub-domains created as modules or contexts.
These Contexts might now be logically clustered within a multimodal
LandTransport application ontology, itself containing general transport
concepts: TransportInterchange, TransportOperator and
various transport roles.
Import implications?: Road, Rail, PopGroup modules into
LandTransport
We see:
Concept duplication and redundancy, e.g: rail:RailwayStation,
road:RailwayStation and pop:RailwayStation; also between
rail:City, road:City and pop:City.
Relation duplications, such as rail:encountersHazard and
road:encountersHazard.
Land Transport + Contexts
So How do we develop “Geo-Modules”
• Need to “de-integrate” to allow low-cost integration
• Aim towards “effectively” disjoint domains
• Deliver by removing concept duplication between modules –
redundancy
• Need to promote/relegate multi or single-context concepts
and relations
Visualising de-integration of domains
This process of semantic-layering represents a conceptual
process of module de-integration to make distinctions.
We reduce each sub-domain to a single context, e.g. the Rail
model is depicted (next page).
Rail is now stated more formally:
Let a domain ontology O that contains concepts C, relations R and has a
domain context CT be a set O = <<C(1,,,n)>, <R(1,,,n)>, CT>.
The multi sub-domain ontology set can then be represented as:
O = <<(CP1,,,CPn),(CS1,,,CSn)>,<(RP1,,,RPn),(RS1,,,RSn)>,<CTP ,(CTS1,,,CTSn)>>
Transportation Domain Layers
In each sub-domain we differentiate between the primary concepts and relations and
secondary. Any secondary concepts and relations are removed to be primary concepts
and relations in their own contexts.
Comparisons show a reduction in classes (from 17 to 11) and in relations (34 to 20)
End