Semantic Web 2 - Riga Technical University

Download Report

Transcript Semantic Web 2 - Riga Technical University

Semantic Web
Andrejs Lesovskis
This lecture’s agenda
Metadata
Introduction to Resource Description
Framework
RDF triples
RDF serialization formats
RDF Schema
RDF and relational databases
Metadata
The term "meta" comes from a Greek
word that denotes something of a
higher or more fundamental nature.
Metadata, then, is data about other
data.
The term refers to any data used to aid
the identification, description and
location of networked electronic
resources.
Metadata
For example, we have a file that contains
some image data. Then file metadata could
be the following:
 name of an author,
 the date and time a picture was taken,
 location where a picture was taken,
 model of a camera that was used,
 etc.
Metadata example (1)
Metadata example (2)
Metadata types
Metadata can describe:
 data
contents
(short
limitations, etc);
 data access history;
 access rights;
 relations between data.
summary,
Uses of metadata
Metadata can be used in:
 content ranking;
 resource searching;
 resource integration;
 define relations between intelligent
agents.
Metadata facts to remember
Metadata does not have to be digital
Metadata relates to more than the
description of an object.
Metadata can come from a variety of
sources
Metadata continue to accrue during the life
of an information object or system.
One information object's metadata can
simultaneously be another information
object's data.
Semantic Web layers
Semantic Web layers
W3C Semantic Web Activity Statement
"The Resource Description Framework (RDF)
is a language for representing information
about resources in the World Wide Web. It is
particularly
intended
for
representing
metadata about Web resources, such as the
title, author, and modification date of a Web
page, copyright and licensing information
about a Web document, or the availability
schedule for some shared resource."
W3C
Resource Description Framework
A framework (not a language) for a
framework for representing information
in the Web,
RDF is a standard model for data
interchange on the Web,
Syntax to allow exchange and use of
the information stored in various
locations,
The point is to facilitate reading and
correct use of information by machines,
not necessarily by people.
What is a resource?
Resource is anything that can be identified
and described.
 Resource can be identified by a URI or it
can be a blank node.
 Resource can be abstract.

The first precise definition of a resource can
be found in the RFC 2396 standard:
http://tools.ietf.org/html/rfc2396
Main goals of RDF

Integrate
data
from
the
multiple
sources.

Allow the re-use of data in the different
projects and organizations.

Decentralize data in a way that no
single party "owns" all the data.
XML and RDF (1)
XML
<?xml version="1.0"?>
<River id="Yangtze"
xmlns="http://www.geodesy.org/river">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau
</startingLocation>
Yangtze.xml
<endingLocation>East China
Sea</endingLocation>
</River>
Can be converted to
RDF
<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau
</startingLocation>
<endingLocation>East China Sea</endingLocation>
Yangtze.rdf
</River>
XML and RDF (2)
1 RDF provides an ID attribute for identifying the resource being described.
2 The ID attribute is in the RDF namespace.
3 Add the "fragment identifier symbol" to
<?xml version="1.0"?>
the namespace.
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau
</startingLocation>
<endingLocation>East China Sea</endingLocation>
</River>
XML and RDF (3)
2 Identifies the resource being described. This
resource is an instance of River.
1 Identifies the type
(class) of the
resource being
described.
3 These are properties,
or attributes, of the
type (class).
<?xml version="1.0"?>
<River rdf:ID="Yangtze"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.geodesy.org/river#">
<length>6300 kilometers</length>
<startingLocation>western China's Qinghai-Tibet Plateau</startingLocati
<endingLocation>East China Sea</endingLocation>
</River>
4 Values of the properties
RDF triple structure (1)
RDF data model is based upon the idea of
making statements about resources in
the form of subject-predicate-object
expressions (triples).
resource
property
value
RDF triple structure (2)
Every triple contains some statement.
Property
Resource
Value
Resource
Statement
RDF triple structure (3)
In RDF, the English statement
"The owner of the web-site Gmail at
http://www.gmail.com is Google."
would look like this:
owned by
Gmail
url
http://www.gmail.com
Binary predicates
RDF offers only
binary predicates.
Think of them as
P(x,y) where P is the
relationship between
the objects x and y.
http://www.w3schools.com/RDF
From the example,
X=
http://www.w3school
s.com/RDF
Y = Jan Egil Refsnes
P = author
author
Jan Egil Refsnes
RDF triple example(1)
RDF graph example:
RDF triple example (2)
RDF graph example:
RDF triple example (3)
RDF/XML code that corresponds to the graph on the previous slide:
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#">
<contact:Person
rdf:about="http://www.w3.org/People/EM/contact#me">
<contact:fullName>Eric Miller</contact:fullName>
<contact:mailbox rdf:resource="mailto:[email protected]"/>
<contact:personalTitle>Dr.</contact:personalTitle>
</contact:Person>
</rdf:RDF>
URIs and RDF
RDF uses URI references to define its
subjects, predicates, and objects.
A URI reference (or URIref) is a URI, together
with an optional fragment identifier at the end.
E.g., the URI
http://www.example.org/index.html#section2
consists of:
the URI http://www.example.org/index.html
the fragment identifier: section2.
A resource is identifiable by a URI reference
RDF serialization formats

RDF/XML;

Notation 3 (N3);

Turtle;

N-Triples.
RDF/XML
 RDF/XML is a syntax, defined by the W3C, to
express (serialize) an RDF graph as an XML
document;
 According to the W3C, "RDF/XML is the
normative syntax for writing RDF";
 Was endorsed as a recommendation on
February 10, 2004.
RDF/XML example
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:geo="http://www. w3.org/2003/01/geo/wgs84_pos#"
xmlns:edu="http://www.example.org/">
<rdf:Description rdf:about="http://www.princeton.edu">
<geo:lat>40.35</geo:lat>
<geo:long>-74.66</geo:long>
<edu:hasDept rdf:resource="http://www.cs.princeton.edu"/>
</rdf:Description>
<rdf:Description rdf:about="http://www.cs.princeton.edu">
<dc:title>Department of Computer Science</dc:title>
</rdf:Description>
</rdf:RDF>
Notation3 (N3)
 Much more compact and readable format that
doesn’t use XML syntax ;
 Is being developed by Tim Berners-Lee and
Semantic Web community members;
 N3 files use UTF-8 encoding.
Example (graph)
Example in RDF/XML
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:m="http://www.example.org/meeting_organization#"
xmlns="http://www.example.org/people#"
xmlns:p="http://www.example.org/personal_details#">
<rdf:Description about="http://meetings.example.com/cal#m1">
<m:homePage resource="http://meetings.example.com/m1/hp"/>
</rdf:Description>
<rdf:Description about="http://www.example.org/people#fred">
<m:attending resource="http://meetings.example.com/cal#m1"/>
<p:GivenName>Fred</p:GivenName>
<p:hasEmail resource="mailto:[email protected]"/>
</rdf:Description>
</rdf:RDF>
Example in Notation3
@prefix p: <http://www.example.org/personal_details#> .
@prefix m: <http://www.example.org/meeting_organization#> .
<http://www.example.org/people#fred>
p:GivenName
"Fred";
p:hasEmail
<mailto:[email protected]>;
m:attending
<http://meetings.example.com/cal#m1> .
<http://meetings.example.com/cal#m1>
m:homePage
<http://meetings.example.com/m1/hp> .
Turtle
 Is Notation3 subset that also uses non-XML
syntax;
 Very popular RDF serialization format;
 Was developed by Dave Beckett;
 Turtle files use UTF-8 encoding.
1st example (RDF/XML)
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:ex="http://example.org/stuff/1.0/">
<rdf:Description
rdf:about="http://www.w3.org/TR/rdf-syntaxgrammar” dc:title="RDF/XML Syntax Specification (Revised)">
<ex:editor>
<rdf:Description ex:fullName="Dave Beckett">
<ex:homePage rdf:resource="http://purl.org/net/dajobe/" />
</rdf:Description>
</ex:editor>
</rdf:Description>
</rdf:RDF>
1st example (Turtle)
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix ex: <http://example.org/stuff/1.0/> .
<http://www.w3.org/TR/rdf-syntax-grammar>
dc:title "RDF/XML Syntax Specification (Revised)" ;
ex:editor [
ex:fullname "Dave Beckett";
ex:homePage <http://purl.org/net/dajobe/>
].
2nd example (graph)
2nd example (Turtle)
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix contact:
<http://www.w3.org/2000/10/swap/pim/contact#>.
<http://www.w3.org/People/EM/contact#me>
rdf:type contact:Person;
contact:fullName "Eric Miller";
contact:mailbox <mailto:[email protected]>;
contact:personalTitle "Dr.".
N-Triples
 Is Turtle and Notation3 subset;
 Developed by Dave Beckett and Art Barstow
(W3C);
 N-Triples is a simpler fromat.
Example (RDF/XML)
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description
rdf:about="http://www.w3.org/2001/sw/RDFCore/ntriples/">
<dc:creator>Art Barstow</dc:creator>
<dc:creator>Dave Beckett</dc:creator>
<dc:publisher rdf:resource="http://www.w3.org/"/>
</rdf:Description>
</rdf:RDF>
Example (N-Triples)
<http://www.w3.org/2001/sw/RDFCore/ntriples/>
<http://purl.org/dc/elements/1.1/creator> "Dave
Beckett" .
<http://www.w3.org/2001/sw/RDFCore/ntriples/>
<http://purl.org/dc/elements/1.1/creator> "Art
Barstow" .
<http://www.w3.org/2001/sw/RDFCore/ntriples/>
<http://purl.org/dc/elements/1.1/publisher>
<http://www.w3.org/>
Notation3 subsets
What is schema?
The word schema comes from the
Greek word "σχήμα" (skhēma),
which means shape, or more
generally, plan.
Schema (comp. science) is a logical
description of the data in a data
base, including definitions and
relationships of data.
RDF Schema (RDFS)
RDF Schema provides a way to
express:
simple statements defining classes of
resources
including
subclass
relationships,
statements
defining
properties
including subclass relationships,
statements about domain and range of
a property.
RDF Schema: A meta-language
RDF Schema's type system is similar to those
of object-oriented programming languages.
RDF Schema allows resources to be defined as
instances of one or more classes.
Classes can be organized in a hierarchical
fashion; for example, a class ex:Dog can be
defined as a subclass of ex:Mammal, meaning
that any resource which is in class ex:Dog is
also in class ex:Mammal.
The RDF Schema (RDFS:) is defined in a
namespace
whose
URI
is:
http://www.w3.org/2000/01/rdf-schema#".
Sample case: MotorVehicle
To say that ex:MotorVehicle is a class, write:
ex:MotorVehicle rdf:type rdfs:Class .
To create an instance of ex:MotorVehicle, write:
exthings:companyCar rdf:type ex:MotorVehicle .
Naming convention:
class names start with an uppercase letter;
property and instance names are lowercase.
A resource may be an instance of more than
one class.
Defining Subclasses
Using subClassOf property we can define
specialized
kinds
of
motor
vehicles
(e.g., passenger vehicles, vans, minivans, etc).
ex:Van rdf:type rdfs:Class .
ex:Van rdfs:subClassOf ex:MotorVehicle .
ex:Truck rdf:type rdfs:Class .
ex:Truck rdfs:subClassOf ex:MotorVehicle .
Meaning of Subclass
subClassOf means if ex:myVan is an
instance of ex:Van, then ex:myVan is also,
by inference, an instance of ex:MotorVehicle.
subClassOf is (obviously) transitive:
If ex:Van rdfs:subClassOf ex:MotorVehicle .
and ex:MiniVan rdfs:subClassOf ex:Van .
then ex:MiniVan is implicitly a subclass of
ex:MotorVehicle.
A class may be a subclass of more than one
class. All classes are implicitly subclasses of
class rdfs:Resource.
A Full Class Hierarchy
The (ex:Truck rdf:type rdfs:Class) part of the graph is not
shown.
Notice Minivan is subClassOf two classes.
Vehicle Hierarchy in RDF/XML
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<rdf:Description rdf:ID="MotorVehicle">
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
</rdf:Description>
<rdfs:Class rdf:ID="PassengerVehicle">
<rdfs:subClassOf rdf:resource="#MotorVehicle"/>
</rdfs:Class>
...
<rdfs:Class rdf:ID="MiniVan">
<rdfs:subClassOf rdf:resource="#Van"/>
<rdfs:subClassOf rdf:resource="#PassengerVehicle"/>
</rdfs:Class >
</rdf:RDF>
Class Naming
Fragment identifiers, like MotorVehicle, use
rdf:ID give the effect of "assigning" URIrefs
relative to the schema document.
Relative URIrefs based on these names can
then be used in other class definitions within
the same schema, e.g., #MotorVehicle.
The full URIref of this class would be:
http://example.org/schemas/vehicles#MotorVehicle
We could also include an explicit declaration:
xml:base="http://example.org/schemas/vehicles"
Different instance creation approaches
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:ex="http://example.org/schemas/vehicles">
<rdf:Description rdf:ID="companyCar">
<rdf:type
rdf:resource="http://example.org/schemas/vehicles#Mot
orVehicle"/>
</rdf:Description>
<ex:MotorVehicle rdf:ID="anotherCar">
…
</ex:MotorVehicle>
</rdf:RDF>
Properties
All properties in RDF are described as
instances of class rdf:Property, e.g.
exterms:weightInKg rdf:type rdf:Property .
RDF Schema provides rdfs:range to define
valid fillers for a triple’s Object.
RDF Schema provides rdfs:domain to
define valid fillers for a triple’s Subject.
53
rdfs:range property
If the property ex:author has values that are
instances of class ex:Person, we would write:
ex:Person rdf:type rdfs:Class .
ex:author rdf:type rdf:Property .
ex:author rdfs:range ex:Person .
If a property has more than one range, then its filler
must be an instance of all of the classes specified as
the ranges:
ex:hasMother rdf:type rdf:Property .
ex:hasMother rdfs:range ex:Person .
ex:hasMother rdfs:range ex:Female .
ex:Sally ex:HasMother exstaff:frances
exstaff:frances must be both a Female and a
Person.
Typed literals as ranges
To say that the range of ex:age is an integer:
ex:age rdf:type rdf:Property .
ex:age rdfs:range xsd:integer .
The datatype xsd:integer is identified by its
URIref (http://www.w3.org/2001/XMLSchema#integer).
It is optional, but “useful” to declare:
xsd:integer rdf:type rdfs:Datatype .
This statement documents the existence of the
datatype, and indicates explicitly that it is being
used in this schema.
55
rdfs:domain property
rdfs:domain indicates that a particular property
applies to a class.
Suppose books have authors. In RDF:
ex:Book rdf:type rdfs:Class .
ex:author rdf:type rdf:Property .
ex:author rdfs:domain ex:Book .
If a property has more than one domain, then
any subject instance of that property must be an
instance of each named domain.
Part of the RDF/XMLVehicle schema
<rdf:Description rdf:ID="registeredTo">
<rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntaxns#Property"/>
<rdfs:domain rdf:resource="#MotorVehicle"/>
<rdfs:range rdf:resource="#Person"/>
</rdf:Description>
<rdf:Property rdf:ID="rearSeatLegRoom">
<rdfs:domain rdf:resource="#PassengerVehicle"/>
<rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#integer"/>
</rdf:Property>
<rdfs:Class rdf:ID="Person" />
57
Specializing Properties
Like rdfs:subClassOf, property rdfs:subPropertyOf
is used to define a property hierarchy.
For example, to say that the property
ex:primaryDriver is a kind of ex:driver, write:
ex:driver rdf:type rdf:Property .
ex:primaryDriver rdf:type rdf:Property .
ex:primaryDriver rdfs:subPropertyOf ex:driver .
This means that if an instance ex:fred is a
ex:primaryDriver of the instance ex:companyVan,
then ex:fred is also a ex:driver of
ex:companyVan.
subPropertyOf property
A property may be a subPropertyOf zero,
one or more properties.
All RDF rdfs:range and rdfs:domain
properties that apply to an RDF property
also apply to each of its subproperties.
Therefore, because of its subproperty
relationship
to
ex:driver,
implicitly
ex:primaryDriver also has an rdfs:domain of
ex:MotorVehicle.
subPropertyOf example (RDF/XML)
<rdf:Description rdf:ID="driver">
<rdf:type
rdf:resource="http://www.w3.org/1999/02/22-rdfsyntax-ns#Property"/>
<rdfs:domain rdf:resource="#MotorVehicle"/>
</rdf:Description>
<rdf:Property rdf:ID="primaryDriver">
<rdfs:subPropertyOf rdf:resource="#driver"/>
</rdf:Property>
An Instance of ex:PassengerVehicle
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:ex="http://example.org/schemas/vehicles">
<rdf:Description rdf:ID="johnSmithsCar">
<rdf:type rdf:resource=
"http://example.org/schemas/vehicles#PassengerVehicle"/>
<ex:registeredTo
rdf:resource="http://www.example.org/staffid/85740"/>
<ex:rearSeatLegRoom rdf:datatype=
"http://www.w3.org/2001/XMLSchema#integer">
127
</ex:rearSeatLegRoom>
<ex:primaryDriver
rdf:resource="http://www.example.org/staffid/85740"/>
</rdf:Description>
</rdf:RDF>
Example
<!-- Top level class 'Staff' -->
<rdfs:Class rdf:ID="Staff" rdfs:comment="A Staff member">
<rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
</rdfs:Class>
<!-- Subclasses of Staff -->
<rdfs:Class rdf:ID="Researcher" rdfs:comment="A Researcher">
<rdfs:subClassOf rdf:resource="#Staff"/>
</rdfs:Class>
<rdfs:Class rdf:ID="Intern">
<rdfs:subClassOf rdf:resource="#Researcher"/>
</rdfs:Class>
Example
<rdf:Property rdf:ID="LName" rdfs:comment="Last Name">
<rdfs:domain rdf:resource="#Staff"/>
<rdfs:range rdf:resource="&rdfs;Literal"/>
</rdf:Property>
<rdf:Property rdf:ID="Author" rdfs:comment="Authors of
the paper" rdfs:domain="#Paper" />
RDF Container Elements
RDF containers are used to describe group
of things (resources or literals):
 rdf:Bag – used to describe a list of
values that do not have to be in a specific
order.
 rdf:Seq – used to describe an ordered
list of values (for example, in alphabetical
order).
 rdf:Alt – used to describe a list of
alternative values (the user can select
only one of the values).
RDF Container Elements (rdf:Bag)
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Beatles">
<cd:artist>
<rdf:Bag>
<rdf:li>John</rdf:li>
<rdf:li>Paul</rdf:li>
<rdf:li>George</rdf:li>
<rdf:li>Ringo</rdf:li>
</rdf:Bag>
</cd:artist>
</rdf:Description>
RDF Container Elements (rdf:Seq)
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Beatles">
<cd:artist>
<rdf:Seq>
<rdf:li>George</rdf:li>
<rdf:li>John</rdf:li>
<rdf:li>Paul</rdf:li>
<rdf:li>Ringo</rdf:li>
</rdf:Seq>
</cd:artist>
</rdf:Description>
RDF Container Elements (rdf:Alt)
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Beatles">
<cd:format>
<rdf:Alt>
<rdf:li>CD</rdf:li>
<rdf:li>Record</rdf:li>
<rdf:li>Tape</rdf:li>
</rdf:Alt>
</cd:format>
</rdf:Description>
IsaViz: A Visual Authoring Tool for RDF
Jena - open source Semantic Web framework
RDF storage
RDF triple store is a system that provides a
mechanism for persistent storage and
access of RDF graphs.
Different Architectures
Based on their implementation, triple stores can
be divided into 3 broad categories : In-memory,
Native, and Non-native, Non-memory.
In – Memory : RDF Graph is stored as triples
in main –memory. For example, storing an
RDF graph using Jena API/ Sesame API.
Native : Persistent storage systems with their
own implementation of databases. Eg. Sesame
Native, Virtuoso, AllegroGraph, Oracle 11g.
Non-Native, Non-Memory : Persistent storage
systems set-up to run on third party DBs. E. g.,
Jena SDB.
Graph databases
Graph database example
Oracle’s RDF/OWL Architectural Overview
Security: Oracle Label Security
RDF/OWL
data and
ontologies
Rulebases: OWL,
RDF/S, userdefined
SQLdev.
PL/SQL
OBIEE
via SPARQL Gateway
Visualizer
Cytoscape-based
QUERY (SPARQL in SQL)
Query
OntologyRDF/OWL
assisted
data and
Query of
ontologies
Enterprise Data
Inferred
RDF/OWL
data
Semantic
Indexes
User-def.
RDF/S
INFER
OWLsubsets
Incr. DML
Bulk-Load
LOAD
SQLplus
JDBC
SQL Interface
Enterprise
(Relational)
data
3rd-Party Callouts
RDF/OWL
Java API support
Java
Programs
Tools
NLP Extractors
Oracle DB
SPARQL: Jena / Sesame
Joseki / Sesame
Reasoners: Pellet
Core
Programming
functionality
Interface
SPARQL Endpoints
Paldies
par uzmanību!