Transcript Document

语义网与本体技术导论
An Introduction to the Semantic Web and Ontology
Technology
黄智生
Zhisheng Huang
Vrije University Amsterdam
The Netherlands
[email protected]
China 2005
http://sekt.semanticweb.org/
1
从Google谈起
starting from Google
China 2005
http://sekt.semanticweb.org/
2
存在的问题
Existing Problems
China 2005
http://sekt.semanticweb.org/
3
我们能不能做得更好?
Can we do it better?
• 基于语义的搜索Semantics-based
search
• 概念组合描述 concept combination
specification
• 指定特定领域 domain specific
• 逼近搜索 approximate search
• 搜索代理 search agent
China 2005
http://sekt.semanticweb.org/
4
语义网(Semantic Web)
•核心思想:给网络信息赋于确切定义的意义,
即语义。
„
The Semantic Web is an extension of the
current web in which information is given welldefined meaning, better enabling computers
and people to work in co-operation.“
[Berners-Lee et al., 2001]
China 2005
http://sekt.semanticweb.org/
5
语义是什么?
What is the Semantics?
• Frege(1848-1925): Reference and Sense
• Syntax, Semantics, Pragmatics
• Denotational Semantics vs. Operational Semantics
Main features
• 指称性 (denotation)
• 唯一性(uniqueness)
• 相关性(relatedness)
China 2005
http://sekt.semanticweb.org/
6
语义网想做什么?
(What the Semantic Web wants to do)
• 机器可自动处理
• 机器可理解
Content is machine-understandable if it
is bound to some formal description
of itself (i.e. metadata).
China 2005
http://sekt.semanticweb.org/
7
HTML标识(HTML Markup)
……
<h2>Zhisheng Huang</h2>
<b>Affiliation</b>:
Department of Computer Science<br>
Faculty of Sciences<br>
Vrije University Amsterdam<p>
<b>Email</b>: huang @ cs.vu.nl<br>
<b>Phone</b>: 31-20-4447740(office)
……
China 2005
http://sekt.semanticweb.org/
8
XML标注
XML-Annotations
<researcher><name>Zhisheng Huang</name>
<affiliation>
<department>Department of Computer
Science</department>
<faculty>Faculty of Sciences</faculty>
<university>Vrije University Amsterdam</university>
</affiliation>
<email>huang @ cs.vu.nl</email>
<phone id=“office”> (31)-20-4447740</phone>
……</researcher>
China 2005
http://sekt.semanticweb.org/
9
Data Structures
• 结构化数据Structured Data:
• Database
• 半结构化数据Semi-structured Data:
• HTML, XML, BibTex
• 非结构化数据Non-structured Data:
• Text
China 2005
http://sekt.semanticweb.org/
10
关系数据库的XML表示
XML representation of a
relational database
AI group
member id
name
phone
001
John
1234567
002
Mary
7654321
…
…
…
China 2005
<group name=“AI”>
<member id=“001”>
<name>John</name>
<phone>1234567</phone>
</member>
<member id=“002”>
<name>Mary</name>
<phone>7654321</phone>
</member>
…..
</group>
http://sekt.semanticweb.org/
11
文件类型定义
Document Type Definition(DTD)
<!DOCTYPE researcher [
<!ELEMENT researcher (name, affiliation, email,
phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ATTLIST phone id CDATA #REQUIRED >
<!ELEMENT affiliation (department, faculty,
university)>
… ]>
China 2005
http://sekt.semanticweb.org/
12
数据模型Data Model
Name
n
Phone
Researcher
eMail
China 2005
Department
has
1
Affiliation
Faculty
University
http://sekt.semanticweb.org/
13
XML模式XML Schema
• The purpose of an XML Schema is to
define the legal building blocks of an
XML document, just like a DTD.
China 2005
http://sekt.semanticweb.org/
14
Why XML Schemas
• XML Schemas are extensible to
future additions
• XML Schemas are richer and more
useful than DTDs
• XML Schemas are written in XML
• XML Schemas support data types
• XML Schemas support namespaces
China 2005
http://sekt.semanticweb.org/
15
名字冲突Name Conflicts
• Since element names in XML are not
fixed, very often a name conflict will
occur when two different documents
use the same names describing two
different types of elements.
• If these two XML documents were
added together, there would be an
element name conflict because both
documents contain a same element
with different content and definition.
China 2005
http://sekt.semanticweb.org/
16
XML名字空间XML NameSpace
• Using Namespaces to solve Name Conflicts
Examples:
• xmlns:namespace prefix="namespace"
• xmlns:xsd="http://www.w3.org/2001/XMLSche
ma"
China 2005
http://sekt.semanticweb.org/
17
可扩展标识语言模式
XML Schema
<xsd:element name="reseracher">
<xsd:complexType>
<xsd:element name="name" type="xsd:String"/>
<xsd:element name="affiliation" type="affil"
minOccurs="1" maxOccurs="unbounded"/>
<xsd:element name="phone" type="xsd:String"/>
<xsd:element name="email" type="xsd:String"/>
</xsd:complexType>
</xsd:element>
<xsd:complexType name="affil">
<xsd:element name= " department" type="xsd:String"/>
<xsd:element name= " faculty" type="xsd:String"/>
<xsd:element name="university" type="xsd:String"/>
</xsd:complexType>
China 2005
http://sekt.semanticweb.org/
18
资源描述框架
Resource Description Framework(RDF)
• Metadata is machine understandable information about web
resources or anything that has an URI, it is represented as a set of
independent assertions:
Triple: T(subject, attribute, values)
Creator
Zhisheng
Creator
Cees
http://wasp.cs.vu.nl/sekt/dig/dig.pdf
<rdf:Description about="http://wasp.cs.vu.nl/sekt/dig/dig.pdf">
<dc:Creator rdf:ressource="http://www.cs.vu.nl/~huang"/>
<dc:Creator rdf:ressource="mailto:[email protected]"/>
</rdf:Description>
China 2005
http://sekt.semanticweb.org/
19
RDF: Dublin Core
• The Dublin Core provides properties for
describing network objects, suitable for use by
network search engines.
• The Dublin Core is a set of predefined
properties for describing documents.
• The first Dublin Core properties were defined
at the Metadata Workshop in Dublin, Ohio in
1995 and is currently maintained by the Dublin
Core Metadata Initiative.
China 2005
http://sekt.semanticweb.org/
20
Dublin Core Metadata Initiative
• The Dublin Core Metadata Initiative is
an open forum engaged in the
development of interoperable online
metadata standards that support a broad
range of purposes and business models.
• http://dublincore.org/
China 2005
http://sekt.semanticweb.org/
21
Annotating Metadata
<rdf:Description rdf:about=…dc-rdf/">
<dc:title>
Guidance on expressing the Dublin Core within the
Resource
Description Framework (RDF)
</dc:title>
<dc:creator> Eric Miller </dc:creator>
<dc:creator> Paul Miller </dc:creator>
<dc:creator> Dan Brickley </dc:creator>
<dc:subject> Dublin Core; RDF; XML </dc:subject>
<dc:publisher> Dublin Core Metadata Initiative
</dc:publisher>
<dc:contributor> Dublin Core Data Model Working
Group </dc:contributor>
<dc:date> 1999-07-01 </dc:date>
<dc:format> text/html </dc:format>
<dc:language> en </dc:language>
</rdf:Description>
China 2005
http://sekt.semanticweb.org/
22
资源描述框架模式
RDF Schema (RDFS)
• RDFS defines vocabulary for RDF
• Organizes this vocabulary in a
typed hierarchy
• Class, subClassOf, type
• Property, subPropertyOf
• domain, range
China 2005
http://sekt.semanticweb.org/
23
RDFS
Person
subClassOf
Student
domain
hasSuperVisor
subClassOf
range
type
type
Prof. Ma
Wang
China 2005
Professor
http://sekt.semanticweb.org/
24
概念与本体
Concepts and Ontologies
• Philosophical discipline, branch of
philosophy that deals with the nature and
the organisation of reality.
• Science of Being (Aristotle, Metaphysics,
IV,1)
• What is being?
• What are the features common to all
beings?
China 2005
http://sekt.semanticweb.org/
25
Vocabulary and Ontology
• Controlled vocabulary (Jernst 2003) :
• a list of controlled terms
• unambiguous
• non-redundant definition
• Ontology: a controlled vocabulary
expressed in an ontology
representation language (Jernst 2003)
China 2005
http://sekt.semanticweb.org/
26
In computer science …
• An ontology is an explicit specification of a conceptualization. [Gruber93]
• An ontology is a shared understanding of some domain of interest.
[Uschold, Gruninger96]
• There are many definitions
• a formal specification EXECUTABLE
• of a conceptualization of a domain COMMUNITY
• of some part of world that is of interest APPLICATION
• Defines
• A common vocabulary of terms
• Some specification of the meaning of the terms
• A shared understanding for people and machines
China 2005
http://sekt.semanticweb.org/
27
Why develop an ontology?
• To make domain assumptions explicit
• Easier to change domain assumptions
• Easier to understand and update legacy data
• To separate domain knowledge from operational
knowledge
• Re-use domain and operational knowledge
separately
• A community reference for applications
• To share a consistent understanding of what
information means.
China 2005
http://sekt.semanticweb.org/
28
本体的主要特征
Key features of an Ontology
• 概念层次性Concept hierarchy,
• 概念包含关系concept subsumption
•特殊与一般关系 InstanceOf Relation
(Instances)
•部分与整体关系 PartOf Relation
(property)
China 2005
http://sekt.semanticweb.org/
29
Why not other alternatives
• 一阶谓词逻辑 the first-order predicate
logic
• 集合论 set theory
• 程序语言 programming languages
China 2005
http://sekt.semanticweb.org/
30
RDF(S) Reconsideration
• Next step up from plain XML:
• (small) ontological commitment to modeling
primitives
• possible to define vocabulary
• However:
• no precisely described meaning
• unclear semantics, no clean separation
between:
• Instances
• Concepts
• Meta-ontologies (e.g. RDFS language itself)
• no inferencehttp://sekt.semanticweb.org/
model
China 2005
31
China 2005
http://sekt.semanticweb.org/
32
网络本体语言
Web Ontology Language (OWL)
•
•
•
•
•
•
China 2005
OWL is built on top of RDF
OWL is for processing information
on the web
OWL was designed to be
interpreted by computers
OWL was not designed for being
read by people
OWL is written in XML
OWL is a web standard
http://sekt.semanticweb.org/
33
Design Goals for OWL
China 2005
http://sekt.semanticweb.org/
34
Layered language
•
•
•
OWL Lite:
• Classification hierarchy
• Simple constraints
OWL DL:
• Maximal expressiveness
• While maintaining tractability
• Standard formalisation
OWL Full:
• Very high expressiveness
• Loosing tractability
• Non-standard formalisation
• All syntactic freedom of RDF
(self-modifying)
China 2005
http://sekt.semanticweb.org/
Full
DL
Lite
Syntactic layering
Semantic layering
35
China 2005
http://sekt.semanticweb.org/
36
China 2005
http://sekt.semanticweb.org/
37
China 2005
http://sekt.semanticweb.org/
38
OWL Example: animals
<?xml version="1.0"?><rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xml:base="http://wasp.cs.vu.nl/sekt/ontology/animal">
<owl:Ontology rdf:about=“animal"/><owl:Class rdf:ID="Eagle">
<rdfs:subClassOf><owl:Class rdf:about="#Bird"/>
</rdfs:subClassOf></owl:Class><owl:Class rdf:ID="Animal"/>
<owl:Class rdf:ID="Fly"><owl:disjointWith>
<owl:Class rdf:about="#Penguin"/></owl:disjointWith>
<rdfs:subClassOf rdf:resource="#Animal"/>
</owl:Class><owl:Class rdf:ID="Bird">
<rdfs:subClassOf rdf:resource="#Fly"/>
</owl:Class>
<owl:Class rdf:ID="Penguin">
<rdfs:subClassOf rdf:resource="#Bird"/>
<owl:disjointWith rdf:resource="#Fly"/>
</owl:Class>
</rdf:RDF>
China 2005
http://sekt.semanticweb.org/
39
China 2005
http://sekt.semanticweb.org/
40
Semantic Web Layers
China 2005
http://sekt.semanticweb.org/
41
语义网的逻辑基础
Logical Foundation of the Semantic Web
描述逻辑与框架逻辑之争
Description Logic vs. Frame-Logic
•
封闭世界假说与开放世界假说
Closed world assumption vs. Open world assumption
•
唯一名假说与非唯一名假说
Unique name assumption vs. Non-unique name assumption
•
面向对象与非面向对象
Object-oriented vs. non-object oriented
•
…..
China 2005
http://sekt.semanticweb.org/
42
描述逻辑 Description Logic
Man ´ Human u Male
Happy-Father ´ Man u 9 has-child
Female u …
Abox (data)
John : Happy-Father
hJohn, Maryi : has-child
China 2005
http://sekt.semanticweb.org/
Interface
Tbox (schema)
Inference System
Knowledge Base
43
Basic Description Logic: AL
• Concept Expressions:
• A (atomic concept)
•  (universal concept)
•  (bottom concept)
•  A (atomic negation)
• C ⊓ D (intersection)
• R.C (value restriction)
• R.T (limited existential quantification)
where A is a concept name, C and D are
concept expressions, and R is a role
expression
China 2005
http://sekt.semanticweb.org/
44
Family of AL language
C ⊔ D (Union)
R.C (Full Existential Quantification)
 C (Complement)
Number restriction
• ( n R) (at least restriction)
• ( n R) (at most restriction)
• Qualified number restriction
• ( n R.C) (at least restriction)
• ( n R.C) (at most restriction)
• Transitive Role: R+
• Inverse of Role: I
R S: H
• Role Hierarchies
http://sekt.semanticweb.org/
45
•
•
•
•
China 2005
Examples
woman ≡ person ⊓ female
man ≡ person ⊓ woman
mother ≡ woman ⊓ hasChild.person
father ≡ man ⊓ hasChild.person
China 2005
http://sekt.semanticweb.org/
46
Description Logics
•
•
Decidable Subset of First-Order Logic
• Equivalent to 3 Variable Fragment (Borgida 1996)
• Model theoretic semantics by mapping to abstract domain
Provides Primitives for defining Conceptual Knowledge
• Concept Expressions (Formulas with 1 free variable) for describing Sets
of Objects
• Boolean Operators: C D, C  D, C
• Quantifiers: (R.C), (P.C)
• Cardinality Constraints: (= n R), (> n R), (< n R), ( n R), ( n R)
• Axioms define relations between concepts
• Subsumption: C  D
• Equivalence: C  D
• Disjointness: C  D  
China 2005
http://sekt.semanticweb.org/
47
DL Semantics
• Interpretation function extends to
concept expressions in an obvious(ish)
way, i.e.:
China 2005
http://sekt.semanticweb.org/
48
DL for OWL: SHIQ
• SHIQ = ALCQHIR+
China 2005
http://sekt.semanticweb.org/
49
Frame-logic (F-logic)
•
•
•
•
•
China 2005
Object oriented
Frame based
Rule-based
…
Negation as failure
http://sekt.semanticweb.org/
50
Example
/* facts */
abraham:man.
sarah:woman.
isaac:man[father->abraham; mother->sarah].
ishmael:man[father->abraham; mother->hagar:woman].
jacob:man[father->isaac; mother->rebekah:woman].
esau:man[father->isaac; mother->rebekah].
/* rules consisting of a rule head and a rule body */
FORALL X,Y X[son->>Y] <- Y:man[father->X].
FORALL X,Y X[son->>Y] <- Y:man[mother->X].
FORALL X,Y X[daughter->>Y] <- Y:woman[father->X].
FORALL X,Y X[daughter->>Y] <- Y:woman[mother->X].
/* query */
FORALL X,Y <- X:woman[son->>Y[father->abraham]].
China 2005
http://sekt.semanticweb.org/
51
Semantic Web Application:
the foaf project
• The Friend of a Friend (FOAF) project
is about creating a Web of machinereadable homepages describing
people, the links between them and
the things they create and do.
• http://www.foaf-project.org/
China 2005
http://sekt.semanticweb.org/
52
China 2005
http://sekt.semanticweb.org/
53
Foaf.rdf
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<foaf:Person>
<foaf:surname>Huang</foaf:surname>
<foaf:name>Zhisheng Huang</foaf:name>
<foaf:firstName>Zhisheng</foaf:firstName>
<foaf:gender>male</foaf:gender>
<foaf:img rdf:resource="http://wasp.cs.vu.nl/~huang/images/huang02.jpg" />
<foaf:homepage rdf:resource="http://wasp.cs.vu.nl/~huang/" />
<foaf:mbox_sha1sum>238a59a17bd96fbb93f39aa9dba2f6847a8d261c</foaf:mbox_sha1sum>
<foaf:workplaceHomepage rdf:resource="http://www.vu.nl/" />
<foaf:mbox>mailto:[email protected]</foaf:mbox>
<foaf:knows>
<foaf:Person>
<foaf:name>Annette ten Teije</foaf:name>
<foaf:mbox_sha1sum>c10984c365331f1d38f649adccbb7aac5873aed2</foaf:mbox_sha1sum>
</foaf:Person>
</foaf:knows>
<foaf:knows>
<foaf:Person>
……
China 2005
http://sekt.semanticweb.org/
54
China 2005
http://sekt.semanticweb.org/
55
Add the FOAF information on the
homepage
<html> <head> ...
<link rel="meta“
type="application/rdf+xml"
title="FOAF" href="foaf.rdf" />
</head> <body> ... </body> </html>
• FOAF Agents on the Internet will now
be able to locate the FOAF entry.
China 2005
http://sekt.semanticweb.org/
56
FOAFBot: IRC Community
Support Agent
• FOAFBot is an IRC bot that provides
access to a knowledge base created
by spidering FOAF files.
• It can sit on an IRC channel and
provide basic informational help
about the members of a community.
China 2005
http://sekt.semanticweb.org/
57
DOPE
• The DOPE Browser is a deliverable
created by Aduna BV for the Drug
Ontology Project for Elsevier
(DOPE), a project funded by the
Elsevier Advanced Technology
Group.
China 2005
http://sekt.semanticweb.org/
58
China 2005
http://sekt.semanticweb.org/
59
China 2005
http://sekt.semanticweb.org/
60
China 2005
http://sekt.semanticweb.org/
61
Variants of ontologies
• 具体领域的本体 Domain ontology:
domain specific ontology.
• Upper Ontology: limited to concepts that
are meta, generic, abstract and
philosophical, general enough to
address (at a high level) a broad range
of domain areas.
China 2005
http://sekt.semanticweb.org/
62
Suggested Upper Merged
Ontology (SUMO)
• http://www.ontologyportal.org/
• SUMO is written in the SUO-KIF
language
• Largest free, formal ontology
available, with 20,000 terms and
60,000 axioms when all domain
ontologies are combined.
China 2005
http://sekt.semanticweb.org/
63
SUMO
These consist of SUMO itself, the MId-Level
Ontology (MILO), and ontologies of
•Communications
•Countries and Regions
•Distributed computing,
•Economy
•Finance,
•Engineering components
•Geography,
•Government,
•Military,
•People
•Transportation
China 2005
http://sekt.semanticweb.org/
64
基因本体Gene Ontology
• http://www.geneontology.org/
• Controlled vocabulary to describe gene
and gene product attributes in any
organism.
• Updated every 30 minutes
• 9759 biological_process
1574 cellular_component
7076 molecular_function (up to 16/8/2005)
• Format: Obo, GO, OWL
China 2005
http://sekt.semanticweb.org/
65
语义网核心研究课题:
SEKT Project
• Semantically Enabled Knowledge
Technologies (SEKT)
• A European research and
development project launched under
the EU Sixth Framework Programme.
.
China 2005
http://sekt.semanticweb.org/
66
Duration and Partners
• Three year project: January 2004 –
December 2006.
• 13 partners:
公司: BT(英国电信), Empolis GmbH,
iSOCO(Spain), Kea-pro GmbH, Ontoprise, Sirma
AI EOOD(Bulgaria), (+SIEMENS西门子公司)
大学: Jozef Stefan Institute(Slovenia), Univ.
Karlsruhe(Germany), Univ. Sheffield(U.K.), Univ.
Innsbruck(O), Univ. Autonoma Barcelona(Spain),
Vrije Universteit Amsterdam(The Netherlands)
China 2005
http://sekt.semanticweb.org/
67
Case Studies
• Legal Domain (iSOCO)
• Telecom Domain (BT)
• Siemens
China 2005
http://sekt.semanticweb.org/
68
SEKT Activities and Relationships
China 2005
http://sekt.semanticweb.org/
69
Core Tasks: WP3
China 2005
http://sekt.semanticweb.org/
70
Main Goals of WP3
AIFB
• Enable and greatly facilitate setting
up, usage and maintenance of
Ontologies and related Metadata
• Combine manual and (semi-) automatic
approaches for evolution of Ontologies
and related Metadata
• Make extensive use of reasoning
China 2005
http://sekt.semanticweb.org/
71
Task Overview
•
•
•
•
•
Reasoning with inconsistent Models
Multi-Version Reasoning
Inconsistency Diagnosis and Repair
China 2005
http://sekt.semanticweb.org/
AIFB
•
Incremental Ontology Evolution
Usage Tracking for Ontologies and
Metadata
Data-driven Change Discovery
72
WP3.4 Reasoning with
Inconsistency
• Milestone 3.4 – Software Prototypes
• D3.4.1: Reasoning with Inconsistent Models. V1.
P/PU/Month 12
China 2005
http://sekt.semanticweb.org/
73
What We are Expecting
• Given an inconsistent ontology, return
meaningful partial answers to queries
(given that fully logically correct answers
are not possible)
• Use nonstandard reasoning to deal with
inconsistency
China 2005
http://sekt.semanticweb.org/
74
WP3.5 Multi-Version Reasoning
• Main task: given two versions of an ontology
and a query, indicate how the changes in
the ontology have affected the answer to the
query.
• Milestone 3.5 – Software Prototypes
• D3.5.1: Multi-version reasoning V1. P/PU/Month
18
China 2005
http://sekt.semanticweb.org/
75
WP3.6 Inconsistency Diagnosis
and Repair
• Main task: given an inconsistent ontology,
locate possible sources of inconsistencies
and offer them to the user (a knowledge
engineer) for repair.
• Milestone 3.6 – Software Prototypes
• D3.6.1: Inconsistency Diagnosis and
Repair V1. P/PU/Month 21
China 2005
http://sekt.semanticweb.org/
76
Topics in Ontology
Management
•
•
•
•
•
Ontology reasoning
Ontology change and evolution
Ontology merge
Ontology mapping
Multi-version ontology reasoning and
management
• Inconsistent ontology processing
China 2005
http://sekt.semanticweb.org/
77