Transcript Document
语义网与本体技术导论 An Introduction to the Semantic Web and Ontology Technology 黄智生 Zhisheng Huang Vrije University Amsterdam The Netherlands [email protected] China 2005 http://sekt.semanticweb.org/ 1 从Google谈起 starting from Google China 2005 http://sekt.semanticweb.org/ 2 存在的问题 Existing Problems China 2005 http://sekt.semanticweb.org/ 3 我们能不能做得更好? Can we do it better? • 基于语义的搜索Semantics-based search • 概念组合描述 concept combination specification • 指定特定领域 domain specific • 逼近搜索 approximate search • 搜索代理 search agent China 2005 http://sekt.semanticweb.org/ 4 语义网(Semantic Web) •核心思想:给网络信息赋于确切定义的意义, 即语义。 „ The Semantic Web is an extension of the current web in which information is given welldefined meaning, better enabling computers and people to work in co-operation.“ [Berners-Lee et al., 2001] China 2005 http://sekt.semanticweb.org/ 5 语义是什么? What is the Semantics? • Frege(1848-1925): Reference and Sense • Syntax, Semantics, Pragmatics • Denotational Semantics vs. Operational Semantics Main features • 指称性 (denotation) • 唯一性(uniqueness) • 相关性(relatedness) China 2005 http://sekt.semanticweb.org/ 6 语义网想做什么? (What the Semantic Web wants to do) • 机器可自动处理 • 机器可理解 Content is machine-understandable if it is bound to some formal description of itself (i.e. metadata). China 2005 http://sekt.semanticweb.org/ 7 HTML标识(HTML Markup) …… <h2>Zhisheng Huang</h2> <b>Affiliation</b>: Department of Computer Science<br> Faculty of Sciences<br> Vrije University Amsterdam<p> <b>Email</b>: huang @ cs.vu.nl<br> <b>Phone</b>: 31-20-4447740(office) …… China 2005 http://sekt.semanticweb.org/ 8 XML标注 XML-Annotations <researcher><name>Zhisheng Huang</name> <affiliation> <department>Department of Computer Science</department> <faculty>Faculty of Sciences</faculty> <university>Vrije University Amsterdam</university> </affiliation> <email>huang @ cs.vu.nl</email> <phone id=“office”> (31)-20-4447740</phone> ……</researcher> China 2005 http://sekt.semanticweb.org/ 9 Data Structures • 结构化数据Structured Data: • Database • 半结构化数据Semi-structured Data: • HTML, XML, BibTex • 非结构化数据Non-structured Data: • Text China 2005 http://sekt.semanticweb.org/ 10 关系数据库的XML表示 XML representation of a relational database AI group member id name phone 001 John 1234567 002 Mary 7654321 … … … China 2005 <group name=“AI”> <member id=“001”> <name>John</name> <phone>1234567</phone> </member> <member id=“002”> <name>Mary</name> <phone>7654321</phone> </member> ….. </group> http://sekt.semanticweb.org/ 11 文件类型定义 Document Type Definition(DTD) <!DOCTYPE researcher [ <!ELEMENT researcher (name, affiliation, email, phone)> <!ELEMENT name (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ATTLIST phone id CDATA #REQUIRED > <!ELEMENT affiliation (department, faculty, university)> … ]> China 2005 http://sekt.semanticweb.org/ 12 数据模型Data Model Name n Phone Researcher eMail China 2005 Department has 1 Affiliation Faculty University http://sekt.semanticweb.org/ 13 XML模式XML Schema • The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD. China 2005 http://sekt.semanticweb.org/ 14 Why XML Schemas • XML Schemas are extensible to future additions • XML Schemas are richer and more useful than DTDs • XML Schemas are written in XML • XML Schemas support data types • XML Schemas support namespaces China 2005 http://sekt.semanticweb.org/ 15 名字冲突Name Conflicts • Since element names in XML are not fixed, very often a name conflict will occur when two different documents use the same names describing two different types of elements. • If these two XML documents were added together, there would be an element name conflict because both documents contain a same element with different content and definition. China 2005 http://sekt.semanticweb.org/ 16 XML名字空间XML NameSpace • Using Namespaces to solve Name Conflicts Examples: • xmlns:namespace prefix="namespace" • xmlns:xsd="http://www.w3.org/2001/XMLSche ma" China 2005 http://sekt.semanticweb.org/ 17 可扩展标识语言模式 XML Schema <xsd:element name="reseracher"> <xsd:complexType> <xsd:element name="name" type="xsd:String"/> <xsd:element name="affiliation" type="affil" minOccurs="1" maxOccurs="unbounded"/> <xsd:element name="phone" type="xsd:String"/> <xsd:element name="email" type="xsd:String"/> </xsd:complexType> </xsd:element> <xsd:complexType name="affil"> <xsd:element name= " department" type="xsd:String"/> <xsd:element name= " faculty" type="xsd:String"/> <xsd:element name="university" type="xsd:String"/> </xsd:complexType> China 2005 http://sekt.semanticweb.org/ 18 资源描述框架 Resource Description Framework(RDF) • Metadata is machine understandable information about web resources or anything that has an URI, it is represented as a set of independent assertions: Triple: T(subject, attribute, values) Creator Zhisheng Creator Cees http://wasp.cs.vu.nl/sekt/dig/dig.pdf <rdf:Description about="http://wasp.cs.vu.nl/sekt/dig/dig.pdf"> <dc:Creator rdf:ressource="http://www.cs.vu.nl/~huang"/> <dc:Creator rdf:ressource="mailto:[email protected]"/> </rdf:Description> China 2005 http://sekt.semanticweb.org/ 19 RDF: Dublin Core • The Dublin Core provides properties for describing network objects, suitable for use by network search engines. • The Dublin Core is a set of predefined properties for describing documents. • The first Dublin Core properties were defined at the Metadata Workshop in Dublin, Ohio in 1995 and is currently maintained by the Dublin Core Metadata Initiative. China 2005 http://sekt.semanticweb.org/ 20 Dublin Core Metadata Initiative • The Dublin Core Metadata Initiative is an open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. • http://dublincore.org/ China 2005 http://sekt.semanticweb.org/ 21 Annotating Metadata <rdf:Description rdf:about=…dc-rdf/"> <dc:title> Guidance on expressing the Dublin Core within the Resource Description Framework (RDF) </dc:title> <dc:creator> Eric Miller </dc:creator> <dc:creator> Paul Miller </dc:creator> <dc:creator> Dan Brickley </dc:creator> <dc:subject> Dublin Core; RDF; XML </dc:subject> <dc:publisher> Dublin Core Metadata Initiative </dc:publisher> <dc:contributor> Dublin Core Data Model Working Group </dc:contributor> <dc:date> 1999-07-01 </dc:date> <dc:format> text/html </dc:format> <dc:language> en </dc:language> </rdf:Description> China 2005 http://sekt.semanticweb.org/ 22 资源描述框架模式 RDF Schema (RDFS) • RDFS defines vocabulary for RDF • Organizes this vocabulary in a typed hierarchy • Class, subClassOf, type • Property, subPropertyOf • domain, range China 2005 http://sekt.semanticweb.org/ 23 RDFS Person subClassOf Student domain hasSuperVisor subClassOf range type type Prof. Ma Wang China 2005 Professor http://sekt.semanticweb.org/ 24 概念与本体 Concepts and Ontologies • Philosophical discipline, branch of philosophy that deals with the nature and the organisation of reality. • Science of Being (Aristotle, Metaphysics, IV,1) • What is being? • What are the features common to all beings? China 2005 http://sekt.semanticweb.org/ 25 Vocabulary and Ontology • Controlled vocabulary (Jernst 2003) : • a list of controlled terms • unambiguous • non-redundant definition • Ontology: a controlled vocabulary expressed in an ontology representation language (Jernst 2003) China 2005 http://sekt.semanticweb.org/ 26 In computer science … • An ontology is an explicit specification of a conceptualization. [Gruber93] • An ontology is a shared understanding of some domain of interest. [Uschold, Gruninger96] • There are many definitions • a formal specification EXECUTABLE • of a conceptualization of a domain COMMUNITY • of some part of world that is of interest APPLICATION • Defines • A common vocabulary of terms • Some specification of the meaning of the terms • A shared understanding for people and machines China 2005 http://sekt.semanticweb.org/ 27 Why develop an ontology? • To make domain assumptions explicit • Easier to change domain assumptions • Easier to understand and update legacy data • To separate domain knowledge from operational knowledge • Re-use domain and operational knowledge separately • A community reference for applications • To share a consistent understanding of what information means. China 2005 http://sekt.semanticweb.org/ 28 本体的主要特征 Key features of an Ontology • 概念层次性Concept hierarchy, • 概念包含关系concept subsumption •特殊与一般关系 InstanceOf Relation (Instances) •部分与整体关系 PartOf Relation (property) China 2005 http://sekt.semanticweb.org/ 29 Why not other alternatives • 一阶谓词逻辑 the first-order predicate logic • 集合论 set theory • 程序语言 programming languages China 2005 http://sekt.semanticweb.org/ 30 RDF(S) Reconsideration • Next step up from plain XML: • (small) ontological commitment to modeling primitives • possible to define vocabulary • However: • no precisely described meaning • unclear semantics, no clean separation between: • Instances • Concepts • Meta-ontologies (e.g. RDFS language itself) • no inferencehttp://sekt.semanticweb.org/ model China 2005 31 China 2005 http://sekt.semanticweb.org/ 32 网络本体语言 Web Ontology Language (OWL) • • • • • • China 2005 OWL is built on top of RDF OWL is for processing information on the web OWL was designed to be interpreted by computers OWL was not designed for being read by people OWL is written in XML OWL is a web standard http://sekt.semanticweb.org/ 33 Design Goals for OWL China 2005 http://sekt.semanticweb.org/ 34 Layered language • • • OWL Lite: • Classification hierarchy • Simple constraints OWL DL: • Maximal expressiveness • While maintaining tractability • Standard formalisation OWL Full: • Very high expressiveness • Loosing tractability • Non-standard formalisation • All syntactic freedom of RDF (self-modifying) China 2005 http://sekt.semanticweb.org/ Full DL Lite Syntactic layering Semantic layering 35 China 2005 http://sekt.semanticweb.org/ 36 China 2005 http://sekt.semanticweb.org/ 37 China 2005 http://sekt.semanticweb.org/ 38 OWL Example: animals <?xml version="1.0"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:dc="http://purl.org/dc/elements/1.1/" xml:base="http://wasp.cs.vu.nl/sekt/ontology/animal"> <owl:Ontology rdf:about=“animal"/><owl:Class rdf:ID="Eagle"> <rdfs:subClassOf><owl:Class rdf:about="#Bird"/> </rdfs:subClassOf></owl:Class><owl:Class rdf:ID="Animal"/> <owl:Class rdf:ID="Fly"><owl:disjointWith> <owl:Class rdf:about="#Penguin"/></owl:disjointWith> <rdfs:subClassOf rdf:resource="#Animal"/> </owl:Class><owl:Class rdf:ID="Bird"> <rdfs:subClassOf rdf:resource="#Fly"/> </owl:Class> <owl:Class rdf:ID="Penguin"> <rdfs:subClassOf rdf:resource="#Bird"/> <owl:disjointWith rdf:resource="#Fly"/> </owl:Class> </rdf:RDF> China 2005 http://sekt.semanticweb.org/ 39 China 2005 http://sekt.semanticweb.org/ 40 Semantic Web Layers China 2005 http://sekt.semanticweb.org/ 41 语义网的逻辑基础 Logical Foundation of the Semantic Web 描述逻辑与框架逻辑之争 Description Logic vs. Frame-Logic • 封闭世界假说与开放世界假说 Closed world assumption vs. Open world assumption • 唯一名假说与非唯一名假说 Unique name assumption vs. Non-unique name assumption • 面向对象与非面向对象 Object-oriented vs. non-object oriented • ….. China 2005 http://sekt.semanticweb.org/ 42 描述逻辑 Description Logic Man ´ Human u Male Happy-Father ´ Man u 9 has-child Female u … Abox (data) John : Happy-Father hJohn, Maryi : has-child China 2005 http://sekt.semanticweb.org/ Interface Tbox (schema) Inference System Knowledge Base 43 Basic Description Logic: AL • Concept Expressions: • A (atomic concept) • (universal concept) • (bottom concept) • A (atomic negation) • C ⊓ D (intersection) • R.C (value restriction) • R.T (limited existential quantification) where A is a concept name, C and D are concept expressions, and R is a role expression China 2005 http://sekt.semanticweb.org/ 44 Family of AL language C ⊔ D (Union) R.C (Full Existential Quantification) C (Complement) Number restriction • ( n R) (at least restriction) • ( n R) (at most restriction) • Qualified number restriction • ( n R.C) (at least restriction) • ( n R.C) (at most restriction) • Transitive Role: R+ • Inverse of Role: I R S: H • Role Hierarchies http://sekt.semanticweb.org/ 45 • • • • China 2005 Examples woman ≡ person ⊓ female man ≡ person ⊓ woman mother ≡ woman ⊓ hasChild.person father ≡ man ⊓ hasChild.person China 2005 http://sekt.semanticweb.org/ 46 Description Logics • • Decidable Subset of First-Order Logic • Equivalent to 3 Variable Fragment (Borgida 1996) • Model theoretic semantics by mapping to abstract domain Provides Primitives for defining Conceptual Knowledge • Concept Expressions (Formulas with 1 free variable) for describing Sets of Objects • Boolean Operators: C D, C D, C • Quantifiers: (R.C), (P.C) • Cardinality Constraints: (= n R), (> n R), (< n R), ( n R), ( n R) • Axioms define relations between concepts • Subsumption: C D • Equivalence: C D • Disjointness: C D China 2005 http://sekt.semanticweb.org/ 47 DL Semantics • Interpretation function extends to concept expressions in an obvious(ish) way, i.e.: China 2005 http://sekt.semanticweb.org/ 48 DL for OWL: SHIQ • SHIQ = ALCQHIR+ China 2005 http://sekt.semanticweb.org/ 49 Frame-logic (F-logic) • • • • • China 2005 Object oriented Frame based Rule-based … Negation as failure http://sekt.semanticweb.org/ 50 Example /* facts */ abraham:man. sarah:woman. isaac:man[father->abraham; mother->sarah]. ishmael:man[father->abraham; mother->hagar:woman]. jacob:man[father->isaac; mother->rebekah:woman]. esau:man[father->isaac; mother->rebekah]. /* rules consisting of a rule head and a rule body */ FORALL X,Y X[son->>Y] <- Y:man[father->X]. FORALL X,Y X[son->>Y] <- Y:man[mother->X]. FORALL X,Y X[daughter->>Y] <- Y:woman[father->X]. FORALL X,Y X[daughter->>Y] <- Y:woman[mother->X]. /* query */ FORALL X,Y <- X:woman[son->>Y[father->abraham]]. China 2005 http://sekt.semanticweb.org/ 51 Semantic Web Application: the foaf project • The Friend of a Friend (FOAF) project is about creating a Web of machinereadable homepages describing people, the links between them and the things they create and do. • http://www.foaf-project.org/ China 2005 http://sekt.semanticweb.org/ 52 China 2005 http://sekt.semanticweb.org/ 53 Foaf.rdf <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <foaf:Person> <foaf:surname>Huang</foaf:surname> <foaf:name>Zhisheng Huang</foaf:name> <foaf:firstName>Zhisheng</foaf:firstName> <foaf:gender>male</foaf:gender> <foaf:img rdf:resource="http://wasp.cs.vu.nl/~huang/images/huang02.jpg" /> <foaf:homepage rdf:resource="http://wasp.cs.vu.nl/~huang/" /> <foaf:mbox_sha1sum>238a59a17bd96fbb93f39aa9dba2f6847a8d261c</foaf:mbox_sha1sum> <foaf:workplaceHomepage rdf:resource="http://www.vu.nl/" /> <foaf:mbox>mailto:[email protected]</foaf:mbox> <foaf:knows> <foaf:Person> <foaf:name>Annette ten Teije</foaf:name> <foaf:mbox_sha1sum>c10984c365331f1d38f649adccbb7aac5873aed2</foaf:mbox_sha1sum> </foaf:Person> </foaf:knows> <foaf:knows> <foaf:Person> …… China 2005 http://sekt.semanticweb.org/ 54 China 2005 http://sekt.semanticweb.org/ 55 Add the FOAF information on the homepage <html> <head> ... <link rel="meta“ type="application/rdf+xml" title="FOAF" href="foaf.rdf" /> </head> <body> ... </body> </html> • FOAF Agents on the Internet will now be able to locate the FOAF entry. China 2005 http://sekt.semanticweb.org/ 56 FOAFBot: IRC Community Support Agent • FOAFBot is an IRC bot that provides access to a knowledge base created by spidering FOAF files. • It can sit on an IRC channel and provide basic informational help about the members of a community. China 2005 http://sekt.semanticweb.org/ 57 DOPE • The DOPE Browser is a deliverable created by Aduna BV for the Drug Ontology Project for Elsevier (DOPE), a project funded by the Elsevier Advanced Technology Group. China 2005 http://sekt.semanticweb.org/ 58 China 2005 http://sekt.semanticweb.org/ 59 China 2005 http://sekt.semanticweb.org/ 60 China 2005 http://sekt.semanticweb.org/ 61 Variants of ontologies • 具体领域的本体 Domain ontology: domain specific ontology. • Upper Ontology: limited to concepts that are meta, generic, abstract and philosophical, general enough to address (at a high level) a broad range of domain areas. China 2005 http://sekt.semanticweb.org/ 62 Suggested Upper Merged Ontology (SUMO) • http://www.ontologyportal.org/ • SUMO is written in the SUO-KIF language • Largest free, formal ontology available, with 20,000 terms and 60,000 axioms when all domain ontologies are combined. China 2005 http://sekt.semanticweb.org/ 63 SUMO These consist of SUMO itself, the MId-Level Ontology (MILO), and ontologies of •Communications •Countries and Regions •Distributed computing, •Economy •Finance, •Engineering components •Geography, •Government, •Military, •People •Transportation China 2005 http://sekt.semanticweb.org/ 64 基因本体Gene Ontology • http://www.geneontology.org/ • Controlled vocabulary to describe gene and gene product attributes in any organism. • Updated every 30 minutes • 9759 biological_process 1574 cellular_component 7076 molecular_function (up to 16/8/2005) • Format: Obo, GO, OWL China 2005 http://sekt.semanticweb.org/ 65 语义网核心研究课题: SEKT Project • Semantically Enabled Knowledge Technologies (SEKT) • A European research and development project launched under the EU Sixth Framework Programme. . China 2005 http://sekt.semanticweb.org/ 66 Duration and Partners • Three year project: January 2004 – December 2006. • 13 partners: 公司: BT(英国电信), Empolis GmbH, iSOCO(Spain), Kea-pro GmbH, Ontoprise, Sirma AI EOOD(Bulgaria), (+SIEMENS西门子公司) 大学: Jozef Stefan Institute(Slovenia), Univ. Karlsruhe(Germany), Univ. Sheffield(U.K.), Univ. Innsbruck(O), Univ. Autonoma Barcelona(Spain), Vrije Universteit Amsterdam(The Netherlands) China 2005 http://sekt.semanticweb.org/ 67 Case Studies • Legal Domain (iSOCO) • Telecom Domain (BT) • Siemens China 2005 http://sekt.semanticweb.org/ 68 SEKT Activities and Relationships China 2005 http://sekt.semanticweb.org/ 69 Core Tasks: WP3 China 2005 http://sekt.semanticweb.org/ 70 Main Goals of WP3 AIFB • Enable and greatly facilitate setting up, usage and maintenance of Ontologies and related Metadata • Combine manual and (semi-) automatic approaches for evolution of Ontologies and related Metadata • Make extensive use of reasoning China 2005 http://sekt.semanticweb.org/ 71 Task Overview • • • • • Reasoning with inconsistent Models Multi-Version Reasoning Inconsistency Diagnosis and Repair China 2005 http://sekt.semanticweb.org/ AIFB • Incremental Ontology Evolution Usage Tracking for Ontologies and Metadata Data-driven Change Discovery 72 WP3.4 Reasoning with Inconsistency • Milestone 3.4 – Software Prototypes • D3.4.1: Reasoning with Inconsistent Models. V1. P/PU/Month 12 China 2005 http://sekt.semanticweb.org/ 73 What We are Expecting • Given an inconsistent ontology, return meaningful partial answers to queries (given that fully logically correct answers are not possible) • Use nonstandard reasoning to deal with inconsistency China 2005 http://sekt.semanticweb.org/ 74 WP3.5 Multi-Version Reasoning • Main task: given two versions of an ontology and a query, indicate how the changes in the ontology have affected the answer to the query. • Milestone 3.5 – Software Prototypes • D3.5.1: Multi-version reasoning V1. P/PU/Month 18 China 2005 http://sekt.semanticweb.org/ 75 WP3.6 Inconsistency Diagnosis and Repair • Main task: given an inconsistent ontology, locate possible sources of inconsistencies and offer them to the user (a knowledge engineer) for repair. • Milestone 3.6 – Software Prototypes • D3.6.1: Inconsistency Diagnosis and Repair V1. P/PU/Month 21 China 2005 http://sekt.semanticweb.org/ 76 Topics in Ontology Management • • • • • Ontology reasoning Ontology change and evolution Ontology merge Ontology mapping Multi-version ontology reasoning and management • Inconsistent ontology processing China 2005 http://sekt.semanticweb.org/ 77