Semantic Web Overview

Download Report

Transcript Semantic Web Overview

Intelligent Systems Lab.
Semantic Web:
Overviews and Trends
Prof. Joongmin Choi
Intelligent Systems Laboratory
Dept. of Computer Science and Engineering
Hanyang University
한국어정보처리튜토리얼 '04
1
Contents

Semantic Web in General


Roles of XML and RDF in Semantic Web





XML – syntactical foundation
RDF – semantic foundation
Ontologies


Motivations, Definitions, Goals
Definitions, Languages, Tools
Semantic Web Services
Applications
Trends
한국어정보처리튜토리얼 '04
2
Intelligent Systems Lab.
Semantic Web in General
Motivations
Definitions
Goals
한국어정보처리튜토리얼 '04
3
Current Web

Current Web is for humans




Most of the Web’s content today is designed for humans to
read, not for computer programs to manipulate meaningfully.
Computers can adeptly parse Web pages for layout and
routine processing, but, in general, computers have no
reliable way to process the semantics.
Can easily lead to information overload and poor content
aggregation
Need a new paradigm to make Web for machines (programs,
or software agents)
한국어정보처리튜토리얼 '04
4
Original Web Envisioned by Berners-Lee
한국어정보처리튜토리얼 '04
5
Before and After ?
한국어정보처리튜토리얼 '04
6
What is the Semantic Web?

T. Berners-Lee, J. Hendler, O. Lassila


The Semantic Web is not a separate Web, but an extension
of the current one, in which information is given welldefined meaning, better enabling computers and people to
work in cooperation.
W3C (World Wide Web Consortium)

The Semantic Web is the abstract representation of data on
the World Wide Web, based on the RDF standards and other
standards to be defined.
한국어정보처리튜토리얼 '04
7
Goals of Semantic Web

To develop enabling standards and technologies
designed to help machines understand more
information on the Web

To support richer discovery, data integration,
navigation, and automation of tasks
한국어정보처리튜토리얼 '04
8
Semantic Web Layers
한국어정보처리튜토리얼 '04
9
Smart Data Continuum

Semantic Web Technologies


Are the foundations of a systematic approach to creating
“smart data”
Smart Data Continuum




Text
XML
XML
XML
documents and database records
documents using single vocabularies
taxonomies and docs with mixed vocabularies
ontology and automated reasoning
한국어정보처리튜토리얼 '04
10
Intelligent Systems Lab.
XML and RDF
XML- Syntactical Foundation
RDF – Semantic Foundation
한국어정보처리튜토리얼 '04
11
XML

eXtensible Markup Language




XML is a markup language for the representation of
documents which contain structured information
Structured information contains both content and some
indication of what role that content plays
A markup language is a mechanism to identify structures in
a document
The XML specification defines a standard way to add
markup to documents
한국어정보처리튜토리얼 '04
12
How XML fits into Semantic Web?

Characteristics of XML



XML creates application-independent documents and data
XML has a standard syntax for metadata
XML has a standard structure for both documents and data
 XML is the syntactical foundation layer of Semantic Web
한국어정보처리튜토리얼 '04
13
XML is not enough

Limitations of XML

Many ways to say the same thing
 Multiple valid structures for the same data

Not impose a common interpretation of a data
 heading vs. title
 price vs. cost
한국어정보처리튜토리얼 '04
14
RDF

Resource Description Framework




RDF is an infrastructure that enables the encoding,
exchange and reuse of structured metadata
RDF is a foundation for processing metadata
 Provides interoperability between applications that
exchange machine-understandable information on the
Web
RDF is an application of XML that imposes needed structural
constraints to provide unambiguous methods of expressing
semantics
RDF is the semantic foundation layer of Semantic Web
한국어정보처리튜토리얼 '04
15
RDF Model

Resources



All things being described by RDF expressions
A resource may be an entire Web page; may be a part of a
Web page; may be a whole collection of pages;
Properties


A specific aspect, characteristic, attribute, or relation used
to describe a resource
Each property has a specific meaning


defines its permitted values, the types of resources it can
describe, and its relationship with other properties.
Statements

A specific resource together with a named property plus the
value of that property for that resource is an RDF statement.
한국어정보처리튜토리얼 '04
16
RDF Graph Model
Statement: SPO triple
Property (predicate)
URI
URI
Resource
Resource
(subject)
(object)
한국어정보처리튜토리얼 '04
17
Example : RDF Model
“The individual referred to by employee id 85740 is named Ora Lassila
and has the email address [email protected].
The resource http://www.w3.org/Home/Lassila was created by this individual.”
Structured value with identifier
한국어정보처리튜토리얼 '04
18
Example: XML Serialization
“The individual referred to by employee id 85740 is named Ora Lassila
and has the email address [email protected].
The resource http://www.w3.org/Home/Lassila was created by this individual.”
한국어정보처리튜토리얼 '04
19
RDF Schema

RDF Schemas are used to declare vocabularies


A schema defines the terms that will be used in RDF
statements and gives specific meanings to them.
Provides a basic type system for use in RDF models
 Defines resources and properties such as Class and
subClassOf that are used in specifying application-specific
schemas

XML DTD vs. RDF Schemas


An XML DTD gives specific constraints on the structure of a
document
An RDF Schema provides information about the
interpretation of the statements given in an RDF data model
한국어정보처리튜토리얼 '04
20
XML vs. RDF

XML was designed for documents, not data.





Many features (like attributes and entities) are documentoriented, not for expressing data
There are many ways to say the same thing in XML
Hybrid tree structure: confusing and nonstandard
Makes basic operations more complex (e.g. merging)
RDF was designed for data



The number of changes you can make to a triple are fairly
small
Simple structure: triples
Merging two documents are simply combining two into one
한국어정보처리튜토리얼 '04
21
Intelligent Systems Lab.
Ontology
Definitions
Languages
Tools
한국어정보처리튜토리얼 '04
22
Ontology – Definition (I)

R. Neches et al. (1991)
An Ontology defines the basic terms and relations
comprising the vocabulary of a topic area, as
well as the rules for combining terms and
relations to define extensions to the vocabulary
한국어정보처리튜토리얼 '04
23
Ontology – Definition (II)

An ontology is a formal, explicit specification of
a shared conceptualization.
(by T. Gruber, 1993, Studer et. al. 1998)




Conceptualization: an abstract model of how people think
about things in the world, usually restricted to a particular
subject area
Explicit specification: the type of concepts used and the
constraints on their use are explicitly defined
Formal: the ontology should be machine understandable
 different degrees of formality are possible
Shared: an ontology captures consensual knowledge; it is
not restricted to some individual but is accepted by a group
한국어정보처리튜토리얼 '04
24
Ontology for Semantic Web

Web Ontology



A program that wants to compare or combine information
across the two databases has to know that two terms are
being used to mean the same thing.
Ontologies can play a crucial role in enabling Web-based
knowledge processing, sharing, and reuse between
applications.
The most typical kind of ontology for the Web has a
taxonomy and a set of inference rules.
한국어정보처리튜토리얼 '04
25
Taxonomy and Inference Rules

Taxonomy



Defines classes of objects and relations among them
Classes, subclasses and relations among entities are a very
powerful tool for Web use
Inference rules


A program can deduce new instances.
The computer doesn’t truly understand any of this
information, but it can now manipulate the terms much more
effectively in ways that are useful and meaningful to the
human user.
한국어정보처리튜토리얼 '04
26
Libraries of Ontologies
한국어정보처리튜토리얼 '04
27
Ontology Example – Human Resource
한국어정보처리튜토리얼 '04
28
Ontology Languages

“Ontology Languages for the Semantic Web”
paper








IEEE Intelligent Systems, Jan/Feb 2002, pp 54-60
analyze the most representative ontology languages created
for the Web, and compare them using a common framework
XOL (XML-based Ontology Exchange Language)
SHOE (Simple HTML Ontology Extension)
OML (Ontology Markup Language)
RDF(S) (Resource Description Framework (Schema))
OIL (Ontology Interchange Language)
DAML+OIL (DARPA Agent Markup Language + OIL)
한국어정보처리튜토리얼 '04
29
DAML+OIL

DARPA Agent Markup Language + OIL (Ontology
Interchange Language)
(www.w3.org/TR/daml+oil-reference)





A semantic markup language for Web resources
Builds on earlier W3C standards such as RDF and RDF
Schema, and extends these languages with richer modeling
primitives
Provides modeling primitives commonly found in framebased languages
A DAML+OIL ontology consists of headers, class elements,
property elements, and instances
OWL is a revision of DAML+OIL
한국어정보처리튜토리얼 '04
30
Defining Classes

In order to describe objects, it is useful to define some basic
types. This is done by giving a name for a class, which is the
subset of the universe which contains all objects of that type.
한국어정보처리튜토리얼 '04
31
subClassOf

The subClassOf element
asserts that its subject is a
subclass of its object

Multiple superclasses
한국어정보처리튜토리얼 '04
32
Defining Properties
Object property
Datatype property
한국어정보처리튜토리얼 '04
33
Defining Property Restrictions

Restriction defines an anonymous class, namely the class of all
things that satisfy the restriction.
한국어정보처리튜토리얼 '04
34
Notations for Properties
UniqueProperty
inverseOf
TransitiveProperty
samePropertyAs
한국어정보처리튜토리얼 '04
35
Notations for Classes
complementOf
disjointUnionOf
intersectionOf
sameClassAs
한국어정보처리튜토리얼 '04
36
Defining Individuals
한국어정보처리튜토리얼 '04
37
Description Logic

Description Logic



Evolved from semantic networks
Express and reason with complex definitions of, and
relations among, objects and classes
Designed to focus on categories and their definitions


Inference Tasks



Terminological logics
Subsumption: checking if one category is a subset of another
based on their definitions
Classification: checking if an object is a subset of another
based on their definitions
Description logic and DAML+OIL

intersectionOf, unionOf, complementOf, cardinality, …
한국어정보처리튜토리얼 '04
38
Comparison Criteria of Ontology Languages

Concepts


Taxonomies


n-ary relations or functions, type constraints, integrity constraints,
operational definitions
Axioms


subclass, disjoint decomposition, exhaustive decomposition, NOT
subclass of
Relations and functions


concepts, attributes, facets
first-order logic, second-order logic, independent axioms,
embedded axioms
Instances

instances of concepts, facts, claims
한국어정보처리튜토리얼 '04
39
Ontology Tools










COHSE: Conceptual Open Hypermedia Service
Environment
KAON: RDF 기반 온톨로지 관련 도구
OntoEdit: 온톨로지 모델의 설계, 적용을 위한 개발환경
Protégé-2000: 온톨로지 에디터 내장(Text/RDF/JDBC)
SiLRI: Java로 작성된 추론엔진
Cerebra
OILed
Ontoprise
Flogic
SymOntos
한국어정보처리튜토리얼 '04
40
Intelligent Systems Lab.
Semantic Web Services
Web Service
Semantic Web Service
DAML-S
한국어정보처리튜토리얼 '04
41
Web Service - Definition
“Web services are a new breed of Web application.
They are self-contained, self-describing, modular
applications that can be published, located, and
invoked across the Web. Web services perform
functions, which can be anything from simple
requests to complicated business processes. …
Once a Web service is deployed, other applications
(and other Web services) can discover and invoke
the deployed service.”
IBM web service tutorial
한국어정보처리튜토리얼 '04
42
Web Service - Objectives
 So
far the Web has provided
Application to human Interaction by way of browser
 Browsing of linked documents
 User-initiated purchases and transactions
 File download

 Web
services provide infrastructure for business on
the Web
Transactions initiated automatically by a program
 Can be described, published, discovered, and invoked dynamically
in a distributed computing environment
 Support intelligent agents, marketplaces, auctions …
 Built on XML and other internet standards

 Content-based
web complemented by a services-
based web
한국어정보처리튜토리얼 '04
43
Traditional Web vs. Web Service
한국어정보처리튜토리얼 '04
44
Service-Oriented Architecture
한국어정보처리튜토리얼 '04
45
Web Service Standards

Messaging protocol
SOAP (Simple Object Access Protocol)

Service description language
WSDL (Web Services Description Language)

Service registry
UDDI (Universal Description, Discovery and Integration)
한국어정보처리튜토리얼 '04
46
Semantic Web Service

Web service description for the Semantic Web




WSDL provides a communication level description of the
messages and protocols used by a Web service
Need to develop semantic markup at the application level
above WSDL
Describe what is being sent across the wires and why, not
just how it is being sent
DAML-S


DAML-based Web service ontology
Describing the properties and capabilities of Web services in
unambiguous, computer-interpretable form
한국어정보처리튜토리얼 '04
47
Web, Semantic Web, Web Service
Dynamic
Static
Web Services
SOAP, WSDL, UDDI
WWW
URI, HTML, HTTP
Less Semantic
한국어정보처리튜토리얼 '04
Semantic Web Service
DAML-S
Semantic Web
RDF, DAML+OIL, OWL
Semantic
48
Automation Enabled by DAML-S

Web service discovery


Web service invocation


Buy me 500 lbs. powdered milk from www.acmemoo.com
Web service selection & composition


Find me a shipping service that transports goods to Dubai
Arrange food for 500 people for 2 weeks in Dubai
Web service execution monitoring

Has the powdered milk been ordered and paid for yet?
한국어정보처리튜토리얼 '04
49
Upper Ontology of Services

Three essential types of knowledge about a service

Abstract representation



Service Profile
 What the service does?
 It gives the types of information needed by a service-seeking
agent to determine whether the service meeting its needs
Service Model
 How the service works?
 It describes what happens when the service is carried out
Physical representation

Service Grounding
 How to access it?
 It specifies the details of how an agent can access a service
 Typically a grounding will specify a communication protocol,
message formats, and other service-specific detail
한국어정보처리튜토리얼 '04
50
Intelligent Systems Lab.
Applications
SHOE
ITTalks
Semantic Search
Annotations
한국어정보처리튜토리얼 '04
51
SHOE

Simple HTML Ontology Extension



(www.cs.umd.edu/projects/plus/SHOE/)
Created as an extension of HTML, incorporating machinereadable semantic knowledge in HTML documents or other
Web documents
Makes it possible for agents to gather meaningful
information about Web pages and documents, improving
search mechanisms and knowledge gathering
This process consists of three phases:
 define an ontology
 annotate HTML pages with ontological information
 have an agent semantically retrieve information
한국어정보처리튜토리얼 '04
52
Example: SHOE Ontology

CS-DEPT-Ontology
Relationships
Categories
한국어정보처리튜토리얼 '04
53
SHOE Example
한국어정보처리튜토리얼 '04
54
SHOE Example (cont.)
SHOE
Metadata
HTML
data
한국어정보처리튜토리얼 '04
55
SHOE Applications

Semantic Search


The Knowledge Annotator


A Java program which allows you to annotate your web
pages with SHOE graphically
Exposé


SHOE search engine
A web robot which searches out web pages with SHOE
entries and gathers the associated knowledge
PIQ (Parka Interface for Queries)

A Java tool that allows you to visually query the SHOE
information that has been discovered by Exposé
한국어정보처리튜토리얼 '04
56
ITTalks

ITTalks


A database driven web site of IT related talks at UMBC and
other institutions.
The database contains information on





(umbc.ittalks.org)
Seminar events
People (speakers, hosts, users, …)
Places (rooms, institutions, …)
The database is used to dynamically generate web pages
and DAML descriptions for the talks and related information
Serves as a focal point for agent-based services relating to
these talks
한국어정보처리튜토리얼 '04
57
ITTalks – Main Interface
한국어정보처리튜토리얼 '04
58
ITTalks – Seminar Announcement
한국어정보처리튜토리얼 '04
59
ITTalks – DAML+OIL Representation
한국어정보처리튜토리얼 '04
60
ITTalks - Ontologies

Defined and use the following ontologies
(http://daml.umbc.edu/ontologies)







calendar-ont.daml – calendar and schedule info
classification.daml – ACM CCS topics
person-ont.daml – people and their attributes
place-ont.daml – talk locations
profile-ont.daml – user modeling info
talk-ont.daml – talks info
topic-ont.daml – topics and interests
한국어정보처리튜토리얼 '04
61
ITTalks - Topics Ontology
한국어정보처리튜토리얼 '04
62
ITTalks - Two Capabilities

Two advanced capabilities facilitated by DAML
 Classifying
talk topics and user interests using
DAML ontologies
 Using DAML as a communication language among
software agents
한국어정보처리튜토리얼 '04
63
ITTalks Agents
ITTALKS app
10
1
mapquest
18
ITTALKS
agent
17
user’s daml profile
11
Travel
agent
Communication
protocol
KQML
9
2
12
API
User
agent
3
DAML
reasoning engine
DAML reasoner
한국어정보처리튜토리얼 '04
8
16
13 5
Calendar
agent
7
15
4
Broker
Agent
14
6
Agent
Name
Server
Common agent infrastructure
user’s calendar
app
64
Semantic Search

TAP 기반 시맨틱 검색 시스템 (Stanford Univ.)

ABS (Activity Based Search)



W3C Semantic Search



키워드 검색엔진 결과에 의미데이터 검색결과를 동시 display
음악가, 운동선수, 영화배우, 장소, 제품 등 정보검색
질의와 의미적으로 연관된 내용만 검색
W3C 서버에 저장된 RDF 파일에 대한 검색
DAML 기반 ASCS 시스템 (Teknowledge)

SSA (Semantic Search Agent)


STS (Semantic Translation Service)


공유 온톨로지 기반 정보 검색 에이전트
서로 다른 온톨로지를 가진 에이전트간 통신, 온톨로지 변환
Prolog assertion과 Prolog query 이용
한국어정보처리튜토리얼 '04
65
Activity Based Search (1)
한국어정보처리튜토리얼 '04
66
Activity Based Search (2)
한국어정보처리튜토리얼 '04
67
DAML ASCS: Architecture
한국어정보처리튜토리얼 '04
68
DAML ASCS: Ontologies


IEEE Standard Upper Merged Ontology (SUMO)
Domain specific ontologies

Quality of Service ontology, covering computer systems and
networks







Ecommerce Services ontology
Ontology of biological viruses
Financial ontology
Ontology of terrain features
Ontologies of weapons of mass destruction and terrorism
An ontology of Government
Periodic table of the elements
한국어정보처리튜토리얼 '04
69
DAML Semantic Search Query
한국어정보처리튜토리얼 '04
70
DAML Semantic Search Results
한국어정보처리튜토리얼 '04
71
Broadening Search
samePropertyAs, sameClassAs, (sameIndividualAs)
 inverseOf



subPropertyOf, subClassOf


Search for (?X childOf ?Y) should also return results when content
is coded as (?Y parentOf ?X)
Generalization and specialization
Search is ordered so that exact matches are returned first
and broadening happens next
한국어정보처리튜토리얼 '04
72
Web Annotations

Web Annotations




Annotations can be viewed as statements made by an
author about a Web document.
Annotations are external to the documents and can be
stored in one or more annotation servers.
Annotations are shared in that everyone having access to an
annotation server should be able to consult the annotations
associated with a given document and add their own
annotations.
Tools: Annotea, SHOE, Ont-O-Mat
한국어정보처리튜토리얼 '04
73
Annotea Project

Annotea



Use an RDF based annotation schema for describing
annotations as metadata
Use Xpointer for locating the annotations in the annotated
document
Annotation server


(www.w3.org/2001/Annotea/)
The annotations are stored in servers as metadata
Annotation client (editor/browser)


capable of understanding the annotation metadata
capable of interacting with an annotation server with the HTTP
service protocol.
한국어정보처리튜토리얼 '04
74
Architecture of Annotea
한국어정보처리튜토리얼 '04
75
Amaya Editor/Browser
Amaya is a client implementation of Annotea
한국어정보처리튜토리얼 '04
76
Intelligent Systems Lab.
Trends and Future
Logic, Proof, Trust
Activities
Future
한국어정보처리튜토리얼 '04
77
Moving to the Future of the Web
한국어정보처리튜토리얼 '04
78
Logic

Semantic Web Logic



Parts of the Semantic Web that haven’t been developed yet
State any logical principle and permit the computer to
reason (by inference) using these principles
ex)
Let’s say one company decides that if someone sells more
than 100 of our products, then they are a member of the
Super Salesman club.
한국어정보처리튜토리얼 '04
79
Proof
Once we begin to build systems that follow logic, it
makes sense to use them to prove things
 Ex)



Corporate sales records show that Jane has sold 55 widgets and
66 sprockets. The inventory system states that widgets and
sprockets are both different company products. The built-in math
rules state that 55 + 66 = 121 and that 121 is more than 100. And,
as we know, someone who sells more than 100 products is a
member of the Super Salesman club. The computer puts all these
logical rules together into a proof that Jane is a Super Salesman.
Difficult to create proofs, but it’s very easy to check them
한국어정보처리튜토리얼 '04
80
Trust

Problems



Digital signatures


Anything can say anything about anything
Issue of whether we can trust RDF statements
Provide proof that a certain person wrote a document
Web of trust


RDF trust is based on the way human trust works
Picking a bunch of friends (who we mostly trust), and then
their friends (who we trust a little less), and then their
friends (who we trust even less), and so on.
한국어정보처리튜토리얼 '04
81
W3C Semantic Web Activity

RDF Interest Group


RDF Core Working Group


A forum for W3C Members and non-Members to discuss
innovative applications of RDF
Chartered to consider update to the RDF Model and Syntax
Recommendation, and to a few revisions to the RDF
Schema specification
Web Ontology Working Group

Chartered to build upon the RDF Core work a language for
defining structured web based ontologies which will provide
richer integration and interoperability of data among
descriptive communities
한국어정보처리튜토리얼 '04
82
Semantic Web Trends

DAML+OIL is already the most used ontology
language ever!!


Gaining acceptance by web players



3.5M statements on 25,000 web pages
Semantic Web Track being offered at WWW 2002
3x more people attended WWW2002 Developer Day on SW
than attended KR
Significant (international) government support



US DARPA/NSF; EU IST Framework 5,6
Japan, Germany, Australia considering significant
investments
US National Cancer Institute to publish cancer vocabulary in
DAML+OIL
한국어정보처리튜토리얼 '04
83
국내 시맨틱 웹 연구 방향

국내 시맨틱 웹 연구진





산•관•연의 관심 증가추세
표준화 노력



AI 분야
E-Business, database 분야
다수의 응용 분야 – 의학, Bioinformatics, ..
W3C Korea branch
시맨틱 웹 한글용어 표준화
시맨틱 웹 관련 메일링 리스트
[email protected]
[email protected]
[email protected]
한국어정보처리튜토리얼 '04
84
Conclusions


It is no longer a question of whether the semantic
web will come into being, it is already here!
We’re already well past the starting gate




Web ontologies, term languages, “shims” to DB and services,
research in proofs/rules/trust
Standardization providing a common denominator for KR
researchers as well as web developers
Small companies starting to form, Big companies starting to
move
The current environment is open, encouraging,
moving fast, and exciting
한국어정보처리튜토리얼 '04
85
People
Tim Berners-Lee
 James Hendler
 Eric Miller
 Deborah McGuinness
 Ora Lassila
 Dan Brickley
 Jeff Heflin
 Ian Horrocks
 Frank van Harmelen
 Jeremy Carroll
and much more….

한국어정보처리튜토리얼 '04
86