Transcript Slide 1
Semantic Web
Andrejs Lesovskis
Agenda
Syntax and semantics Introduction to Semantic Web Semantic Web layers Projects that use Semantic Web technologies
Syntax and semantics (1)
A term for the study of the rules governing the way words are combined to form sentences in a language.
In computer science it refers to the ways symbols may be combined to create well-formed programs in the language.
It defines the formal relations between the constituents of a language.
Syntax and semantics (2)
Semantics is the study of the meaning of linguistic expressions. The language can be a natural language, such as English or Navajo, or an artificial language, like a computer programming
language.
Natural-language semantics is important in trying to make computers better able to deal directly with human languages.
What is Semantic Web?
"The Semantic Web is not a separate Web but an extension of the current one (World
Wide Web
– WWW), in which information is given well-defined meaning, better enabling computers and people to work in cooperation.
... a web of data that can be processed directly and indirectly by machines." Tim Berners-Lee, James Hendler, and Ora Lassila.
Semantic Web and World Wide Web
Semantic Web and World Wide Web
Semantic Web and Beyond
Creators Semantic Web content Users
applications agents Semantic Web
Semantic Annotations Languages
WWW and Beyond
Creators Ontologies Tools Web content Logical Support Applications / Services Users
Resource Integration
Semantic annotations Web resources, services, databases Shared ontology 8
8
Resource integration
Industrial and business processes External resources Web resources, services, databases Web users Shared ontology Multimedia resources Mobile devices Machines and devices Web agents/applications
9
Semantic Web and semantic network (1)
Semantic Web and semantic network (2)
Semantic Web inventor
Semantic web inventor the World Wide Web.
Sir Timothy Berners-Lee best known as the inventor of Berners-Lee is the director of the World Wide Web Consortium (W3C), which oversees the Web's continued development.
Semantic Web layers (1)
Semantic Web layers (2)
URI and Unicode XML (eXtensible Markup Language) RDF (Resource Derscription Framework) Ontology Logic Proof Trust User interface and applications
Semantic Web layers (3)
XML and Semantic Web Standards Timeline
Project OpenCalais (Thomson Reuters)
• Thomson Reuters launched project Calais in January 2008.
• Calais Web Service processes unstructured text (like news articles, blog postings, scientific papers, etc.) and it returns semantic metadata in RDF format.
• Uses natural language processing learning and machine techniques to examine the text and locate the entities, facts, and events.
Swoogle search engine
Project DBPedia.org (1)
DBpedia is a project aimed to extract structured content from the information created as part of the Wikipedia project ("infobox" tables).
This structured information is then made available on the World Wide Web.
The DBpedia knowledge base allows users to query relationships and properties associated with the Wikipedia resources, including links to other related datasets.
Used technologies: Scala, Java, Virtuoso Universal Server.
Project DBPedia.org (2)
Project DBPedia.org (3)
Project DBPedia.org (4)
DBPedia project results: Data extraction from 97 languages, English version of the DBpedia knowledge base currently describes 3.77
million things, including 764,000 persons, 573,000 places, 333,000 creative works, 192,000 organizations, 202,000 species and 5,500 diseases., Contains more than 672 million RDF triples, Tests show 87% precision, Developed a large multi-domain ontology.
RDF Site Summary (RSS)
RSS (Really Simple Syndication) is a family of web feed formats used to publish frequently updated works — such as blog entries, news headlines, video audio, and — in a standardized format.
Really Simple Syndication (RSS)
URI un Unicode
Unicode
- is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems.
URI (Uniform Resource Identifier) URL (Uniform Resource Locator) http://www.google.com
mailto:[email protected] URN (Uniform Resource Name) URN of "Spider-Man" movie: urn:isan:0000-0000-9E59-0000-O-0000-0000-2 URN of "Science of Computer Programming “ magazine: urn:issn:0167-6423
XML (1)
XML (eXtensible Markup Language) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable Uses tags for markup :
XML (2)
Example:
Scalable Vector Graphics (SVG)
Scalable Vector Graphics (SVG)
Simple Object Access Protocol (SOAP)
SOAP Version 1.2 (SOAP) is a lightweight protocol intended for exchanging structured information in a decentralized, distributed environment.
It uses XML technologies messaging to define framework an extensible providing a message construct that can be exchanged over a variety of underlying protocols.
SOAP 1.2
became recommendation in 2007.
a W3C
SOAP envelope
SOAP example
POST /InStock HTTP/1.1
Host: www.example.org
Content-Type: application/soap+xml; charset=utf-8 Content-Length: 299 SOAPAction: "http://www.w3.org/2003/05/soap-envelope"
Web Services Description Language (WSDL)
Web Services Description Language is an XML-based interface description language that is used for describing the functionality offered by a web service.
A WSDL description of a web service (also referred to as a WSDL file) provides a machine-readable description of how the service can be called, what parameters it expects, returns.
and what data structures it WSDL 2.0 became a W3C recommendation on June 2007.
Web Services Description Language (WSDL)
Semantic Web service architecture
Simple Semantic Web Architecture and Protocol
Simple Semantic Web Architecture and Protocol (2) The SSWAP architecture is based on the following five basic concepts:
Provider
– corresponds to the organizations that own and publish resources;
Resource
– arbitrary resources (for example, web pages, ontologies, or datasets), but they are primarily used to describe web services;
Graph
– concept that describes transformations performed by the service;
Subject
– input data that is given to the service;
Object
– service execution result.
Document Type Definition (DTD)
Document Type Definition (DTD) is a set of markup declarations that define a document type for an SGML-family markup language (SGML, XML, HTML). DTD is a part of XML 1.0 specification.
Example: DTD ]> XML
DTD elements
External DTD declaration: doc_elem SYSTEM/PUBLIC dtd_addr>
Element type declaration name content_model>
•
Any content:
•
Children elements:
•
Parsed character data:
•
Empty (has no content):
DTD quantifiers
a+
• •
a* a?
•
a, b a | b ]>
DTD attributes
Attribute declaration template:
attribute_name type default_value
...
attribute_name type default_value>
Example:
XML Schema
XML Schema 1.0 was approved as a W3C Recommendation in 2001 and it was the first separate schema language for XML to receive this status.
Schema is an abstract collection of metadata, that includes the following components: element and attribute declarations and complex and simple type definitions.
Schema definition example:
Reference to an XL Schema: Simple elements don’t contain child elements or attributes: XML Schema example
XML Schema elements
Element types
Primitive types
string,
boolean,
decimal,
float,
double,
duration,
dateTime, time, date,
gYearMonth, gYear,
gMonthDay, gDay,
gMonth,
hexBinary, base64Binary,
anyURI,
Qname,
NOTATION.
Derived types
normalizedString,
token,
language,
NMTOKEN, NMTOKENS,
Name, NCName,
ID, IDREF, IDREFS,
ENTITY, ENTITIES,
integer,
nonPositiveInteger,
negativeInteger,
long, int, short, byte,
unsignedLong,
unsignedInt,
unsignedShort,
unsignedByte.
Element occurrence indicators
The minOccurs indicator specifies the minimum number of times an element can occur. If minOccurs is equal to 0, then element is optional.
XML Schema attributes
Attribute declaration template:
Example:
DTD vs XML Schema (1)
• •
DTD pros
It's been around longer than XML Schema; Is a part of XML 1.0 specifications.
• • • •
DTD cons
Uses different from XML syntax; Doesn’t support namespaces; Limited number of types; DTD describes whole document.
DTD un XML Schema (2)
• • •
XML Schema pros
Uses XML syntax (schemas themselves are XML documents); Supports more data types and allows to define your own types; Schema can define portions of the document.
•
XML Schema cons
Pretty much none these days.