The Future of the Web
Download
Report
Transcript The Future of the Web
The Semantic
Web
Deborah McGuinness
Associate Director and Senior Research Scientist
Knowledge Systems Laboratory
Stanford University
Stanford, CA USA
[email protected]
http://www.ksl.stanford.edu/people/dlm
Today: Rich Information Source for
Human Manipulation/Interpretation
Human
Human
“I know what was input”
• Global documents and terms indexed and available for search
• Search engine interfaces
• Entire documents retrieved according to relevance (instead of
answers)
• Human input, review, assimilation, integration, action, etc.
• Special purpose interfaces required for user friendly applications
The web knows what was input but does little interpretation,
manipulation, integration, and action
Information Discovery… but not
much more
• Human intensive (requiring input reformulation
and interpretation)
• Display intensive (requiring filtering)
• Not interoperable
• Not agent-operational
• Not adaptive
• Limited context
• Limited service
Analogous to a new assistant who is thorough yet
lacks common sense, context, and adaptability
Future: Rich Information Source for
Agent Manipulation/Interpretation
Human
Agent
Agent
“I know what was meant”
•
•
•
•
•
Understand term meaning and user background
Interoperable (can translate between applications)
Programmable (thus agent operational)
Explainable (thus maintains context and can adapt)
Capable of filtering (thus limiting display and
human intervention requirements)
• Capable of executing services
Semantic Markup
Languages such as DAML+OIL
(www.daml.org)
• Encoding background info
• User modeling info
• Annotating web pages
• Annotating services
Ontologies
DAML-enabled
web pages
thereby limiting needs for human disambiguation input, human
interpretation, multiple answer display, translation assistance,
agent assistance, adaptivity support, etc.)
The Semantic Web Enables…
•E-commerce
solutions Web
The Semantic
enables…
•M-commerce
• New models of intelligent services
•
•
•
•
E-commerce solutions
M-commerce
Web assistants
…
New forms of web assistants/agents that act on a
human’s behalf requiring less from humans
and their communication devices…
Under the covers
Meaning needs to be encoded, understood, and
reasoned with.
-- Ontologies capture meanings of terms and their
interrelationships
What is an Ontology?
Catalog/
ID
Thesauri
“narrower
term”
relation
Terms/
glossary
Informal
is-a
Frames General
Formal
is-a (properties) Logical
constraints
Formal
instance
Disjointness,
Value Inverse, partRestrs. of…
Ontologies and importance to E-Commerce
Simple ontologies (taxonomies) provide:
• Controlled shared vocabulary (search engines,
authors, users, databases, programs/agents all speak
same language)
• Site Organization and Navigation Support
• Expectation setting (left side of many web pages)
• “Umbrella” Upper Level Structures (for extension)
• Browsing support (tagged structures such as Yahoo!)
• Search support (query expansion approaches such as
FindUR, e-Cyc)
• Sense disambiguation
Ontologies and importance to E-Commerce II
• Consistency Checking
• Completion
• Interoperability Support
• Support for validation and verification testing (e.g.
http://ksl.stanford.edu/projects/DAML/chimaera-jtpcardinality-test1.daml )
• Configuration support
• Structured, “surgical” comparative customized
search
• Generalization/ Specialization
• … Foundation for expansion and leverage
A Few Observations about Ontologies
– Simple ontologies can be built by non-experts
• Verity’s Topic Editor, Collaborative Topic Builder, GFP, Chimaera, Protégé,
OIL-ED, etc.
– Ontologies can be semi-automatically generated
• from crawls of site such as yahoo!, amazon, excite, etc.
• Semi-structured sites can provide starting points
– Ontologies are exploding (business pull instead of technology push)
• most e-commerce sites are using them - MySimon, Amazon, Yahoo! Shopping,
VerticalNet, etc.
• Controlled vocabularies (for the web) abound - SIC codes, UMLS, UN/SPSC,
Open Directory (DMOZ), Rosetta Net, SUO
• Business interest expanding – ontology directors, business ontologies are
becoming more complicated (roles, value restrictions, …), VC firms interested,
• DTDs are making more ontology information available
• Markup Languages growing XML, RDF, DAML, RuleML, xxML
• “Real” ontologies are becoming more central to applications
Implications and Needs
• Ontology Language Syntax and Semantics
(DAML+OIL)
• Environments for Creation and Maintenance
of Ontologies
• Training (Conceptual Modeling, reasoning
implications, …)
Issues
–
–
–
–
–
–
–
–
–
–
–
Collaboration among distributed teams
Interconnectivity with many systems/standards
Analysis and diagnosis
Scale
Versioning
Security
Ease of use
Diverse training levels /user support
Presentation style
Lifecycle
Extensibility
Chimaera – A Ontology
Environment Tool
An interactive web-based tool aimed at supporting:
•Ontology analysis (correctness, completeness, style, …)
•Merging of ontological terms from varied sources
•Maintaining ontologies over time
•Validation of input
• Features: multiple I/O languages, loading and merging into multiple
namespaces, collaborative distributed environment support, integrated
browsing/editing environment, extensible diagnostic rule language
• Used in commercial and academic environments
• Available as a hosted service from www-ksl-svc.stanford.edu
• Information: www.ksl.stanford.edu/software/chimaera
XML
• World Wide Web Consortium (W3C) standard
• Provides important solution to syntax problem and
simple semantics and schemas:
<SSN>444-23-2656</SSN>
• Now we can describe the meaning of words
• Many applications of XML appearing:
– Geographic Markup Language (GML)
– Extensible rights Markup Language (XrML)
– Chemical Markup Language (CML)
Problem: Limited semantics and ontology
DARPA Agent Markup Language
•
•
•
•
Builds on top of XML and RDF
Provides rich ontology representation
Key starting point for W3C Semantic Web activity
Future releases will provide logic and rules
capabilities
Problem: Tools to help create DAML ontologies, markup,
and to facilitate access are still emerging
EXAMPLES
HTML
<html>
<head> <TITLE>Fred Jones</TITLE> </head>
<body> <H1>Information About Fred Jones</H1>
<P>Fred Jones is in the U.S. Air Force. He is a Captain stationed at AFRL. </P>
</body>
</html>
DAML
XML
<person>
<name>Fred Jones</name>
<employer>U.S. Air Force</employer>
<station>AFRL</station>
<rank>Captain</rank>
</person>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:daml="http://www.daml.org/2001/03/daml+oil#"
xmlns:dod="http://www.dod.mil/personnel#"
xmlns:af="http://www.af.mil/personnel#"
xmlns:afrl="http://www.rl.af.mil/personnel#"
<dod:Officer rdf:ID="fsmith">
<dod:givenName>Fred</dod:givenName>
<dod:surname>Smith</dod:surname>
<dod:service rdf:resource="http://www.dod.mil/services#AirForce"/>
<af:rank rdf:resource="http://www.af.mil/personnel#Captain"/>
<af:station rdf:resource="http://www.af.mil/stations#AFRL_Rome"/>
<daml:equivalentTo rdf:resource="ssn:123-45-6789"/>
</dod:Officer>
</rdf:RDF>
DAML Status
• DAML ontology language specification released and in use
• DAML services language specification draft released
• http://www.daml.org provides public Web site with DAML
information
• Research and corporate teams are developing DAML tools
• Supported by W3C in the Semantic Web Activity
• Endorsed by companies and interest growing
Trustworthy
Web
Resources
Proof, Logic and
Ontology Languages
Shared terms/terminology
Machine-Machine communication
2010?
Resource Description Framework
eXtensible Markup Language
HyperText Markup Language
HyperText Transfer Protocol
Self-Describing Documents
2000
Foundation of the Current Web
1990
(from Berners-Lee, Hendler; Nature, 4/01)
Discussion/Conclusion
• Ontologies are exploding; core of many applications
• Business “pull” is driving ontology language tools and
languages
• New generation applications need more expressive ontologies
and more back end reasoning
• New generation users (the general public) need more support
than previous users of KR&R systems
• Scale and distribution of the web force mind shift
• Markup languages will revolutionize web applications
• Agents can be human proxies enabling new applications and
modes of interaction
Some Pointers
• Ontologies Come of Age Paper:
http://www.ksl.stanford.edu/people/dlm/papers/
ontologies-come-of-age-abstract.html
• Ontologies and Online Commerce Paper:
http://www.ksl.stanford.edu/people/dlm/papers/
ontologies-and-online-commerce-abstract.html
• DAML+OIL: http://www.daml.org/
Extras
What Is An Agent?
• Software module
• Intended to act as a proxy for you in some way
• May be:
– Tightly controlled
– Autonomous
– Mobile
Why Is This Important?
• Humans work sequentially
• Agents work in parallel and 24x7
• Therefore, agents can be a major
productivity multiplier
Web Trends
• Web is evolving from a provider of documents and images
(information retrieval)
• To a provider of services
• Web service discovery -
Find me an airline service that offers flights to Singapore
• Web service execution -
Buy me “Harry Potter and the Sorcerer’s Stone” at
www.amazon.com
• Web service selection, composition and interoperation -
Make my travel arrangements for my Internet World
conference trip
• Both retrieval and services lend themselves to agent
technologies
Problems
• Average Web searches examine only 25% of
available information
• Web searches return a lot of unwanted
information
• Information content of the Web doubles
approximately every six months
• Problem continues to worsen as Web grows