Title of the presentation

Download Report

Transcript Title of the presentation

Accessing Cultural Heritage using
Semantic Web Techniques
Antoine ISAAC
VU Amsterdam - KB
Digital Access to Cultural Heritage Master
March 20th, 2008
Accessing CH using Semantic Web techniques
Background
• CATCH (NWO)
• Continuous Access To Cultural Heritage
• Computer science research projects
• Applied to Cultural Heritage
• STITCH
• SemanTic Interoperability To access Cultural
Heritage
• Exchanging and integrating metadata
Accessing CH using Semantic Web techniques
Agenda
•
•
•
•
•
Cultural Heritage interoperability problems
Why Semantic Web techniques can be relevant
Porting CH vocabularies to the Semantic Web
Vocabulary alignment
Demo?
Accessing CH using Semantic Web techniques
The Interoperability Problem in Cultural Heritage
• Trend: simultaneous access to different collections
• The European Library, Memory of the Netherlands
• Problem: how to access seamlessly different collections?
• Traditional solution: using object metadata
• For instance subjects coming from controlled vocabularies
• But…
Accessing CH using Semantic Web techniques
Interoperability Problems
From syntactic to semantic
• Different formats
• Different metadata schemas
• Different conceptual vocabularies
Accessing CH using Semantic Web techniques
Interoperability Solutions?
From syntactic to semantic
• Different formats
• “We have a solution!”
• XML as a standard for data exchange
• Different metadata schemas
• “Something could be used…”
• Dublin Core for simple metadata publication & exchange
• Different conceptual vocabularies
Accessing CH using Semantic Web techniques
Interoperability Solutions?
From syntactic to semantic (continued)
• Different conceptual vocabularies
• “Do you really want to discuss it now?”
• No standard vocabulary
• DDC, UDC, SWD, LCSH, AAT, Iconclass
• and myriads of others…
• Not even a common model: classes, terms, concepts…
• Even worse: there are reasons for this!
Accessing CH using Semantic Web techniques
KB Illustrated Manuscripts
Accessing CH using Semantic Web techniques
KB Illustrated Manuscripts: Iconclass
Accessing CH using Semantic Web techniques
Mandragore
Accessing CH using Semantic Web techniques
Mandragore
Accessing CH using Semantic Web techniques
What we have
MDS 1
- Field 1
- Field 1.1
- Field 2
- Field 2.1
- Field 2.2
-…
MDS 2
- Field 1
- Field 1.1
- Field 1.2
- Field 1.2.1
- Field 1.3
- Field 2
-…
Accessing CH using Semantic Web techniques
What we want
Accessing CH using Semantic Web techniques
CH Interoperability ProblemS
MDS 1
- Field 1
- Field 1.1
- Field 2
- Field 2.1
- Field 2.2
-…
MDS 2
- Field 1
- Field 1.1
- Field 1.2
- Field 1.2.1
- Field 1.3
- Field 2
-…
MDS 1
- Field 1
- Field 1.1
- Field 2
- Field 2.1
- Field 2.2
-…
MDS 2
- Field 1
- Field 1.1
- Field 1.2
- Field 1.2.1
- Field 1.3
- Field 2
-…
Accessing CH using Semantic Web techniques
Agenda
•
•
•
•
•
Cultural Heritage interoperability problems
Why Semantic Web techniques can be relevant
Porting CH vocabularies to the Semantic Web
Vocabulary alignment
Demo?
Accessing CH using Semantic Web techniques
What is the Semantic Web?
• Pushed by the World Wide Web Consortium
http://www.w3.org/2001/sw/
• “The Semantic Web is a web of data”
• “It is about common formats for integration and
combination of data drawn from diverse sources”
Accessing CH using Semantic Web techniques
SW Problem: The Web for Humans
• A city
• A flag
• The city’s
location
Meaning
Accessing CH using Semantic Web techniques
SW Problem: The Web for Computers?
Where is
meaning?
Accessing CH using Semantic Web techniques
SW Problem: The Web for Computers?
Accessing CH using Semantic Web techniques
The Semantic Web Approach: A Web of (Meta)data
Article
The_Netherlands
Document
subClassOf
type
hasCapital
file1
Amsterdam
partOf
type
defines
City
paragraph3
Accessing CH using Semantic Web techniques
The Semantic Web (1/2)
• Pointing at resources
• What? Knowledge objects
• everything that we may want to refer to
• including documents, persons…
• How? Uniform Resource Identifiers
• HTTP URLs: http://www.few.vu.nl/~aisaac/
• urn:isbn:0-395-36341-1
• mailto:[email protected]
Accessing CH using Semantic Web techniques
A Web of Resources
myVoc1:Article
http://ex.org/files/file1
myVoc2:Amsterdam
http://www.ned.nl/rep321
Accessing CH using Semantic Web techniques
The Semantic Web (2/2)
• Pointing at resources: URIs
• Creating structured assertions involving resources
• What? Typed links between resources
• How? RDF (Resource Description Framework)
• Statements subject-predicate-object
Accessing CH using Semantic Web techniques
Data in an RDF “Graph”
myVoc1:Article
rdf:type
http://ex.org/files/file1
myVoc1:defines
myVoc2:Amsterdam
http://www.ned.nl/rep321
Accessing CH using Semantic Web techniques
Building on Top of the Web
• Web-based resources allow
distribution/sharing of
• document
• description vocabularies
• (meta)data
http://www.geo.org/voc/
(par3, defines, Amsterdam)
http://www.kb.nl/eDepot
http://www.ned.nl/rep321
different
owners & locations
Accessing CH using Semantic Web techniques
CH Interoperability ProblemS (reminder)
MDS 1
- Field 1
- Field 1.1
- Field 2
- Field 2.1
- Field 2.2
-…
MDS 2
- Field 1
- Field 1.1
- Field 1.2
- Field 1.2.1
- Field 1.3
- Field 2
-…
MDS 1
- Field 1
- Field 1.1
- Field 2
- Field 2.1
- Field 2.2
-…
MDS 2
- Field 1
- Field 1.1
- Field 1.2
- Field 1.2.1
- Field 1.3
- Field 2
-…
Accessing CH using Semantic Web techniques
What do we need then?
• Porting vocabularies to the Semantic Web
xxx
x
xxx
x
xxx
xxx
xxx
x
xxxx
xxx
xxxx
xxx
xxx
xxx
xxx
xxxx
xxx
xxx
x
xxx
x
xxxx
xxx
Accessing CH using Semantic Web techniques
What do we need then?
• Aligning vocabularies
xxx
x
xxx
x
xxx
xxx
xxx
x
xxxx
xxx
xxxx
xxx
xxx
xxx
xxx
xxxx
xxx
xxx
x
xxx
x
xxxx
xxx
Accessing CH using Semantic Web techniques
Agenda
•
•
•
•
•
Cultural Heritage interoperability problems
Why Semantic Web techniques can be relevant
Porting CH vocabularies to the Semantic Web
Vocabulary alignment
Demo?
Accessing CH using Semantic Web techniques
SKOS (Simple Knowledge Organization System)
• Model to represent KOSs on the SW
• In a simple way
• In a standard way
• Comparable to Dublin Core, for conceptual
vocabularies
• Still being elaborated by W3C
http://www.w3.org/2004/02/skos/
Accessing CH using Semantic Web techniques
SKOS (Simple Knowledge Organization System)
• SKOS offers building blocks to represent KOSs in RDF
• Objects: Concept and ConceptScheme
• Lexical properties (multilingual)
• prefLabel
• altLabel
• Semantic relations
• broader, narrower
• related
• Notes
• scopeNote
• definition
…
Accessing CH using Semantic Web techniques
SKOS: Example
skos:ConceptScheme
rdf:type
skos:Concept
http://www.iconclass.nl/
rdf:type
skos:inScheme
http://www.iconclass.nl/s_11F
skos:prefLabel
“the Virgin Mary”@en
“la Vierge Marie”@fr
skos:prefLabel
skos:broader
http://www.iconclass.nl/s_11
Accessing CH using Semantic Web techniques
Agenda
•
•
•
•
•
Cultural Heritage interoperability problems
Why can Semantic Web techniques be relevant
Porting CH vocabularies to the Semantic Web
Vocabulary alignment
Demo
Accessing CH using Semantic Web techniques
The semantic interoperability problem
• There is no standard vocabulary
• We don’t really want it
different vocabularies for different expertise domains,
traditions, tasks
• Consequence:
• “klassieke ruïnes” vs. “landschap met ruïnes”
• “maagd Maria”
vs. “Heilige Moeder”
Accessing CH using Semantic Web techniques
Vocabulary alignment
• Aim: finding semantic correspondences between
vocabulary elements
• “klassieke ruïnes” ≈ “landschap met ruïnes”
• “maagd Maria”
= “Heilige Moeder”
• Doing it (semi-) automatically
• Vocabularies are big (tens of thousands concepts)
• They change
Accessing CH using Semantic Web techniques
Automatic alignment techniques
• Lexical
Long brain
Labels of entities and textual definitions
• Structural
Structure of the vocabularies
• Background knowledge
Using a shared conceptual reference to find links
• Extensional
Object information (e.g. book indexing)
tumor
Long
tumor
Accessing CH using Semantic Web techniques
Automatic alignment techniques
• Lexical
Long brain
Labels of entities and textual definitions
• Structural
Structure of the vocabularies
• Background knowledge
Using a shared conceptual reference to find links
• Extensional
Object information (e.g. book indexing)
tumor
Long
tumor
Accessing CH using Semantic Web techniques
Extensional Statistical Alignment
• Object information (e.g. book indexing)
Thesaurus 1
“Dutch
Literature”
“Dutch”
Collection
of books
Thesaurus 2
Accessing CH using Semantic Web techniques
Results
1: 9132.9 (1704 3479 976) Schilderijen schilderkunst
2: 8088.5 (1204 2330 767) Kwaliteitszorg kwaliteitsmanagement
3: 6232.7 (820 1572 543) Personeelsmanagement personeelsbeleid
4: 5392.1 (1399 3271 622) Beeldende kunsten beeldende kunst
5: 5063.1 (4951 1152 613) Nederlands - Nederlandse
taalkunde
17: 3421.8 (280 714 243) Diabetes mellitus suikerziekte
Accessing CH using Semantic Web techniques
Alignment: no Trivial Solution
• Current techniques are not reliable as unique source of
knowledge
• Workflow would imply checking/completion by human
• Combination of techniques is required
• Alignment is a difficult research problem
Accessing CH using Semantic Web techniques
Agenda
•
•
•
•
•
Cultural Heritage interoperability problems
Why can Semantic Web techniques be relevant
Porting CH vocabularies to the Semantic Web
Vocabulary alignment
Demo?
Accessing CH using Semantic Web techniques
Demo
• KB Illuminated Manuscripts
• BNF Mandragore Manuscripts
• http://galjas.cs.vu.nl:33333/MANDRA-SV-ICEmandraNewNONE , amphibians
• Wheat
Accessing CH using Semantic Web techniques
Message
Semantic Web techniques
•
•
Representation of collections and vocabularies
Alignment of vocabularies
can help solving Cultural Heritage problems
•
•
Semantic integration
Publication and access
•
[And more: semantic query expansion, clustering…]
Accessing CH using Semantic Web techniques
Thanks!
Accessing CH using Semantic Web techniques
Links
• Semantic Web at W3C
• http://www.w3.org/2001/sw/
• SKOS
• http://www.w3.org/2004/02/skos/
• Cultural Heritage and Semantic Web projects
•
•
•
•
MuseumFinland, http://www.museosuomi.fi/
eCulture, http://e-culture.multimedian.nl/
STITCH, http://www.cs.vu.nl/STITCH/
CATCH, http://www.nwo.nl/catch