Thesauri, Terminologies and the Semantic Web A J Miles Rutherford Appleton Laboratory CCLRC • Council for the Central Laboratory of the Research Councils (CCLRC) • Big.

Download Report

Transcript Thesauri, Terminologies and the Semantic Web A J Miles Rutherford Appleton Laboratory CCLRC • Council for the Central Laboratory of the Research Councils (CCLRC) • Big.

Thesauri, Terminologies and the
Semantic Web
A J Miles
Rutherford Appleton Laboratory
CCLRC
• Council for the Central Laboratory of the
Research Councils (CCLRC)
• Big Science
– Synchrotron Radiation Sources
– Lasers
– Pulsed Neutron Source
• Large-scale IT demands: tera-scale data,
computation
• Strong IT R&D programme
• BITD: Business and Information
Technology Department
A J Miles
22
SWAD-E
• Semantic Web Advanced Development for
Europe (SWAD-E)
• EU Project
• W3C Semantic Web Activity
– R&D
– Demos & Apps
– Guidelines & Best Practises
• Partners: HP Labs, ILRT, Stilo, ERCIM
(INRIA), CCLRC
A J Miles
33
Semantic Web
• Current Web:
– Web of information for humans
• Semantic Web:
– Web of data for computers
• Why?
– Automation, organisation, search
• Enabling technologies:
– RDF: Resource Description Framework
• Data linking, low-level semantics
– OWL: Web Ontology Language
• High-level semantics, inference
A J Miles
44
SWAD-E Thesaurus Activity
• Why is SWAD-E interested in thesauri?
– Large body of well-engineered knowledge
 Enrich & bootstrap semantic web
• SWAD-E Thesaurus Activity
– Design Schemas for thesaurus data
– Guidelines for use and migration
– Supporting technologies and demos
A J Miles
55
Exploiting the Semantic Web
• What can you get out of the semantic
web?
– Integration & Connectivity
– Data Interoperability
– Application Interoperability
• N.B. The Semantic Web is a Tool.
A J Miles
66
Integration & Connectivity
• Recurring use-case: high/low-level
thesauri
– E.g. GCL & Cultural Heritage thesauri
– E.g. Gerry’s macro/micro thesauri
• RDF => Data linking
 Can create large linked thesaurus structures – superthesauri
 Linking solves problem of concurrent versions
A J Miles
77
Data Interoperability
• Interoperability is a major goal
• Move to XML technologies is a step in the
right direction
• But …
– XML does not equal Interoperability
• [N.B. plethora of XML formats for thesauri]
A J Miles
88
SKOS-Core: RDF Schema for Thesauri
• Introducing: SKOS-Core 1.0
– RDF Schema for Thesauri
– SKOS: Simple Knowledge Organisation Systems
• The First Challenge: coping with variety in
thesaurus design and structure
– Allow unique features
– Support interoperability
• The Second Challenge: interoperating
with/migrating to ontologies, taxonomies,
classification schemes etc.
• N.B. The semantic web is where data collide
A J Miles
99
SKOS Meta-Model
• SKOS Meta-Model: Concept-orientation
–
–
–
–
Concepts (given URIs)
Labels (pref, alt) (symbols)
Concept Schemes
Semantic Relations (extensible set)
• Broader, narrower, related …
– Semantic Mappings (extensible set)
• Exact, inexact …
– Scope notes, definitions, depictions
skos:semanticRelation
sub-property of
skos:broader
sub-property of
skos:broaderInstantive
 Infer meaning of concept from labels, scope
notes, definitions, depictions & neighbours.
A J Miles
10 10
SKOS Meta-Model
A J Miles
11 11
Example: GEMET
• Non-Standard Features:
– Groups & Super-Groups
– Themes
• Solution: Extend SKOS-Core:
– Class gemet:Group
• Sub-class of skos:Concept
– Class gemet:Theme
• Sub-class of skos:Concept
– Property gemet:broaderTheme
• Sub-property of skos:broader
– Property gemet:broaderGroup
• Sub-propery of skos:broader
A J Miles
skos:semanticRelation
sub-property of
skos:broader
sub-property of
gemet:broaderGroup
12 12
Approach to Non-Standard Thesauri
• Design schema as extension to SKOSCore
 Preserve unique features
 Can interoperate with anything based on SKOS-Core
 Have your cake and eat it!
A J Miles
13 13
Example: MeSH
• Medical Subject Headings
• Thesaurus features:
– Semantically ambiguous concept hierarchy
• Ontology features:
– Semantic typing of concepts, e.g. Calcymycin type Antibiotic
 Use combination of SKOS and OWL to
represent this hybrid structure
• RDF, SKOS & OWL means you can migrate a
thesaurus to ontology merely by adding
statements
• (No re-engineering or transformation is required)
A J Miles
14 14
Interoperability: Thesaurus Mapping
• Common use-case: overlapping thesauri
– Mappings support interchangeable use of overlapping
thesauri
• Introducing: SKOS-Mapping
– RDF Schema for inter-thesaurus mapping
A J Miles
15 15
SKOS-Mapping
Sets of Resources in which
Concept Occurs
Exact
Inexact
Major
Minor
Partial
Broad
Narrow
Ordered
Combinators:
A J Miles
AND, OR, NOT
16 16
Data Interop Summary
• SKOS-Core supports interoperability of
KOS with variable structures
• SKOS-Mapping supports interoperability of
overlapping KOS
A J Miles
17 17
Multilinguality
• Analyse multilingual thesaurus into
language components
• Multilingual Labelling
– Use SKOS-Core
– E.g. GEMET
• Inter-lingual Mapping
– Use SKOS-Core + SKOS-Mapping
– E.g. AAT, Merimee
A J Miles
18 18
Multilinguality (2)
Analyse each language component
Multilingual Labelling
Interlingual Mapping
A J Miles
19 19
GEMET Backbone in SKOS-RDF
<skos:Concept rdf:about="c_3194">
<gemet:broaderTheme rdf:resource="t_9"/>
<skos:narrower rdf:resource="c_1824"/>
<skos:narrower rdf:resource="c_13150"/>
<skos:narrower rdf:resource="c_13260"/>
<skos:narrower rdf:resource="c_13177"/>
<skos:narrower rdf:resource="c_13152"/>
<skos:narrower rdf:resource="c_8308"/>
<skos:inScheme rdf:resource="../GEMET"/>
<skos:narrower rdf:resource="c_6793"/>
<gemet:broaderGroup rdf:resource="g_2504"/>
<skos:narrower rdf:resource="c_3205"/>
<skos:narrower rdf:resource="c_1025"/>
<skos:narrower rdf:resource="c_13151"/>
<skos:narrower rdf:resource="c_11044"/>
<skos:narrower rdf:resource="c_3201"/>
<skos:narrower rdf:resource="c_4478"/>
<skos:narrower rdf:resource="c_6602"/>
</skos:Concept>
A J Miles
20 20
GEMET en labels in SKOS-RDF
<skos:Concept rdf:about="c_3194">
<skos:prefLabel xml:lang="en-GB">finances</skos:prefLabel>
<skos:definition xml:lang="en-GB">The monetary resources or
revenue of a government, company, organization or individual.
(Source: RHW)</skos:definition>
</skos:Concept>
A J Miles
21 21
Application Interoperability
• Move to modularisation and distribution via
web services is step in the right direction
• But …
– Web services does not equal application
interoperability
• Web service API
– Community driven design
– Endorsed by wider community
A J Miles
22 22
SKOS API
• SKOS API
– Interface to terminology web service
– Under development (pre-release is online)
• Participate on public mailing list:
[email protected]
A J Miles
23 23
Web Service Implementation
• Reference Implementation of SKOS API
– Leverage power of sem-web tools
• E.g. RDF Query
• E.g. transitive closure of broader concepts
• SOAP Service
– Back end: Sesame RDF Store
– Modular design
• Semantic Web Services
A J Miles
24 24
Future Issues
• Social aspects
– Semantic web technology supports community driven
development of thesauri
– But … validation?
• Change Management
– Versioning
– Evolution
A J Miles
25 25
Summary
• Semantic Web Technologies are tools
– SKOS-Core & SKOS-Mapping support data
interoperability
– SKOS API supports application interoperability
• W3C Semantic Web Best Practises
Working Group
– Thesaurus Task Force
A J Miles
26 26
SWAD-E & Eco-informatics
• Work with thesaurus developers and
managers
– Publish RDF encodings of current thesauri
– Test design and coverage of SKOS-Core
• Community building for development of
web service API
A J Miles
27 27
Thank You
• Links:
SWAD-Europe Thesaurus Activity
http://www.w3.org/2001/sw/Europe/reports/thes/
SKOS-Core 1.0 Guide
http://www.w3.org/2001/sw/Europe/reports/thes/1.0/guide/
SKOS API
http://www.w3.org/2001/sw/Europe/reports/thes/skosapi.html
SKOS-Core 1.0 Guidelines for Migration
http://www.w3.org/2001/sw/Europe/reports/thes/1.0/migrate/
Public Thesaurus Mailing list
[email protected]
W3C Semantic Web Activity
http://www.w3.org/2001/sw/
SWAD-E Project
http://www.w3.org/2001/sw/Europe/
A J Miles
28 28