Transcript Title

Linked (Open) Data
Opportunities and challenges
Makx Dekkers
[email protected]
Outline
•
•
•
•
•
Basic notions
Recent developments
Comparing objectives
Opportunities and risks
Conclusions
© 2011 Makx Dekkers
Journeés ABES 2011
2
BASIC NOTIONS
© 2011 Makx Dekkers
Journeés ABES 2011
3
The idea and its history
• 1989: Tim Berners-Lee already talked about
linking documents and data together
(http://www.w3.org/History/1989/proposal.html)
• 2001: Tim Berners-Lee and Ora Lassila
introduced the “Semantic Web”
(http://www.scientificamerican.com/article.cfm?id=the-semantic-web)
• 2006: Tim Berners-Lee presented the initial
design issues (rules) for Linked Data
(http://www.w3.org/DesignIssues/LinkedData.html)
© 2011 Makx Dekkers
Journeés ABES 2011
4
W3C Semantic Web initiative
• Objective
– to create a universal medium for the exchange of
data […] to smoothly interconnect personal
information management, enterprise application
integration, and the global sharing of commercial,
scientific and cultural data
• Main results
– Resource Description Framework (RDF), RDFa
(RDF-in-HTML), SPARQL Query Language
© 2011 Makx Dekkers
Journeés ABES 2011
5
Core Linked Data Specifications
• Transport
– HTTP Hypertext Transfer Protocol
• Identification
– URI Uniform Resource Identifier
• Description and linking
– RDF Resource Description Framework
• Search and access
– SPARQL Query Language for RDF
© 2011 Makx Dekkers
Journeés ABES 2011
6
The four rules of Linked Data
• TBL’s recommendations:
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those
names
3. When someone looks up a URI, provide useful
information, using the standards (RDF*, SPARQL)
4. Include links to other URIs so that they can
discover more things
© 2011 Makx Dekkers
Journeés ABES 2011
7
The basic model of RDF
• Resource Description Framework “triple”:
– Subject: the “thing” (resource) described
– Predicate: the characteristic of the resource
– Object: the value of the characteristic
Predicate
Subject
© 2011 Makx Dekkers
Object
Journeés ABES 2011
8
Complex structures in RDF
hometown
presenter
This
presentation
partOf
Makx
Dekkers
Barcelona
location
organizer
ABES
Montpellier
Journées ABES
location
date
© 2011 Makx Dekkers
17-18 May 2011
Journeés ABES 2011
9
Linked (Open / Enterprise) Data
• Commonalities
– Using Semantic Web technologies (RDF)
– Linking information resources, people, places
• Differences
– Open Data with open licenses; Enterprise Data
mostly for closed, controlled environments
– Open Data links to other Open Data, available for
external use; Enterprise Data may link to external
data but not openly available for external use
© 2011 Makx Dekkers
Journeés ABES 2011
10
Linked Data -- Open Data
• Linked Data: focus on technology
– Semantic Web: Resource Description Framework,
and other Web standards
– Final solutions still under development
• Open Data: focus on strategy
– Based on notion that sharing is important and
benefits all
– Technology is secondary
© 2011 Makx Dekkers
Journeés ABES 2011
11
The five-star system
Source: http://inkdroid.org/journal/2010/06/04/the-5-stars-of-open-linked-data/
© 2011 Makx Dekkers
Journeés ABES 2011
12
The LOD diagram: 2007
25 datasets
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
© 2011 Makx Dekkers
Journeés ABES 2011
13
The LOD diagram: 2008
45 datasets
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
© 2011 Makx Dekkers
Journeés ABES 2011
14
The LOD diagram: 2009
95 datasets
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
© 2011 Makx Dekkers
Journeés ABES 2011
15
The LOD diagram: 2010
203 datasets
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
© 2011 Makx Dekkers
Journeés ABES 2011
16
RECENT DEVELOPMENTS
© 2011 Makx Dekkers
Journeés ABES 2011
17
W3C communities
• LinkingOpenData SWEO Community Project
– Goal: to extend the Web with a data commons by
publishing various open data sets as RDF on the
Web and by setting RDF links between data items
from different data sources
(http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData)
• Library Linked Data Incubator Group
– to help increase global interoperability of library
data on the Web (http://www.w3.org/2005/Incubator/lld/)
© 2011 Makx Dekkers
Journeés ABES 2011
18
More W3C communities
• Government Linking Data Working Group
– to provide standards and other information which
help governments around the world publish their
data as effective and usable linked data
(http://www.w3.org/2011/gld/charter)
• Semantic Web Health Care and Life Sciences
(HCLS) Interest Group
– to develop, advocate for, and support the use of
Semantic Web technologies for health care and
life science (e.g. biology, medicine)
(http://www.w3.org/2001/sw/hcls/)
© 2011 Makx Dekkers
Journeés ABES 2011
19
Open Knowledge Foundation, okfn.org
• not-for-profit organization promoting open
knowledge: any kind of data and content that
can be freely used, reused, and redistributed
• Working and Interest Groups, e.g.
– Open Data in Science, Open Government Data,
Open Bibliographic Data, Cultural Heritage etc.
• CKAN.net: registry of open datasets and other
“knowledge resources”
© 2011 Makx Dekkers
Journeés ABES 2011
20
Linked Data initiatives
Predicate vocabularies (descriptors)
Research Description and Access (RDA)
http://metadataregistry.org/rdabrowse.htm
The Bibliographic Ontology (BIBO)
http://bibliontology.com/
Dublin Core
http://dublincore.org/
Object vocabularies (values)
Virtual International Authority File (VIAF)
http://viaf.org/
Library of Congress authorities
http://id.loc.gov/authorities/
AgroVOC (agricultural terminology)
e.g. http://aims.fao.org/aos/agrovoc/c_550
DBPedia (based on Wikipedia)
e.g. http://dbpedia.org/page/Montpellier
Bibliographic data
LIBRIS Sweden
e.g. http://libris.kb.se/library/S
British Library
http://www.bl.uk/bibliographic/datasamples.html
CrossRef (DOI metadata)
http://www.crossref.org/CrossTech/linked_data/
© 2011 Makx Dekkers
Journeés ABES 2011
21
More Linked Data initiatives
Broadcasting, publishing
BBC
http://www.bbc.co.uk/blogs/bbcinternet/linked_data/
New York Times
http://data.nytimes.com/
Governments (small sample)
USA
http://data.gov/
France
http://opendata.paris.fr/
Finland
http://data.suomi.fi/
UK
http://data.gov.uk/
Spain (Cataluña)
http://dadesobertes.gencat.cat/
Norway
http://data.norge.no
Netherlands
http://www.overheid.nl/opendata
Australia
http://data.gov.au/
© 2011 Makx Dekkers
Journeés ABES 2011
22
COMPARING OBJECTIVES
© 2011 Makx Dekkers
Journeés ABES 2011
23
Strategic aspects Linked Data
• Achieving global interoperability with minimal
coordination
• Aggregating human knowledge
• Supporting democracy, transparency and
accountability
• Enhancing and enriching information
• Enabling user-driven and user-generated
applications
© 2011 Makx Dekkers
Journeés ABES 2011
24
Strategic aspects libraries
• Organizing information for use by specific
users for specific goals
• Ensuring and maintaining quality
• Sustaining services economically
• Preserving information for the long term
• Providing trusted services
© 2011 Makx Dekkers
Journeés ABES 2011
25
Functional aspects Linked Data
• Searching distributed collections
• “Following your nose” – navigating links
between pieces of content
• Distributing responsibility for making
statements about things
• Leaving to the user whom and what to trust
• Leaving development of products and services
to an open market (apps)
© 2011 Makx Dekkers
Journeés ABES 2011
26
Functional aspects libraries
• Describing information by professionals
• Bringing together and managing aggregations
of information
• Selecting relevant information
• Mixing analogue and digital resources
© 2011 Makx Dekkers
Journeés ABES 2011
27
Technical aspects Linked Data
• Publishing and using machine-readable
statements (“data that speak for themselves”)
• Focusing on Semantic Web technology
• Enabling inferences across large distributed
data sets
• (Still to be done) Solving issues around
harvesting, caching and real-time updating
© 2011 Makx Dekkers
Journeés ABES 2011
28
Technical aspects libraries
• Using proven technology to provide highquality services
• Managing production systems and services
• Guaranteeing performance, uptime,
consistency across data
© 2011 Makx Dekkers
Journeés ABES 2011
29
Agility versus sustainability
• In the Linked Data space:
– Things move fast
– Trial-and-error
– Lots of development by volunteers (hackers)
• In the library domain:
– Operational systems need to evolve
– Need to handle legacy data
– Development by professionals in managed
projects
© 2011 Makx Dekkers
Journeés ABES 2011
30
Data versus services
• In the Linked Data space:
– Focus on availability of “raw data”
– Quality is secondary
– Data and technology should lead to useful results
• In the library domain:
– Focus on services
– Quality is essential
– Data and technology in support of the service
© 2011 Makx Dekkers
Journeés ABES 2011
31
Economic aspects
• In the Linked Data space:
– “Information wants to be free” – a human right?
– Short-term thinking: today is hot, yesterday is not
– Focus on applications to create value out of data
• In the library domain:
– Long-term view: sustainability is crucial
– Public money to provide community services
– Expected to do more with less money
© 2011 Makx Dekkers
Journeés ABES 2011
32
OPPORTUNITIES AND RISKS
© 2011 Makx Dekkers
Journeés ABES 2011
33
Strong points Linked Data
• Attempt to create a common technical
platform for machine-readable data
• Lots of enthusiasm in publishing open data
• Promise of global interoperability
• Mix of researchers, user communities,
hackers, professional data providers
• High visibility on political level
© 2011 Makx Dekkers
Journeés ABES 2011
34
Risks Linked Data
• Driven by technology, not by requirements
• Technology may not (yet) be stable – RDF 2.0?
• Operational issues far from solved (reliability,
performance, quality, security, trust)
• Hope for general agreement across domains
may not be realistic
• Promise may turn into disappointment
© 2011 Makx Dekkers
Journeés ABES 2011
35
Strong points libraries
• Long time operational experience in managing
information
• Professional intermediaries between users
and information needs
• Sustainable business models (albeit with
eternally shrinking budgets)
• Long-term perspective: the past (legacy data)
as well as the future (preservation)
© 2011 Makx Dekkers
Journeés ABES 2011
36
Risks libraries
• Technologies change rapidly
• New skills difficult to spread through the
organization
• Some people see libraries as a thing of the
past (“the book museum”)
• Underestimation of information handling skills
• Information overload, human intervention
does not scale, need for better tools
© 2011 Makx Dekkers
Journeés ABES 2011
37
Meeting both worlds
• An example: Europeana.eu
– Started out with domain perspectives (libraries,
archives, museums, audiovisual archives)
– “Traditional” approach (metadata mappings)
works but insufficient
– Using Linked Data approach preserves domain
specifics but allows for generalization to support
common services
– Cross-domain (but co-ordinated) interoperability
© 2011 Makx Dekkers
Journeés ABES 2011
38
Europeana Data Model
Classes
Properties
Complex
example
Simple example
Source at:
http://version1.europeana.eu/web/europeana-project/technicaldocuments/
© 2011 Makx Dekkers
Journeés ABES 2011
39
CONCLUSION
© 2011 Makx Dekkers
Journeés ABES 2011
40
Libraries and Linked Data
• Using Linked Data technology as the next step
in connecting services
• Offering information management skills to the
technology domain
• Creating a quality hub in the Linked Data
space
© 2011 Makx Dekkers
Journeés ABES 2011
41
Best of both worlds
• Libraries providing stability and sustainability
to Linked Data spaces
• Library professionals helping to manage the
distributed collections
• Libraries delivering high-quality linked data to
the Web
• Technologists to provide the next generation
of systems and tools
© 2011 Makx Dekkers
Journeés ABES 2011
42
Linked (Open) Data:
opportunity for libraries!
Thank you!
Makx Dekkers
[email protected]