Introduction to DataCite Adam Farquhar, PhD Head of Digital Library Technology, The British Library President, DataCite June, 2010
Download
Report
Transcript Introduction to DataCite Adam Farquhar, PhD Head of Digital Library Technology, The British Library President, DataCite June, 2010
Introduction to DataCite
Adam Farquhar, PhD
Head of Digital Library Technology, The British Library
President, DataCite
June, 2010
The British Library
Exists for everyone who wants to do
research – for academic, personal, and
commercial purposes.
Covers all subject areas – sciences,
technology, medicine, arts, humanities,
social sciences…
Receives a copy of every item
published in the UK.
Holds over 150 million items, with 3
million items added each year.
Used by over 16,000 people each day
(on site and online).
2
Data and the Digital Landscape
Seismic measurements taken by a
geologist.
Genetic data collected by a medical
researcher.
A survey of public opinions collected
by a sociologist.
3
Data: The Foundation of Research
Data is a crucial component of the scholarly record
Re-acquisition may be impossible
Datasets are essential to the British Library’s mission
to advance the World’s knowledge
4
Widening Gap
Articles
Underlying
data
No effective way to link
between datasets and
articles
No widely used method to
identify datasets
No widely used method to
cite datasets
5
As a result…
Datasets are
Difficult to discover
Difficult to access
Being lost
6
Datasets – First Class Citizens?
Data is difficult to manage after
project funding ceases
Informal networks provide the
primary means of sharing
Only 21% use a national or
international facility
Datasets are not included in
impact analysis
Good luck finding it or getting
permission to use it (your
discipline may vary)
Source: UKRDS Study
7
DataCite – An Award Winning Global Consortium
DataCite aims to:
Establish easier access to scientific research data
Increase acceptance of research data
Support archiving of data for verification and re-use
8
DataCite – Supporting the Research Community
DataCite:
Supports researchers by enabling them to locate,
identify, and cite research datasets with
confidence
Supports data centres by providing persistent
identifiers for datasets, workflows and standards
for data publication
Supports publishers by enabling research articles
to be linked to the underlying data
9
DataCite uses DOIs for Data:
DataCite : Data Centres :: CrossRef : Publishers
URLs are not persistent
(e.g. Wren JD: URL decay in
MEDLINE- a 4-year follow-up
study. Bioinformatics. 2008, Jun
1;24(11):1381-5).
Digital Object Identifiers (DOIs)
offer a solution
Mostly widely used identifier for
scientific articles
Researchers, authors, publishers
know how to use them
Put datasets on the same playing
field as articles
Dataset
Yancheva et al (2007). Analyses
on sediment of Lake Maar.
PANGAEA.
doi:10.1594/PANGAEA.587840
10
Membership
AUS Australian National Data Service (ANDS)
CAN Canada Institute for Scientific and Technical
Information
Library of the ETH Zurich
CH
Technical Information Center of Denmark
DK
Institute for Scientific and Technical Information
FR
GER German National Library of Science and
Technology (TIB)
German National Library of Medicine (ZB MED)
GESIS - Leibniz Institute for Social Science
TU Delft Library
NL
The British Library
UK
USA California Digital Library (CDL)
Purdue University Libraries
UK
USA
From Canada to Australia
Currently twelve members
across nine countries
Over 800,000 records
registered with DOI names
so far
Associated Members
Digital Curation Centre
Microsoft Research
11
Rapid Progress Builds on Foundational Work
05
TIB begins
to issue
DOIs for
datasets
03.
09
12.
09
Paris
DataCite
Memorandum Association
founded in
London
7 members
06.
10
12 members
All members
assigned DOIs
Over 800,000
items
registered
Pilot projects
with Data
Centres
12.
10
Production
services with
Data Centres
Shared
technical
infrastructure
Integrated
services with
key partners
12
DataCite – Roles and Responsibilities
The DataCite registration agency
Maintains the resolution infrastructure
Maintains a searchable database of metadata
Manages identifiers over the long term
Establishes and shares best practice
Publishing agents (data centres, research institutes, publishers) are
responsible for
Quality assurance
Content storage and access
Creating the identifier
Creating and updating metadata
13
DataCite Structure
International DOI
Foundation
Global Handle
System
Member
DataCite
Member
Institution
Member
Institution
Works
with
…
DataCentre
Centre
Data
Data Centre
Associate
Stakeholder
DataCentre
Centre
Data
Data Centre
14
Strengths and Weaknesses of DOI
DOIs have some strong advantages
Accepted by researchers and scientists
Mature infrastructure
Put datasets on the same playing field as articles
But perceived as
Expensive
The current IDF business model favours larger
registration agencies
Publisher oriented
The largest registration agency is the publisher-oriented
CrossRef
15
The Cost of Visibility
€0.01 – €1
€50 – €500
DOI Assignment
Management
Storage
Quality Assurance
Metadata
Collection
Production
(approx 1% of data creation cost)
€5,000 – €5,000,000
16
BL – Search Our Catalogue
17
DE Service – Elsevier Science Direct
18
Research Data in Articles
19
Publishing Primary Data
20
Rapidly Growing Ecosystem
Microsoft works with CDL to embed DataCite into Excel
plug-in
UK National Sound Archive assigns DataCite DOIs to
archival recordings
Dryad integrates DataCite DOIs into publisher workflows
for supplementary material and datasets in US
ANDS integrates DataCite DOIs into dataset services
Thieme Publishing Group uses DataCite DOIs to link
articles and primary research data (at FIZ)
Active discussions with key research information service
providers and data centres
21
What Next?
Require clear unambiguous
citations for datasets
Integrate links to datasets into
delivery platforms
Integrate into workflows for
researchers, data centres,
and publishers
Collaborate to understand
roles and responsibilities
among publishers, data
centres, and libraries
Improve attribution and credit
for data producers
Roll out services
DataCite supports researchers
by enabling them to locate,
identify, and cite research
datasets with confidence
We welcome your comments,
questions, and ideas!
Contact:
www.datacite.org
adam.farquhar {@} bl.uk
jan.brase {@} tib.unihannover.de
22