PIDs and DOI Registration with DataCite

Download Report

Transcript PIDs and DOI Registration with DataCite

PIDs and DOI Registration with DataCite
Frauke Ziedorn
IATUL Workshop 2013
Research Data Management: Finding our Role
6. December 2013
Background:
Why publish and cite research data?
• Easier re-usability and verification of data
• Recognition for collection and documentation of data (Citation
Indeces)
• Compliance with funders‘ requirements (e.g German Research
Foundation)
• Avoiding duplication
• Motivation for new research
2
Persistent Identifier I
• DOI
• Citation of scientific publications
• Established in the scientific community
• Global resolving via any handle server or dx.doi.org/{doi}
• Persistence and data quality are guaranteed
• Handle
• Global referencing of data before publication
• Global resolving via any handle server
• No persistence or quality management
3
Persistent Identifier II
• URN
• Referencing of local documents in a closed system ( e. g.
dissertations, thesis)
• Resolving only on server of publisher
• Persistence and data quality are guaranteed
• ARK
• Documents of all work stages
• Persistence declaration available; may be deleted
• Resolving only free of cost on server of publisher
• Quality management (metadata), no standards for
persistence
4
The DOI® System
• International DOI Foundation was founded in 1998.
• The DOI system offers long-term persistence and
accessibility of data.
• Based on the Handle system.
• In May 2012 the DOI System ISO Standard 26324 was
published.
• Part of the quality control is mandatory metadata for
each object registered with a DOI.
DOI®, DOI.ORG® and shortDOI® are trademarks of the International DOI Foundation
5
A little History
• 2003:
DFG-funded project of the TIB with World Data Centres
regarding the publication of research data.
• 2005:
TIB becomes the first DOI registration agency for research
data. From the beginning, grey literature is also registrered.
• 2009-03:
Paris Memorandum regarding the cooperation of 6 European
information providers.
• 2009-12:
DataCite is founded in London with 7 members.
6
DataCite
• Growing demand to make data citable.
• DataCite is an international consortium whose aims are
• to establish easier access to research data on the Internet
• to increase acceptance of research data as legitimate,
citable contributions to the scholarly record
• to support data archiving that will permit results to be
verified and re-purposed for future study.
• Developement of standards, worflows, and best practices.
• 2013:
• 18 members from 13 countries,
• 9 associated members,
• ~2.2 Million DOIs
7
DataCite Members
8
DOI System Infrastructure
International DOI
Foundation
Member
9 DOI Registration
Agencies
DataCite
Managing Agent
TIB
DataCite Member
DataCite Member
Associate
Members
…
DataCentre
Centre
Data
Datacenter
DataCentre
Centre
Data
Datacenter
9
DataCite Services
Metadata Store (MDS)
• Registration and updating of DOI names.
• Storage of metadata.
• Accessible via UI or API.
10
DataCite and Metadata
• Metadata make data discoverable.
• Long-term maintenance of metadata is an
important part of the persistence of an identifier.
• Schema is inspired by Dublin Core.
• Core value of the DataCite Metadata Schema:
Linking between data and related objects.
• Future vision:
Links between all related publications and objects.
11
DataCite Metadata Schema
Mandatory Properties
•
•
•
•
•
Identifier (with type attribute)
Creator (with type and nameIdentifier attributes)
Title (with optional type attribute)
Publisher
PublicationYear
• Citation:
Creator (PublicationYear): Title. Publisher. Identifier
12
Citation
Creator (PublicationYear): Title. Publisher. Identifier
Dataset:
Kuhlmann, H et al. (2009):
Age models, iron intensity, magnetic susceptibility records and dry bulk density of
sediment cores from around the Canary Islands. PANGAEA - Data Publisher for
Earth & Environmental Science.
doi:10.1594/PANGAEA.727522,
Is supplement to this article:
Kuhlmann, Holger; Freudenthal, Tim; Helmke, Peer; Meggers, Helge
(2004): Reconstruction of paleoceanography off NW Africa during the last 40,000
years: influence of local and regional factors on sediment accumulation.
Marine Geology, 207(1-4), 209-224,
doi:10.1016/j.margeo.2004.03.017
13
DataCite Metadata Schema
Optional Properties
•
•
•
•
•
•
•
•
•
•
•
•
•
Subject (with scheme attribute)
Contributor (with type and nameIdentifier attributes)
Date (with type attribute)
Language
ResourceType (with description attribute)
AlternateIdentifier (with type attribute)
RelatedIdentifier (with type and relationType attributes)
Size
Format
Version
Rights
Description (with type attribute)
GeoLocation (with point, box, and place)
14
DataCite Services
Metadata Search
• Search engine for all metadata stored in the MDS.
• Filter options to refine the search.
15
DataCite Services
OAI-PMH Data Provider
• Exposes metadata stored in the MDS using the Open Archives
Initiative Protocol for Metadata Harvesting (OAI-PMH).
• Different metadata formats are available.
• Service is open to everyone; harvesters include:
• TIB (GetInfo)
• Thomson Reuters (Data Citation Index)
• Elsevier (Exlibris)
16
Services in Cooperation with CrossRef
• http://crosscite.org/citeproc/
A Citation Formater which provides over 100 different formats
for citations.
• http://crosscite.org/cn/
With Content Negotiation it is possible to access different
media types of a registered object (machine-to-machine only).
17
Content Negotiation
Resolving to a citation:
http://data.datacite.org/application/xdatacite+text/10.5524/100005
Li, j; Zhang, G; Lambert, D; Wang, J (2011): Genomic data from
Emperor penguin. GigaScience.
http://dx.doi.org/10.5524/100005
18
Content Negotiation
Resolving to RDF metadata:
http://data.datacite.org/application/rdf+xml/10.5524/100005
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:j.0="http://purl.org/dc/terms/" > <rdf:Description
rdf:about="http://dx.doi.org/10.5524/100005">
<j.0:identifier>10.5524/100005</j.0:identifier> <j.0:creator>Li,
J</j.0:creator> <j.0:creator>Zhang, G</j.0:creator>
<j.0:creator>Wang, J</j.0:creator>
<owl:sameAs>doi:10.5524/100005</owl:sameAs>
<owl:sameAs>info:doi/10.5524/100005</owl:sameAs>
<j.0:publisher>GigaScience</j.0:publisher> <j.0:creator>Lambert,
D</j.0:creator> <j.0:date>2011</j.0:date> <j.0:title>Genomic data
from the Emperor penguin (Aptenodytes forsteri)</j.0:title>
</rdf:Description></rdf:RDF>
19
DataCite new developements
• ORCID and DataCite Interoperability Network
(http://odin-project.eu/ )
• http://datacite.labs.orcid-eu.org/
Link datasets to your ORCID profile
• Inclusion of a DataCite interface in next version of D-Space
20
Links
• http://schema.datacite.org
Access to all versions of the DataCite metadata schema, with documentation,
schema definition, and examples.
• http://search.datacite.org
Search engine for all metadata stored by DataCite.
• http://oai.datacite.org
Datacite‘s OAI-PMH service which allows access to the metadata.
• http://data.datacite.org
DataCite Content Service exposes metadata using multiple formats.
• http://test.datacite.org
DataCite‘s test system includes all services, like MDS, Search, Content
Negotiation etc.
• http://stats.datacite.org
Display of registration and resolving statistics.
21
Thank you for your attention!