Slides - California Digital Library

Download Report

Transcript Slides - California Digital Library

SCIENTIFIC DATA
Presentation to the California Digital Library, 20th June 2014
Ruth Wilson – Head of Publishing Services
Andrew Hufton – Managing Editor
Iain Hrynaszkiewicz – Head of Data and HSS
Introduction
• Open Access at NPG
• Drivers for data publication
• Scientific Data
• Next steps
Development of Open Access
General landscape
Open Access at NPG
2000: PubMed Central launched
2001: Nature, Science, and the Third
World Academy of Sciences launch
SciDev, a free online source of science
news and research
2003: The Berlin Declaration on Open
Access to Knowledge in the Sciences and
Humanities signed
2005: The Wellcome Trust introduced its
open access mandate to Wellcome-funded
research
2002: NPG ceases to require copyright
transfer on research articles
2005: First full OA title launched,
Molecular Systems Biology
2005: National Institutes of Health adopted
2009-2011: All non- Nature journals
NIH Public Access Policy
2006: RCUK open access mandates come
into effect
offer OA option
2011: Scientific Reports launched
2013: Nature Publishing Group partners
2009: First international Open Access Week with open access publisher Frontiers.
2013: Obama administration US and
2014: Launch of Scientific Data
HEFCE, UK both introduce open access
2014: Launch of Nature Partner Journals
mandates for taxpayer-funded research
2014: Chinese science research funding
agencies mandate open access
3
2014: 51% of NPG and Frontiers content
is published open access
Open Access at Nature Publishing Group
Nature Communications
Launched in 2010, NatComms
now has an impact factor of
10.015 and receives more
submissions than Nature
Frontiers
A community-oriented open-access
academic publisher and research
network.
Scientific Data
Open access publication publishing Data
Descriptors, peer-reviewed, scientific
publications that provide detailed
descriptions of datasets.
Scientific Reports
Fully open access, Scientific
Reports is a primary research
publication covering all areas of the
natural sciences.
Society open access
journals
We publish 18 fully open access
titles with society partners
Nature Partner Journals
A new series of online open access
journals, published in collaboration
with world-renowned international
partners.
Subscription journals offering open access
option
Over 40 journals in the NPG family offer an open access
option.
4
Concept development
Drivers for data publication
Two important factors are driving to make research data more
available and reusable:
• To ensure the scientific process is transparent and can be
scrutinised and research results reproduced
• To speed the scientific process, lead to new insights and reduce
duplicated and repeated work
To achieve this research data needs to be
Available, Discoverable, Interpretable, Re-usable, Citable
Stakeholders
Funders/researchers/research institutes/data
repositories/libraries/learned societies/publishers/standards
groups/curators
6
Researchers and data
What do researchers do with their data?
~ 75% of researchers store their data locally and do not publish it.
~17% publish data in supplementary info
~14% delete research data
~10% deposit data in a public repository
A strong collaborative culture exists among researchers:
They share 60% of their data with their colleagues
50% look at other researchers’ datasets at least once a month
Researchers are supportive of Scientific Data:
Over 90% reacted positively to the concept of Scientific Data
80% believed that Scientific Data would increase repository deposition rates
What was
96%
95%
93%
80%
7
important to them?
- increased visibility and discovery of their research data
- increased usability of their research data
- credit mechanism for those who take the time to deposit and explain their data
- peer review of content/datasets
A new open-access publication for descriptions
of scientifically valuable datasets
Now Live!
Get Credit for Sharing Your Data
Publications will be indexed and citeable.
Open-access
Authors select from three Creative Commons licenses for the main Data
Descriptor. Each publication supported by CCO metadata.
Focused on Data Reuse
All the information others need to reuse the data; no interpretative analysis,
or hypothesis testing
Peer-reviewed
Rigorous peer-review focused on technical data quality and reuse value
Promoting Community Data Repositories
Not a new data repository; data stored in community data repositories
Data Descriptor
Focus on data reuse
Detailed descriptions of the methods and technical analyses
supporting the quality of the measurements.
Does not contain tests of new scientific hypotheses
Sections:
• Title
• Abstract
• Background & Summary
• Methods
• Technical Validation
• Data Records
• Usage Notes
• Figures & Tables
• References
• Data Citations
Data Descriptor
Article or
narrative
component
(PDF and HTML)
Experimental metadata
or
structured component
(in-house curated,
machine-readable
formats)
Data Citations
Formally link Data Descriptor to external data records
Joint Declaration of Data
Citation Principles
by the Data Citation
Synthesis Group, incl.:
- CODATA
- Research Data Alliance,
- Force11
Data Descriptor
structured metadata (CC0)
In-house curation team:
• assists users to submit the
structured content via simple
templates and an internal
authoring tool
• performs value-added semantic
annotation of the experimental
metadata
For advanced users/service
providers willing to export ISA-Tab
for direct submission, we have
released a technical specification:
analysis
method
Data file or
record in a
database
script
Our data policies
Clear data sharing policies
• Data must be deposited to an approved data repository
before manuscript submission, prior to peer-review.
• If datasets are private, they must be made accessible to
editors and referees in a secure and confidential manner.
• Must agree to release data to the public, without undue
restrictions, at the time of publication.
• Reasonable controls allowed for datasets with human privacy
restrictions.
Data repositories criteria
1. Broadly support and recognition within their scientific
community
2. Ensure long-term persistence and preservation of datasets
in their published form
3. Provide expert curation
4. Implement relevant, community-endorsed reporting
requirements
5. Provide for confidential review of submitted datasets
6. Provide stable identifiers for submitted datasets
7. Allow public access to data without unnecessary
restrictions
17
Our recommended repositories
• We currently recognize over 60 public data repositories.
• We have integrated systems with both figshare and
Dryad
• No institutional repositories yet, but we are open to
adding them …
18
The right licence for the right content
Data Descriptor article: Licensed under one of
three Creative Commons licenses, by author
choice:
Metadata: released under the CC0 waiver to
maximize reuse and aid data miners
Data: the primary datasets will reside in public
repositories. Partnering with figshare and Dryad,
which both use the CC0 waiver.
Diverse content from across the natural sciences
Ecology
• Associated Nature
articles
• Data in figshare
• Integrated figshare
data viewer
• Citizen science
project
Neuroscience
•
•
•
•
New Dataset
Data in OpenfMRI
Source code in GitHub
Big Data
Code in GitHub
Data Descriptor
relation with traditional articles
Methods and technical analyses supporting the quality of the measurements.
Do not contain tests of new scientific hypotheses
What did I do to generate the
data?
How was the data processed?
Where is the data?
Who did what when
Synthesis
Analysis
Conclusions
Community
Advisory Panel
• Guide the development, policies, standards and
editorial scope of Scientific Data.
• Senior scientists from academia and industry along
with representatives from the data repository,
librarian, biocurator and funder communities.
Editorial Board
Active scientists oversee peer-review
Peer-review assesses
• The completeness of the description
• Alignment with community standards
• Data deposition in an appropriate repository
• Technical quality of the measurements
• Reuse value
Scientific Data & the University of California
Advisory Panel
Patricia Cruse, CDL
Joseph Ecker, Salk & UCSD
Editorial Board
Michelle Arkin, UCSF
Trey Ideker, UCSD
Maryann Martone, UCSD
Adam Renslo, UCSF
Amir AghaKouchak, UCI
27
Thanks for your time!
Helping you publish, discover
and reuse research data
Now launched!
Honorary Academic Editor
Susanna-Assunta Sansone
Advisory Panel and Editorial
Board including senior researchers,
funders, librarians and curators
Supported by
Visit
nature.com/scientificdata
Email
[email protected]
Tweet @ScientificData