Friday 1 June 2012 RSP Scholarly Communications: New Developments in Open Access, RIBA Encouraging data publication - the JISC Managing Research Data Programme Simon.

Download Report

Transcript Friday 1 June 2012 RSP Scholarly Communications: New Developments in Open Access, RIBA Encouraging data publication - the JISC Managing Research Data Programme Simon.

Friday 1 June 2012
RSP Scholarly Communications: New Developments in Open Access, RIBA
Encouraging data publication - the JISC Managing
Research Data Programme
Simon Hodson
JISC Programme Manager, Managing Research Data
Finally, a lamentable element of the culture in social psychology
and psychology research is for everyone to keep their own data
and not make them available to a public archive. This is a problem
on a much larger scale, as has recently become apparent.
Even where a journal demands data accessibility, authors usually do
not comply (Wicherts et al. 2006). Archiving and public access to
research data not only makes this kind of data fabrication more
visible, it is also a condition for worthwhile replication and metaanalysis.
Recommendation
Far more than is customary in psychology research practice,
research replication must be made part of the basic instruments of
the discipline. Research data that underlie psychology publications
must be held on file for at least five years after publication, and be
made available on request to other scientific practitioners. This rule
is to apply not only to raw laboratory data, but also to completed
questionnaires, audio and video recordings, etc. The publication
must state where the raw data reside and how to access them.
INTERIM REPORT REGARDING THE BREACH OF SCIENTIFIC INTEGRITY
COMMITTED BY PROF. D.A. STAPEL
Tilburg, 31 October 2011
Data Reuse: asking new questions
 Papers based upon reuse of archived observations now exceed those
based on the use described in the original proposal.
– http://archive.stsci.edu/hst/bibliography/pubstat.html
Combining data from disparate sources
 ‘New technologies for sharing data
and for combining data from
disparate sources are particularly
valuable in multidisciplinary fields such
as earth science and nanoscience. ...
The challenge of federating, mining,
analysing and interpreting these data
will be a key focus in coming years.’
http://www.rin.ac.uk/ourwork/using-and-accessinginformation-resources/physicalsciences-case-studies-use-anddiscovery-
Data Management and Data Publication
 Good data management is good for research
– More efficient research process, avoidance of data loss
 Data sharing / data publication is good for research
– Verification of research findings
– Benefits of data reuse: new questions
– Metastudies; integration of data in interdisciplinary research
 Research funder policies, legislative frameworks, good practice, open data
agenda
– The outputs of publicly funded research should be publicly available.
– The evidence underpinning research findings should be available for
validation
 Alignment with university missions.
– Universities want to provide excellent research infrastructure.
– Universities want to have better oversight of research outputs.
Research Data Principles (1)
 Publicly funded research data are a public good, produced in the public
interest, which should be made openly available in a timely manner with as
few restrictions as possible. (RCUK, EPSRC)
 Data with acknowledged long-term value should be preserved and remain
accessible and usable for future research. (RCUK, EPSRC)
 Sharing research data is an important contributor to the impact of publicly
funded research. (EPSRC)
 Research data of future historical interest, and all research data that
represent records of the University, including data that substantiate
research findings should be deposited. (Edinburgh).
 Published results should always include information on how to access
the supporting data. (RCUK)
Research Data Principles (2)
 To recognise the intellectual contributions of researchers who generate,
preserve and share key research datasets, all users of research data
should acknowledge the sources of their data and abide by the terms and
conditions under which they are accessed. (RCUK, EPSRC)
 There are legitimate reasons for restricting access to data (legal, ethical,
commercial). Researchers and research institutions should ensure data is
well managed throughout the lifecycle in order to guard against inappropriate
release of research data. (RCUK, EPSRC)
 RCUK Common Principle on Research Data
http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
 EPSRC Research Data Principles
http://www.epsrc.ac.uk/about/standards/researchdata/Pages/principles.aspx
 University of Edinburgh RDM Policy http://www.ed.ac.uk/schoolsdepartments/information-services/about/policies-and-regulations/researchdata-policy
EPSRC Research Data Policy Expectations
 Research organisations to have RDM policy, advocacy and support
functions. (i, iii)
 Research data to be effectively managed and curated throughout the
life-cycle (viii)
 Research organisations to maintain public catalogue of research
data holdings, adequate metadata and permanent identifier (v)
 Publications to indicate how research data can be accessed (ii)
 Data to be retained for 10 years from last access (vii)
 Research data management to be adequately resourced from
appropriate funding streams (ix)
 Roadmap in place by 1 May 2012
 Compliance by 1 May 2015
Dryad Data Repository
JDAP: Joint Data Archiving Policy
 Joint Data Archiving Policy: http://datadryad.org/jdap
 Joint declarations, Feb 2010, in American Naturalist, Evolution, the Journal of Evolutionary
Biology, Molecular Ecology, Heredity, and other key journals in evolution and ecology:
http://www.journals.uchicago.edu/doi/full/10.1086/650340
 This journal requires, as a condition for publication, that data supporting the results
in the paper should be archived in an appropriate public archive, such as GenBank,
TreeBASE, Dryad, or the Knowledge Network for Biocomplexity.
 Allows embargos of up to one year;
allows exceptions for, e.g., sensitive
information such as human subject
data or the location of endangered
species.
 ‘Data that have an established
standard repository, such as DNA
sequences, should continue to be
archived in the appropriate repository,
such as GenBank. For more
idiosyncratic data, the data can be
placed in a more flexible digital data
library such as the National Science
Foundation-sponsored Dryad archive
at http://datadryad.org.'
‘Some BioMed Central journals
now additionally encourage or
require authors, as a condition of
publication, to include in some
article types a section that
provides a permanent link to
the data supporting the results
reported in the article. … The
aim is to provide links in a
consistent place within an article
to supporting data - regardless of
the location or format of the data and to make it clear to readers
when they can also access the
data as well as the article.’
Adopted by 20 BMC journals
between Aug 2011 and Mar
2012
(most encourage…)
http://www.biomedcentral.com/ab
out/supportingdata
Challenges and Questions…
 What research data should be kept and for how long?
– Relates to the use cases: verification and reuse…
 How do we best support the good management and publication of
research data?
 Where should research data be archived?
– Discipline data centres, institutional data repositories…
– How do we ensure discoverability?
 Where this requires change in practice, how do we motivate
researchers to make data available in a usable form?
Barriers to data sharing…
Recognition
Recognition
Recognition
 Practical:
– Lack of infrastructure.
– Lack of support or expertise.
– Technical challenges (data types, metadata, systems to support RDM)
 Behavioural:
– Concern that data may be misused.
– Concern that will lose scientific edge.
– Concern that will not be credited.
– Lack of career rewards for data publication.

See ODE report, using Parse.Insight findings: http://www.alliancepermanentaccess.org/wpcontent/uploads/downloads/2011/11/ODE-ReportOnIntegrationOfDataAndPublications-1_1.pdf

RIN Report, ‘To Share or not to share’, http://www.rin.ac.uk/our-work/data-management-and-curation/share-or-notshare-research-data-outputs
Supporting the Research Data Lifecycle
Store
Plan
Reuse
Annotate
Create
Access
Discover
Describe
Identify
Publish
Use
Appraise
Hand Over?
Discard
Select
A holistic approach…
Leadership and
Policy Development
Publication, Citation
and Discovery
Mechanisms
RDM Systems and
Infrastructure
Guidance and
Training
Support for Data
Management
Planning
Citing and linking to research data
DCC Briefing Paper:
Ball, A., Duke, M. (2011). ‘Data Citation
and Linking’. DCC Briefing Papers.
Edinburgh: Digital Curation Centre.
Available online:
http://www.dcc.ac.uk/resources/briefingpapers/
DCC How to Guide:
Ball, A. & Duke, M. (2011). ‘How to Cite
Datasets and Link to Publications’. DCC
How-to Guides. Edinburgh: Digital
Curation Centre. Available online:
http://www.dcc.ac.uk/resources/howguides
Dryad-UK Project
Dryad-UK
 Expand the number of journals: BMJ Open, titles from PLoS and BioMed Central:
 Prepare a business model for long term funding of the data repository: e.g. supported
by payments from journals, in turn recouped from subscription or author-pays OA fees.
Research
Funder
Research
Project
Gold OA
Fee
Journal
Repository
Costs
 Estimate costs of archiving (curation and preservation) datasets in Dryad: $25-75 per
publication
 Estimate full costs of research and publication per OA article: $2500
 Costs of data archiving in Dryad 1-3% of costs of producing the article.
See Piwowar http://researchremix.wordpress.com/page/2/ and Vision ‘Open Data and the
Social Contract of Scientific Publishing’
http://www.bioone.org/doi/full/10.1525/bio.2010.60.5.2
New Funding Model
(officially adopted May 2012)
A range of payment plans to be offered in order to account to differing usage models:
1. Journal-based
• One or more journals from a publisher prepay based on number of
research articles published
• Deposit price of ca. $25 per research paper
2. Vouchers
• Any organisation may pay in advanced for a fixed number of deposits
• Deposit price of ca. $50 per research paper
3. Pay-as-you-go
• Any organisation may pay retrospectively for deposits
• Deposit price of ca. $50 + surcharge per research paper
4. Author-pays
• For authors submitting via journals not having one of the above plans
• Deposit price of ca. $50 + surcharge per research paper
• Additional curation charges if not involving an integrated journal
Slide Credit: Brian Hole
New Governance Model
(officially adopted May 2012)
1. Incorporation of Dryad as a US tax-exempt not-for-profit organisation.
2. A twelve-person Board of Directors, vested with legal and financial responsibility
for Dryad
a. Directors to serve for staggered three-year terms and able to serve multiple
terms
b. Directors to be elected by (but not necessarily from) the members
c. Board members to appoint the officers of the Board, as well as to appoint exofficio members as required.
3. Dryad executive staff to be hired as employees, reporting to the Board of
Directors.
4. Membership of Dryad to be open to any legitimate organisation that supports its
mission.
5. Members to pay a modest annual fee, not to be depended on for revenue.
6. Members to vote on amendments to the Articles and By Laws, and serve as an
advisory body to the Board of Directors.
Slide Credit: Brian Hole
Data and publications
ODE Report on integration of data and publications:
http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/2011/11/ODE-ReportOnIntegrationOfDataAndPublications-1_1.pdf
Data and publications
Linking / integrating data an publications:
 helps the data to be better discoverable
 helps the data to be better interpretable
 provides the author better credit for the data
 and reversely: the data add depth to the article and facilitate
better understanding.
 ParseInsight findings: 85% of researchers are in favour of linking data
with literature.
 Recognition of data as a ‘first class research object’, requiring
preservation, recognition, validation and dissemination just like
articles.
ODE Report on integration of data and publications:
http://www.alliancepermanentaccess.org/wp-content/uploads/downloads/2011/11/ODE-ReportOnIntegrationOfDataAndPublications-1_1.pdf
Data Publication Initiatives
 Initiatives for linking data to literature, for data papers
and for peer review of data.
 BMC Research Notes (‘encourages the publication of
software tools, databases and data sets and a key
objective of the journal is to ensure that associated data
files will, wherever possible, be published in standard,
reusable formats’)
http://www.biomedcentral.com/bmcresnotes/
 Earth System Science Data aims ‘to publish data
according to the conventional fashion of publishing
articles, applying the established principles of quality
assessment through peer-review to datasets’
http://www.earth-system-science-data.net/
 RDMF8 ‘Engaging with the publishers’:
http://www.dcc.ac.uk/events/research-datamanagement-forum-rdmf/rdmf8-engaging-publishers
PRIME
Publisher, Repository and Institutional Metadata Exchange
UCL LIBRARY SERVICES
[email protected]
2012
INSTITUTE OF ARCHAEOLOGY
75 YEARS OF LEADING GLOBAL ARCHAEOLOGY
www.ubiquitypress.com / @ubiquitypress
PRIME: Project focus
• Developing a system to exchange metadata between:
• the UCL Discovery EPrints institutional repository
• the Archaeology Data Service subject repository
• the Journal of Open Archaeology Data
• Focusing on archaeology data only to pilot the system
• Building on other successful JISC projects:
• DryadUK
• REWARD
• SWORD-ARM
[email protected]
www.ubiquitypress.com / @ubiquitypress
PRIME: Use Case #1
• A UCL Researcher deposits data in an external subject repository.
• The subject repository sends the metadata and DOI of the data to the
UCL institutional repository so that it has a record of the output.
[email protected]
www.ubiquitypress.com / @ubiquitypress
PRIME: Use Case #2
• A UCL Researcher deposits data in their institutional repository.
• The institutional repository sends the metadata and DOI of the data to
the appropriate subject repository so that it has a record of the output.
[email protected]
www.ubiquitypress.com / @ubiquitypress
PRIME: Use Case #3
•
•
•
A UCL Researcher submits an article to a journal, and is asked to archive the data
as a precondition of publication.
The journal sends the metadata to the subject repository so that the author does
not have to re-enter it.
The subject repository sends the metadata and DOI of the data to the
institutional repository so that it has a record of the output, and the DOI back to
the journal to link the article with the data.
[email protected]
www.ubiquitypress.com / @ubiquitypress
PREPARDE: Peer REview for Publication & Accreditation of
Research Data in the Earth sciences
•
•
Lead Institution: University of Leicester
Partners
–
–
–
–
–
–
–
British Atmospheric Data Centre (BADC)
US National Centre for Atmospheric Research (NCAR)
California Digital Library (CDL)
Digital Curation Centre (DCC)
University of Reading
Wiley-Blackwell
Faculty of 1000 Ltd
•
•
•
•
•
Project Lead:
Project Manager:
Length of Project:
Project Start Date:
Project End Date:
•
•
Total Funding Requested from JISC: £135, 025
Total Institutional Contributions: £80,207
Dr Jonathan Tedds (University of Leicester, [email protected])
Dr Sarah Callaghan (BADC, [email protected] )
12 months
1st July 2012
31st June 2013
Geoscience Data Journal, Wiley-Blackwell and the
Royal Meteorological Society
•
•
•
supported by NERC – in particular the British Atmospheric Data Centre
partnership formed between Royal Meteorological Society & academic publishers Wiley-Blackwell
• develop a mechanism for the formal publication of data in the Open Access Geoscience
Data Journal
builds on JISC funded OJIMS (Overlay Journal Infrastructure for Meteorological Sciences)
project
Example of (potential)
steps/workflow required for a
researcher to publish a data paper
Items in orange will be investigated
in PREPARDE
• Author guidelines for data
papers and submissions
• Repository accreditation
• Scientific review of data
• Linking mechanisms
• Divisions of responsibility
between journals and data
repositories.
Solutions will be tested with
partners.
Three workshops in early 2013.
Recommendations to broader
community
PREPARDE objectives
•
capture and manage workflows required to operate the Geoscience Data Journal
– from submission of a new data paper and dataset, through review and to publication
•
develop procedures and policies for authors, reviewers and editors
– allow the Geoscience Data Journal to accept data papers as submissions for publication
– focus on guidelines for scientific reviewers who will review the datasets
•
incorporate some technical developments at the point of submission
– data visualisation checks
– interface improvements
– enhance the resulting data publications
•
put in place procedures needed for data publication in the California Digital Library
•
interact with the wider scientific and data community
– provide recommendations on accreditation requirements for data repositories
•
engage the user and stakeholder community
– promote long-term sustainability and governance of data journals
Project team, roles and responsibilities
•
University of Leicester (UoL): project lead, academic liaison and administration
–
•
British Atmospheric Data Centre (BADC): project management
–
•
feedback on development of peer review guidelines and workflows for the Geoscience Data Journal
contribute to the data repository accreditation report and workshops
F1000: contribute broader perspective from the biomedical sciences
–
–
•
informal (crowd-sourced) review
contribute a broader international perspective from outside the Earth Sciences
US National Centre for Atmospheric Research (NCAR): use cases of data review methods
–
–
•
work with partners to develop and implement workflows & support technical enhancements to enhance authorial
and readership experiences
California Digital Library (CDL): investigate a lightweight data paper convention
–
–
•
provide technical input into the cross-linking, workflows and scientific review work packages
Wiley-Blackwell: provide publishing platform
–
•
liaise with range of academic stakeholders internationally => input to workshops and guidelines
extend the impact from PREPARDE to the biomedical community through launch of F1000 Research
use of F1000 Advisory Panel and F1000 Faculty & co-organise a workshop
Digital Curation Centre: assess peer review models & overlap with data repository appraisal
and ingest processes
–
–
–
respective roles of main stakeholders
Workflows and interoperability with data centres and publishers
Trusted Digital Repository certification.
Journal Data Availability Policies
 Research shows that journal data availability and sharing policies are
influential upon researcher data archiving.
– Piwowar HA (2011) Who Shares? Who Doesn't? Factors Associated with
Openly Archiving Raw Research Data. PLoS ONE 6(7): e18657;
doi:10.1371/journal.pone.0018657
 Useful for researchers, research support staff, data repository
managers, librarians and policy makers to have readily accessible
summary of journal policies (à la Sherpa RoMEO).
 Feasibility study and analysis of business models for a registry
of journal research data availability and sharing policies.
Thank You!
 First JISC MRD Programme, 2009-11: http://bit.ly/jiscmrd2009-11
 JISC MRD Outputs Page: http://bit.ly/jiscmrd2009-11-outputs
 Second JISC MRD Programme, 2011-13: http://bit.ly/jiscmrd2009-11
 Programme Blog: http://researchdata.jiscinvolve.org/
 E-mail: [email protected]
 Twitter: simonhodson99
 Acknowledgements for slides, content: Andrew Treloar, Eefke Smit, Brian Hole (DryadUK, PRIME), Jonathan Tedds (and others from PREPARDE).