Data citation in CSIRO - Australian National Data Service

Download Report

Transcript Data citation in CSIRO - Australian National Data Service

Data citation in CSIRO
Building a culture of data citation
Anne Stevenson | Research Data Services Support
Adapted and presented by David Benn, 14th March 2014
CSIRO INFORMATION MANAGEMENT & TECHNOLOGY
Image: http://xkcd.com/285/
The big picture
• Research data is a critical enabler of CSIRO’s collaborative
science.
• CSIRO IM&T recognises that research data is a valuable
organisational asset that needs to be managed.
• Data citation is part of this bigger picture.
• Global movement from funders and journal publishers to
support and promote data sharing and citation
• Nature, Science, PLoS, ...
2 | Data citation in CSIRO| Anne Stevenson
The big picture
• Appeal to researchers to describe and manage their data, for:
• Accountability: publicly funded research, responsibility to
provide access to data
• Access to data is 1 step towards reproducibility
• Reuse: data product used by other researchers; same
datasets, multiple analyses
• Reputation: protection against loss of data, broken links to
data, unreadable media
3 | Data citation in CSIRO| Anne Stevenson
The big picture
• Research data is a critical enabler of CSIRO’s collaborative
science.
• CSIRO IM&T recognises that research data is a valuable
organisational asset that needs to be managed.
• 20TB of Parkes pulsar data is one example of data-driven
discovery.
• Needs to be managed, protected against loss, versioned,
described, made broadly available.
• Work is underway to process Parkes pulsar observations in
DAP, in order to detect differences in pulse arrival times from
an array of pulsars in an attempt to detect gravitational
waves.
4 | Data citation in CSIRO| Anne Stevenson
The big picture: Carrots
• Recognition for managing data: get your data cited:
• Carrot: linking to data from publications can raise citation rates
Piwowar HA, Vision TJ. (2013) Data reuse and the open data
citation advantage. PeerJ 1:e175
http://dx.doi.org/10.7717/peerj.175
5 | Data citation in CSIRO| Anne Stevenson
The big picture
• Data reuse and the open data citation advantage
• Piwowar HA, Vision TJ. (2013) Data reuse and the open data citation
advantage.PeerJ 1:e175
• http://dx.doi.org/10.7717/peerj.175
• (gene expression microarray data)
• After accounting for other factors affecting citation rate, we find a robust
citation benefit from open data, although a smaller one than previously
reported. We conclude there is a direct effect of third-party data reuse that
persists for years beyond the time when researchers have published most of
the papers reusing their own data.
6 | Data citation in CSIRO| Anne Stevenson
The big picture
• Linking to Data - Effect on Citation Rates in Astronomy
• Edwin A. Henneken, Alberto Accomazzi
• http://arxiv.org/abs/1111.3618
• Is there a difference in citation rates between articles that were published
with links to data and articles that were not? Besides being interesting from a
purely academic point of view, this question is also highly relevant for the
process of furthering science. Data sharing not only helps the process of
verification of claims, but also the discovery of new findings in archival data.
7 | Data citation in CSIRO| Anne Stevenson
The big picture
• http://blogs.loc.gov/digitalpreservation/2013/11/astronomical-dataand-astronomical-digital-stewardship-an-interview-with-brian-schmidt/
• What do you think the role of traditional libraries, museums and archives
should be when dealing with astronomical data and artifacts?
• Our work could be substantially aided with libraries providing systems
for working with and curating data. Libraries need to figure out how to
help curate and make available data and data products. Ideally, we
would have librarians taking on increasingly specialist niches, across
many institutions. In our library, we are bringing in more staff who have
expertise in data management – trained astronomers who decide they
want to be exporting data to the masses. I think training people in
library science curation is important too, and I imagine we will
increasingly see individuals with these skill sets and background
embedded in the teams that produce, maintain, and provide access to
various data products.
8 | Data citation in CSIRO| Anne Stevenson
In place
An automated process to:
• Deliver a DOI for a publically available dataset
– a PID (a handle) for a “non-public” dataset
• Present an attribution statement within the collection:
– Visible during the creation of the record in the Data Access Portal (DAP)
– Any changes made to the record during the deposit process are reflected in
the attribution
– Can be copied for use in publications
Facility to provide bidirectional linking between data and
publications
9 | Data citation in CSIRO| Anne Stevenson
Attribution statement in the Data Access Portal
10 | Data citation in CSIRO| Anne Stevenson
Bidirectional linking
11 | Data citation in CSIRO| Anne Stevenson
In place
• Training sessions for depositing and managing data sets in the
Data Access Portal (DAP)
– We can almost hear the attention level increase when we mention DOI
minting
• Distributing the ANDS Data Citation leaflet into CSIRO’s libraries,
information centres and canteens
• Adding Google Analytics reports to wiki (shows most viewed, most
downloaded)
• Providing altmetric data for inclusion in annual reports
• Exploring tracking and metrics options such as ImpactStory and
Thomson Reuters Data Citation Index
• Exploring the export of data set citations into reference
management software
12 | Data citation in CSIRO| Anne Stevenson
In place
• Contacting authors who have journal articles to submit:
13 | Data citation in CSIRO| Anne Stevenson
In place
• Tapping into the internal professional writing and proposal
preparation sessions to encourage referral of data sources
• A dedicated intranet section (this is currently brief and under
review)
• Using internal media channels such as Yammer and newsletters
opportunistically
• Participating in ANDS’ Project Columbus
14 | Data citation in CSIRO| Anne Stevenson
But … resistance and reservations
“Do I publish my data then cite it in my journal article, or do I
publish my journal article and then publish my data?”
“I wouldn’t release the data before I publish the related article,
would I?”
Image: Turnauckas, Mark. Chicken And Egg, April 3, 2011. http://www.flickr.com/photos/marktee/5586165599/.
15 | Data citation in CSIRO| Anne Stevenson
Challenges
DOI management
• Researchers “get” DOIs; they understand their relevance &
importance for publications – Natasha Simons
• Object level, collection level, changes to the data, additions to the collection
(e.g. time series)
• Collaborative data & CSIRO data stored outside CSIRO
Cultural change
• Tracking citations is a great objective, however it’s important that
researchers are citing the data they use, and they aren’t
necessarily aware that they should – Steve McEachern
• Pose the question in discussions with researchers: what happens if
you can’t find your data? – Steve McEachern
16 | Data citation in CSIRO| Anne Stevenson
Plans
• Feed the beast!
• Include data collections/citations in internal assessments (Science
reviews) and planning
– Project proposals should include data management planning from start
• Expand intranet content on citing publications to include data
• Attempt to recruit “champions” as we are aware that researchers
are more likely to hear this message from other researchers than
from support staff
17 | Data citation in CSIRO| Anne Stevenson
Plans
• Communications
– Link to publications AND data in media releases, news items
– Use the Google Analytics reports to notify top-viewed, top-downloaded
depositors
– News items on data reuse
• Reviewing our DOI resolution to link to CSIRO data regardless of
it’s location to ensure that ALL CSIRO data has the capability to be
cited easily.
18 | Data citation in CSIRO| Anne Stevenson
Further plans
• Provenance and its relationship to citation
– Work on integrating provenance into research and data workflow
• Software
– Its relationship to the data. How it too can improve data use and thus
citation rate
– Software citation in relation to data citation
• In context exposure – increase likelihood of reuse => citation
• Linking up our systems
– ePublish, DAP, Staff profiles, Research profiles
– Automating feeds to other portals
19 | Data citation in CSIRO| Anne Stevenson
CSIRO Research Data Support Service
Anne Stevenson
Cynthia Love
David Benn
Dominic Hogan
Sue Cook
Plus roving data advocate: John Morrissey
20 | Data citation in CSIRO| Anne Stevenson
Thank you
CSIRO IM&T
Anne Stevenson
Information Specialist
t +61 2 4960 6087
e [email protected]
w www.csiro.au
CSIRO INFORMATION MANAGEMENT & TECHNOLOGY