RDMRose: Research Data Management for LIS Session 6 Managing Data Session 6.4 Metadata and data citation Metadata and data citation Session 6.4 Nov-15 Learning material produced.

Download Report

Transcript RDMRose: Research Data Management for LIS Session 6 Managing Data Session 6.4 Metadata and data citation Metadata and data citation Session 6.4 Nov-15 Learning material produced.

RDMRose: Research Data Management for LIS
Session 6 Managing Data
Session 6.4 Metadata and data citation
Metadata and data citation
Session 6.4
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Learning Outcomes
By the end of this session you will be able to
• Discuss the varying requirements of metadata
that will enable researchers to identify the
potential of a particular dataset
• Evaluate ways of citing data
• Articulate and reflect upon some of the issues
involved with citing data and datasets
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Session overview
• EPSRC principles and expectations
• What is sufficient metadata?
• How to cite data?
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
EPSRC Principle 6
•
•
Nov-15
“Sufficient metadata should be recorded and made openly
available to enable other researchers to understand the
potential for further research and re-use of the data.
Published results should always include information on how
to access the supporting data.”
http://www.epsrc.ac.uk/about/standards/researchdata/Page
s/principles.aspx
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
EPSRC Expectation 5
•
•
Nov-15
“Research organisations will ensure that appropriately
structured metadata describing the research data they hold
is published (normally within 12 months of the data being
generated) and made freely accessible on the internet; in
each case the metadata must be sufficient to allow others to
understand what research data exists, why, when and how it
was generated, and how to access it. Where the research
data referred to in the metadata is a digital object it is
expected that the metadata will include use of a robust
digital object identifier (For example as available through the
DataCite organisation - http://datacite.org).”
http://www.epsrc.ac.uk/about/standards/researchdata/Page
s/expectations.aspx
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Metadata
ACTIVITY 6.4.1
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Activity 6.4.1 Metadata
• What is “sufficient metadata” that enables
“other researchers to understand the
potential for further research and re-use of
the data”?
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Activity 6.4.1 Metadata
The University of Poppleton holds a dataset with meteorological
observations, taken at the university’s weather station. In particular, it
contains a set of precipitation measurements since the foundation of
the university. A climatologist, Jenny Fairweather, is interested in this
dataset for her research into climate change. She is looking for trends
in the weather. A meteorologist, Wilson Rainbird, who works for the
UK Met Office wants to use these data for the purposes of weather
prediction. He is mainly interested in combining these precipitation
measurements with other similar datasets. A researcher, Alice Snowe,
from another university’s Accident Research Unit conducts most of her
research in the area of road traffic accidents. She would like to map
the precipitation measurements to another dataset containing
information on road accidents in order to analyse possible
correlations. Lastly, the university’s data repository manager, John
Shower, is concerned with issues regarding data access and IPR.
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Activity 6.4.1 Metadata
• What is “sufficient metadata” for each of
these stakeholders “to understand the
potential for further research and re-use of
the data”?
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Example
• The DaMaRO project at the University of Oxford is developing a
metadata schema for its DataFinder (Rumsey, 2012).
• A three-tier metadata approach:
– Mandatory minimal metadata to enable basic discovery, such as
Creator, Title, Publisher, Date, Location, Access terms & conditions
– Mandatory contextual metadata (mostly administrative and partly
based on EPSRC expectations), such as Funding Agency, Grant Number,
Last access request date, Project Information, Data Generation
Process, Why the data was generated, Date (range) of data collection,
Reasons for embargo
– Optional metadata (including discipline-specific metadata) to enable
reuse, such as machine settings and experimental conditions under
which the data were gathered
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Data citation
ACTIVITY 6.4.2
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Activity 6.4.2 Data citation
• How should data be cited?
• There are no established standards for data
citation yet, although some style manuals
such as the APA’s (in the 5th and 6th editions)
and some repositories such as the UK Data
Archive do provide instructions.
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Activity 6.4.2 Data citation
• Researcher, Alice Snowe, from another university’s
Accident Research Unit is seeking to use the dataset
with precipitation measurements going back to the
foundation of the University. This dataset was
deposited in 2011 by the University’s meteorologist,
Christopher Oldman Frost, and covers all years up to
and including 2010. It consists of data subsets that are
organised per year, each consisting of several files,
including Excel spreadsheets, Word files, and image
files (digitised observations written down on paper). Of
course, Mr Oldman Frost is not the only meteorologist
who has been involved in taking the measurements
that make up this dataset.
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Activity 6.4.2 Data citation
• Alice Snowe is now writing a research paper for Science
called ‘The correlation between bicycle accidents and
precipitation in urban centres during the rush hour’.
She needs to cite our institutional repository’s dataset.
In particular she will need to refer to the precipitation
measurements of 4 May 1979. Elsewhere in her article
she also needs to refer to a subset covering the winter
months of the years 1981-1985.
• Write down the references that Alice Snowe needs to
give in her article.
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
APA
Basic form:
• Rightsholder. (Year). Title of data set (Version number)
[Description of form]. Location: Name of producer.
or
Rightsholder. (Year). Title of data set (Version number)
[Description of form]. Retrieved from http://
• University of Poppleton (2011). Precipitation
measurements 1905-2010 taken at Western Bank
weather station [Data files and documentation].
Poppleton: The University of Poppleton,
Meteorological Service.
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
DataCite
• DataCite (http://www.datacite.org) is a not-for-profit
organisation that aims to promote and support the
sharing of research data
• They are developing an infrastructure that supports
methods of data citation, discovery, and access
• They are currently leveraging the DOI (Digital Object
Identifier) infrastructure, which is also used for
research articles
• They can provide DOIs for datasets
• DataCite DOIs have to resolve to a public landing page
with information about the dataset and a direct link to
it
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
DataCite
Basic form:
• Creator (PublicationYear): Title. Publisher. Identifier
• Version and ResourceType are optional extra elements
• For citation purposes, DataCite recommends that DOI
names are displayed as linkable, permanent URLs
• More info in DataCite (2011)
• University of Poppleton (2011): Precipitation
measurements 1905-2010 taken at Western Bank
weather station. Meteorological service, The University
of Poppleton. http://dx.doi.org/10.1594/UoP.MS.298
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Activity 6.4.2 Data citation
• What practical issues did you encounter when
writing the references for Alice Snowe’s
research paper? How could these issues be
solved?
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Data Citation
• Issues include (Ball & Duke, 2011a and b):
– At what granularity should data be made citeable?
– How to credit each contributor in a dataset that is
assembled from very many contributions?
– Where in a research paper should a data citation
be given (e.g. a paper describing a dataset versus
subsequent papers using it)?
– What to do with frequently updated data?
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
REFERENCES
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
References
• American Psychological Association (2010). Publication
Manual of the American Psychological Association (6th
edition). Washington, DC: American Psychological Association,
pp. 210-211.
• Ball, A., & Duke, M. (2011a). Data Citation and Linking. DCC
Briefing Papers. Edinburgh: Digital Curation Centre. Retrieved
from http://www.dcc.ac.uk/resources/briefingpapers/introduction-curation/data-citation-and-linking
• Ball, A., & Duke, M. (2011b). How to Cite Datasets and Link to
Publications. DCC How-To Guides. Edinburgh: Digital Curation
Centre. Retrieved from http://www.dcc.ac.uk/resources/howguides/cite-datasets
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
References
• DataCite (2011). DataCite Metadata Schema for the Publication and
Citation of Research Data. Version 2.2. London: DataCite. Retrieved
from http://schema.datacite.org/meta/kernel-2.2/doc/DataCiteMetadataKernel_v2.2.pdf. doi:10.5438/0005
• DataCite (n.d.). Why cite data? Hannover. Retrieved from
http://datacite.org/whycitedata
• Rumsey, S. (2012). Just enough metadata: Metadata for research
datasets in institutional data repositories [PowerPoint
presentation]. Oxford: The University of Oxford. Retrieved from
http://damaro.oucs.ox.ac.uk/docs/Just%20enough%20metadata%2
0v3-1.pdf
• UK Data Archive (n.d.). Citing Data. Colchester. Retrieved from
http://www.data-archive.ac.uk/conditions/citing-data
Nov-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose