RDMRose: Research Data Management for LIS Session 6 Managing Data Session 6.4 Metadata and data citation Metadata and data citation Session 6.4 Nov-15 Learning material produced.
Download ReportTranscript RDMRose: Research Data Management for LIS Session 6 Managing Data Session 6.4 Metadata and data citation Metadata and data citation Session 6.4 Nov-15 Learning material produced.
RDMRose: Research Data Management for LIS Session 6 Managing Data Session 6.4 Metadata and data citation Metadata and data citation Session 6.4 Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Learning Outcomes By the end of this session you will be able to • Discuss the varying requirements of metadata that will enable researchers to identify the potential of a particular dataset • Evaluate ways of citing data • Articulate and reflect upon some of the issues involved with citing data and datasets Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Session overview • EPSRC principles and expectations • What is sufficient metadata? • How to cite data? Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose EPSRC Principle 6 • • Nov-15 “Sufficient metadata should be recorded and made openly available to enable other researchers to understand the potential for further research and re-use of the data. Published results should always include information on how to access the supporting data.” http://www.epsrc.ac.uk/about/standards/researchdata/Page s/principles.aspx Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose EPSRC Expectation 5 • • Nov-15 “Research organisations will ensure that appropriately structured metadata describing the research data they hold is published (normally within 12 months of the data being generated) and made freely accessible on the internet; in each case the metadata must be sufficient to allow others to understand what research data exists, why, when and how it was generated, and how to access it. Where the research data referred to in the metadata is a digital object it is expected that the metadata will include use of a robust digital object identifier (For example as available through the DataCite organisation - http://datacite.org).” http://www.epsrc.ac.uk/about/standards/researchdata/Page s/expectations.aspx Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Metadata ACTIVITY 6.4.1 Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Activity 6.4.1 Metadata • What is “sufficient metadata” that enables “other researchers to understand the potential for further research and re-use of the data”? Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Activity 6.4.1 Metadata The University of Poppleton holds a dataset with meteorological observations, taken at the university’s weather station. In particular, it contains a set of precipitation measurements since the foundation of the university. A climatologist, Jenny Fairweather, is interested in this dataset for her research into climate change. She is looking for trends in the weather. A meteorologist, Wilson Rainbird, who works for the UK Met Office wants to use these data for the purposes of weather prediction. He is mainly interested in combining these precipitation measurements with other similar datasets. A researcher, Alice Snowe, from another university’s Accident Research Unit conducts most of her research in the area of road traffic accidents. She would like to map the precipitation measurements to another dataset containing information on road accidents in order to analyse possible correlations. Lastly, the university’s data repository manager, John Shower, is concerned with issues regarding data access and IPR. Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Activity 6.4.1 Metadata • What is “sufficient metadata” for each of these stakeholders “to understand the potential for further research and re-use of the data”? Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Example • The DaMaRO project at the University of Oxford is developing a metadata schema for its DataFinder (Rumsey, 2012). • A three-tier metadata approach: – Mandatory minimal metadata to enable basic discovery, such as Creator, Title, Publisher, Date, Location, Access terms & conditions – Mandatory contextual metadata (mostly administrative and partly based on EPSRC expectations), such as Funding Agency, Grant Number, Last access request date, Project Information, Data Generation Process, Why the data was generated, Date (range) of data collection, Reasons for embargo – Optional metadata (including discipline-specific metadata) to enable reuse, such as machine settings and experimental conditions under which the data were gathered Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Data citation ACTIVITY 6.4.2 Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Activity 6.4.2 Data citation • How should data be cited? • There are no established standards for data citation yet, although some style manuals such as the APA’s (in the 5th and 6th editions) and some repositories such as the UK Data Archive do provide instructions. Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Activity 6.4.2 Data citation • Researcher, Alice Snowe, from another university’s Accident Research Unit is seeking to use the dataset with precipitation measurements going back to the foundation of the University. This dataset was deposited in 2011 by the University’s meteorologist, Christopher Oldman Frost, and covers all years up to and including 2010. It consists of data subsets that are organised per year, each consisting of several files, including Excel spreadsheets, Word files, and image files (digitised observations written down on paper). Of course, Mr Oldman Frost is not the only meteorologist who has been involved in taking the measurements that make up this dataset. Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Activity 6.4.2 Data citation • Alice Snowe is now writing a research paper for Science called ‘The correlation between bicycle accidents and precipitation in urban centres during the rush hour’. She needs to cite our institutional repository’s dataset. In particular she will need to refer to the precipitation measurements of 4 May 1979. Elsewhere in her article she also needs to refer to a subset covering the winter months of the years 1981-1985. • Write down the references that Alice Snowe needs to give in her article. Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose APA Basic form: • Rightsholder. (Year). Title of data set (Version number) [Description of form]. Location: Name of producer. or Rightsholder. (Year). Title of data set (Version number) [Description of form]. Retrieved from http:// • University of Poppleton (2011). Precipitation measurements 1905-2010 taken at Western Bank weather station [Data files and documentation]. Poppleton: The University of Poppleton, Meteorological Service. Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose DataCite • DataCite (http://www.datacite.org) is a not-for-profit organisation that aims to promote and support the sharing of research data • They are developing an infrastructure that supports methods of data citation, discovery, and access • They are currently leveraging the DOI (Digital Object Identifier) infrastructure, which is also used for research articles • They can provide DOIs for datasets • DataCite DOIs have to resolve to a public landing page with information about the dataset and a direct link to it Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose DataCite Basic form: • Creator (PublicationYear): Title. Publisher. Identifier • Version and ResourceType are optional extra elements • For citation purposes, DataCite recommends that DOI names are displayed as linkable, permanent URLs • More info in DataCite (2011) • University of Poppleton (2011): Precipitation measurements 1905-2010 taken at Western Bank weather station. Meteorological service, The University of Poppleton. http://dx.doi.org/10.1594/UoP.MS.298 Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Activity 6.4.2 Data citation • What practical issues did you encounter when writing the references for Alice Snowe’s research paper? How could these issues be solved? Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Data Citation • Issues include (Ball & Duke, 2011a and b): – At what granularity should data be made citeable? – How to credit each contributor in a dataset that is assembled from very many contributions? – Where in a research paper should a data citation be given (e.g. a paper describing a dataset versus subsequent papers using it)? – What to do with frequently updated data? Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose REFERENCES Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose References • American Psychological Association (2010). Publication Manual of the American Psychological Association (6th edition). Washington, DC: American Psychological Association, pp. 210-211. • Ball, A., & Duke, M. (2011a). Data Citation and Linking. DCC Briefing Papers. Edinburgh: Digital Curation Centre. Retrieved from http://www.dcc.ac.uk/resources/briefingpapers/introduction-curation/data-citation-and-linking • Ball, A., & Duke, M. (2011b). How to Cite Datasets and Link to Publications. DCC How-To Guides. Edinburgh: Digital Curation Centre. Retrieved from http://www.dcc.ac.uk/resources/howguides/cite-datasets Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose References • DataCite (2011). DataCite Metadata Schema for the Publication and Citation of Research Data. Version 2.2. London: DataCite. Retrieved from http://schema.datacite.org/meta/kernel-2.2/doc/DataCiteMetadataKernel_v2.2.pdf. doi:10.5438/0005 • DataCite (n.d.). Why cite data? Hannover. Retrieved from http://datacite.org/whycitedata • Rumsey, S. (2012). Just enough metadata: Metadata for research datasets in institutional data repositories [PowerPoint presentation]. Oxford: The University of Oxford. Retrieved from http://damaro.oucs.ox.ac.uk/docs/Just%20enough%20metadata%2 0v3-1.pdf • UK Data Archive (n.d.). Citing Data. Colchester. Retrieved from http://www.data-archive.ac.uk/conditions/citing-data Nov-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose