user engagement in research data curation

Download Report

Transcript user engagement in research data curation

Edinburgh DataShare –
A DSpace Data Repository: Achievements and Aspirations
Stuart Macdonald
EDINA National Data Centre &
Edinburgh University Data Library
Fedora-UK&I&EU Meeting,
Oxford, 8 December
Flickr CC Image –
http://www.flickr.com/photos/laszlo-photo/1899390628/
Flickr CC Image - http://www.flickr.com/photos/artimagesmarkcummins/300173269/
•
EDINA and Data Library (EDL) together
are a division within Information Services
of the University of Edinburgh.
•
EDINA is a JISC-funded National Data
Centre providing national online resources
for education and research.
•
The Data Library service (established in
1983) assists Edinburgh University users
in the discovery, access, use and
management of research data assets.
•
Building relationships with researchers via
postgraduate teaching activities, IS Skills
workshops, Research Data Management
training and through traditional reference
interviews.
•
Edinburgh Datashare is a digital repository
of multi-disciplinary research datasets
produced at the University of Edinburgh,
hosted by the Data Library
DISC-UK DataShare
Project
DISC-UK DataShare Project – funded
by JISC (March 2007 – March 2009) - a
collaborative project which investigated
the legal, cultural and technical issues
surrounding research data sharing
within UK tertiary education community
Flickr CC Image - http://www.flickr.com/photos/ronin691/2285257955/
•
Explore new pathways to assist
academics wishing to share their data
over the Internet via Institutional
Repositories (IRs)
•
Policy-Making for Research Data in
Repositories: A Guide - Green, A.,
Macdonald, S. and R. Rice, (2009).
•
Edinburgh DataShare digital repository
– post project output – embedded into
University Information Services policy
Edinburgh Repository Landscape
Informatics Publications Repository
incl.Artificial Applications Institute,
Institute for Adaptive and neural Computation,
centre for Speech technology Research,
Institute for Computer Systems Architecture
Open Access
Open Access Publications Policy
passed by Senate, February 2009
Launch Jan. 2010
Research Publication Service
Publications Repository
DSpace, closed
Online remote-storage services
e.g. Drop box, Mozy, Humyo
Cloud
Edinburgh Research Archive
(ERA) - DSpace, Open Access
Edinburgh DataShare
DSpace, Open Access
Remote storage
e.g. Andrew File System
Restricted
‘Namedentity’ recognition’
Hosted DSpace Repositories:
St Andrews, (Heriot Watt / UHI / Aberdeen?)
EDINA repository projects
JORUM
DSpace, Open Access
Grid, iRODS etc
ShareGeo
DSpace, Restricted
Infrastructure SAN
ECDF SAN
(Edinburgh Compute Data Facility)
Parallel processing
Repository Junction
Edinburgh Datashare – technical
development
DSpace v.1.51 with new theme aligned with University
corporate style
• Embargo option - coded to restrict full data download with open
metadata until specified date
• Open Data Commons License option (PDDL)
• Dynamically queries Geonames, a community generated
spatial database to ensure consistency in metadata entry for
Spatial Coverage field
• Extension to DSpace to record bitstream downloads in usage
statistics
• Implementation of JACS for assigning keyword to content
• Download All option (zip file of all item components)
• Citation field automatically generated based on specified
metadata values
• Dublin Core-based metadata schema for datasets
Flickr CC Image –
http://www.flickr.com/photos/59414209@N00/3367225630/
Potential Next Steps:
• Semantification of content –
OpenLinkedData & MIT’s SIMILIE project
(which aims to leverage and extend DSpace by
supporting semantic web techniques)
• Visualisation tools – APIs that can be utilised
within repository environment (Timeplot,
Timeline) unlike existing open utilities
• Spatial analysis using open geo-browsers
and mapping utilities
• Annotation, tagging, data citation
Flickr CC Image - http://www.flickr.com/photos/silvertje/3512611046/
• Implementation of Deposit tool(s) – SWORD
• Streaming & viewing heterogeneous content
Research Data Management Training
Developed on the back of the Data Audit Framework lead by
HATII (Edinburgh being one of the pilot implementation
projects) – developing online tools and methodologies which
enable information specialists to engage and build relationships
with researchers in order to ascertain the data holdings in a
research school / department / group
Developing online and F-2F modules in conjunction with the
Postgraduate Transferable Skills Unit http://www.transkills.ed.ac.uk/
and PG Essentials http://www.transkills.ed.ac.uk/pgessentials.htm
Flickr CC Image –
http://www.flickr.com/photos/sgrantarch/3563676104/
Research data management guidance &
Data sharing and preservation
– http://www.ed.ac.uk/is/data-management
Engaging researchers through RDM exercises, teaching, reference interviews
& funded-projects are crucial to the bottom-up development of tools, services
and infrastructures meant to serve them and which ultimately improve both
the efficiency and effectiveness of their research.
Engaging with the Research Community (1)
Human Geography (Professor Jane Jacobs)
High Rise project - an interdisciplinary research programme involving
architects and geographers –
“It investigates two cases that encapsulate the varied fortunes
of the highrise experience: the UK, where the form is routinely
condemned, even demolished; and Singapore, where it is
embraced enthusiastically and continues to be built at greater
heights and densities - http://www.ace.ed.ac.uk/highrise/”
• Content of the High Rise Digital Archive includes images, sound
recordings, video, transcripts, architectural drawings
• Customise DataShare to allow streaming or embedded player
functionality, ingest heterogeneous content, employ multi-media
metadata standards
• Develop customised learning and teaching materials for deposit in
JORUM Open
Image courtesy of the periodic table printmaking project –
http://azuregrackle.com/periodictable/table/58.html
Engaging with the research community (2)
Centre for Earth System Dynamics (Dr Mike Mineter, Dr Magnus
Hagdorn)
The aim of the CESD is to develop climate models across SAGES
and other multi-disciplinary and international partners to quantify and
predict climate and environmental change
•
Use DataShare to provide federated access via shibboleth for
international partners for ‘working’ datasets (e.g. from the Arctic
Biosphere Atmosphere Coupling at Multiple Scales (ABACUS)
programme)
•
Store data via DataShare but also link to large datasets (c.1TB) stored
on remote storage (Andrew File Systems, ECDF SAN)
•
Content for ingestion includes large climate models/simulations,
fieldwork and experimental output in proprietary formats
•
Employ discipline-specific metadata standards to describe content
END
Thank You
[email protected]
Creative Commons images from Flickr (unless otherwise stated)
Flickr CC Image –
http://www.flickr.com/photos/hippie/2556161507/