user engagement in research data curation

Download Report

Transcript user engagement in research data curation

User engagement in research data
curation
Stuart Macdonald
EDINA National Data Centre, University of Edinburgh
Luis Martinez-Uribe
Oxford e-Research Centre, University of Oxford
ECDL
Corfu, 30 September 2009
Data deluge
An updated IDC white paper reported that the digital
universe in 2007 was 281 exabytes and in 2011 should be
1,800 exabytes (or 10 times that produced in 2006).
*“The Diverse and Exploding Digital Universe - an updated forecast of worldwide information growth through
2011- http://www.emc.com/collateral/analyst-reports/diverse-exploding-digital-universe.pdf (Mar. 2008)
BBSRC strategic plan
(2010-2015)
consultation document
Research data definitions
US Office of Management and Budget defines research
data as “the recorded factual material commonly accepted
in the scientific community as necessary to validate
research findings”
Words, pictures, numbers, sounds
Workflows, methodologies, protocols, standard operating
procedures, instrumentation, models, questionaires, code
books, set-up files, algorithms, transcripts
Growing importance of curating research
data
“it is becoming increasingly clear that effective and efficient management and
reuse of research data will be a key component in the UK knowledge economy
in years to come, essential for the efficient conduct of research ….”
*JISC (2008) “Identifying the benefits of curating and sharing research data” http://www.jisc.ac.uk/whatwedo/programmes/digitalrepositories2007/databenefits.aspx
• Research methods experiencing a radical
transformation
• New tools & infrastructures generating
research data
• New ways to use, share and re-use
Data deposition and publication
• Departmental websites
• Domain-specific repositories
• Centralised data repositories (UKDA, NERC, MRC)
Libraries and computing/IT services within academic
institutions working together to develop and customise
institutional repositories to curate research data
Researchers – key user community
overlooked
Institutional Repositories:
• open access
• built for academic publications
• technology lead
No formal requirements analysis procedures
User engagement required to develop systems that will
meet researchers’ needs
Bottom up approach to inform top-down thinking
Open data – realism versus altruism
DISC-UK DataShare - legal, cultural, technical issues surrounding
the sharing of research data in institutional settings
Barriers to sharing:
• time taken to prepare datasets for deposit
• concerns over making data available before full academic
exploitation
• misuse / misinterpretation (journalists, non-academics)
• loss of ownership, loss of commercial or competitive advantage
• repositories will cease to exist
• unwillingness to change working practices
• uncertainty about IPR and confidentiality
RIN-funded Disciplinary case studies
Charting individual researcher’s information practices across 7 subdisciplines of the life sciences - http://www.rin.ac.uk/case-studies
DCC / ISSTI (University of Edinburgh)
Deployed a range of methodologies and tools including short-term
ethnographic techniques and semi-structured instruments:
Diaries (x55),
F-2-F interviews, (x24)
Cognitive mapping (1 per case),
Focus groups (1 per case)
Some findings from RIN Disciplinary case
studies project:
• Some disciplines lend themselves more than others to
‘openly’ data sharing
• Research data are varied, specific and complex
• Data curation and/or sharing only becomes crucial at
certain stages of research lifecycle
• Feeling that only researchers have subject knowledge to
curate their own data
• Keen sense of ‘ownership’ and protectiveness towards
data
Scoping digital repository services for
research data management http://www.ict.ox.ac.uk/odit/projects/digitalrepository/
Scope requirements for services to manage research data
generated by Oxford researchers from a variety of disciplines:
Interviews (x37) conducted to learn about data management
practices and identify top requirements
Workshop (x46) held to compliment findings and to gather
examples of good practice regarding use of repository services
for research data management
Consultation with service units (ORA, data library,NGS, oxford
digital library) - identify gaps in service, validate researchers’
requirements
Scoping digital repository services - top
requirements
• Advice on practical issues related to managing data across their life
cycle incl. data management plans, assistance with formatting
• Secure storage required for large datasets generated by high
throughput instruments
• Sustainable & authenticated infrastructure that allows publication and
long-term preservation of research data
It is now followed up by the intra-institutional JISC funded Embedding
Institutional Data Curation Services in Research (EIDCSR) project http://eidcsr.oucs.ox.ac.uk/
Tools – Data Audit Framework
http://www.data-audit.eu/
"staff had numerous comments and
suggestions for improvement of data
management at different levels
indicating an awareness of the issues,
even where it had not been made a
priority to address" - edinburgh data
audit implementation project
• DAF helps to establish relationships with research communities around
the issues of data curation
• Allows institutions to identify, locate, describe and assess how they are
managing their research data
• Provides information specialists who wish to extend support for research
data with a vehicle for engaging with researchers e.g. through local
research data management training
Summary
• Repository development distant from current research
needs
- due to lack of iterative requirements analysis with researchers
• Open data ethos detached from disciplinary research
needs
• Trusted relationships
- dialogue with researchers early in research process
Thank you
[email protected]
[email protected]
All images - creative commons courtesy of Flickr