Database as a Service (DaaS) for 2nd Sudamih Workshop

Download Report

Transcript Database as a Service (DaaS) for 2nd Sudamih Workshop

Managing Research Data –
The Organisational Challenge at Oxford
Friday 6th December, 2013
James A J Wilson
[email protected]
The Growing Importance of
Research Data Management
• Rise of data-driven research
– Challenge to existing academic practices
– Opportunities for new kinds of research
• Increasing recognition of need to manage research data better
– Opportunities for research communities
– Concern for reputations
– Mandates from research funders
Damaro Objectives
• Institutional RDM Policy
• Better understanding of researchers’ requirements
• Improved training & support materials – embedded in existing delivery
channels
• Design for connected RDM infrastructure, from planning to re-use
• ‘DataFinder’ software – to act as a catalogue of research data outputs
• Outputs that can be taken and adapted by other institutions
(project was part of the JISC MRD Programme)
• Sustainability
What is Research Data Management?
File organisation
& local storage
Data analysis &
research
Documentation
outputs
Data
gathering
Data
deposit
Literature / data review
Repository
storage
Long-term
curation
[Funding bid]
Planning
Discovery
Idea
Re-use
Access
Principles behind Oxford’s infrastructure
• Modular
– Different business models for different components
– May be extended (or reduced)
• Researcher-focused
– Caters for different disciplines and working practices
• Intra-institutional
– Requires input from multiple support departments and
Academic Divisions
Demand
Demand for support with RDM from researchers
Importance of RDM
Essential -- My research would suffer
significantly if my data were not
properly managed
Important -- My research benefits
from the time spent managing data
Helpful up to a point -- Time spent
managing research data can make
life easier further down the line, but
it's not a very significant aspect of
research
Not important -- Devoting time to
managing research data would be a
distraction from the real work of
research
But fewer than a quarter had received any
information about RDM from the University
“My supervisor doesn’t want the whole dataset to
be made publicly available as it is. However, he
is very keen that whenever research papers
based on the data are published, relevant
portions of the data that support the findings are
also published.”
“Having a secure and fairly straightforward
means by which to share data with selected
collaborators around the world would be
extremely useful.”
“It would be useful for graduate students to learn
to pick the appropriate tool for the appropriate
question and the appropriate data … to know
what their options are.”
Training Desired
Common RDM tasks ranked by mean level of desire for training :
5 = most desired, 1 = least desired
1
Dealing with copyright, licensing, or other IP (intellectual property) issues relating to datasets
3.55
2
Preparing datasets for long-term preservation
3.42
3
Data documentation
3.34
4
Preparing datasets for sharing with researchers outside your research group
3.27
5
Storing data securely and backing up
3.13
6
Data management planning
3.13
7
Determining whether research datasets ought to be preserved after the end of a particular project
3.11
8
Organizing and structuring data within files (e.g. for analysis)
3.02
9
Version control
2.97
10
Managing bibliographic data
2.73
11
Organizing, structuring, and naming files and folders
2.66
Demand for support with RDM from above
“Publicly funded research data are a public good, produced in the public
interest, which should be made openly available with as few restrictions as
possible in a timely and responsible manner.”
RCUK Common Principles on Data Policy
“data must be accessible and readily located; they must be intelligible to those
who wish to scrutinise them; data must be assessable so that judgments can
be made about their reliability and the competence of those who created them;
and they must be usable by others. For data to meet these requirements it
must be supported by explanatory metadata (data about data).”
Royal Society – Data as an Open Enterprise
Challenges
Diverse practices
• Principle of subsidiarity
• 45% of Departmental IT Managers reported that ‘every researcher /
research group is completely free to choose how they manage their
research data’
• 70% offer some departmental infrastructure to encourage a degree of
standard practice (e.g. shared drives, data deposit guidelines)
• 15% of departments have a departmental policy mandating particular
tools and processes that researchers should use for managing their data
• University RDM policy ratified in 2012, setting out responsibilities of
researchers and institution
Disciplinary requirements differ
•
Significant differences in how researchers work
•
Wide range of experience and confidence
amongst researchers
•
Some disciplines already have good RDM
infrastructure in place, some keen for central
support
– “The University should have a dedicated central
repository”
– “[The University should] develop a data
As part of a team,
with our research
data managed by
the team
Social
Sciences
As part of a team,
but each member
of the team looks
after their own
data
Medical
Sciences
Mathematic
al, Physical
and Life
Sciences
As an individual
management service or be in a position to know
what to recommend to our researchers”
Humanities
– “The desire to centralize … may work at the
lower end of the data requirements, but at the
higher end is rather naïve”
0%
50%
100%
Some of my
research is
undertaken as
part of a team,
but I also
conduct some
research
independently
Researchers unclear where to go for support
IT Services
Departmental staff (including IT staff)
Academics & Colleagues
Research Services (including divisional & departmental)
National bodies (e.g. UKDA)
General web search
Libraries
OeRC
University website
Funders
RDM website
Other (suggestions with only 1 response)
Lack of staff confidence with RDM issues
Completely
Confident
7
6
5
4
3
2
Not
Confident
1
Solutions
Who should support research data management?
IT
Services
File organisation
& local storage
Data analysis &
research
outputs
Documentation
Data
gathering
Literature / data review
[Funding bid]
Academic Divisions
& Departments
Oxford eResearch
Centre (OeRC)
Planning
Research
Services
Discovery
Idea
Access and
re-use
Data
deposit
Repository
storage
Long-term
curation
Library
Services
Role of Libraries
• Metadata
• Access
• Workflows
• Collection management
• Collection curation and preservation
• Service provision
• Systems
• But also contributions to training and good practice in earlier parts
of research life-cycle
Ongoing work
• Research services
– OxfordDMPOnline & 20 questions for RDM
– Involvement of research facilitators
• IT Services
– Implementing services for ‘live’ data (HFS, Servers and VMs, Supercomputing, ORDS)
– Research Support Group
• Libraries
– DataBank
– DataFinder
– Involvement of Subject Librarians
• University coordination
– Research Data Management and Open Data Working Group
Coordination
• Single point of contact
– Central RDM website
• Associated challenges
– Information / data / metadata flows
– RT systems
– Resourcing
– More organisational than technical
Questions?