UK Perspectives on the Curation and Preservation of Scientific Data Dr Liz Lyon, Director, UKOLN, University of Bath, UK Associate Director, UK Digital Curation.

Download Report

Transcript UK Perspectives on the Curation and Preservation of Scientific Data Dr Liz Lyon, Director, UKOLN, University of Bath, UK Associate Director, UK Digital Curation.

UK Perspectives on the
Curation and Preservation of
Scientific Data
Dr Liz Lyon, Director, UKOLN, University of Bath, UK
Associate Director, UK Digital Curation Centre
AAAS Annual Meeting Boston, February 2008.
UKOLN is supported by:
This work is licensed under a Creative Commons Licence
Attribution-ShareAlike 2.0
www.ukoln.ac.uk
A centre of expertise in digital information management
Overview
1. Context: “big science, small
science & open science”
2. Strategy, Policy, Planning:
Dealing with Data Report
3. Practice & Futures
Big versus Small Science
“Data from Big Science is … easier to handle, understand and
archive. Small Science is horribly heterogeneous and far more
vast. In time Small Science will generate 2-3 times more data than
Big Science.”
“Lost in some research assistant’s computer, the data are often
irretrievable or an undecipherable string of digits”
“To vet experiments, correct errors, or find new breakthroughs,
scientists desperately need better ways to store and retrieve
research data”
‘Lost in a Sea of Science Data’ S.Carlson, The Chronicle of Higher Education (23/06/2006)
Open
Science
Millennials as native data scientists
Social networks for scientists
Second Life: virtual worlds
Community repositories for data
Tagging and sharing workflows
Open Notebook Science (ONS)
Data Curation and
Preservation choices?
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Disciplinary data centre
Institutional / departmental / lab repository
Repository federation or network
National library or national archive
“Public” data repository or service
Web archiving services
Commercial data store - Amazon S3
Ecosystem of hosted lifebits services (Jon Udell)
None of these?
All of these?
Strategic approaches & policy
UKOLN Liz Lyon June 2007
35 Recommendations for JISC
Roles, Rights, Responsibilities,
Relationships: scientist,
institution, data centre, user,
funder, publisher
Research Information Network
RIN January 2008
5 Principles: Roles &
responsibilities, standards &
QA, access, usage & credit,
benefits & cost-effectiveness,
preservation & sustainability
Report Recommendations 1
• DataSets Mapping and Gap Analysis (UK)
• Data Curation & Preservation Strategy (UK)
• Data Audit Framework (HE Institutions)
• Institutional Data Management,
Preservation & Sharing Policy
• Data Management & Sharing Policy
(Funders)
• Data Management Plan (Projects)
• Data Networking Forum (People)
Recommendations 2
Digital Curation Centre
• Co-ordinated advocacy programmes
• Co-ordinated training programmes
• Disciplinary Data Case Studies (SCARP)
• scientist
• institution
Roles, Rights • data centre
Responsibilities
Relationships: • user
• funder
• publisher
Digital Curation Centre
http://www.dcc.ac.uk/
• Policy & Advocacy: briefing
papers, curation manual
• Audit & Certification: DRAMBORA
• Community Development:
Research Data Forum with RIN
• Training and skills: workshops,
summer school
• Research: database archiving
• Dissemination: International
Conference, e-journal IJDC
Draft DCC Curation
Lifecycle Model
Conceptualise
Create
Transform
Access
and
Use
Description
and
Planning
and
Digital
Objects,
Data or
Databases
and
Access
and
Reuse
Preservation
Curate
Appraise
and
Select
Preserve
Store
Representation
Information
Ingest
Preservation
Action
Reappraise
Destroy
Institutional case study:
eCrystals data repository
Started as JISC-funded
eBank-UK Project Sept 2003
ePrints.org @ Southampton +
aggregator service @ UKOLN
• Metadata schema Application
Profile, DOIs, InChIs, Rights &
Citation Policy
• Embedded in workflow
http://ecrystals.chem.soton.ac.uk
Funder
Data centres /
aggregator
services
Scientist
Scientist
Create
Deposit
Advisory
IR Federation
Curate
Policy
Preserve
Advocacy
Standards Training
Collaborate
Share
Harvest
Link
Discover
Re-use
Link
eCrystals Federation
Data Deposit Model
(based on model in Dealing with Data
Report UKOLN 2007)
Publishers
User
Link
eCrystals Curation &
Preservation Study
Examined four main areas
1. Audit and certification
(TRAC, DRAMBORA,
NESTOR, ISO International
repository audit and
certification BOF Group)
2. The Open Archival
Information System (OAIS)
and Representation
Information (RI)
3. eBank-UK application profile
and preservation metadata
4. ePrints.org repository
platform
http://www.ukoln.ac.uk/projects/ebankuk/curation/eBank3-WP4-Report%20(Revised).pdf
Recommendations
eCrystals Federation: Preservation &
sustainability Recommendations
Data repositories
• Use DRAMBORA
Interactive for selfassessment
• Add PREMIS preservation
metadata
• Collect eCrystals
representation information
• Examine repository
platform conformance to
OAIS Reference Model
• Survey partner
preservation policies
Dealing with Data Report
Future challenges?
• Instrumentation and laboratory equipment
• Dataset re-use: significant properties
• Versions, identifiers, citation
• Robust bi-directional linking
• IPR and model licences for data
• Cost-benefits of data curation
• Careers, specialist skills and capacity
• Data curation within the curriculum
Future UK developments
• JISC initiatives
–
–
–
–
–
Data Audit Framework Study
Research Data Preservation Costs Study
Institutional Preservation Policy Study
Preservation of Web Resources Workshops
Data curators: professional development and careers Study
• Shared Research Data Service Feasibility Study: report
January 2009
• Research Information Network (RIN) Publication of Data
Outputs study pending
• Open Repositories Conference April 2008 University of
Southampton
• 4th International Digital Curation Conference, December 2008,
Edinburgh
Slides will be available at :
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.html
www.ukoln.ac.uk
A centre of expertise in digital information management