Transcript Digital Preservation Tools for Repository Managers
Transforming repositories: from repository managers to institutional data managers
ECA 2010, 8th European Conference on Digital Archiving, Geneva, 28 - 30 April 2010
Steve Hitchcock, David Tarrant and Leslie Carr
School of Electronics & Computer Science Steve Hitchcock
Not everyone can be a digital archiving specialist
Growth of Open Access Digital Repositories
Generated from Directory of Open Access Repositories (OpenDOAR) http://www.opendoar.org/ Generated from Registry of Open Access Repositories (ROAR) http://roar.eprints.org/ Both charts generated 23 April 2010
ROAR repository format profile
From Registry of Open Access Repositories (ROAR) http://roar.eprints.org/ This profile for Australian Research Online repository To access a format profile: Find chosen repository in ROAR, open [Record Details]
Format profiles not available for all repositories in ROAR ROAR disclaimer: Full-text formats is based on automatic file-format identification and is prone to errors
Digital repositories diversifying: institution-wide outputs
KeepIt exemplar preservation repositories
Research Arts Science Teaching
Digital Preservation Tools for Repository Managers
A practical course in five parts presented by the
KeepIt project
in association with
Module 1, Organisational issues, audit, selection and appraisal
School of ECS, University of Southampton, 19 January 2010 Twitter hashtag #dprc (digital preservation repository course)
Module 1, Organisational issues, audit, selection and appraisal
School of ECS, University of Southampton, 19 January 2010
Module 2, institutional and lifecycle preservation costs
School of ECS, University of Southampton, 5 February 2010
Module 3, Primer on preservation workflow, formats and characterisation
Westminster-Kingsway College, London, 2 March 2010
Module 4, Putting storage, format management and preservation planning in the repository
University of Southampton, 18-19 March 2010
Module 5, Trust, of the repository, of the tools and services it chooses
University of Northampton, 30 March 2010
Course structure
• • • • •
Module 1. Organisational issues
Scoping, selection, assessment, institutional parameters (19 January 2010)
Module 2. Costs
Lifecycle costs for managing digital objects, based on the LIFE approach, and institutional costs (5 February)
Module 3. Description
Describing content for preservation: provenance, significant properties and preservation metadata (2 March)
Module 4. Preservation workflow tools
available in EPrints for format management, risk assessment and storage, and linked to the Plato planning tool from Planets (17-18 March)
Module 5. Trust
(by others) of the repository’s approach to preservation; trust (by the repository) of the tools and services it chooses (30th March)
KeepIt course participant numbers
jisckeepit
KeepIt course 3: thanks as well to all participants:
16 for course 1 (from 11 institutions), 15 (from 11)
yesterday. Great commitment
03 Mar 2010
jisckeepit
KeepIt course: "did you really think it would only be you left by the last module". Yes, but I was wrong.
Course 1, 16; course 5, 16
31 Mar 2010 Source: Twitter @jisckeepit
Evaluation: course structure
“Structure and development through the course was excellent. Presentations and practicals gave good introductions to all the tools. Applicability sometimes focussed too much on IRs and research data”
Course evaluation summary
“Many of these tools are, of necessity, complex in scope and time consuming. The challenge is to understand which one to use in which situation and
to what depth to engage with it.”
Tools module 1
• The Data Asset Framework (DAF), Sarah Jones, University of Glasgow, and Harry Gibbs, University of Southampton • The AIDA toolkit: Assessing Institutional Digital Assets, Ed Pinsent, University of London Computer Centre
Evaluation module 1
Tools module 2
• Keeping Research Data Safe (KRDS), Costs, Policy, and Benefits in Long-term Digital Preservation, Neil Beagrie, Charles Beagrie Ltd consultancy • LIFE 3 : Predicting Long Term Preservation Costs, Brian Hole, The British Library
Evaluation module 2
“Impressed with LIFE3 tool. I hope it is further developed. I like the way it works and can provide, comparatively quickly, some indication of likely costs. Useful and practical”
Behaviour Structure
subject Message text Line break Paragraph underline strikethrough Body background Body text colour In-reply-to references Message-id Trace-route Sender display-name Sender local-part Sender domain-part • •
Tools module 3
Significant characteristics, Stephen Grace and Gareth Knight, Kings College London PREMIS, Open Provenance Model Recipient display name Recipient local-part Recipient domain part
Check
Preservation workflow
Analyse Action • Format identification, • versioning File validation • Virus check • Bit checking and checksum calculation
Tools
e.g. DROID JHOVE FITS Preservation planning Characterisation: Significant properties and technical characteristics, provenance, format, risk
factors
Risk analysis
Tools
Plato (Planets) PRONOM (TNA) P2 risk registry (KeepIt) INFORM (U Illinois) KB • • • Migration Emulation Storage selection
Tools module 4
• EPrints preservation apps, including the storage controller, Dave Tarrant and Adam Field, University of Southampton • Plato, preservation planning tool from the Planets project, Andreas
Rauber and Hannes
Kulovits, TU Wien
Steve Jobs launches Apple iPad
“75 million people already own iPod Touches and iPhones. That's all people who already know how to use the iPad.
”
Picture by curiouslee http://www.flickr.com/photos/curiouslee/4320074421/
Evaluation module 4
“This part of the course made me appreciate how big the area of preservation is and also the amount of research undertaken in this area.”
Tools module 5
TRAC, Trusted Repository
Audit and Certification: criteria and checklist DRAMBORA, Digital Repository Audit Method Based On Risk Assessment, Martin Donnelly, Digital Curation Centre, University of Edinburgh
Evaluation: DRAMBORA
“We will definitely be investigating DRAMBORA further”
KeepIt course time
Module 1
5h 20 mins
Module 2
5h 05 mins
Module 3
5h 20 mins
Module 4
day 1 5h 15 mins
Module 4
day 2 4h 30 mins
Module 5
5h 15 mins
Total 30h 45 mins
group work 3h (56%) group 2h 30 mins (49%) group 3h (56%) group 3h (57%) group 3h 30 mins (inc. panel) (78%) group 2h (38%)
group 17h (55%)
KeepIt course summary in tweets
•
jisckeepit
KeepIt course 1, result 1: senior directors at Northampton U. support use of DAF http://bit.ly/cFcsas Mon, 08 Feb 2010 • •
digitalfay
uploaded my first file to the cloud using #eprints next stop: comprehensive bitstream preservation policies for repository content Thu, 18 Mar 2010
digitalfay
very impressed with end-to-end logical preservation process #eprints3.2 (risk audit) to #planetsway (planning) & back again (action) Thu, 18 Mar 2010 • •
jisckeepit
KeepIt course 4: practical-make a preservation plan in Plato, upload it to EPrints and it enacts the plan on your collection. Magic! Mon, 22 Mar 2010
jisckeepit
KeepIt course: There's now a substantial group of repository managers out there ready and able to apply appropriate preservation tools Wed, 31 Mar 2010 •
clairemparry
: @jisckeepit absolutely - thanks to all the tutors & organisers for a fantastic course which made all the long train journeys worth it Tue, 06 Apr 2010 •
jisckeepit
KeepIt course 5: revision, evaluation and concluding thoughts - the last hurrah. Complete course slides now at http://bit.ly/8XMesd Thu, 08 Apr 2010 Selected tweets from Twapperkeeper for #dprc http://twapperkeeper.com/hashtag/dprc
Lessons from the KeepIt course
• • • • • • • The digital preservation community has produced an array of tools covering most requirements.
Repository managers have responded positively to practice with these tools.
Repository managers need to act to shape their repositories for the next phase of development: expansion; diversification or focus.
These tools support this process, as well as the technical management of digital preservation.
Still need to reduce complexity and make tools simpler for non-specialists.
One approach is to integrate tools into familiar interfaces, such as repositories.
This is a great story for digital preservation
Credits
•
KeepIt
team at the University of Southampton Les Carr, PI, Steve Hitchcock, project manager, David Tarrant, developer •
KeepIt
preservation exemplar repositories are led by: Simon Coles (eCrystals, University of Southampton) Stephanie Meece (University of the Arts London) Debra Morris (EdShare, University of Southampton) Miggie Pickton (Nectar, University of Northampton) Thanks to all
KeepIt
course tutors and tutees
KeepIt
is funded by JISC (to Sept. 2010) as part of its Information Environment Programme 2009 11 http://preservation.eprints.org/keepit/