Preservation4Rookies..

Download Report

Transcript Preservation4Rookies..

Ensuring Long-term Access to ETDs
Distributed Preservation Network
Gail McMillan
Director, Digital Library and Archives Virginia Tech
[email protected]
Newcomers’ Workshop @ ETD 2009
University of Pittsburgh
What is Digital Preservation?

Systematic management of digital works
over an indefinite period of time
 Processes and activities that ensure the
continued access to works in digital formats
 Requires ongoing attention--constant input
of resources: effort, time, money
 Technological and organizational change are
obstacles for preserving beyond a few years.
Backup/IRs vs. Digital Preservation
Backups are tactical measures.
Make copies to restore the originals after a data
loss event.
Typically stored in a single location
– often nearby
– collocated with the servers backed up
Address
short-term data loss with minimal
investment resources
Backups are not a comprehensive solution to the
problem of preserving information over time.
Digital Preservation is Strategic

Long-term, error-free storage with means for
retrieval and interpretation for the entire time span
the information is required.
 To realistically address the issues involved in
preserving information over time, a true digital
preservation program requires
– Multi-institutional collaboration
– Geographically dispersed set of secure caches
– Some ongoing investment
Distributed Digital Preservation Network

Security reduces the likelihood that any single cache will
be compromised.
 Distribution reduces the likelihood that the loss of any
single cache will lead to a loss of the preserved
information.
 A single organization is unlikely to have the capability to
operate several geographically dispersed and securely
maintained servers.
 Inter-institutional agreements will ensure commitment to
act in concert over time.
Effective preservation succeeds by replicating copies of
content in secure, distributed locations over time.
NDTLD Preservation Strategy:
MetaArchive Cooperative

Programmatically harvests ETDs at host
universities
– Secure access: only authorized partners’ servers

Distributes content among partners
– Digital preservation network around the world
– Standard hardware, free software
– Audits and repairs content as needed from host or
partners
– Low cost to administer and run

Dark Archive only
http://scholar.lib.vt.edu/theses/NDLTD/NDLTDPreservationPlan200906.pdf
ETD File Formats






85% PDF
30% JPG
27% WAV
24% GIF
23% HTML, MOV
21% AVI, MP3
MetaArchive Conspectus Database
Start ETD Preservation Planning
Now

Thursday, 10:45 am
– Breakout Session 2A: New Trends

Avoiding the Calf-Path: Digital Preservation
Readiness for Growing Collections and
Distributed Preservation Networks
– Best practices for preservation readiness
• How to organizing an ETD collection
• Metadata discipline: ETD MS
• Live vs. static media
– ETD-specific LOCKSS-based collaborative distributed
archive sponsored by the NDLTD: www.metaarchive.org