Digital Preservation through Regional Cooperation: LOCKSS

Download Report

Transcript Digital Preservation through Regional Cooperation: LOCKSS

Digital Preservation through
Cooperation: LOCKSS
Gail McMillan
Digital Library and Archives, University Libraries
Virginia Polytechnic Institute and State University
VIVA Steering Committee and SCHEV LAC
Virginia State University
June 10, 2005
Libraries: Collections, not just Links
• Libraries should own, as well as manage, their digital
collections, including
– Content currently leased: VIVA examples
• BioOne, Cambridge Uni. Press, Nature Publishing Group, Project Muse
• See http://lockss.stanford.edu/about/titles.htm
– LOCKSS prevents the publisher from revoking access rights to
back content
– Open-access web resources, for example
• Abbey's Web: Provides links to biographical information,
bibliographies, articles, and other resources about the environmental
writer, Edward Abbey: http://www.abbeyweb.net/
LOCKSS Basics
Library uses inexpensive computer and free software
• Programmatically collects content from publisher
• Preserves content among LOCKSS servers
– Periodically audits content and repairs as needed from other
LOCKSS servers
• Disseminates content to library’s appropriate users
– Host library’s readers see the content from publisher’s URL
– Unless it isn’t available from there
• It is delivered from the reader’s library’s LOCKSS-preserved content.
• It doesn’t look any different.
LOCKSS and EJournals
• Library (consortium) negotiates with publishers
• Publishers trust LOCKSS
– Collections begin with subscriptions, not retrospectively
– Libraries have access to their collections in perpetuity
– Outside the appropriate user community, access only to audit and
repair files
• Low cost to administer and run
– Less than 1 hour per month
– 95% of systems patched in 48hrs
• Low storage costs: 2003: $0.70 = one year, one journal,
~0.5GB
LOCKSS software turns a PC
into a preservation tool
One PC holds >3,000 years of an
average electronic journal (2005)
600MHz-128MB RAM-Bootable CD drive-Floppy disk drive
LOCKSS and Publishers
• Suggested license language permits
libraries to
– Collect and preserve currently accessible
materials, i.e., subscription-based content
– Use materials consistent with original
license terms
– Provide copies to others for purposes of
audit and repair
Review of Writing and Photography of Appalachia
LOCKSS is for more than just ejournals
• MetaArchive of Southern Digital Culture
• ETDs: Electronic Theses and Dissertations
– ASERL: Association of SouthEastern Research
Libraries
• 9/11 web sites -- NYPL
• Newspapers -- University of Utah
• Government Documents
NDIIPP
National Digital Information Infrastructure and
Preservation Program
• Created by federal legislation in December 2000
• Support preservation of significant “born-digital”
content at risk
• Three areas of focus
– Network of preservation partners: Clear instructions
from legislators that LC should work with others
– Architectural framework for preservation
– Digital preservation research
MetaArchive NDIIPP Network
University of Louisville
Va Tech
Emory University
Ga Tech
Florida State University
Auburn University
http://www.metaarchive.org
Key Features of a Secure MetaArchive
1.
2.
3.
4.
5.
6.
7.
8.
Distributed preservation strategy
Flexible organizational model
Formal content selection process
Capability for migrating archives
Dim archiving strategy
Low cost to deployment
Self-Sustaining incentives
Simple preservation exchange mechanisms with
the Library of Congress
MetaArchive Project Goals
1. Create a conspectus of digital content within the
subject domain held by the partner sites
2. Harvested body of the most critical content to be
preserved (3 terabytes, w/ capability to expand)
3. Develop a model cooperative agreement for
ongoing collaboration and sustainability
4. Distributed preservation network infrastructure
based on the LOCKSS software
MetaArchive: Deliverables,
more than CLOCKSS
• Define the Scope of the Content
– What is Southern digital culture?
– What is “at risk?”
• Developing a Conspectus: Content Selection
– What collections will be preserved?
– Metadata
• Adaptations showing any unique or qualified tags
• Rights issues: harvesting for preservation vs. user
access
MetaArchive’s CLOCKSS
(Collecting Lots of Copies Keeps Stuff Safe)
• Diversifying LOCKSS
– Software , hardware, collections, communities
• Study problems
– Dynamic content
– Format migration (next grant)
• Cooperative agreement model
– Not only an effective preservation network for
one body of digital content, but enable the
creation of many others for this important
purpose.
http://www.lockss.org