Gail McMillan
Digital Library and Archives, University Libraries
Virginia Polytechnic Institute and State University
VIVA Steering Committee and SCHEV LAC
Virginia State University
June 10, 2005
Libraries: Collections, not just Links
• Libraries should own, as well as manage, their digital
collections, including
– Content currently leased: VIVA examples
• BioOne, Cambridge Uni. Press, Nature Publishing Group, Project Muse
• See
– LOCKSS prevents the publisher from revoking access rights to
back content
– Open-access web resources, for example
• Abbey's Web: Provides links to biographical information,
bibliographies, articles, and other resources about the environmental
writer, Edward Abbey:
Library uses inexpensive computer and free software
• Programmatically collects content from publisher
• Preserves content among LOCKSS servers
– Periodically audits content and repairs as needed from other
LOCKSS servers
• Disseminates content to library’s appropriate users
– Host library’s readers see the content from publisher’s URL
– Unless it isn’t available from there
• It is delivered from the reader’s library’s LOCKSS-preserved content.
• It doesn’t look any different.
LOCKSS and EJournals
• Library (consortium) negotiates with publishers
• Publishers trust LOCKSS
– Collections begin with subscriptions, not retrospectively
– Libraries have access to their collections in perpetuity
– Outside the appropriate user community, access only to audit and
repair files
• Low cost to administer and run
– Less than 1 hour per month
– 95% of systems patched in 48hrs
• Low storage costs: 2003: $0.70 = one year, one journal,
LOCKSS software turns a PC
into a preservation tool
One PC holds >3,000 years of an
average electronic journal (2005)
600MHz-128MB RAM-Bootable CD drive-Floppy disk drive
LOCKSS and Publishers
• Suggested license language permits
libraries to
– Collect and preserve currently accessible
materials, i.e., subscription-based content
– Use materials consistent with original
license terms
– Provide copies to others for purposes of
audit and repair
Review of Writing and Photography of Appalachia
LOCKSS is for more than just ejournals
• MetaArchive of Southern Digital Culture
• ETDs: Electronic Theses and Dissertations
– ASERL: Association of SouthEastern Research
• 9/11 web sites -- NYPL
• Newspapers -- University of Utah
• Government Documents
National Digital Information Infrastructure and
Preservation Program
• Created by federal legislation in December 2000
• Support preservation of significant “born-digital”
content at risk
• Three areas of focus
– Network of preservation partners: Clear instructions
from legislators that LC should work with others
– Architectural framework for preservation
– Digital preservation research
MetaArchive NDIIPP Network
University of Louisville
Va Tech
Emory University
Ga Tech
Florida State University
Auburn University
Key Features of a Secure MetaArchive
Distributed preservation strategy
Flexible organizational model
Formal content selection process
Capability for migrating archives
Dim archiving strategy
Low cost to deployment
Self-Sustaining incentives
Simple preservation exchange mechanisms with
the Library of Congress
MetaArchive Project Goals
1. Create a conspectus of digital content within the
subject domain held by the partner sites
2. Harvested body of the most critical content to be
preserved (3 terabytes, w/ capability to expand)
3. Develop a model cooperative agreement for
ongoing collaboration and sustainability
4. Distributed preservation network infrastructure
based on the LOCKSS software
MetaArchive: Deliverables,
more than CLOCKSS
• Define the Scope of the Content
– What is Southern digital culture?
– What is “at risk?”
• Developing a Conspectus: Content Selection
– What collections will be preserved?
– Metadata
• Adaptations showing any unique or qualified tags
• Rights issues: harvesting for preservation vs. user
MetaArchive’s CLOCKSS
(Collecting Lots of Copies Keeps Stuff Safe)
• Diversifying LOCKSS
– Software , hardware, collections, communities
• Study problems
– Dynamic content
– Format migration (next grant)
• Cooperative agreement model
– Not only an effective preservation network for
one body of digital content, but enable the
creation of many others for this important