Electronic Publishing
Download
Report
Transcript Electronic Publishing
Archiving
What is it and why should it be
important to me?
John Shaw
Director, Publishing Technologies
SAGE Publications, U.S.
I. Archiving Overview
II. Types of Archives
II. A SAGE Example
IV. Risks, Questions, and More Questions
Archiving Part I:
Archiving Overview
What is an Archive?
An authoritative collection
Preserved and professionally managed in perpetuity
History, institutional commitment & policy, integrity re:
preservation
“…information needed for society’s memory.” "Schellenberg
in Cyberspace," American Archivist 61:2 (Fall 1998), p. 309-327.
Preservation first
What is a Repository?
“A place where things can be stored and maintained; a
storehouse.”
[Society of American Archivists Glossary]
“Depository” is same
also library that receives government documents to public
access
Not all repositories are archives
Why Care?
“Preserving information for decades or even centuries has
proved important. Shang dynasty (12th century BC)
Chinese astronomers inscribed eclipse observations on
“oracle bones" (animal bones and tortoise shells).
About 3200 years later researchers used these records,
together with one from 1302BC, to estimate that the
accumulated clock error was just over 7 hours, and
from this derived a value for the viscosity of the Earth's
mantle as it rebounds from the weight of the glaciers..”
********
Why Care?
“These timescales of many decades, even centuries,
contrast with the typical 5-year lifetime for computing
hardware and digital media”
“A Fresh Look at the Reliability of Longterm Digital Storage.” Baker, Mary, et al..
EuroSys '06, April 18-21, 2006
Why Care?
Preservation: Digital information is impermanent
Publisher: Safety
to insure ongoing availability of your content
Your library customers: Custodianship
to insure continuity of the record of scientific
progress
Very long view: epistemology, history of
science and culture
What Should be Preserved?
Scholarly content
Research materials
Web-based, digitally born content
How e-Archives Differ
Mission: collection v. preservation
Access control, dark v. light
Deposits
Why: voluntary v. mandated
Who: author v. publisher
What: manuscripts v. final work
When: backfile v. current content
Future format migration
Rights transfer
Costs
Archiving Part II:
Types of Archives
Types of Archives:
National archives
Institutional repositories
Community-based archives
Product solution archives
Types of Archives:
National
Dutch National library
Koninklijke Bibliotheek (KB)
British Library
NIH – PubMedCentral?
“NIH’s digital repository for biomedical research”
Library of Congress?
KB: Dutch National Library
Mission: Legal deposit library
Deposits: Source files from publishers
“…collect, catalogue and preserve all publications
appearing in the Netherlands. ”
Capable of ingesting 60,000 articles/day
Automated, strict
Costs?
Access Control:
Local patron access
Publisher sets remote access rules
KB: Dutch National Library
Migration: Preservation research leader
Committed to format migration
Archiving agreements with:
OUP, Sage, Blackwell, Elsevier, Kluwer Academic, etc.
The British Library
Legal Deposit Pilot
Mission: Legal deposit library
UK-published (to start)
Pilot: Legal deposit for e-journals
23 volunteer publishers
Secure infrastructure
Uses DigiTool by Ex-Libris
Shared with the other UK legal deposit libraries
To “scope and test” ingest, storage, retrieval
Cost?
The British Library:
Preservation and Migration
BL’s future for managing digital assets
Migration
preserve any type of digital material in perpetuity
ensure that users can view the material with contemporary
applications
preserve the original look-and-feel where possible
Access Control
“appropriate permissions”
PMC: US National Library of
Medicine Journal Archive
Mission: Make research more accessible
Free full-text archive of 230 journals
Deposit: publishers submit source files
Migration
Access Control
Cost?
PMC: Depository for
NIH-Funded Research Articles
Authors of NIH-funded articles “encouraged” to
deposit final manuscript
“After all modifications due to …peer review”
MS Word, PDF, etc.
With supplementary information
Publisher can replace with published version
To be required soon?
Library of Congress
National Digital Information Infrastructure and Preservation
Program (NDIIPP) – formed in 2000
Members: National Library of Medicine, the National
Agricultural Library, the National Institute of Standards
and Technology, the Research Libraries Group, the OCLC
Online Computer Library Center, and the Council on
Library and Information Resources
Preliminary investigation and software development phase
Primarily e-journal deposit
Future …???
Types of Archives:
Institutional
University with expansive focus
Stanford Digital Repository
Automated
LOCKSS
Stanford Digital Repository
Stanford Univ. Libraries initiative
Digital preservation serving
Stanford University
Broader academic community
Publishers
Principles: Trust, Security, Transparency
Costs?
LOCKSS
Technology to preserve local library collection
Automated, self-correcting cache servers
Requires LOCKSS server at library
Requires publisher participation
Builds collection of all resources which the institution
licenses
Goes online to users if data source becomes
unavailable
Provides access to static “HTML images” of source
Costs
Types of Archives:
Product Solution
Non-profit organization
Portico
Portico
Mission: scholarly preservation
Standalone archive
Initiated by JSTOR, with grant funding
Deposits: source files from publisher
Migration: planned
Costs
Publishers annual fee $250 to $75,000
based on annual revenue
Libraries annual fee $1,500 to $24,000
based on Library Materials Expenditure
Portico: Access Control
Member libraries get access:
“when specific trigger events occur, and when titles are
no longer available from the publisher or other source.”
Trigger events include:
Publisher stops operations
Publisher ceases to publish a title
Publisher no longer offers back issues
Catastrophic and sustained failure of a publisher’s delivery
platform
Can also fulfill “perpetual access” subscription
obligations
Types of Archives:
Community
Community based and openly run
CLOCKSS
CLOCKSS (Controlled LOCKSS)
Long-term global archiving solution
Small number library participants maintain the archive on behalf
of larger community
libraries preserve member publisher content whether they subscribe or
not
Release only after a trigger event
Community-managed, failsafe repository for scholarly content
Serve libraries & publishers in the event of a long-term business
interruption
Publishers participation is voluntary
Publisher, libraries, and society collaborative decision to release
“cost sharing” for system, not access
Costs?
Summary Table
Agency
Primary
Mission
Data
A/C
Migration
KB
Gov’t
Preserv
Pub
Twilight
Yes
BL
Gov’t
Preserv
Pub
Portico
Ind.
Failsafe
Pub
Dark
Yes
PMC
Gov’t
Access
Pub,
Author
Light
Yes
LoC
Gov’t
Preserv
Pub
SDR
Inst.
Preserv
Pub
Twilight
LOCKSS
Inst.
Failsafe
Pub
Dark
-
CLOCKSS
Comm.
Failsafe
Pub
Dark
-
?
?
Yes
?
Yes
Summary:
How Repositories Differ
Stated purpose
Dark v. light
Complete backfile v. current only
Deposits
Who: author v. publisher
What: manuscripts v. final work
Why: voluntary v. mandated
Rights transfer
Access control
Costs
Archiving Part III:
A SAGE Example
Why Archive?
SAGE’s commitment to customers and partners
Critical to society arrangements
Essential for new e-sales (consortia + single
institutions) – Perpetual access
Business continuity
Long-term preservation
We are not archiving experts!
Where to Archive?
Dutch KB
CLOCKSS
LOCKSS
Portico
Library of Congress
British Library
How to Archive?
Provide details of digital availability
Provide sample of content
Provide details of content format (DTD)
Send all backfile for loading
Set up content flow for ongoing content
SAGE Experience with
DutchKB
Contract and negotiation
Contact with technical team
Delivery of samples and details of scope
Follow-up questions
Visit KB – Find out what’s happening
Delivery of back content
Delivery of ongoing issues
Ongoing issue discrepancies
Archiving Part IV:
Questions, Questions and
More Questions
Measurements of Success
Who is overseeing the archiving process and
governance?
Compliance?
Accuracy and legitimacy?
Financial stability?
Resources
Archiving should be done by librarians ad archivists, period. Gordon Tibbitts, Blackwell
Publishing. April 4, 2006 UKSG
Portico - http://www.portico.org/
LOCKSS - http://lockss.stanford.edu
CLOCKSS - http://www.lockss.org/clockss/Home
KB E-Depot - http://www.kb.nl/index-en.html
DepotDigital Archiving at the national library of the Netherlands- http://www5.ibm.com/be/pdf/en/events/nextlevel/presentation_kb_den_haag_edepot_ibm_brussels_v03.
pdf
“A Fresh Look at the Reliability of Longterm Digital Storage.” Baker, Mary, et al.. EuroSys '06,
April 18-21, 2006
Digital Archives & Repositories: Why should I care? – Bernard Hecker, HighWire Press,
Publishers Meeting, October 2004
Archive Overview, – Bernard Hecker, HighWire Press, Publishers Meeting, April 2006
Trusted Digital Repositories: Attributes and Responsibilities An RLG-OCLC Report. © 2002
Research Libraries Group
British Library: Project: JCLD Pilot Project in Anticipation of E-Journals, June 2005 Simon Inger
Note: Presentation based on Digital Archives & Repositories: Why should I care? – Bernard Hecker, HighWire Press,
Publishers Meeting, October 2004; Archive Overview. Bernard Hecker, HighWire Press, Publishers Meeting, April 2006;
Archiving: A SAGE Example. John Shaw. Publishers Meeting, April 2006
Thank You!
Contact info:
[email protected]
www.sagepub.com