Electronic Publishing

Download Report

Transcript Electronic Publishing

Archiving
What is it and why should it be
important to me?
John Shaw
Director, Publishing Technologies
SAGE Publications, U.S.
I. Archiving Overview
II. Types of Archives
II. A SAGE Example
IV. Risks, Questions, and More Questions
Archiving Part I:
Archiving Overview
What is an Archive?


An authoritative collection
Preserved and professionally managed in perpetuity


History, institutional commitment & policy, integrity re:
preservation
“…information needed for society’s memory.” "Schellenberg
in Cyberspace," American Archivist 61:2 (Fall 1998), p. 309-327.

Preservation first
What is a Repository?

“A place where things can be stored and maintained; a
storehouse.”
[Society of American Archivists Glossary]

“Depository” is same


also library that receives government documents to public
access
Not all repositories are archives
Why Care?
“Preserving information for decades or even centuries has
proved important. Shang dynasty (12th century BC)
Chinese astronomers inscribed eclipse observations on
“oracle bones" (animal bones and tortoise shells).
About 3200 years later researchers used these records,
together with one from 1302BC, to estimate that the
accumulated clock error was just over 7 hours, and
from this derived a value for the viscosity of the Earth's
mantle as it rebounds from the weight of the glaciers..”
********
Why Care?
“These timescales of many decades, even centuries,
contrast with the typical 5-year lifetime for computing
hardware and digital media”
“A Fresh Look at the Reliability of Longterm Digital Storage.” Baker, Mary, et al..
EuroSys '06, April 18-21, 2006
Why Care?
Preservation: Digital information is impermanent
 Publisher: Safety
 to insure ongoing availability of your content
 Your library customers: Custodianship
 to insure continuity of the record of scientific
progress
 Very long view: epistemology, history of
science and culture
What Should be Preserved?



Scholarly content
Research materials
Web-based, digitally born content
How e-Archives Differ



Mission: collection v. preservation
Access control, dark v. light
Deposits







Why: voluntary v. mandated
Who: author v. publisher
What: manuscripts v. final work
When: backfile v. current content
Future format migration
Rights transfer
Costs
Archiving Part II:
Types of Archives
Types of Archives:




National archives
Institutional repositories
Community-based archives
Product solution archives
Types of Archives:
National




Dutch National library
Koninklijke Bibliotheek (KB)
British Library
NIH – PubMedCentral?
 “NIH’s digital repository for biomedical research”
Library of Congress?
KB: Dutch National Library

Mission: Legal deposit library



Deposits: Source files from publishers



“…collect, catalogue and preserve all publications
appearing in the Netherlands. ”
Capable of ingesting 60,000 articles/day
Automated, strict
Costs?
Access Control:


Local patron access
Publisher sets remote access rules
KB: Dutch National Library

Migration: Preservation research leader


Committed to format migration
Archiving agreements with:

OUP, Sage, Blackwell, Elsevier, Kluwer Academic, etc.
The British Library
Legal Deposit Pilot

Mission: Legal deposit library


UK-published (to start)
Pilot: Legal deposit for e-journals


23 volunteer publishers
Secure infrastructure
Uses DigiTool by Ex-Libris
 Shared with the other UK legal deposit libraries



To “scope and test” ingest, storage, retrieval
Cost?
The British Library:
Preservation and Migration

BL’s future for managing digital assets


Migration



preserve any type of digital material in perpetuity
ensure that users can view the material with contemporary
applications
preserve the original look-and-feel where possible
Access Control

“appropriate permissions”
PMC: US National Library of
Medicine Journal Archive






Mission: Make research more accessible
Free full-text archive of 230 journals
Deposit: publishers submit source files
Migration
Access Control
Cost?
PMC: Depository for
NIH-Funded Research Articles

Authors of NIH-funded articles “encouraged” to
deposit final manuscript





“After all modifications due to …peer review”
MS Word, PDF, etc.
With supplementary information
Publisher can replace with published version
To be required soon?
Library of Congress




National Digital Information Infrastructure and Preservation
Program (NDIIPP) – formed in 2000
 Members: National Library of Medicine, the National
Agricultural Library, the National Institute of Standards
and Technology, the Research Libraries Group, the OCLC
Online Computer Library Center, and the Council on
Library and Information Resources
Preliminary investigation and software development phase
Primarily e-journal deposit
Future …???
Types of Archives:
Institutional


University with expansive focus
 Stanford Digital Repository
Automated
 LOCKSS
Stanford Digital Repository


Stanford Univ. Libraries initiative
Digital preservation serving





Stanford University
Broader academic community
Publishers
Principles: Trust, Security, Transparency
Costs?
LOCKSS






Technology to preserve local library collection
Automated, self-correcting cache servers
 Requires LOCKSS server at library
Requires publisher participation
Builds collection of all resources which the institution
licenses
Goes online to users if data source becomes
unavailable
 Provides access to static “HTML images” of source
Costs
Types of Archives:
Product Solution

Non-profit organization
 Portico
Portico

Mission: scholarly preservation





Standalone archive
Initiated by JSTOR, with grant funding
Deposits: source files from publisher
Migration: planned
Costs

Publishers annual fee $250 to $75,000


based on annual revenue
Libraries annual fee $1,500 to $24,000

based on Library Materials Expenditure
Portico: Access Control

Member libraries get access:


“when specific trigger events occur, and when titles are
no longer available from the publisher or other source.”
Trigger events include:





Publisher stops operations
Publisher ceases to publish a title
Publisher no longer offers back issues
Catastrophic and sustained failure of a publisher’s delivery
platform
Can also fulfill “perpetual access” subscription
obligations
Types of Archives:
Community

Community based and openly run

CLOCKSS
CLOCKSS (Controlled LOCKSS)

Long-term global archiving solution




Small number library participants maintain the archive on behalf
of larger community



libraries preserve member publisher content whether they subscribe or
not
Release only after a trigger event


Community-managed, failsafe repository for scholarly content
Serve libraries & publishers in the event of a long-term business
interruption
Publishers participation is voluntary
Publisher, libraries, and society collaborative decision to release
“cost sharing” for system, not access
Costs?
Summary Table
Agency
Primary
Mission
Data
A/C
Migration
KB
Gov’t
Preserv
Pub
Twilight
Yes
BL
Gov’t
Preserv
Pub
Portico
Ind.
Failsafe
Pub
Dark
Yes
PMC
Gov’t
Access
Pub,
Author
Light
Yes
LoC
Gov’t
Preserv
Pub
SDR
Inst.
Preserv
Pub
Twilight
LOCKSS
Inst.
Failsafe
Pub
Dark
-
CLOCKSS
Comm.
Failsafe
Pub
Dark
-
?
?
Yes
?
Yes
Summary:
How Repositories Differ




Stated purpose
Dark v. light
Complete backfile v. current only
Deposits






Who: author v. publisher
What: manuscripts v. final work
Why: voluntary v. mandated
Rights transfer
Access control
Costs
Archiving Part III:
A SAGE Example
Why Archive?






SAGE’s commitment to customers and partners
Critical to society arrangements
Essential for new e-sales (consortia + single
institutions) – Perpetual access
Business continuity
Long-term preservation
We are not archiving experts!
Where to Archive?






Dutch KB
CLOCKSS
LOCKSS
Portico
Library of Congress
British Library
How to Archive?





Provide details of digital availability
Provide sample of content
Provide details of content format (DTD)
Send all backfile for loading
Set up content flow for ongoing content
SAGE Experience with
DutchKB
Contract and negotiation
 Contact with technical team
 Delivery of samples and details of scope
 Follow-up questions
 Visit KB – Find out what’s happening
 Delivery of back content
 Delivery of ongoing issues
 Ongoing issue discrepancies

Archiving Part IV:
Questions, Questions and
More Questions
Measurements of Success




Who is overseeing the archiving process and
governance?
Compliance?
Accuracy and legitimacy?
Financial stability?
Resources











Archiving should be done by librarians ad archivists, period. Gordon Tibbitts, Blackwell
Publishing. April 4, 2006 UKSG
Portico - http://www.portico.org/
LOCKSS - http://lockss.stanford.edu
CLOCKSS - http://www.lockss.org/clockss/Home
KB E-Depot - http://www.kb.nl/index-en.html
DepotDigital Archiving at the national library of the Netherlands- http://www5.ibm.com/be/pdf/en/events/nextlevel/presentation_kb_den_haag_edepot_ibm_brussels_v03.
pdf
“A Fresh Look at the Reliability of Longterm Digital Storage.” Baker, Mary, et al.. EuroSys '06,
April 18-21, 2006
Digital Archives & Repositories: Why should I care? – Bernard Hecker, HighWire Press,
Publishers Meeting, October 2004
Archive Overview, – Bernard Hecker, HighWire Press, Publishers Meeting, April 2006
Trusted Digital Repositories: Attributes and Responsibilities An RLG-OCLC Report. © 2002
Research Libraries Group
British Library: Project: JCLD Pilot Project in Anticipation of E-Journals, June 2005 Simon Inger
Note: Presentation based on Digital Archives & Repositories: Why should I care? – Bernard Hecker, HighWire Press,
Publishers Meeting, October 2004; Archive Overview. Bernard Hecker, HighWire Press, Publishers Meeting, April 2006;
Archiving: A SAGE Example. John Shaw. Publishers Meeting, April 2006
Thank You!
Contact info:
[email protected]
www.sagepub.com