a centre of expertise in data curation and preservation Introduction to Digital Archives Maureen Pennock EAOLUG Spring/Summer Meeting 2006 Funded by: This work is licensed.

Download Report

Transcript a centre of expertise in data curation and preservation Introduction to Digital Archives Maureen Pennock EAOLUG Spring/Summer Meeting 2006 Funded by: This work is licensed.

a centre of expertise in data curation and preservation
Introduction to Digital Archives
Maureen Pennock
EAOLUG Spring/Summer Meeting 2006
Funded by:
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK:
Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-ncsa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San
Francisco, California, 94105, USA.
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Today’s talk
• The DCC
• Background & Context
• What We Do
• Digital Archives & Archiving
• Definitions
• Main Issues
• OAIS
• Systems
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
UK Digital Curation Centre
• JISC Circular 6/03 called for bids in digital curation
• JISC and the e-Science Core Programme funding
• for development, services and outreach in digital
curation
• for a research programme
• Impetus to action
• Growth in e-Science activity and data creation
• Recognition that continuing access to digital
information is needed
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Partners
• University of Edinburgh (lead site)
• Chris Rusbridge, Prof Peter Buneman
• University of Glasgow - HATII
• Prof Seamus Ross, Director of HATII and Erpanet
• University of Bath - UKOLN
• Dr Liz Lyon, Director of UKOLN
• Councils for the Central Laboratory of the
Research Councils (CCLRC)
• Dr David Giaretta
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Objectives
•
Lead a vibrant international research programme to improve quality in
data curation and digital preservation
•
Deliver effective, efficient and high demand services
• undertake evaluation of tools, methods, standards and policies
• work with the community to establish registries of tools and
technical information
•
Create an active, innovative and collaborative Associates Network
•
Connect communities
• Universities and Research institutions
• Scientific data and documents
• International & cross-sector
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Research
•
•
•
•
•
•
•
•
•
Annotation in Databases
Data archiving
Socio-economic and legal issues
Metadata extraction and curation
Provenance and databases
Data transformation, integration and publishing
Security
Supporting technologies
Organisational and cultural challenges to digital
curation
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Development
• DCC Approach to Digital Curation (white paper) –
sets out the path for development activities:
• Monitoring international standards
• Development of a Representation Information
Registry/Repository (DCC RIR)
• Development of recommendations for tools and methods for
generating Representation Information
• Creating testbeds for digital curation tools
• Creating auditing and certification processes for trusted
repositories
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Services
• Information Services
•
•
•
•
•
Community-developed Digital Curation Manual
Briefing Papers & FAQ’s
Technology Watch
Case Studies
Best Practice Checklists
• Advisory Services
• Events: information days, workshops, training, conferences
• Helpdesk
• Audit and Certification Services
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Summary
• Support and promote continuing improvement in the
quality of data curation and preservation activity
• Nurture strong community relationships between
practitioners, researchers, and curators
• Address digital curation from all aspects of the
records life-cycle
• Develop and promote curation knowledge, tools and
techniques
• Identify and research new organisational, technical,
and supporting curation challenges
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Digital Curation
• Digital curation is all about maintaining and adding
value to a trusted body of digital information for
current and future use; specifically, we mean the
active management and appraisal of data over the
life-cycle of scholarly and scientific materials.
• Digital Curation brings a whole host of challenges
• The range of stakeholders that affect the survival of
digital material cuts across the whole life-cycle
• Everyone plays an important role
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Digital Archiving
• Digital archiving is a curation activity
• Ensures that
• Data is properly selected
• Data is properly stored
• Data can be accessed
• The logical and physical integrity of the data is
maintained over time
• Data is secure and authentic *
* Lord & MacDonald, e-Science Data Curation Report, 2003
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Digital Preservation
• Digital preservation is an archiving activity
• Ensures that specific items of data are maintained
over time so that they can still be accessed and
understood through changes in technology *
• Includes content files and associated metadata
• Combats digital obsolescence
• Keeps data authentic despite technological change
• Has technical, organisational, and cultural challenges
* Lord & MacDonald, e-Science Data Curation Report, 2003
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
What is a Digital Archive?
• Inconsistency in use of the terms digital archive,
digital repository, and digital library
• Task Force on Archiving Digital Information 1996:
“Defines digital archives strictly in functional terms as
repositories of digital information that are collectively
responsible for ensuring, through the exercise of various
migration strategies, the integrity and long-term accessibility of
the nation’s social, economic, cultural and intellectual heritage
instantiated in digital form.”
• Provide reliable solutions for life-cycle and long-term
management of digital archival materials
• System driver is Preservation, leading to Access
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
What is a Digital Repository?
• Collections of digital objects: content + metadata
• Cross-domain implementation
• Offer minimum set of basic services – Get, Search,
Access control
• Sustainable & trusted; well-supported and managed
• Policies, processes, services, people
• Overall commitment to stewardship of digital
materials
• Enables quick & remote access to digital materials
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Main Issues for Digital Archives
•
•
•
•
•
•
•
•
•
User Requirements
Transfer & Ingest
Metadata
Standards
Digital preservation strategies
Linkage
Audit and Certification
Legal Issues
Access restrictions
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
OAIS
• Open Archival Information System Reference Model
• ISO 14721:2003
• "An archive, consisting of an organisation of people
and systems, that has accepted the responsibility to
preserve information and make it available for a
Designated Community"
• Establishes a common framework of terms and
concepts
• Defines an Information Model
• Identifies basic Functions of an OAIS
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
OAIS Functional Model
• Functional model has six entities:
•
•
•
•
•
•
Ingest;
Archival Storage;
Data Management;
Administration;
Preservation Planning;
Access
• Described using UML diagrams
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
OAIS Functional Entities
P
R
O
D
U
C
E
R
Preservation Planning
DIP
Descriptive
info.
Descriptive
info.
queries
Data
Management
SIP
Access
result sets
orders
Ingest
Archival
Storage
SIP
AIP
SIP
C
O
N
S
U
M
E
R
AIP
DIP
Administration
MANAGEMENT
EAOLUG :: RSC :: Cambridge
OAIS Functional Entities (Figure 4-1)
23 May 2006
a centre of expertise in data curation and preservation
DSpace
• DSpace: “DSpace is a groundbreaking digital
repository system that captures, stores, indexes,
preserves, and redistributes an organization's
research data [...] the DSpace software platform
serves a variety of digital archiving needs.”
• Open source software
• Example use:
•
•
•
•
American Museum of Natural History Research Library
Chapel Hill, SILS, Theses & Dissertations
University of Cambridge – Academic & related content
Edinburgh Research Archive (ERA)
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
EPrints
• Eprints: “GNU EPrints is generic archive software
under development by the University of
Southampton. It is intended to create a highly
configurable web-based archive.”
• Open Source software
• Example uses:
•
•
•
•
•
Southampton Crystal Structure Report Archive
Central Connecticut State University Digital Archive
Central European University – Preprint Archive
Curtin institute of Technology Institutional Repository
DLIST – Digital Library of Information Science & Technology
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Fedora
• Fedora: “Open source software that gives
organisations a flexible service-oriented architecture
for managing and delivering their digital content.”
• Open source software
• Example uses:
• Digital Case, Case Western Reserve University's electronic
repository and archive: stores, disseminates, and preserves
faculty research in digital formats (both born digital and
digitised)
• University of Queensland eSpace – research digital
repository with published articles and conference papers,
book chapters, theses and other forms of written research
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Others
• Other systems such as Digital Commons institutional
repository service
• Other, custom-built systems
• NARA Electronic Records Archives (ERA) project
• UK National Archives
• Public Record Office, Victoria
• KB eDepot, Netherlands
• Several other large bodies whose archive predates development of aforementioned repository
software
• Commercial systems
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
In conclusion
• There is much in common between digital archives,
libraries, and repositories
• Intention and subsequent functionality is the key to
defining digital storage systems
• Digital Archives offer a framework for maintaining &
preserving the authenticity and integrity of records
over time
• Several software solutions are available
• Development is ongoing
• Need technical know-how to implement
• There is still a lot of work to do... .
EAOLUG :: RSC :: Cambridge
23 May 2006
a centre of expertise in data curation and preservation
Thank you.
Questions?
Maureen Pennock
[email protected]
Join the DCC Associates Network at
http://www.dcc.ac.uk
EAOLUG :: RSC :: Cambridge
23 May 2006