a centre of expertise in data curation and preservation Introduction to Digital Archives Maureen Pennock EAOLUG Spring/Summer Meeting 2006 Funded by: This work is licensed.
Download ReportTranscript a centre of expertise in data curation and preservation Introduction to Digital Archives Maureen Pennock EAOLUG Spring/Summer Meeting 2006 Funded by: This work is licensed.
a centre of expertise in data curation and preservation Introduction to Digital Archives Maureen Pennock EAOLUG Spring/Summer Meeting 2006 Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-ncsa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Today’s talk • The DCC • Background & Context • What We Do • Digital Archives & Archiving • Definitions • Main Issues • OAIS • Systems EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation UK Digital Curation Centre • JISC Circular 6/03 called for bids in digital curation • JISC and the e-Science Core Programme funding • for development, services and outreach in digital curation • for a research programme • Impetus to action • Growth in e-Science activity and data creation • Recognition that continuing access to digital information is needed EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Partners • University of Edinburgh (lead site) • Chris Rusbridge, Prof Peter Buneman • University of Glasgow - HATII • Prof Seamus Ross, Director of HATII and Erpanet • University of Bath - UKOLN • Dr Liz Lyon, Director of UKOLN • Councils for the Central Laboratory of the Research Councils (CCLRC) • Dr David Giaretta EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Objectives • Lead a vibrant international research programme to improve quality in data curation and digital preservation • Deliver effective, efficient and high demand services • undertake evaluation of tools, methods, standards and policies • work with the community to establish registries of tools and technical information • Create an active, innovative and collaborative Associates Network • Connect communities • Universities and Research institutions • Scientific data and documents • International & cross-sector EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Research • • • • • • • • • Annotation in Databases Data archiving Socio-economic and legal issues Metadata extraction and curation Provenance and databases Data transformation, integration and publishing Security Supporting technologies Organisational and cultural challenges to digital curation EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Development • DCC Approach to Digital Curation (white paper) – sets out the path for development activities: • Monitoring international standards • Development of a Representation Information Registry/Repository (DCC RIR) • Development of recommendations for tools and methods for generating Representation Information • Creating testbeds for digital curation tools • Creating auditing and certification processes for trusted repositories EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Services • Information Services • • • • • Community-developed Digital Curation Manual Briefing Papers & FAQ’s Technology Watch Case Studies Best Practice Checklists • Advisory Services • Events: information days, workshops, training, conferences • Helpdesk • Audit and Certification Services EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Summary • Support and promote continuing improvement in the quality of data curation and preservation activity • Nurture strong community relationships between practitioners, researchers, and curators • Address digital curation from all aspects of the records life-cycle • Develop and promote curation knowledge, tools and techniques • Identify and research new organisational, technical, and supporting curation challenges EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Digital Curation • Digital curation is all about maintaining and adding value to a trusted body of digital information for current and future use; specifically, we mean the active management and appraisal of data over the life-cycle of scholarly and scientific materials. • Digital Curation brings a whole host of challenges • The range of stakeholders that affect the survival of digital material cuts across the whole life-cycle • Everyone plays an important role EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Digital Archiving • Digital archiving is a curation activity • Ensures that • Data is properly selected • Data is properly stored • Data can be accessed • The logical and physical integrity of the data is maintained over time • Data is secure and authentic * * Lord & MacDonald, e-Science Data Curation Report, 2003 EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Digital Preservation • Digital preservation is an archiving activity • Ensures that specific items of data are maintained over time so that they can still be accessed and understood through changes in technology * • Includes content files and associated metadata • Combats digital obsolescence • Keeps data authentic despite technological change • Has technical, organisational, and cultural challenges * Lord & MacDonald, e-Science Data Curation Report, 2003 EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation What is a Digital Archive? • Inconsistency in use of the terms digital archive, digital repository, and digital library • Task Force on Archiving Digital Information 1996: “Defines digital archives strictly in functional terms as repositories of digital information that are collectively responsible for ensuring, through the exercise of various migration strategies, the integrity and long-term accessibility of the nation’s social, economic, cultural and intellectual heritage instantiated in digital form.” • Provide reliable solutions for life-cycle and long-term management of digital archival materials • System driver is Preservation, leading to Access EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation What is a Digital Repository? • Collections of digital objects: content + metadata • Cross-domain implementation • Offer minimum set of basic services – Get, Search, Access control • Sustainable & trusted; well-supported and managed • Policies, processes, services, people • Overall commitment to stewardship of digital materials • Enables quick & remote access to digital materials EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Main Issues for Digital Archives • • • • • • • • • User Requirements Transfer & Ingest Metadata Standards Digital preservation strategies Linkage Audit and Certification Legal Issues Access restrictions EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation OAIS • Open Archival Information System Reference Model • ISO 14721:2003 • "An archive, consisting of an organisation of people and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community" • Establishes a common framework of terms and concepts • Defines an Information Model • Identifies basic Functions of an OAIS EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation OAIS Functional Model • Functional model has six entities: • • • • • • Ingest; Archival Storage; Data Management; Administration; Preservation Planning; Access • Described using UML diagrams EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation OAIS Functional Entities P R O D U C E R Preservation Planning DIP Descriptive info. Descriptive info. queries Data Management SIP Access result sets orders Ingest Archival Storage SIP AIP SIP C O N S U M E R AIP DIP Administration MANAGEMENT EAOLUG :: RSC :: Cambridge OAIS Functional Entities (Figure 4-1) 23 May 2006 a centre of expertise in data curation and preservation DSpace • DSpace: “DSpace is a groundbreaking digital repository system that captures, stores, indexes, preserves, and redistributes an organization's research data [...] the DSpace software platform serves a variety of digital archiving needs.” • Open source software • Example use: • • • • American Museum of Natural History Research Library Chapel Hill, SILS, Theses & Dissertations University of Cambridge – Academic & related content Edinburgh Research Archive (ERA) EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation EPrints • Eprints: “GNU EPrints is generic archive software under development by the University of Southampton. It is intended to create a highly configurable web-based archive.” • Open Source software • Example uses: • • • • • Southampton Crystal Structure Report Archive Central Connecticut State University Digital Archive Central European University – Preprint Archive Curtin institute of Technology Institutional Repository DLIST – Digital Library of Information Science & Technology EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Fedora • Fedora: “Open source software that gives organisations a flexible service-oriented architecture for managing and delivering their digital content.” • Open source software • Example uses: • Digital Case, Case Western Reserve University's electronic repository and archive: stores, disseminates, and preserves faculty research in digital formats (both born digital and digitised) • University of Queensland eSpace – research digital repository with published articles and conference papers, book chapters, theses and other forms of written research EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Others • Other systems such as Digital Commons institutional repository service • Other, custom-built systems • NARA Electronic Records Archives (ERA) project • UK National Archives • Public Record Office, Victoria • KB eDepot, Netherlands • Several other large bodies whose archive predates development of aforementioned repository software • Commercial systems EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation In conclusion • There is much in common between digital archives, libraries, and repositories • Intention and subsequent functionality is the key to defining digital storage systems • Digital Archives offer a framework for maintaining & preserving the authenticity and integrity of records over time • Several software solutions are available • Development is ongoing • Need technical know-how to implement • There is still a lot of work to do... . EAOLUG :: RSC :: Cambridge 23 May 2006 a centre of expertise in data curation and preservation Thank you. Questions? Maureen Pennock [email protected] Join the DCC Associates Network at http://www.dcc.ac.uk EAOLUG :: RSC :: Cambridge 23 May 2006