Interpretation of the OAIS Model Derek Sergeant http://www.leeds.ac.uk/camileon/ Overview of the OAIS Model In order to become familiar with the OAIS Reference Model  When.

Download Report

Transcript Interpretation of the OAIS Model Derek Sergeant http://www.leeds.ac.uk/camileon/ Overview of the OAIS Model In order to become familiar with the OAIS Reference Model  When.

Interpretation of the OAIS Model
Derek Sergeant
http://www.leeds.ac.uk/camileon/
Overview of the OAIS Model
In order to become familiar with the
OAIS Reference Model
 When Cedars staff first encountered the
model it took them several months to
start grasping it
 Re-iterate some of the things already
said

Overview of the OAIS Model
Specific vocabulary for Digital
Preservation practioners
 Specific advice on how to sub-divide a
complex task
 Provides logic and structure to allow the
digital holdings to be visualised and
processed

Overview of the OAIS Model
Much of the OAIS reference model does
not need to be understood by the
majority of people working in digital
preservation
 Some detail is only necessary to
implement a solution (the low - level
understanding)

Key concepts of the OAIS Model
Producer
OAIS
Management
Consumer
Key concepts of the OAIS Model
The Producer creates and delivers the
digital objects which go into the OAIS
 The Consumer asks for and receives
digital objects from the OAIS
 The Management deals with high level
OAIS policy and monitors the OAIS

Key concepts of the OAIS Model

The OAIS receives the digital objects
from the producer, archives them, and
supplies them to the consumer.
Key concepts of the OAIS Model
Producer
OAIS
SIPs
Consumer
DIPs
AIPs
Management
Key concepts of the OAIS Model
There are three basic types of
Information Package
 The Producer and the OAIS
communicate with Submission IPs
 The OAIS and the Consumer
communicate with Dissemination IPs
 The OAIS preserves Archive IPs

Key concepts of the OAIS Model
SIPs
AIPs
Content
PDI
Information
DIPs
Key concepts of the OAIS Model
Archival Information Packages contain
both Content Information and
Preservation Description Information
 Content Information is the digital object
that you need to preserve
 PDI is description and information to
explain what the Content actually is

Key concepts of the OAIS Model
Content
Data
Object
AIP
RI
Content
PDI
Information
Key concepts of the OAIS Model

The Content Information part of an AIP
contains (very tightly coupled) the actual
data object and the Representation
Information that makes the object
meaningful
Key concepts of the OAIS Model
Content
Data
Object
+
RI
Intellectual Content
=(genuine information)
Key concepts of the OAIS Model
Long Term
 (The Representation Information needs
to keep the Content Data
understandable in the Long Term)
 The knowledge base of the designated
community (and the archive) needs to
be monitored in the Long Term

Key concepts of the OAIS Model
Preservation Planning
Data Management
Producer
Ingest
Access
Archival Storage
Administration
Management
Consumer
Key concepts of the OAIS Model
Ingest gets digital objects from the
Producer into the OAIS
 Access passes digital objects to the
Consumer
 Data Management keeps track of the
OAIS holdings
 Archival Storage preserves AIPs in the
Long Term

The Scenario
The Library that I work for has realised
that over the past five years we are
getting an increasing number of items
that are digital
 At the last University Senate meeting
the Pro-Vice Chancellor for Information
Technology declared that we would
keep these and make them available

The Scenario
In order to do this it was realised that
we need to develop a computer system
capable of storing these electronic
objects in a convenient form (to us)
 Making them available should be just a
case of duplicating the storage copy
and allowing a library user to download
the object

The Scenario

At the moment the digital objects that
we have consist of
• CD Rom supplements that arrive with a
conventional book
• Electronic thesis from Postgrad Computing
• e-journal subscriptions
The Scenario
Upon investigation, we found a
Reference Model that describes exactly
what we need to do in order to preserve
and make available all of our digital
objects
 The OAIS Reference Model

Interpreting the OAIS Model
Given that we have established a need
to preserve the digital objects from our
library, and that we shall be archiving
them ourselves - in a newly formed
library centre for preservation of
electronic holdings
 We revisit the basic OAIS diagram

Basic OAIS Relationships
Producer
OAIS
Management
Consumer
Interpreting the OAIS Model
Identifying the Producers:
 due to the number of types and sources
of digital objects there are many

• e-journal publishers
• CD Rom book supplement publishers
• Other Departments (e-thesis)

Are there emerging trends - new
Producers in the future
Interpreting the OAIS Model
Identifying the Consumers:
 We inherit the same Consumers as the
library

• University students
• University staff/researchers

Are there going to be new Consumer
groups in the future?
Interpreting the OAIS Model
Identifying the Management:
 Looking at the OAIS Model, we
determine the roles of Management:

•
•
•
•
Long term equipment planning
Review of OAIS performance
Ratify pricing policy
Relationship development
– Producer OAIS Consumer
• Promote OAIS uptake
– (within spheres of funding)
Interpreting the OAIS Model
Some of the roles of Management are
very close to the current roles of the
library management
 There are no existing people that
already perform the other roles
 We will form a new Management group
with some existing library management
and other senior university strategy
managers

Interpreting the OAIS Model
Identify the OAIS:
 Since we are intending to preserve our
digital objects ourselves, we provide the
role of the OAIS
 Both the Archival store and the
administration

Interpreting the OAIS Model

Identify the archive holdings:
• Both present holdings and future holdings

Present:
• e-thesis
• CD Rom book supplements
• (2 e-journal subscriptions)

Future:
• more internal publications
• more e-journals
Structural Components of an AIP
Content Information
AIP
Content
Data
Object
Representation
Information
Preservation
Description
Information
Interpreting the OAIS Model
We do not have all of the components
that are needed for an AIP
 In the beginning, we have the Content
Data Object for everything
 For our e-thesis objects we also have a
small amount of PDI

Lesson from the Cedars project
Determine the Significant Properties for
the digital objects
 This should be done as early as
possible
 Significant Properties are those
attributes of an object that constitute the
complete (for the intended Consumer)
intellectual content of that object

Lesson from the Cedars project
I.e. Significant Properties for an e-thesis
 The complete text, including divisions
into chapters and sections
 The layout and style - particular fonts
and spacing are essential
 Diagrams
 (perhaps web adverts are not
Significant for our e-journals)

Interpreting the OAIS Model
We have now established who we are
working with
 We have also established what data
objects there are
 We have moved into OAIS vocabulary
 Examples of old vocabulary

• Publishers, Readers
• Electronic records
Functional Entities Diagram
Preservation Planning
Data Management
Producer
Ingest
Access
Archival Storage
Administration
Management
Consumer
Interpreting the OAIS Model
Ingest
 Establish agreements with Producers

• Record assumptions about Producer and
our (the OAIS) knowledge base
Take the digital data (SIPs)
 Process the SIPs into AIPs

• Record any current software dependencies
to use the Content Data Object
Interpreting the OAIS Model
Archival Storage
 Put the AIPs into Archival Storage from
Ingest

• Update the Data Management database to
keep track of the OAIS holdings

NB: The Archival Storage system that
we procure will be capable of storing
and retrieving an AIP without loss
• Storage, maintenance, retieval of AIPs
Interpreting the OAIS Model
Data Management
 As well as keeping track of the AIPs
currently in Archival Storage this entity
produces Discovery Information
 These can be passed to the Consumer
to allow them to choose suitable AIPs
for viewing

Interpreting the OAIS Model
Access
 This provides support for the
Consumers
 It delivers DIPs (in an appropriate form
for the particular Consumer)

Interpreting the OAIS Model
Administration
 Overall operational control of the OAIS
 Records and makes submission
agreements (with Producers)
 Records and implements archiving
standards and policies

Interpreting the OAIS Model
Preservation Planning
 Monitors the environment of the OAIS
 Ensures that AIPs remain accessible

• I.e. remain understandable to current
Consumers

Develops templates for SIPs and DIPs
and other assistance for working with
Producers and Consumers
Responsibilities of an OAIS
Negotiate and accept information from
Producers
 Determine which community should
become the Designated Community
 Ensure that Information Packages are
independently understandable
 Ensure IPs are preserved
 Make preserved IPs available

Organisational views
Establishing your Designated
Community
 The people who you service by
preserving information for them
 Determining the knowledge base of the
Designated Community and monitoring
changes to this knowledge base

Organisational views
The Perspective of Preservation
 Long Term
 To do a preservation job which takes
into account

• Changing technology
• Changing user community
Organisational views
Deciding whether Digital Objects need
to be transformed (migrated)
 If they do, ensuring that nothing
significant to future Consumers is lost
 Are there alternatives to transforming

• Source code for original software
• Emulation
Organisational views
Archive Interoperability
 The drivers for interoperability come
from:

• The Consumers
• The Producers
• The Management
Organisational views
Four basic models for interoperating in
the OAIS Reference Model
 Independent - no interoperating
 Co-operating - common producers,
common dissemination standards
 Federated - the most interoperating
 Shared Resource - reduce costs by
sharing equipment

Organisational views
Federated archives
 Central site?
 Distributed Finding Aids
 Distributed Access Aids
 Issues:

• Unique AIP Names - hierarchical
namescheme
• Duplicate AIPs

Management - level of autonomy
Summary and Questions
Federated archives : Cedars
Site A
Site C
Site B
How Can a Digital Resource be
prepared for good/lasting
preservation?

Give it a unique name

Metadata

Significant Properties
OAIS Representation Information
Representation
Information
Structure
Information
adds meaning
OAIS fig 4-10
Semantic
Information
Cedars Representation Net
AIP
PDI
Representation
Information
Primary Digital Object
RAE
UAF
RAE
RAE
Transformer
Format
Description
RAE
Software
Input format
Output format
Platform
Gödel’s Theorem




Some representations (e.g. plain ASCII
text, MS-WORD, HTML) are defined
outside the system
All references to such a format are via the
same CRID
The ends of representation nets must be
managed, to look out for obsolescence
replace CRID destination with converter
facility
Evolution of the Representation Net
AIP
PDI
Representation
Information
Primary Digital Object
RAE
UAF
RAE
RAE
Transformer
Format
Description
RAE
Software
Input format
Output format
Platform
Evolution of the Representation Net
AIP
PDI
Representation
Information
Primary Digital Object
RAE
UAF
RAE
RAE
Transformer
Format
Description
RAE
RAE
Software
Input format
Output format
Platform
Platform
Evolution of the Representation Net
AIP
PDI
Representation
Information
Primary Digital Object
RAE
UAF
RAE
RAE
Transformer
Format
Description
RAE
RAE
Software
RAE
Input format
Output format
Platform
Platform
Obsolete data formats
Keep the original byte-streams
 Representation info leads to sofware
capable of rendering the information
 Archive management must lookout for
dependence on rendering software
that is about to become obsolete.

• Can use software preservation
techniques to preserve rendering
sofware
Emulation of Yesteryear
Today’s desktop machine far exceeds
the mainframe of the 1970s or even 80s
 George3 (1970s UK system)

• Emulate the George3 executive
– i.e. order code + system calls

Constructing RI for obsolete materials
proves a valuable test-bed for the model
Vital concepts
CRIDS - give everything a unique name
 A byte-stream can be stored for ever

• Complex data streams must be mapped
into byte-streams, and mapped back again
for use

Representation Information preserves
access to intellectual content
• makes emulation possible

Gödel Ends are monitored for
obsolescence
The Archival Information Package
Preservation
Description
Information
XML
Representation
Information
Primary Digital
Object
Property list
Packed into bytestream
Packed together into one AIP bytestream using ASN.1
• Links to Representation Network
• Links for other purposes
Choices at Creation of AIP
Geared towards easy/low maintenance
 Identify which parts of PDI are fixed/static
 Use current best archival method to map
the digital resource into a bytestream
(PDO then remains static)
 For common (esp. changing) metadata
use indirection

Rendering Instructions
 Format Descriptions


Representation Information

Technical Metadata

Evolving Technology

Representation Networks
Controversy Three

A Digital Message can be Preserved
Indefinitely

This is media - less

The Preserved resource hops media
long before temporal effects loose it

Digitisation and Access have a place