Interpretation of the OAIS Model Derek Sergeant http://www.leeds.ac.uk/camileon/ Overview of the OAIS Model In order to become familiar with the OAIS Reference Model When.
Download
Report
Transcript Interpretation of the OAIS Model Derek Sergeant http://www.leeds.ac.uk/camileon/ Overview of the OAIS Model In order to become familiar with the OAIS Reference Model When.
Interpretation of the OAIS Model
Derek Sergeant
http://www.leeds.ac.uk/camileon/
Overview of the OAIS Model
In order to become familiar with the
OAIS Reference Model
When Cedars staff first encountered the
model it took them several months to
start grasping it
Re-iterate some of the things already
said
Overview of the OAIS Model
Specific vocabulary for Digital
Preservation practioners
Specific advice on how to sub-divide a
complex task
Provides logic and structure to allow the
digital holdings to be visualised and
processed
Overview of the OAIS Model
Much of the OAIS reference model does
not need to be understood by the
majority of people working in digital
preservation
Some detail is only necessary to
implement a solution (the low - level
understanding)
Key concepts of the OAIS Model
Producer
OAIS
Management
Consumer
Key concepts of the OAIS Model
The Producer creates and delivers the
digital objects which go into the OAIS
The Consumer asks for and receives
digital objects from the OAIS
The Management deals with high level
OAIS policy and monitors the OAIS
Key concepts of the OAIS Model
The OAIS receives the digital objects
from the producer, archives them, and
supplies them to the consumer.
Key concepts of the OAIS Model
Producer
OAIS
SIPs
Consumer
DIPs
AIPs
Management
Key concepts of the OAIS Model
There are three basic types of
Information Package
The Producer and the OAIS
communicate with Submission IPs
The OAIS and the Consumer
communicate with Dissemination IPs
The OAIS preserves Archive IPs
Key concepts of the OAIS Model
SIPs
AIPs
Content
PDI
Information
DIPs
Key concepts of the OAIS Model
Archival Information Packages contain
both Content Information and
Preservation Description Information
Content Information is the digital object
that you need to preserve
PDI is description and information to
explain what the Content actually is
Key concepts of the OAIS Model
Content
Data
Object
AIP
RI
Content
PDI
Information
Key concepts of the OAIS Model
The Content Information part of an AIP
contains (very tightly coupled) the actual
data object and the Representation
Information that makes the object
meaningful
Key concepts of the OAIS Model
Content
Data
Object
+
RI
Intellectual Content
=(genuine information)
Key concepts of the OAIS Model
Long Term
(The Representation Information needs
to keep the Content Data
understandable in the Long Term)
The knowledge base of the designated
community (and the archive) needs to
be monitored in the Long Term
Key concepts of the OAIS Model
Preservation Planning
Data Management
Producer
Ingest
Access
Archival Storage
Administration
Management
Consumer
Key concepts of the OAIS Model
Ingest gets digital objects from the
Producer into the OAIS
Access passes digital objects to the
Consumer
Data Management keeps track of the
OAIS holdings
Archival Storage preserves AIPs in the
Long Term
The Scenario
The Library that I work for has realised
that over the past five years we are
getting an increasing number of items
that are digital
At the last University Senate meeting
the Pro-Vice Chancellor for Information
Technology declared that we would
keep these and make them available
The Scenario
In order to do this it was realised that
we need to develop a computer system
capable of storing these electronic
objects in a convenient form (to us)
Making them available should be just a
case of duplicating the storage copy
and allowing a library user to download
the object
The Scenario
At the moment the digital objects that
we have consist of
• CD Rom supplements that arrive with a
conventional book
• Electronic thesis from Postgrad Computing
• e-journal subscriptions
The Scenario
Upon investigation, we found a
Reference Model that describes exactly
what we need to do in order to preserve
and make available all of our digital
objects
The OAIS Reference Model
Interpreting the OAIS Model
Given that we have established a need
to preserve the digital objects from our
library, and that we shall be archiving
them ourselves - in a newly formed
library centre for preservation of
electronic holdings
We revisit the basic OAIS diagram
Basic OAIS Relationships
Producer
OAIS
Management
Consumer
Interpreting the OAIS Model
Identifying the Producers:
due to the number of types and sources
of digital objects there are many
• e-journal publishers
• CD Rom book supplement publishers
• Other Departments (e-thesis)
Are there emerging trends - new
Producers in the future
Interpreting the OAIS Model
Identifying the Consumers:
We inherit the same Consumers as the
library
• University students
• University staff/researchers
Are there going to be new Consumer
groups in the future?
Interpreting the OAIS Model
Identifying the Management:
Looking at the OAIS Model, we
determine the roles of Management:
•
•
•
•
Long term equipment planning
Review of OAIS performance
Ratify pricing policy
Relationship development
– Producer OAIS Consumer
• Promote OAIS uptake
– (within spheres of funding)
Interpreting the OAIS Model
Some of the roles of Management are
very close to the current roles of the
library management
There are no existing people that
already perform the other roles
We will form a new Management group
with some existing library management
and other senior university strategy
managers
Interpreting the OAIS Model
Identify the OAIS:
Since we are intending to preserve our
digital objects ourselves, we provide the
role of the OAIS
Both the Archival store and the
administration
Interpreting the OAIS Model
Identify the archive holdings:
• Both present holdings and future holdings
Present:
• e-thesis
• CD Rom book supplements
• (2 e-journal subscriptions)
Future:
• more internal publications
• more e-journals
Structural Components of an AIP
Content Information
AIP
Content
Data
Object
Representation
Information
Preservation
Description
Information
Interpreting the OAIS Model
We do not have all of the components
that are needed for an AIP
In the beginning, we have the Content
Data Object for everything
For our e-thesis objects we also have a
small amount of PDI
Lesson from the Cedars project
Determine the Significant Properties for
the digital objects
This should be done as early as
possible
Significant Properties are those
attributes of an object that constitute the
complete (for the intended Consumer)
intellectual content of that object
Lesson from the Cedars project
I.e. Significant Properties for an e-thesis
The complete text, including divisions
into chapters and sections
The layout and style - particular fonts
and spacing are essential
Diagrams
(perhaps web adverts are not
Significant for our e-journals)
Interpreting the OAIS Model
We have now established who we are
working with
We have also established what data
objects there are
We have moved into OAIS vocabulary
Examples of old vocabulary
• Publishers, Readers
• Electronic records
Functional Entities Diagram
Preservation Planning
Data Management
Producer
Ingest
Access
Archival Storage
Administration
Management
Consumer
Interpreting the OAIS Model
Ingest
Establish agreements with Producers
• Record assumptions about Producer and
our (the OAIS) knowledge base
Take the digital data (SIPs)
Process the SIPs into AIPs
• Record any current software dependencies
to use the Content Data Object
Interpreting the OAIS Model
Archival Storage
Put the AIPs into Archival Storage from
Ingest
• Update the Data Management database to
keep track of the OAIS holdings
NB: The Archival Storage system that
we procure will be capable of storing
and retrieving an AIP without loss
• Storage, maintenance, retieval of AIPs
Interpreting the OAIS Model
Data Management
As well as keeping track of the AIPs
currently in Archival Storage this entity
produces Discovery Information
These can be passed to the Consumer
to allow them to choose suitable AIPs
for viewing
Interpreting the OAIS Model
Access
This provides support for the
Consumers
It delivers DIPs (in an appropriate form
for the particular Consumer)
Interpreting the OAIS Model
Administration
Overall operational control of the OAIS
Records and makes submission
agreements (with Producers)
Records and implements archiving
standards and policies
Interpreting the OAIS Model
Preservation Planning
Monitors the environment of the OAIS
Ensures that AIPs remain accessible
• I.e. remain understandable to current
Consumers
Develops templates for SIPs and DIPs
and other assistance for working with
Producers and Consumers
Responsibilities of an OAIS
Negotiate and accept information from
Producers
Determine which community should
become the Designated Community
Ensure that Information Packages are
independently understandable
Ensure IPs are preserved
Make preserved IPs available
Organisational views
Establishing your Designated
Community
The people who you service by
preserving information for them
Determining the knowledge base of the
Designated Community and monitoring
changes to this knowledge base
Organisational views
The Perspective of Preservation
Long Term
To do a preservation job which takes
into account
• Changing technology
• Changing user community
Organisational views
Deciding whether Digital Objects need
to be transformed (migrated)
If they do, ensuring that nothing
significant to future Consumers is lost
Are there alternatives to transforming
• Source code for original software
• Emulation
Organisational views
Archive Interoperability
The drivers for interoperability come
from:
• The Consumers
• The Producers
• The Management
Organisational views
Four basic models for interoperating in
the OAIS Reference Model
Independent - no interoperating
Co-operating - common producers,
common dissemination standards
Federated - the most interoperating
Shared Resource - reduce costs by
sharing equipment
Organisational views
Federated archives
Central site?
Distributed Finding Aids
Distributed Access Aids
Issues:
• Unique AIP Names - hierarchical
namescheme
• Duplicate AIPs
Management - level of autonomy
Summary and Questions
Federated archives : Cedars
Site A
Site C
Site B
How Can a Digital Resource be
prepared for good/lasting
preservation?
Give it a unique name
Metadata
Significant Properties
OAIS Representation Information
Representation
Information
Structure
Information
adds meaning
OAIS fig 4-10
Semantic
Information
Cedars Representation Net
AIP
PDI
Representation
Information
Primary Digital Object
RAE
UAF
RAE
RAE
Transformer
Format
Description
RAE
Software
Input format
Output format
Platform
Gödel’s Theorem
Some representations (e.g. plain ASCII
text, MS-WORD, HTML) are defined
outside the system
All references to such a format are via the
same CRID
The ends of representation nets must be
managed, to look out for obsolescence
replace CRID destination with converter
facility
Evolution of the Representation Net
AIP
PDI
Representation
Information
Primary Digital Object
RAE
UAF
RAE
RAE
Transformer
Format
Description
RAE
Software
Input format
Output format
Platform
Evolution of the Representation Net
AIP
PDI
Representation
Information
Primary Digital Object
RAE
UAF
RAE
RAE
Transformer
Format
Description
RAE
RAE
Software
Input format
Output format
Platform
Platform
Evolution of the Representation Net
AIP
PDI
Representation
Information
Primary Digital Object
RAE
UAF
RAE
RAE
Transformer
Format
Description
RAE
RAE
Software
RAE
Input format
Output format
Platform
Platform
Obsolete data formats
Keep the original byte-streams
Representation info leads to sofware
capable of rendering the information
Archive management must lookout for
dependence on rendering software
that is about to become obsolete.
• Can use software preservation
techniques to preserve rendering
sofware
Emulation of Yesteryear
Today’s desktop machine far exceeds
the mainframe of the 1970s or even 80s
George3 (1970s UK system)
• Emulate the George3 executive
– i.e. order code + system calls
Constructing RI for obsolete materials
proves a valuable test-bed for the model
Vital concepts
CRIDS - give everything a unique name
A byte-stream can be stored for ever
• Complex data streams must be mapped
into byte-streams, and mapped back again
for use
Representation Information preserves
access to intellectual content
• makes emulation possible
Gödel Ends are monitored for
obsolescence
The Archival Information Package
Preservation
Description
Information
XML
Representation
Information
Primary Digital
Object
Property list
Packed into bytestream
Packed together into one AIP bytestream using ASN.1
• Links to Representation Network
• Links for other purposes
Choices at Creation of AIP
Geared towards easy/low maintenance
Identify which parts of PDI are fixed/static
Use current best archival method to map
the digital resource into a bytestream
(PDO then remains static)
For common (esp. changing) metadata
use indirection
Rendering Instructions
Format Descriptions
Representation Information
Technical Metadata
Evolving Technology
Representation Networks
Controversy Three
A Digital Message can be Preserved
Indefinitely
This is media - less
The Preserved resource hops media
long before temporal effects loose it
Digitisation and Access have a place