USING METADATA TO FACILITATE UNDERSTANDING AND CERTIFICATION ABOUT THE PRESERVATION PROPERTIES OF A PRESERVATION SYSTEM Jewel H.

Download Report

Transcript USING METADATA TO FACILITATE UNDERSTANDING AND CERTIFICATION ABOUT THE PRESERVATION PROPERTIES OF A PRESERVATION SYSTEM Jewel H.

USING METADATA TO FACILITATE
UNDERSTANDING AND CERTIFICATION ABOUT
THE PRESERVATION PROPERTIES OF A
PRESERVATION SYSTEM
Jewel H. Ward, Hao Xu, Mike C. Conway, Terrell G.
Russell, and Antoine de Torcy
CRADLE Seminar, 20 September 2013
University of North Carolina at Chapel Hill
School of Information and Library Science
I.
II.
III.
IV.
V.
Introduction
Background
Method (Mapping a Data Grid to the OAIS Functional
Model)
Conclusion
Future Work
Introduction
• Need to provide internal audit mechanisms to verify
assertions about preservation capabilities
• OAIS RM  TRAC  TDR Certification  Certifying the
Certifiers
• This project grew out of Moore’s work, “Towards a Theory
of Digital Preservation” & work for SAA in 2010:
• “A preservation environment manages communication from the past while
communicating with the future. Information generated in the past is sent into the
future by the current preservation environment. The proof that the preservation
environment preserves authenticity and integrity while performing the
communication constitutes a theory of digital preservation. We examine the
representation information that is needed about the preservation environment for a
theory of digital preservation. The representation information includes descriptions
of the preservation management policies, the preservation processes, and the state
information that is needed to verify the correct working behavior of the system.”
IJDC, 2008, Vol. 3, No. 1, pp. 63-75
Introduction
• We propose to do this by:
• Metadata-as-a-preservation task: mapping computer functions and
procedures to the OAIS model and ISO 16363 (TRAC)
• Metadata-as-state: use state transition systems to connect
computer code & data management policies
• Our method bridges the gap between computer code &
human-readable policies
• Facilitates certifying preservation repository properties
• Not a funded project per se. Melding of dissertation
(Jewel Ward), Master’s (Mike Conway), and post-Doc
(Hao Xu) work.
Background
• Metadata
• Bibliographic, administrative
• IP numbers, date, time
• Event data, state
• The OAIS Composite of Functional Entities
• Common Services, Administration, Ingest, Data Management,
Archival Storage, Preservation Planning, Access
• Human- vs Machine-readable Policies
“The repository shall comply with Access Policies” (ISO 16363)
Vs.
acAclPolicy { msiAclPolicy(“STRICT”); } #computer code
• “Bottom Up” vs. “Top Down”
Background
Background
Background
Background
Method 1
Method 1
Method 2
• State transition system captures
• State of data objects
• Their transitions
Example: SIP  AIP  DIP
• Data management policies determine which sequences of
states are compliant or not
• These refinements yield a formal structure for data
management policies
• The state transition system thus obtained specifies
which metadata are necessary if the preservation
properties are going to be certified.
Method 2
Method 3
• Data system consists of:
• Actions change states
• Events that trigger rules
• Rules that consist of actions
• Need to ID all states and actions, and the non-
determinism
• What events trigger rules?
• Need a set of metadata attributes == states of the state transition
system
Method 3
• Mapping
• States
• State Transitions
• Non-determinism
• Interpreting the components of the OAIS Reference
Model as a state transition system allows us to map
the high level recommendations to machine code in a
rigorous manner
• State transitions provide log information; log information
== metadata; these attributes are stored in the metadata
catalog
Method 4
• If not a 1:1 mapping, need additional metadata
• Example: SIP  AIP; AIP  AIP; AIP  DIP
• Use the OAIS Composite of Functional Entities to create
“policy domains” in the preservation system. E.g., the
OAIS Reference Model as a “state transition system”.
• State transitions provide logging information about each
transition that may be preserved as metadata
• Example, can store whether or not an object is part of a SIP or an
AIP as a metadata attribute attached to that object
• No change = compliant
• Change = not compliant
Method 4
• Attach metadata for each area of the data grid
• Similar to the OAIS Composite of Functional Entities
• Combined with Customized Policies = a “policy domain”
Method 4
Conclusion 1
“Basing policy domains on [the OAIS Reference Model
Composite of Functional Entities] enables the development
of assessment criteria that evaluate the recorded state of
each object and allow automated validation of the
properties such as authenticity, integrity, chain of custody,
and trustworthiness. Our approach makes the assessment
of the preservation state of the archive amenable to formal,
automated validation via state transition events metadata.”
Summary
• We examined the “state of the state”
• We detailed a method for abstracting state transition
systems from data management policies
“…this approach bridged the gap between concrete
computer code and human-readable, abstract standards
such as the OAIS Reference Model Functional Model by
using metadata. We posit that this method may be applied
to any preservation system in order to facilitate the
certification of a trustworthy digital repository. “
Future Work
• I finish my dissertation, which examines the “state of the
•
•
•
•
•
state”.
Capture how information objects in the OAIS RM may be
transformed, combined, or divided
Capture the static and dynamic aspects of the OAIS RM
Map iRODS policy enforcement points to the OAIS RM
semantics (static vs. dynamic)
Classify the policies required for repositories to create a
library of policies by domain or function
Define a formalism to verify these
states/transitions/actions
Acknowledgements
This research is partially supported by NSF grant #0940841 “DataNet
Federation Consortium” and NSF grant #1032732 “SDCI Data
Improvement: Improvement and Sustainability of iRODS Data Grid
Software for Multi-Disciplinary Community Driv- en Application”.
Questions? Comments?