Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.

Download Report

Transcript Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander.

Archival Information Packages for
NASA HDF-EOS Data
R. Duerr, Kent Yang, Azhar Sikander
Outline
• What is an Archival Information Package?
 HDF-AIP
• Standards? What Standards?
 METS
 DIF/FGDC/ISO 19115-2
 PREMIS
• Results
• Next Steps
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
OAIS Reference Model1
Archive Information Package
1
Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002.
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Archival Information Package Contents
• Content Information
 The data object to be preserved
 Information that describes the data object
o Typically interpreted as the syntax and semantics of the file
structure
• Preservation Description Information
 Provenance – Origin or source of the data, any changes that have taken place since,
and who has had custody of it
 Fixity – the authentication mechanisms (with keys) needed to ensure that the data
object has not been altered in an undocumented manner
 Reference – identification mechanisms and values
 Context – relation of the object to its environment
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
HDF-Archive Information Packages
• The HDF group was
funded to investigate
and propose a design
for a complete archival
information package
for HDF data files
• The result was a METS
metadata file to
accompany the HDF
data file
http://www.hdfgroup.org/projects/hdf5_aip/hdf5_aip_wp.html
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Metadata Standards - METS
• Metadata Encoding and Transmission Standard
• An initiative of the Digital Library Federation
• Provides the means to convey the metadata
necessary for
 management of digital objects within a repository
 exchange of objects between repositories (or between
repositories and their users)
• Designed to facilitate
 shared development of information management
tools/services
 interoperable exchange of digital materials
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
METS - A very brief overview
Describes the METS
document itself
Describes
e.g.,
creatorthe
orobject
editor
using some external standard
Describes object creation, storage,
e.g.,
MARC, FGDC, Dublin Core
intellectual property rights, source
info, provenance, etc.
Provides ane.g.,
inventory
PREMISof all of the
files that are part of the object
described
A physical or logical map of the
organization of the materials
describedof hyperlinks
Allows specification
between parts of the map (mostly
useful when preserving websites)
Used to associate executable code
with parts of the content
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Metadata Standards - Descriptive Metadata
Derived from
• Discovery, Assess and Access Metadata
 GCMD DIF
 FGDC CSDGM
 ISO 19115
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Metadata Standards - ISO 19115:2003
• The international equivalent of the FGDC standard
• Most fields can be mapped or generated from
FGDC metadata
• The exception is the Dataset Topic Keywords
• Allows for national profiles
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Metadata Standards - ISO 19115:2003
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Is there a metadata standard for AIP
information?
Archive Information Package
1
Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002.
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Preservation Metadata Implementation Strategies
(PREMIS)
• Provide a core preservation metadata set with broad
applicability across the digital preservation
community
• Developed by an OCLC and RLG sponsored
international working group
 Representatives from libraries, museums, archives,
government, and the private sector.
• Maintained by the Library of Congress
• Based on the OAIS reference model
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
PREMIS - Entity-Relationship Diagram
Intellectual
Entities
Objects
“an action that involves at
least one
object or agent
Rights
“a person,
organization,
or
“a coherent set of content
to unit
the associated
preservation
software
“aknown
discrete
program
of information
that is reasonably
with described
preservation
inrepository”
digital
form”in
as aevents
unit”
e.g.,
created,
archived,
For
lifeexample,
of
an
object”
asite,
datadata
file
Forthe
example,
a web
migrated
e.g.,
Dr.
Spock
“assertions
ofdonated
one
or it
more
set or collection
of
data
sets
rights or permissions
pertaining to an object
or an agent”
e.g., copywrite
Eventsnotice, legal
statute, deposit agreement
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Agents
Is there a metadata standard for AIP
information?
PREMIS
ISO 19115
1
Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002.
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
NOAA Data Stewardship Prototype
• NSIDC and THG demonstrated the feasibility of
migrating NASA data to a standard HDF-AIP
format
• Motivation:
Technologies change regularly, organizations
come and go, but data must survive
But preserving data takes more than just
preserving the bits, all the components of an
AIP are critical
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Project Goals
• Prototype development of Archive Information
Packages for HDF data:
 For entire data sets
 For individual “granules”
• Test usability of digital library standards with
geospatial data
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Program Plan (Modified)
ISO-19115
CDM/NetCDF4
ECS to
METS
(Data Set)
HDF5-AIP
NetCDF4 /
HDF5 Data
METS
ECS to
METS
NSIDC/ECS
Metadata
(Granule)
NetCDF4/HDF5-data
H4to
H5
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
NSIDC/ ECS
HDF4-data
HDF5 Granule Level Archive Information
Packages
HDF5
Data file
Metadata file
METS
Primary Schema
Extension Schema
|<mets>
|---<dmdSec>----------------<ISO 19115>
|---<amdSec>--------------|--<techMD>
|
|--<rightsMD>
|
|--<sourceMD>
|----<fileGrp>
|----<structMap>
PREMIS
HDF5 AIP Components
http://www.hdfgroup.uiuc.edu/papers/papers/AIP/HDF5_AIP_White_Paper.pdf
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
File Level AIP Activity Status
• Developed a map from NSIDC/ECS metadata to
METS/PREMIS/ISO 19115 components
• Prototype software completed
• Issues
 What goes in PREMIS vs ISO 19115?
 Auxillary file handling - own AIP or not?
o
E.g., browse files, processing history, PGE’s
 Granules vs files
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Issues and Questions
• Inconsistent use of terminology between standards
– for example, what is a data set?
• Many of the standards care about distribution
formats
 Are these even relevant concepts any more?
 Do you really want to have to update the metadata record
just because a new distribution format was added?
 What about new access services?
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Next Steps
• NSIDC is updating our non-ECS data systems
handling of metadata including support for
PREMIS, etc. metadata on all holdings
• Work underway to upgrade granule level metadata
for NSIDC flagship sea ice products
(PREMIS/METS/ISO AIP packages)
• Work to improve archivability of data stored in
HDF formats on-going – NASA implementing a
standard XML description of contents across its
archives
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Acknowledgement
This work was supported under NOAA Scientific
Stewardship Program grant number
NA07OAR4310286. Any opinions, findings,
and conclusions or recommendations
expressed in this material are those of the
author(s) and do not necessarily reflect the
views of NOAA.
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII