USGS Proposal to Collaborate as a NARA Affiliated Archives

Download Report

Transcript USGS Proposal to Collaborate as a NARA Affiliated Archives

Selection, Appraisal and Retention
of Digital Scientific Data
Presented at
Workshop on Strategies for Permanent Access to Scientific Information in
Southern Africa: Focus on Health and Environmental Information for
Sustainable Development
By
John Faundeen
Archivist, U.S. Geological Survey
September 2005 - Pretoria, South Africa
Outline
•
•
•
•
•
•
•
Context
Selection
Appraisal
Retention
Case Study
Proposal
Summary
Context
• U.S. Geological Survey
– Center for Earth Resources Observation and Science
• Land Applications
– Satellite and Aerial Remote Sensing
– Topography
– Land Cover
• Earth Observation Holdings
– 2.8 Petabytes from 1972 to Present (1 TB/day)
– 110,000 Film Rolls from 1937 to Present (4 types)
Context
Context
• Scientific Data
– Observations
– Experimental Outputs
– Models
• “Long-Term” Preservation
“A period of time long enough for there to be
concern about the impacts of changing
technologies, including support for new
media and data formats, and of a changing
user community, on the information being
held in a repository. This period extends into
the indefinite future.” - Consultative Committee for Space Data Systems
APPRAISAL
SELECTION
RETENTION
Selection
“The process of identifying materials to be
preserved because of their enduring value,
especially those materials to be physically
transferred to an archives.”
- Society of American Archivist Glossary
Selection
• Inventory (verb)
– Magnitude
– Breadth
•
•
•
•
Political
Daunting
Time-Consuming
Necessary
Appraisal
“The process of determining the value and thus the
disposition of records based upon their
[informational/scientific] use.”
- Society of American Archivist Glossary
“The process of determining the value and thus the
final disposition of Federal records, making them
either temporary or permanent.”
- U.S. National Archives and Records Administration
- http://www.archives.gov/records-mgmt/initiatives/appraisal.html
Appraisal
•
•
•
•
•
Judgmental and Subjective
Organizational Mission Alignment
Agreements or Legislative Mandates
Collection Policy
Continuing Value
– Secondary Uses
– Science Data Interdependences
– Future Research Use … Predicting
Appraisal
• Reproducible? At what Cost?
• Chain of Custody
–
–
–
–
Authenticity
Reliability
Integrity
Usability
• Advisory Committees
• Funding
– Lifecycle
• Creation, Maintenance and Use, Disposition
Appraisal - Criteria
• U.S. National Archives and Records Administration
–
–
–
–
–
–
–
–
How significant are the records for research?
How significant is the source and context of the records?
Is the information unique?
How useable are the records?
Do the records document decisions that set precedents?
Are the records related to other permanent records?
What is the time frame covered by the information?
What are the cost considerations for permanent
maintenance of the records?
Appraisal - Criteria
• U.S. Department of Energy Sandia National
Laboratories (Questionnaire to archives extracts)
– Who really needs this spatial data (who is the intended
user group of the spatial data sets in the archives)?
– What are the purposes for keeping this data?
– How do you intend to use the data (general science
based, legal reasons, basic reference)?
– Does the data really need to be digital?
– Are the data indexed in any way?
– Are the data stored by date, sensor type, geography,
theme or project or in a combination?
Appraisal Criteria
• U.S. National Research Council
“As a general rule, all observational data that are nonredundant, useful,
and documented well enough for most primary uses should be
permanently maintained. Laboratory data sets are candidates for longterm preservation if there is no realistic chance of repeating the
experiment, or if the cost and intellectual effort required to collect and
validate the data were so great that long-term retention is clearly justified.
For both observational and experimental data, the following retention
criteria should be used to determine whether a data set should be saved:
uniqueness, adequacy of documentation (metadata), availability of
hardware to read the data records, cost of replacement, and evaluation
by peer review. Complete metadata should define the content, format or
representation, structure, and context of a data set.”
Appraisal
• Process Flow for USGS EROS
– Appraisal Requested by Archivist, a Program, our
Science Council or the Director of EROS.
– Appraisal Team Assembled
• Archivist
• Records Managers
• Relevant Scientists
– Appraisal Conducted and Documented
• Appraisal Tool Utilized
– Recommendations Briefed to Requestor and Director
of EROS
• Physical Transfers Reinforce Appraisal Criteria
– Requirements & Checklist
Appraisal – USGS EROS Tool
• Over 70 Categorized Questions
– Mission Relevancy
– General Policy (ISO Standard)
– Physical (Media)
– Metadata
– Cost / Benefit
http://edc2.usgs.gov/government/RAT/tool.asp
Retention [Period]
“The length of time, usually based upon an
estimate of the frequency of use of current
and anticipated business, that records
should be retained….before they are
transferred to archives or otherwise
disposed of.”
- Society of American Archivist Glossary
Retention
• Records Schedules
– Draft USGS Geography
• For Long-Term Preservation
– Archival Media Studies (Fiscal Year 2004)
• Near/On, Off-line, Off-site = Objective
• Cost
• Disposition
–
–
–
–
Preparation
Transfer
Destruction
Cost
Case Study
• Committee on Earth Observation Satellites
– 1999 Working Group Meeting in Nairobi, Kenya
– Developing Countries Support Workshop
– Environmental Agencies Around Kenya
• Aging 9-Track Tapes
• Contained Satellite Observations
– Requested Data Rescue Effort
• Desired Data on CDs
– Transfer and Copy Process Developed
• 53% Success Rate
Proposal
• USGS Willing to Extend The Effort
– We Best Know Satellite Data
– Willing to Attempt Other Data
• Case-by-Case Basis
– John Faundeen
• [email protected]
• Fax 707-222-0223
• U.S. Geological Survey, EROS, Sioux Falls, SD 57198 USA
Summary
• Selection, Appraisal & Retention Work
Together
• Appraisal is Subjective, But Must be Done!
• Retention Leads to Dispositions
– May Result in Transfers, not Just Destruction
• Critical to Institutionalize these Processes
• Start Small, But do Start!
“The collections of scientific data acquired
with government and private support are
the foundation for our understanding of the
physical world and for our capabilities to
predict changes in that world.”
- U.S. National Research Council
References
•
•
•
•
•
•
•
Gutmann, M., Schurer, K., Donakowski, D., Beedham, H., “The Selection, Appraisal, and
Retention of Digital Social Science Data.” in Data Science Journal, Vol. 3, December 30,
2004, pp. 209-221.
Pearce-Moses, R., “A Glossary of Archival and Records Terminology.” Society of American
Archivist, Chicago, IL, 2004.
Bellardo, L. J., Bellardo, L.L., “A Glossary for Archivists, Manuscript Curators, and Records
Managers.” Society of American Archivists, Chicago, IL, 1992.
National Archives and Records Administration (2003) Strategic Directions: Appraisal Policy.
Retrieved on August 4, 2005 from the National Archives and Records Administration site:
http://www.archives.gov/records-mgmt/initiatives/appraisal.html.
National Archives and Records Administration (2003) Appraisal Policy of the National Archives
and Records Administration. Retrieved on August 4, 2005 from the National Archives and
Records Administration site: http://www.archives.gov/records-mgmt/initiatives/appraisal.html
Esanu, J., Davidson, J., Ross, S., Anderson, W., “Selection, Appraisal, and Retention of
Digital Scientific Data: Highlights of an ERPANET/CODATA Workshop.” December 15-17,
2003, Lisbon, Portugal.
International Standards Organization, “Information and Documentation – Records
Management.” ISO 15489-1:2001(E), Geneva, Switzerland, 2001.
References
•
•
•
•
•
•
•
Eastwood, T., “Appraising Digital Records for Long-Term Preservation.” in Data Science Journal,
Volume 3, December 30, 2004, pp. 202-208.
Faundeen, J., “Interdisciplinary Case Study 2: Earth & Environmental Sciences U.S. Geological
Survey/EROS Data center Archiving Perspective.” Presentation to the ERPANET/CODATA
Workshop on the Selection, Appraisal, and Retention of Digital Scientific Data, December 15-17,
2003, Lisbon, Portugal.
Anderson, W. L., “Some Challenges and Issues in Managing, and Preserving Access to, LongLived Collections of Digital Scientific and Technical Data.” in Data Science Journal, Volume 3,
December 30, 2004, pp. 191-202.
Bleakly, D. R., “Long-Term Spatial Data Preservation and Archiving: What are the Issues?” Sand
Report 2002-0107, Prepared by Sandia National Laboratories, Albuquerque, NM January 2002.
National Research Council, “Preserving Scientific Data on Our Physical Universe: A New Strategy
for Archiving the Nation’s Scientific Information Resources.” Steering Committee for the Study on
the Long-term Retention of Selected Scientific and Technical Records of the Federal Government,
Commission on Physical Sciences, Mathematics, and Applications, National Academy Press,
Washington, D.C. 1995.
ERPANET, “ErpaTools: Digital Preservation Policy Tool.” Electronic Resource Preservation and
Access Network, September 2003. Retrieved on August 4, 2005 from the Electronic Resource
Preservation and Access Network site:
http://www.erpanet.org/guidance/docs/ERPANETPolicyTool.pdf
Consultative Committee for Space data Systems. “Reference Model for an Open Archival
Information Systems (OAIS).” Retrieved on August 4 from the Consultative Committee for Space
data Systems site: http:www.ccsds.org/CCSDS/documents/650x0b1.pdf
Backup Slides
Retention
• ERPANET Digital Preservation “Areas of Coverage”
–
–
–
–
–
–
–
–
–
–
–
Authority and Responsibility
Conversion and Reformatting
Appraisal, Selection and Acquisition
Storage and Maintenance
Access and Dissemination
Implementation
Standards
Procedures
Quality Control, Auditing and Benchmarking
Cooperation
Technical Infrastructure
Landsat MSS 1972
Landsat TM 1985
Landsat ETM+ 2000
Declassified 1973
Pretoria, South
Africa area images.
EO-1 Hyperion 2002
Case Study
• Process Developed
– UNEP Collected Tapes from Environmental
Agencies
– UNEP Shipped Tapes to USGS
– USGS Cleaned and Copied Tapes to CDs
– CDs Sent Back to UNEP for Redistribution
– Old 9-Track Tapes Destroyed
Case Study
• 448 Tapes Received
DATE
RECEIVED
NUMBER OF
TAPES
DELIVERED to
UNEP
NUMBER OF
TAPES
CONVERTED
1 August 2000
52
9 May 2001
28
31 August 2001
207
15 April 2002
69
23 May 2002
40
31 March 2003
23
149 16 September 2004
118
17 March 2004
Totals:
448
238
(53%)
“DO RIGHT BY THE RECORDS”