Long-Term Preservation of Digital Geospatial Data: A Cooperative Project with Library of Congress Steve Morris North Carolina State University Libraries Mountain Region GIS Advisory.

Download Report

Transcript Long-Term Preservation of Digital Geospatial Data: A Cooperative Project with Library of Congress Steve Morris North Carolina State University Libraries Mountain Region GIS Advisory.

Long-Term Preservation of Digital Geospatial Data: A
Cooperative Project with Library of Congress
Steve Morris
North Carolina State University Libraries
Mountain Region GIS Advisory Council Meeting
September 15, 2006
Overview
Spatial Data Preservation: Values and Considerations
NC Geospatial Data Archiving Project
Approaches to Preservation
Challenges
Workflow
NC Spatial Data Infrastructure
NC OneMap
Regional/Local Partnerships and Data Sharing
Coordinated Content Transfer
Industry Engagement
Historic and Geologic Map Preservation Project
2
Today’s geospatial data as tomorrow’s cultural heritage
Future uses of data are difficult to
anticipate (as with Sanborn Maps).
3
Temporal Data Supports Decision Making
Land use change analysis
Real Estate trend analysis
Site selection
(past uses?)
Forecasting
Parcel Boundary Changes
2001-2004
North Raleigh, NC
4
Time series – Ortho imagery
Vicinity of Raleigh-Durham International Airport 1993-2002
5
Geospatial Data: Risks
Producer focus on current data
Future support of data formats in
question
Shift to web services- and API-based
access
Inadequate or nonexistent metadata
Increasing use of spatial databases
for data management
Many digital archiving challenges …
6
Geospatial data types: Aerial imagery
85+ NC counties with orthophotos
1-5 flights per county
30-300 gb per flight
7
Geospatial data types: Vector & Tabular
Economic, infrastructure,
and ethnographic data
8
Geospatial data types: Cartographic Project Files
Counterpart to the map is not
just the dataset but also models,
symbolization, classification,
annotation, etc.
9
How would you describe your current geospatial
archive?
Bob’s hard drive
Last week’s set of nightly tape backups
Several boxes of CD’s and DVD’s
The data back-end for our internet mapping application
A collection of files in our “GIS Folder”
A stand-alone spatial database
An enterprise GIS
10
NC Geospatial Data Archiving Project (NCGDAP)
Partnership between university library (NCSU) and state agency
(NCCGIA), with Library of Congress under the National Digital
Information Infrastructure and Preservation Program (NDIIPP)
One of 8 initial NDIIPP partnerships
Focus on state and local geospatial content in North Carolina
(state demonstration)
Tied to NC OneMap initiative, which provides for seamless access
to data, metadata, and inventories
Objective: engage existing state/federal geospatial data
infrastructures in preservation
Serve as catalyst for discussion within industry
11
12
NDIIPP Overview
National Digital Information Infrastructure and Preservation Program
Congress appropriated $100 million for this effort, which instructs the Library to
spend an initial $25 million to develop and execute a congressionally approved
strategic plan
Eight initial projects, 2004-2007:
web pages, cultural heritage, numeric data, video, business records,
mixed content, geospatial (2)
Developing partnerships and identifying issues
Extensive interaction among NDIIPP projects
13
Different Ways to Approach Preservation
Technical solutions: How do we archive acquired content over the long term?
Tools
Hardware
Software
Cultural/Organizational solutions: How do we make the data more
preservable—and more prone to be archived—from point of production?
Collaboration
Education
Feedback
14
Technical Approaches
Receive data as is – variety of distribution methods
Migration of some at-risk formats
Metadata remediation, standardization, and synchronization
Distilling complex objects into repository ingest items (not
easy)
Using DSpace for demonstration purposes
In the development: use METS record as dormant item “brain”
within the repository
Some unsustainable activities – for learning experience
15
Cultural/Organizational Approaches
Feedback to metadata outreach program
Feedback to coordinating bodies on adherence to content
standards
Engage existing spatial data infrastructure in archiving and
preservation
Engage software vendors and standards community
Cross-fertilize with other national archiving efforts
Current use and data sharing requirements – not archiving needs –
drive improved preservability of content and improvement of metadata
16
Challenge: Vector Data Formats
No widely-supported, open vector formats for geospatial data
Spatial Data Transfer Standard (SDTS) not widely supported
Geography Markup Language (GML) – diversity of application
schemas and profiles threatens permanent access
Spatial Databases
The sum is more than the whole of the parts, and the sum is
very difficult to preserve
Can export individual data layers for curation
Some thinking of using the spatial database as the primary
archival platform
17
Challenge: Geospatial Web Services
• How to capture records from decisionmaking processes?
• Possible: Atlas collections from automated
image capture
• Web 2.0 impact: Emerging tiling and
caching schemes (archive target?)
18
Challenge: Preserving Cartographic Representation
19
General Workflow
1) Receive Data from Agency
2) Copy data from agency source to NCSU workstation
3) Create Dspace collection “space” for the data
4) Create administrative metadata
5) Process geospatial metadata
6) Scan geospatial formats and migrate to archival format
7) Ingest original and archival data objects, and geospatial
administrative metadata to Dspace
20
NCGDAP Leveraging Existing Spatial Data
Infrastructure (NC OneMap)
NC OneMap: "Historic and temporal data will be maintained and
available,” RAMONA
Metadata outreach and content standards
Regional Partnerships
WGRT and other Coordination Efforts
Data Sharing Agreements
Frequent communication and discussion among geospatial data
community
21
Challenge: Coordinated Content Transfer
How to allow one data snapshot to be accessible by multiple
agencies – more compelling use cases than preservation can put
the data in motion (business continuity, disaster preparedness,
etc.)
Question: Capture frequency of data snapshot?
Survey in progress to identify local government best
practices, consumer agencies needs
Working Group for Roads and Transportation (WGRT)
Stakeholder group working to build data depository for
statewide local road data
First serious effort to develop a plan for local-to-state data
sharing on a regular basis
Other Activities? (DHS, State Archives, Census, etc.)
22
Partnership Activity
ESRI
Discussing software requirements: meetings with development teams
April 2005
Open Geospatial Consortium (OGC)
Presented to Architecture Working Group Nov. 2005
National Archives and Records Administration
Investigations into GML for archiving; presentation to NARA technology
team Dec. 2005
FGDC Historical Data Working Group
Ongoing, general geospatial data preservation issues
23
Partnership Activity
EDINA (University of Edinburgh, UK)
NCSU is Associate Partner on UK project for geospatial institutional
repositories
UC Santa Barbara & Stanford University
Collaboration with other NDIIPP geospatial project
EROS Data Center
Planned site visit
Project visits to regional GIS groups
24
Preservation of Digital Geologic and Historic Maps
Georeferenced over 450 maps scanned by NC Geologic Survey
Maps are available for download at
http://wfs.enr.state.nc.us/NCGeologicMaps
15-min topo maps
1:31,680 – 1:430,000
1,200 – 24,000
1:500,000 – 1:2.5 M
25
Questions?
Contact:
Steve Morris
Head, Digital Library Initiatives
NCSU Libraries
[email protected]
Web site: http://www.lib.ncsu.edu/ncgdap/
26
NC Spatial Data Infrastructure: NCOneMap
NC OneMap is a next generation mechanism to coordinate and
disseminate geographic information in North Carolina and
interact with the NSDI.
Objectives:
• Build a common
understanding of North
Carolina data resources
• Enable widespread
access and distribution
of geospatial data
27
NC OneMap Viewer
28
NC OneMap
Objectives (cont.):
• Develop ongoing data
inventory for all geospatial data
holdings RAMONA –
http://nc.gisinventory.net
• Develop content standards
for key data themes
NC Geographic Information
Coordinating Council (GICC)
One of the defined characteristics of NC OneMap is that “Historic and temporal
data will be maintained and available”.
29
Emerging Regional Partnerships
Focused on development of shared infrastructure for cultivating access
to data
Becoming test beds for innovation in the area of data sharing and data
management, including archiving
30
Local Govt. Data Sharing
Becoming more open, fewer agreements to sign
Recent survey: over 20 state and federal agencies use local data
Problem of local governments being swamped by requests
Many requests are more compelling than “archiving”
Content transfer is non-trivial – large dataset sizes, small rural
staffs, technical limitations
31
Earlier NCSU Acquisition Efforts
NCSU University Extension project 2000-2001
Target: County/city data in eastern NC
“Digital rescue” not “digital preservation”
Project learning outcomes
Confirmed concerns about long term access
Need for efficient inventory/acquisition
Wide range in rights/licensing
Need to work within statewide infrastructure
Acquired experience; unanticipated collaboration
32
Big Geoarchiving Challenges
Format migration paths
Management of data versions over time
Preservation metadata
Preserving cartographic representation
Keeping content repository-agnostic
Preserving geodatabases
Harnessing geospatial web services
More …
33