The Fifth China - U.S. Roundtable on Scientific Data Cooperation Global Science Needs Global Data A Case for Data Sharing E.

Download Report

Transcript The Fifth China - U.S. Roundtable on Scientific Data Cooperation Global Science Needs Global Data A Case for Data Sharing E.

The Fifth China - U.S. Roundtable on
Scientific Data Cooperation
Global Science Needs Global Data
A Case for Data Sharing
E. Lynn Usery
U.S. Department of the Interior
U.S. Geological Survey
[email protected]
http://cegis.usgs.gov
Objectives
I will focus on the need for complete global geospatial
datasets at high resolution to support global science
modeling and analysis
Example Science Issues and Global Data Needs
Global Climate and Land Cover Change
Global Ecosystem Modeling
Global Hazards
Earthquakes
Sea Level Rise
Volcanism
Research on Semantic Web; platform for data sharing
2
USGS Science Strategy
http://www.usgs.gov/science_strategy
/
3
USGS Science
Understanding Ecosystems and Predicting
Ecosystem Change
Climate Variability and Change
Energy and Minerals for America’s Future
A National Hazards, Risk, and Resilience
Assessment Program
The Role of Environment and Wildlife in Human
Health
A Water Census of the United States
4
USGS Science
Data Integration and Beyond
The USGS will use its information resources to
create a more integrated and accessible
environment for its vast resources of past and
future data. It will invest in cyberinfrastructure,
nurture and cultivate programs in natural-science
informatics, and participate in efforts to build a
global integrated science and computing platform.
5
Global Climate and Land Cover
Change
Data Needs – One Example
High resolution (30 m or smaller pixels) satellite
images for land cover extraction
U.S. has Landsat archive but does not
include all scenes from non-US-based
receiving stations
Extracted land cover
Classes must match, i.e., same classification
system and same level of detail
6
Worldwide Usage of Landsat Imagery
1M
7
7
Online Data Search, Browse and Order Tools
Earth Explorer
http://earthexplorer.usgs.gov
GLOVIS
http://glovis.usgs.gov
8
Landsat International Data Usage (FY10)
Top 20 Countries
(Excludes United States)
MALAYSIA
4793
COLOMBIA
5091
ARGENTINA
5729
NETHERLANDS
6962
CONGO
9124
ITALY
9167
FRANCE
10107
KOREA, REPUBLIC OF
11265
INDIA
14042
KAZAKHSTAN
14146
JAPAN
16525
UNITED KINGDOM
22074
GERMANY
23185
CANADA
24850
MEXICO
25604
INDONESIA
29234
AUSTRALIA
37487
SPAIN
41108
RUSSIAN FEDERATION
60257
CHINA
104734
0
20000
40000
60000
80000
100000
120000
9
Landsat Global Archive Consolidation (LGAC)
Goal is to consolidate the entire Landsat archive
5 million scenes held internationally vs. 2 million in the USGS
archive
From current stations as well as historical stations
Each station has data that will enhance the USGS archive
Enables scientific analysis of most complete time-series
of images for global land change
Facilitates large scale scene selection and data mining
capability
Recover data not currently available to users
Some data at risk due to aging media and drive obsolescence
Provides data to global user community as standard
product like current Landsat data from US archive
10
LGAC International Data Holdings
Scenes
(in 1,000s)
ICs
Europe (ESA)
Unique to
IC
% Unique
2,089
1,883
90%
Australia (GA-NEO)
629
463
74%
Canada (CCRS)
532
242
45%
China (CEODE)
449
409
91%
Japan (RESTEC)
275
261
95%
Brazil (INPE)
234
210
90%
Thailand (GISTDA)
90
77
86%
Germany (DLR)
87
61
70%
India (NRSC)
51
50
98%
Ecuador (CLIRSEN)
40
30
75%
Pakistan (SUPARCO)
24
20
84%
Japan (HIT)
21
7
35%
Indonesia (LAPAN)
8
3
63%
Argentina (CONAE)
253
162
64%
South Africa (CSIR-SAC)
137
122
89%
4919
3704
75%
Subtotal
Saudi Arabia (KACST)
69
Puerto Rico (UPR)
42
Grand Total (in 1,000s)
5030
11
Landsat 8
Similar requirement for global data from all
receiving stations to be archived and made
freely available to support global science
12
Global Ecosystem Modeling – Data
Needs
Global species data
Invasive species – cost in U.S. is billions of
dollars each year – similar in other countries
Global secession data
Global climate records
As climate changes, how do species adapt; this
is a global problem and requires global data
sharing
13
Global Hazards – Earthquakes
Locations, epicenters, seismic wave data,
exchanged in real time
Soil effects data
Infrastructure damage
Relief effort and support depends on data
availability
14
Global Hazards – Volcanism
Volcano locations, eruption histories, types,
distributed in realtime
Ash cloud distribution and models
15
Global Hazards – Sea Level Rise
Global elevation high resolution
ASTER Global DEM (15 m resolution) is a start
Need lidar/IfSar along all coasts
Corresponding population data
(current highest resolution is 30 arc-sec)
Corresponding land cover data
16
DATA SHARING ISSUES
Volume – multiple global datasets at high resolution
Structure – variety of structures, vector and raster, many
different formats
Semantics – various attribution and relation schemes, some
feature-based, some layers
Integration of multiple datasets – for maximum utility all
datasets should be able to be integrated to produce
new data and information
17
Geometry/
Format
Attribution/
Scaling
National Hydrography Dataset (NHD)
National Transportation Dataset
Vector
Vector; tables
Discrete/nominal
Discrete/nominal
National Boundaries Dataset
National Structures Dataset
Geographic Names Information System (GNIS)
National Elevation Dataset (NED)
Vector
Vector
Vector
Raster
Discrete/nominal
Discrete/nominal
Discrete/nominal
Continuous/ratio
National Digital Orthophotos
Raster
National Land Cover Dataset (NLCD)
Raster
Continuous/
interval
Discrete/nominal
Global Land Cover Dataset
LiDAR
Satellite images
Hazards (Earthquakes, Volcanoes)
Minerals
Raster
Point
Raster
Graphics
Vector; text
Discrete/nominal
Continuous/ratio
Continuous/interval
Multiple forms
Discrete/nominal
Energy
Landscapes and Coasts
Astrogeology
Geologic Map Database
Geologic Data Digital Data Series
National Water Information System
Floods and High Flow
Drought
Monthly Stream Flow
Ground Water
Water Quality
National Biological Information Infrastructure (NBII)
Vegetation Characterization
Wildlife
Vector; databases
Reports
Databases
Vector; maps; text
Maps; tables
Graphics; tables
Graphics; tables
Graphics; tables
Graphics; tables
Vector; tables;
Graphics
Graphics; vector;
geodatabases
Vector; databases
Vector; text;video
Multiple forms
Discrete/nominal
Discrete/nominal
Discrete/nominal
Discrete/nominal
Continuous/ratio
Continuous/ratio
Continuous/ratio
Continuous/ratio
Continuous/ratio
Continuous/ratio
Multiple forms
http://viewer.nationalmap.gov/viewer/nhd.html?p=nhd
http://viewer.nationalmap.gov/viewer/
http://gisdata.usgs.net/website/MRLC/viewer.htm
http://viewer.nationalmap.gov/viewer/
http://viewer.nationalmap.gov/viewer/
http://geonames.usgs.gov/domestic/download_data.htm
http://viewer.nationalmap.gov/viewer/
http://seamless.usgs.gov/website/seamless/viewer.htm
http://www.ndop.gov/data.html; http://viewer.nationalmap.gov/viewer/
http://gisdata.usgs.net/website/MRLC/viewer.htm
http://viewer.nationalmap.gov/viewer/
http://gisdata.usgs.net/website/MRLC/viewer.htm
http://landcover.usgs.gov/landcoverdata.php
http://viewer.nationalmap.gov/viewer/
http://edcsns17.cr.usgs.gov/NewEarthExplorer/; http://glovis.usgs.gov/
http://earthquake.usgs.gov/hazards/; http://volcanoes.usgs.gov/activity/status.php
http://mrdata.usgs.gov/; http://tin.er.usgs.gov/mrds/
http://tin.er.usgs.gov/geochem/; http://crustal.usgs.gov/geophysics/index.html
http://energy.usgs.gov/search.html
http://geochange.er.usgs.gov/info/holdings.html
http://astrogeology.usgs.gov/DataAndInformation/
http://ngmdb.usgs.gov/
http://pubs.usgs.gov/dds/dds-060/
http://wdr.water.usgs.gov/nwisgmap/
http://waterwatch.usgs.gov/new/index.php?id=ww
http://waterwatch.usgs.gov/new/index.php?id=ww
http://waterwatch.usgs.gov/new/index.php?id=ww
http://waterdata.usgs.gov/nwis/gw/; http://groundwaterwatch.usgs.gov/
http://waterdata.usgs.gov/nwis/qw/; http://waterwatch.usgs.gov/wqwatch/
http://www.nbii.gov/portal/server.pt/community/nbii_home/236
Multiple forms
Multiple forms
http://biology.usgs.gov/npsveg/
http://www.nwhc.usgs.gov/
Invasive Species
Vector; databases; Multiple forms
graphics, image
Dataset
URL
http://www.nbii.gov/portal/server.pt/community/invasive_species/221
18
Volunteered Geographic Information/
User Generated Content
USGS “Did You Feel It?”
Open Street Map (OSM)
USGS now researching use of OSM for our
transportation and structures data
VGI/UGC rivals traditional geospatial data
sources and provides new basis for data
sharing
19
Technical problems
Compatible data models
Resolution, accuracy issues
Attribution issues – need ontology that allows matching
across data schema
Data sharing is more than making data available for
download over the Web
Requires standards
USGS data meets Federal Geographic Data
Committee and Open Geospatial Consortium
standards for metadata and packaging
20
Semantics – Intelligence
USGS is exploring Semantic Web for data
sharing; globally linked data
Requirements:
Ontology of features, attributes, and
relationships: currently being developed.
Semantic Web triple format: Conversion for
selected test areas is in progress.
Uniform Resource Identifiers (URIs) for
individual features, i.e., each geographic
feature has a unique URI
21
USGS Semantic Web SPARQL Endpoint for Data Access
http://usgs-ybother.srv.mst.edu:8890/sparql
22
Query – Find the tributaries of West Hunter Creek
Default Graph URI
http://cegis.usgs.gov/rdf/ontologytest/
PREFIX ogc: <http://www.opengis.net/rdf#>
PREFIX fid:
<http://cegis.usgs.gov/rdf/nhd/featureID#>
SELECT ?feature ?type
WHERE {
fid:_102217454 ogc:hasGeometry ?geo1.
?geo1 ogc:touches ?geo2.
?feature ogc:hasGeometry ?geo2.
?feature a ?type }
23
Query Result
http://cegis.usgs.gov/rdf/nhd/featureID#_102216432
http://cegis.usgs.gov/rdf/nhd/featureID#_102216448
http://cegis.usgs.gov/rdf/nhd/featureID#_102216340
http://cegis.usgs.gov/rdf/nhd/featureID#_102216320
http://cegis.usgs.gov/rdf/nhd/featureID#_102217454
http://cegis.usgs.gov/rdf/nhd/featureID#_102216276
http://cegis.usgs.gov/rdf/nhd/featureID#_102216358
24
Major Challenges for Geospatial Data Sharing with
Semantics
Semantic spatial data model
Coordinates on the Semantic Web in RDF
Geospatial feature ontologies
Ontology-driven geospatial operators
Moving multi-GB to TB of data to grid/cloud
Implementing spatial operators on Semantic Web and in
grid/cloud environment
Interfacing Semantic Web and grid/cloud capabilities
25
The Fifth China - U.S. Roundtable on
Scientific Data Cooperation
Global Science Needs Global Data
A Case for Data Sharing
E. Lynn Usery
U.S. Department of the Interior
U.S. Geological Survey
[email protected]
http://cegis.usgs.gov