The Fifth China - U.S. Roundtable on Scientific Data Cooperation Global Science Needs Global Data A Case for Data Sharing E.
Download ReportTranscript The Fifth China - U.S. Roundtable on Scientific Data Cooperation Global Science Needs Global Data A Case for Data Sharing E.
The Fifth China - U.S. Roundtable on Scientific Data Cooperation Global Science Needs Global Data A Case for Data Sharing E. Lynn Usery U.S. Department of the Interior U.S. Geological Survey [email protected] http://cegis.usgs.gov Objectives I will focus on the need for complete global geospatial datasets at high resolution to support global science modeling and analysis Example Science Issues and Global Data Needs Global Climate and Land Cover Change Global Ecosystem Modeling Global Hazards Earthquakes Sea Level Rise Volcanism Research on Semantic Web; platform for data sharing 2 USGS Science Strategy http://www.usgs.gov/science_strategy / 3 USGS Science Understanding Ecosystems and Predicting Ecosystem Change Climate Variability and Change Energy and Minerals for America’s Future A National Hazards, Risk, and Resilience Assessment Program The Role of Environment and Wildlife in Human Health A Water Census of the United States 4 USGS Science Data Integration and Beyond The USGS will use its information resources to create a more integrated and accessible environment for its vast resources of past and future data. It will invest in cyberinfrastructure, nurture and cultivate programs in natural-science informatics, and participate in efforts to build a global integrated science and computing platform. 5 Global Climate and Land Cover Change Data Needs – One Example High resolution (30 m or smaller pixels) satellite images for land cover extraction U.S. has Landsat archive but does not include all scenes from non-US-based receiving stations Extracted land cover Classes must match, i.e., same classification system and same level of detail 6 Worldwide Usage of Landsat Imagery 1M 7 7 Online Data Search, Browse and Order Tools Earth Explorer http://earthexplorer.usgs.gov GLOVIS http://glovis.usgs.gov 8 Landsat International Data Usage (FY10) Top 20 Countries (Excludes United States) MALAYSIA 4793 COLOMBIA 5091 ARGENTINA 5729 NETHERLANDS 6962 CONGO 9124 ITALY 9167 FRANCE 10107 KOREA, REPUBLIC OF 11265 INDIA 14042 KAZAKHSTAN 14146 JAPAN 16525 UNITED KINGDOM 22074 GERMANY 23185 CANADA 24850 MEXICO 25604 INDONESIA 29234 AUSTRALIA 37487 SPAIN 41108 RUSSIAN FEDERATION 60257 CHINA 104734 0 20000 40000 60000 80000 100000 120000 9 Landsat Global Archive Consolidation (LGAC) Goal is to consolidate the entire Landsat archive 5 million scenes held internationally vs. 2 million in the USGS archive From current stations as well as historical stations Each station has data that will enhance the USGS archive Enables scientific analysis of most complete time-series of images for global land change Facilitates large scale scene selection and data mining capability Recover data not currently available to users Some data at risk due to aging media and drive obsolescence Provides data to global user community as standard product like current Landsat data from US archive 10 LGAC International Data Holdings Scenes (in 1,000s) ICs Europe (ESA) Unique to IC % Unique 2,089 1,883 90% Australia (GA-NEO) 629 463 74% Canada (CCRS) 532 242 45% China (CEODE) 449 409 91% Japan (RESTEC) 275 261 95% Brazil (INPE) 234 210 90% Thailand (GISTDA) 90 77 86% Germany (DLR) 87 61 70% India (NRSC) 51 50 98% Ecuador (CLIRSEN) 40 30 75% Pakistan (SUPARCO) 24 20 84% Japan (HIT) 21 7 35% Indonesia (LAPAN) 8 3 63% Argentina (CONAE) 253 162 64% South Africa (CSIR-SAC) 137 122 89% 4919 3704 75% Subtotal Saudi Arabia (KACST) 69 Puerto Rico (UPR) 42 Grand Total (in 1,000s) 5030 11 Landsat 8 Similar requirement for global data from all receiving stations to be archived and made freely available to support global science 12 Global Ecosystem Modeling – Data Needs Global species data Invasive species – cost in U.S. is billions of dollars each year – similar in other countries Global secession data Global climate records As climate changes, how do species adapt; this is a global problem and requires global data sharing 13 Global Hazards – Earthquakes Locations, epicenters, seismic wave data, exchanged in real time Soil effects data Infrastructure damage Relief effort and support depends on data availability 14 Global Hazards – Volcanism Volcano locations, eruption histories, types, distributed in realtime Ash cloud distribution and models 15 Global Hazards – Sea Level Rise Global elevation high resolution ASTER Global DEM (15 m resolution) is a start Need lidar/IfSar along all coasts Corresponding population data (current highest resolution is 30 arc-sec) Corresponding land cover data 16 DATA SHARING ISSUES Volume – multiple global datasets at high resolution Structure – variety of structures, vector and raster, many different formats Semantics – various attribution and relation schemes, some feature-based, some layers Integration of multiple datasets – for maximum utility all datasets should be able to be integrated to produce new data and information 17 Geometry/ Format Attribution/ Scaling National Hydrography Dataset (NHD) National Transportation Dataset Vector Vector; tables Discrete/nominal Discrete/nominal National Boundaries Dataset National Structures Dataset Geographic Names Information System (GNIS) National Elevation Dataset (NED) Vector Vector Vector Raster Discrete/nominal Discrete/nominal Discrete/nominal Continuous/ratio National Digital Orthophotos Raster National Land Cover Dataset (NLCD) Raster Continuous/ interval Discrete/nominal Global Land Cover Dataset LiDAR Satellite images Hazards (Earthquakes, Volcanoes) Minerals Raster Point Raster Graphics Vector; text Discrete/nominal Continuous/ratio Continuous/interval Multiple forms Discrete/nominal Energy Landscapes and Coasts Astrogeology Geologic Map Database Geologic Data Digital Data Series National Water Information System Floods and High Flow Drought Monthly Stream Flow Ground Water Water Quality National Biological Information Infrastructure (NBII) Vegetation Characterization Wildlife Vector; databases Reports Databases Vector; maps; text Maps; tables Graphics; tables Graphics; tables Graphics; tables Graphics; tables Vector; tables; Graphics Graphics; vector; geodatabases Vector; databases Vector; text;video Multiple forms Discrete/nominal Discrete/nominal Discrete/nominal Discrete/nominal Continuous/ratio Continuous/ratio Continuous/ratio Continuous/ratio Continuous/ratio Continuous/ratio Multiple forms http://viewer.nationalmap.gov/viewer/nhd.html?p=nhd http://viewer.nationalmap.gov/viewer/ http://gisdata.usgs.net/website/MRLC/viewer.htm http://viewer.nationalmap.gov/viewer/ http://viewer.nationalmap.gov/viewer/ http://geonames.usgs.gov/domestic/download_data.htm http://viewer.nationalmap.gov/viewer/ http://seamless.usgs.gov/website/seamless/viewer.htm http://www.ndop.gov/data.html; http://viewer.nationalmap.gov/viewer/ http://gisdata.usgs.net/website/MRLC/viewer.htm http://viewer.nationalmap.gov/viewer/ http://gisdata.usgs.net/website/MRLC/viewer.htm http://landcover.usgs.gov/landcoverdata.php http://viewer.nationalmap.gov/viewer/ http://edcsns17.cr.usgs.gov/NewEarthExplorer/; http://glovis.usgs.gov/ http://earthquake.usgs.gov/hazards/; http://volcanoes.usgs.gov/activity/status.php http://mrdata.usgs.gov/; http://tin.er.usgs.gov/mrds/ http://tin.er.usgs.gov/geochem/; http://crustal.usgs.gov/geophysics/index.html http://energy.usgs.gov/search.html http://geochange.er.usgs.gov/info/holdings.html http://astrogeology.usgs.gov/DataAndInformation/ http://ngmdb.usgs.gov/ http://pubs.usgs.gov/dds/dds-060/ http://wdr.water.usgs.gov/nwisgmap/ http://waterwatch.usgs.gov/new/index.php?id=ww http://waterwatch.usgs.gov/new/index.php?id=ww http://waterwatch.usgs.gov/new/index.php?id=ww http://waterdata.usgs.gov/nwis/gw/; http://groundwaterwatch.usgs.gov/ http://waterdata.usgs.gov/nwis/qw/; http://waterwatch.usgs.gov/wqwatch/ http://www.nbii.gov/portal/server.pt/community/nbii_home/236 Multiple forms Multiple forms http://biology.usgs.gov/npsveg/ http://www.nwhc.usgs.gov/ Invasive Species Vector; databases; Multiple forms graphics, image Dataset URL http://www.nbii.gov/portal/server.pt/community/invasive_species/221 18 Volunteered Geographic Information/ User Generated Content USGS “Did You Feel It?” Open Street Map (OSM) USGS now researching use of OSM for our transportation and structures data VGI/UGC rivals traditional geospatial data sources and provides new basis for data sharing 19 Technical problems Compatible data models Resolution, accuracy issues Attribution issues – need ontology that allows matching across data schema Data sharing is more than making data available for download over the Web Requires standards USGS data meets Federal Geographic Data Committee and Open Geospatial Consortium standards for metadata and packaging 20 Semantics – Intelligence USGS is exploring Semantic Web for data sharing; globally linked data Requirements: Ontology of features, attributes, and relationships: currently being developed. Semantic Web triple format: Conversion for selected test areas is in progress. Uniform Resource Identifiers (URIs) for individual features, i.e., each geographic feature has a unique URI 21 USGS Semantic Web SPARQL Endpoint for Data Access http://usgs-ybother.srv.mst.edu:8890/sparql 22 Query – Find the tributaries of West Hunter Creek Default Graph URI http://cegis.usgs.gov/rdf/ontologytest/ PREFIX ogc: <http://www.opengis.net/rdf#> PREFIX fid: <http://cegis.usgs.gov/rdf/nhd/featureID#> SELECT ?feature ?type WHERE { fid:_102217454 ogc:hasGeometry ?geo1. ?geo1 ogc:touches ?geo2. ?feature ogc:hasGeometry ?geo2. ?feature a ?type } 23 Query Result http://cegis.usgs.gov/rdf/nhd/featureID#_102216432 http://cegis.usgs.gov/rdf/nhd/featureID#_102216448 http://cegis.usgs.gov/rdf/nhd/featureID#_102216340 http://cegis.usgs.gov/rdf/nhd/featureID#_102216320 http://cegis.usgs.gov/rdf/nhd/featureID#_102217454 http://cegis.usgs.gov/rdf/nhd/featureID#_102216276 http://cegis.usgs.gov/rdf/nhd/featureID#_102216358 24 Major Challenges for Geospatial Data Sharing with Semantics Semantic spatial data model Coordinates on the Semantic Web in RDF Geospatial feature ontologies Ontology-driven geospatial operators Moving multi-GB to TB of data to grid/cloud Implementing spatial operators on Semantic Web and in grid/cloud environment Interfacing Semantic Web and grid/cloud capabilities 25 The Fifth China - U.S. Roundtable on Scientific Data Cooperation Global Science Needs Global Data A Case for Data Sharing E. Lynn Usery U.S. Department of the Interior U.S. Geological Survey [email protected] http://cegis.usgs.gov