Large-Scale Data Management Challenges Climate, Water, and Weather Data Kenneth Galluppi Director, Disaster and Environmental Programs Renaissance Computing Institute University of North Carolina at Chapel.

Download Report

Transcript Large-Scale Data Management Challenges Climate, Water, and Weather Data Kenneth Galluppi Director, Disaster and Environmental Programs Renaissance Computing Institute University of North Carolina at Chapel.

Large-Scale Data Management Challenges

Climate, Water, and Weather Data Kenneth Galluppi

Director, Disaster and Environmental Programs Renaissance Computing Institute University of North Carolina at Chapel Hill

NOAA - National Climatic Data Center

Ed Kearns 1

Goal of Collaborations

• Enable cutting edge, Grand Challenge multidisciplinary science through the federation of data-grids of climate, hydrological, and weather data, with other geospatially and socially relevant datasets.

– Understanding of regional impacts of climate change on water availability and society trends – Understanding and prediction of catastrophic weather driven events under climatatic change – Communicate risk/crisis knowledge non-specialists 2

Challenges of Data

• • •

Integration of Large, Multidisciplinary Datasets

– NCDC and NOAA Centers, SDSC, and others – Discover, access, integration, utility [not store/retrieve]

Linkage of Datasets to Computational Models

– – Input/outputs for real-time model forecasting Model-to-observation comparison – Climatic models for reanalysis and prediction

Access to Large Reference Data

– Climate Reanalysis Datasets, 1 PetaByte –

Collaboration and Datagrids

Academic Research Federal Agencies National Climatic Data Center Research Program 4 Emergency Management

Data Supports NOAA/NCDC Mission NCDC’s Place in NOAA’s Mission

NOAA Mission: To understand and predict changes environment and conserve and manage coastal and marine resources to meet our nation’s economic, social, and environmental needs in Earth’s NOAA Goals: Climate Understand Climate Variability and Change to Enhance Society’s Ability to Plan and Respond Weather & Water Serve Society’s Needs for Weather and Water Information Commerce & Transportation Support the Nation’s Commerce with Information for Safe, Efficient, and Environmentally Sound Transportation Ecosystems Protect, Restore, and Manage the Use of Coastal and Ocean Resources through an Ecosystem Approach to Management Mission Support Provide Critical Support for NOAA’s Mission

Data supports NOAA/NCDC Mission

• • • • • NCDC will need to function in a wider information landscape with a NOAA Federated Archive – Support distributed data management and services Interoperable with DataNet, Earth System Grid, GEO-IDE, EOSDIS, etc.

– netCDF, LDM, CF conventions, ISO 19115-2 Move out of the Box and into the Cloud (networked) – Utilize highly distributed storage and computing (RENCI, Oak Ridge National Lab Implement supporting technologies to enable interoperability with Designated Communities (OGC, WMS/WFS) Institute rules-based data management to enable true federation of NOAA Centers of Data – iRODS

The National Environmental Data Archive

Comprehensive Large Array data Stewardship System (CLASS) Storage

(reanalysis)

NOAA’s Data Centers Will Function in a Wider Information Landscape

IPCC ORNL, ESG DAPs NEAAT NSF DataNet Data Mgmt International Sources

Climate Services using Federated DB’s

 NOAA’s Data Centers will need to provide access to petabytes of data that are distributed across multiple NOAA facilities  Be able to integrate these data with data from other disciplines (environmental, biological, social, etc..) that are distributed on other databases both in the public and private sector domain  Export data to common data formats - Shapefile, Well-Known Text, Arc/Info ASCII GRID, Gridded and Raw NetCDF, GeoTIFF and KMZ (Google Earth) Support :  Disaster reduction  Human Health  Climate  Water Resources  Weather  Ocean Resources  Agriculture & Land-Use  Ecosystems

Discipline-Specific View

Atmospheric Observations Land Surface Observation Ocean Observations Space Observations Data Systems

Current systems are program-specific, focused, individually efficient.

But incompatible, not integrated, isolated from one another and from wider environmental community Whole-System View Coordinated, efficient, integrated, interoperable

1

NOAA/NCDC Climate Services

NCDC-RENCI Potential Use Cases

• • • • • Catastrophic Event Modeling and Observations Climate Reanalysis Datasets – Climate records everywhere, for 30 years – 1-PetaByte – Regional and local sub-setting – Ten’s of thousands of users Multi-sensed Gridded Precipitation Climatology Extreme Event Climatology Green Energy, physical-social science Integration 11

As of October 2009,

1,867,108 sites and

4,336,790,286 data values

where available through the HIS from federal, state, and academic data providers.

There have been 543,144

“GetValues” data requests

from Feb 2008 to Oct 2009 .

http://his.cuahsi.org

Hydrology Community

High Level View of HIS Service Oriented Architecture 12

HIS Service Oriented Architecture

13

Maximize Data Access and Utility

Data and Model Integration Needed to Support Hydrologic Science DFC Physical Data Observations Hydrologic Models Weather and Climate Models Socioeconomic Data CUAHSI HIS 15

Meteorology, Hydrology, Ecological Models

ADAS WRF CHPS RHESSys Scientific Research Historical Re-Analysis Disaster Planning Disaster Response Agricultural Forecasts Ag Decision Support Public Dissemination Economic Planning etc …

Sensor Data Bus State Climate Office Sensor Cloud

• National Weather Service • Department of Transportation / FAA • USGS NWIS, USFS • Buoys, Stream Gauges, Soil Moisture • People with mobile devices • etc …

Enablement

Use Case: National Water Model

Hydrologic scientist have expressed a “grand research challenge” of building a National Water Model for flood and drought applications.

Flooding in the Mississippi River Basin, August 1993 observed from satellite imagery Terrain in the Neuse River Basin, NC constructed from 390 million LiDAR measurements Source: terrain.cs.duke.edu

Achieving this goal will require a system like DFC to handle the massive data requirements.

17 Source: nasa.gov

CUAHSI Case Study

• • Hydrology Grand Challenge Problem:

National Water Model

– How much water is available in the Nation’s water resources?

– Currently, hydrologic models are implemented at the watershed-scale (county) – Hydrologists plan to scale physically-based models to national level Provide CI, Policies & Sustainability for Water Model Data – Gathering, analysis, dissemination and preservation – – Policies for quality control, metadata harvesting, versioning and usage Enables the data required for real-time analysis for flood and drought modeling – – – Enables integrating data from “new sources” Enables new science, outreach, decision making and disaster recovery Integration of Predictive Models, Real-time Data and Historic Data 18

• • Technical Solutions – Too many systems/solutions, home grown to programs (CUAHSI) – Standards (ODM, OGC, Virtual USA, etc.

– Federal enterprises – NOAA, CLASS general, heavy system – Oracle front end to large tape system Unique • Handling large sets with limited skills • Multidisciplinary, formats are not enough, but knowledge • Federal – – – Has to work, has to preserve Observation systems are getting more complex Users are more sophisticated and demanding more