Data for Climate and Energy Studies

Download Report

Transcript Data for Climate and Energy Studies

Data for Climate and Energy Studies
Steven Worley
Computational and Information Systems Laboratory
NCAR
Topics
Scope of the NCAR Research Data Archive (RDA)
Discovery and Access Highlights
User ranked popular datasets
Examples
Near-term service improvements
7 May 2010
NCAR-CSM Symposium on Climate and Energy
2
Scope of the NCAR Research Data Archive (RDA)
Focus on atmospheric, oceanographic, and related
geo-sciences observational data and derived
analyses.
 Some weather forecast data
 Do not specialize in climate prediction datasets
Active stewardship program to maintain and grow
the RDA for 40+ years.
 Large variety, 600+ datasets, ~ 400 TB, 4M files
7 May 2010
NCAR-CSM Symposium on Climate and Energy
3
Discovery and Access Highlights
Primary design feature for web portal
• Data Discovery – Find Data!
7 May 2010
NCAR-CSM Symposium on Climate and Energy
4
Discovery and Access Highlights
Multiple Methods - simple to interoperable
1. Find the files in our lists and download
• Through your browser – limit 2GB
• We create a ‘wget’ script for you – run in background on your
machine – no limit
2. You select temporal, spatial, parameter domains
• We build a file list for you
• Download options as in 1
3. Data is not online to the web – but, is on archive storage
• We automatically stage data to online, then download
4. You select temporal, spatial, parameter domains - we build CURL
commands - you get only the grids you select
• About CURL
• Client URL Library functions
• Readily available on Linux OS
• We use HTPPS protocols – others are available
• Applies well to WMO GRIB data format
• Users modify the CURL commands and script them to perform
routine data extractions from RDA
7 May 2010
NCAR-CSM Symposium on Climate and Energy
5
User ranked popular datasets
Unique
users
FY09
2878
924
510
477
358
264
262
190
173
153
106
106
91
89
72
69
68
61
58
56
56
55
53
47
45
42
42
40
36
32
30
27
25
25
24
5921
datasets
ds082.0, ds083.2, ds083.0
ds090.0
ds758.0, ds759.3, ds759.2
ds461.0, ds351.0
ds337.0, ds464.0,ds353.4
ds608.0
ds609.2
ds540.1, ds540.0
ds744.4
ds277.0
ds335.0, ds336.0
ds091.0
ds552.1, ds552.0, ds556.0
ds277.3
ds824.1, ds330.3
ds570.0
ds314.0
ds900.0
ds260.3
ds285.3
ds512.0
ds625.0
ds578.1, ds485.0
ds285.0
ds770.0
ds215.0
ds277.7
ds330.2
ds472.0
ds232.2
ds131.1, ds131.0
ds260.2
ds885.1
ds627.0
ds510.0
ds564.0
7 May 2010
All Datasets
Titles
NCEP FNL Operational Model Global Tropospheric Analyses
NCEP/NCAR Global Reanalysis Products
NGDC Global 2' and 5' Elevations, USGS 30 ARC-second
Top 30 datasets/groups FY09
~ 6000 Unique Users Annually
NCEP ADP/PREPBUFR Global Surface and Upper Air Observations
NCEP North American Regional Reanalysis (NARR)
GCIP NCEP ETA model output
International Comprehensive Ocean-Atmosphere Data Set (ICOADS)
QSCAT/NCEP Blended Ocean Winds
NCEP V2.0 OI Global SST, V3.0 Extended Reconstructed Analyses
Unidata (IDD) Observations and Model Data
NCEP/DOE Reanalysis II
River Discharge Data
Hadley Centre Global Sea Ice and Sea Surface Temperature (HadISST)
Global Tropical Cyclone "Best Track" Position and Intensity Data, TIGGE Cyclone Tracks
World Monthly Surface Station Climatology
Global Meteorological Forcing Dataset for Land Surface Modeling
U.S. AFGWC Station (Surface and Upper Air) Library
NOCS Surface Flux Dataset v2.0
Japanese Subsurface Temperature And Salinity Analyses V6.7
CPC Global Summary of Day/Month Observations
Japanese 25-year Reanalysis Project
China Monthly Station Precipitation and Temperature, Daily Precip. and Monthly Soil Temperature
World Ocean Database and World Ocean Atlas
GISS Soil and Surface Slope
Global Monthly Surface Temperature Anomalies (1856-2005), Precipitation (1900-1998), and
Sea Level Pressure (1873-2000) from the University of East Anglia Climatic Research Unit
NOAA OI 1/4 Degree Daily SST Analysis
TIGGE Near Real-time
TDL U.S. and Canada Surface Hourly Observations
Scatterometer Climatology of Ocean Winds
NOAA-CIRES Twentieth Century Global Reanalysis Version I and II
CORE.2 Global Air-Sea Flux Dataset
NCDC TD9640 U.S. Palmer Drought Indices
ERA-Interim Project
NCDC TD3200 U.S. Cooperative Summary of Day
Global Historical Climatology Network (GHCN) Temperature, Precipitation, Pressure
NCAR-CSM Symposium on Climate and Energy
All DSS datasets
6
One example
Final Global Analysis from NOAA/NCEP
 4x Daily
 Updated in the RDA 1x/day
 1° horizontal resolution
 26 vertical pressure levels, plus surface
 Series starts in 1999
 Over 55 parameter fields
7 May 2010
NCAR-CSM Symposium on Climate and Energy
7
One example
7 May 2010
NCAR-CSM Symposium on Climate and Energy
8
Re-analyses
Table 1: Global atmospheric and oceanographic re-analyses are one of many valuable data resources provided by
external organizations that employ the expertise of RDA consultants and are the most recent major reanalyses
available in the Research Data Archive. Most time periods are ongoing, that is, providers continue to produce
the products gong forward in time. In general, all reanalyses also have lower temporal and horizontal
resolutions than those shown above. Most reanalyses also have variables on vertical model coordinate levels,
as well as large numbers of surface specific fields, and vertically integrate values.
NCAR-CSM Symposium on Climate and Energy
7 May 2010
http://www.earthobservations.org/documents/geonewsletter/art008001_trenberth_article.pdf
9
Near-term service improvements
 Current and soon-to-be workflow
7 May 2010
NCAR-CSM Symposium on Climate and Energy
10
HPC User Community
Advantages:
Access to full RDA
Fast computing
No login required
HPC User Community
Disadvantages:
No access to online data
Use MSS as a file server
No direct access to RDA
metadata
No direct access to RDA
data processing services
Require separate
account to access RDA
web server
Complete User
Community
Advantages:
Fast access to online
data – limited part of
RDA
Access to all RDA
content metadata
Access to RDA data
processing services
Complete User
Community
Disadvantages:
Slow access to MSS
data – delayed mode
Have to create a
separate RDA account
and log in
Data processing
requests take a long time
to finish
Slow download speeds
for some users
Complete User Community
Improvements:
Fast access to full RDA
Expanded data processing
services available
Single CISL account - no
separate RDA account
Faster download speeds –
grid-based tools, e.g.
GRIDFTP
Single “first point of contact”
for user support
HPC User Community
Improvements:
Fast access to full RDA
Access to all RDA content
metadata
Access to RDA data
processing services
Single CISL account
Single “first point of contact”
Resolved all the disadvantages
New Challenges:
 GPFS and HPSS don’t have generic
file use logging
• Need for metrics & services
 HPSS doesn’t have sophisticated file
access control
• Some RDA assets have limited
access policies
 Abandon a functional RDA
registration system – retool a 20K+
user DB
Of course, there will be more!
Big transition while maintaining RDA
content building and services
End
 Scope of the NCAR Research Data Archive (RDA)
 Discovery and Access Highlights
 User ranked popular datasets
 Examples
 Near-term service improvements
http://dss.ucar.edu/
7 May 2010
NCAR-CSM Symposium on Climate and Energy
13