LEAD - Unidata

Download Report

Transcript LEAD - Unidata

LEAD: An Overview for the
Unidata Users Committee
7 October 2004
Linked Environments for Atmospheric Discovery
Motivation for LEAD
• Each year, mesoscale weather – floods, tornadoes,
hail, strong winds, lightning, and winter storms –
causes hundreds of deaths, routinely disrupts
transportation and commerce, and results in annual
economic losses > $13B.
Linked Environments for Atmospheric Discovery
What Would You Do???
Linked Environments for Atmospheric Discovery
The Roadblock
• The study of events responsible for these losses is
stifled by rigid information technology frameworks
that cannot accommodate the
– real time, on-demand, and dynamically-adaptive needs of
mesoscale weather research;
– its disparate, high volume data sets and streams; and
– its tremendous computational demands, which are
among the greatest in all areas of science and
engineering
Linked Environments for Atmospheric Discovery
The LEAD Goal
Provide the IT necessary to allow
People (scientists, students, operational
practitioners)
and
Technologies (models, sensors, data
mining)
TO INTERACT WITH WEATHER
Linked Environments for Atmospheric Discovery
Traditional Methodology
STATIC OBSERVATIONS
Analysis/Assimilation
Prediction/Detection
Radar Data
Mobile Mesonets
Surface Observations
Upper-Air Balloons
Commercial Aircraft
Geostationary and Polar
Orbiting Satellite
Wind Profilers
GPS Satellites
Quality Control
Retrieval of Unobserved
Quantities
Creation of Gridded Fields
PCs to Teraflop Systems
Product Generation,
Display,
Dissemination
The Process is Entirely Serial
and Static (Pre-Scheduled):
No Response to the Weather!
End Users
Linked Environments for Atmospheric Discovery
NWS
Private Companies
Students
Current WRF Capability
Linked Environments for Atmospheric Discovery
The Consequence: Current
Prediction Models
Linked Environments for Atmospheric Discovery
The Desired Approach: Dynamic
Adaptivity
20 km CONUS Ensembles
10 km
3 km
1 km
Linked Environments for Atmospheric Discovery
The Limitations of Today’s Research
Environment: Example #2
• Mesoscale forecast models
are being run by universities,
in real time, at dozens of sites
around the country, often in
collaboration with local NWS
offices
– Tremendous value
– Leading to the notion of “distributed” NWP
• Yet only a few (OU, U of Wash, Utah)
are actually assimilating local
observations – which is one of
the fundamental reasons for
such models!
Linked Environments for Atmospheric Discovery
•Applied Modeling Inc. (Vietnam) MM5
•Atmospheric and Environmental Research MM5
•Colorado State University RAMS
•Florida Division of Forestry MM5
•Geophysical Institute of Peru MM5
•Hong Kong University of Science and Technology MM5
•IMTA/SMN, Mexico MM5
•India's NCMRWF MM5
•Iowa State University MM5
•Jackson State University MM5
•Korea Meteorological Administration MM5
•Maui High Performance Computing Center MM5
•MESO, Inc. MM5
•Mexico / CCA-UNAM MM5
•NASA/MSFC Global Hydrology and Climate Center, Huntsville, AL
MM5
•National Observatory of AthensMM5
•Naval Postgraduate School MM5
•Naval Research Laboratory COAMPS
•National Taiwan Normal University MM5
•NOAA Air Resources Laboratory RAMS
•NOAA Forecast Systems Laboratory LAPS, MM5, RAMS
•NCAR/MMM MM5
•North Carolina State University MASS
•Environmental Modeling Center of MCNC MM5 MM5
•NSSL MM5
•NWS-BGM MM5
•NWS-BUF (COMET) MM5
•NWS-CTP (Penn State) MM5
•NWS-LBB RAMS
•Ohio State University MM5
•Penn State University MM5
•Penn State University MM5 Tropical Prediction System
•RED IBERICA MM5 (Consortium of Iberic modelers) MM5 (click on
Aplicaciones)
•Saint Louis University MASS
•State University of New York - Stony Brook MM5
•Taiwan Civil Aeronautics AdministrationMM5
•Texas A\&M UniversityMM5
•Technical University of MadridMM5
•United States Air Force, Air Force Weather Agency MM5
•University of L'Aquila MM5
•University of Alaska MM5
•University of Arizona / NWS-TUS MM5
•University of British Columbia UW-NMS/MC2
•University of California, Santa Barbara MM5
•Universidad de Chile, Department of Geophysics MM5
•University of Hawaii MM5
•University of Hawaii RSM
•University of Hawaii MM5
•University of Illinois MM5, workstation Eta, RSM, and WRF
•University of Maryland MM5
•University of Northern Iowa Eta
•University of Oklahoma/CAPS ARPS
•University of Utah MM5
•University of Washington MM5 36km, 12km, 4km
•University of Wisconsin-Madison UW-NMS
•University of Wisconsin-Madison MM5
•University of Wisconsin-Milwaukee MM5
The LEAD Vision: No Longer Serial or Static
STATIC OBSERVATIONS
Analysis/Assimilation
Prediction/Detection
Radar Data
Mobile Mesonets
Surface Observations
Upper-Air Balloons
Commercial Aircraft
Geostationary and Polar
Orbiting Satellite
Wind Profilers
GPS Satellites
Quality Control
Retrieval of Unobserved
Quantities
Creation of Gridded Fields
PCs to Teraflop Systems
Product Generation,
Display,
Dissemination
End Users
Linked Environments for Atmospheric Discovery
NWS
Private Companies
Students
The LEAD Vision: No Longer Serial or Static
DYNAMIC OBSERVATIONS
Analysis/Assimilation
Prediction/Detection
Quality Control
Retrieval of Unobserved
Quantities
Creation of Gridded Fields
PCs to Teraflop Systems
Product Generation,
Display,
Dissemination
End Users
Linked Environments for Atmospheric Discovery
NWS
Private Companies
Students
LEAD: Users INTERACTING with Weather
Interaction Level I
NWS National Static
Observations & Grids
Virtual/Digital Resources
and Services
ADaM ADAS
Mesoscale
Weather
Users
Tools
MyLEAD
Portal
Remote Physical
(Grid) Resources
Local Physical Resources
Local Observations
Linked Environments for Atmospheric Discovery
LEAD: Users INTERACTING with Weather
Interaction Level II
NWS National Static
Observations & Grids
Virtual/Digital Resources
and Services
Mesoscale
Weather
Experimental Dynamic
Observations
ADaM ADAS
Users
Tools
MyLEAD
Portal
Remote Physical
(Grid) Resources
Local Physical Resources
Local Observations
Linked Environments for Atmospheric Discovery
The LEAD Goal
• To create an integrated, scalable framework that
allows analysis tools, forecast models, and data
repositories to be used as dynamically adaptive,
on-demand systems that can
– change configuration rapidly and automatically in
response to weather;
– continually be steered by new data (i.e., the weather);
– respond to decision-driven inputs from users;
– initiate other processes automatically; and
– steer remote observing technologies to optimize data
collection for the problem at hand;
– operate independent of data formats and the physical
location of data or computing resources
Linked Environments for Atmospheric Discovery
The LEAD Foundation
WOORDS
Workflow Orchestration for Ondemand, Real-time,
Dynamically-Adaptive
Systems
Linked Environments for Atmospheric Discovery
And This Means…
• Workflow Orchestration -- The automation of a process, in whole or
part, during which tasks or information are passed from one or more
components of a system to others -- for specific action -- according to
a set of procedural rules.
• On-Demand – The capability to perform action immediately, with or
without prior planning or notification.
• Real-Time -- The transmission or receipt of information about an event
nearly simultaneously with its occurrence, or the processing of data
immediately upon receipt or request.
• Dynamically-Adaptive – The ability of a system, or any of its
components, to respond automatically, in a coordinated manner, to
both internal and external influences in a manner that optimizes
overall system performance.
• System – A group of independent but interrelated components that
operate in a unified holistic manner.
Linked Environments for Atmospheric Discovery
LEAD is a Unique “Poster Child”
Because it is End-to-End and Contains Just
About Everything in Cyberinfrastructure
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Collection of data by remote sensors
Analysis and prediction of physical phenomena
Huge data sets and streaming data
Visualization
On-demand, real time, dynamic adaptability
Resource prediction and scheduling
Fault tolerance
Remote and local resource usage
Interoperability
Grid and Web services
Personal virtual spaces
Education
An extremely broad user base (students, researchers, operational
practitioners) that is in place
A long-standing mechanism for deployment (Unidata)
Linked Environments for Atmospheric Discovery
LEAD Grid and Web Services
Testbeds
• There will be five testbed sites at
–
–
–
–
–
Unidata
University of Oklahoma
Indiana
University of Alabama, Huntsville
NCSA/University of Illinois
• They will be using the Grid framework
• Initially, it will be a nearly homogeneous
environment, running the same software
stack
Linked Environments for Atmospheric Discovery
Portal
Local Resources and Services
ADaM
Geo-reference
GUI
MyLEAD
Workspace
User Sub-System
Workflow
Engine
ADAS
WRF
IDV
Detection
Algorithms
UserSpecified
Grid Resources and Services
Grid and Web Services
Test Beds
Tools Sub-System
Personal
Catalogs
Servers and
Live Feeds
THREDDS
Catalogs
Storage
Semantics &
Interchange
Technologies
Controllable
Devices
Data Sub-System
Workflow
GUI
Task Design
Allocation &
Scheduling
Monitoring
Estimation
Orchestration Sub-System
The Grid
• Refers to an infrastructure that enables the integrated,
collaborative use of computers, networks, databases, and
scientific instruments owned and managed by distributed
organizations.
• The terminology originates from a crude analogy to the
electrical power grid; most users do not care about the details
of power generation, distribution, etc, but your appliances
work when you plug them into the socket.
• Grid applications often involve large amounts of data and/or
computing and require secure resource sharing across
organizational boundaries.
• Grid services are essentially web services running in a Grid
framework.
Linked Environments for Atmospheric Discovery
LEAD CS/IT Research
• Workflow orchestration – the construction and scheduling of
execution task graphs with data sources drawn from real-time
sensor streams and outputs
• Data streaming – to support robust, high bandwidth transmission
of multi-sensor data.
• Distributed monitoring and performance evaluation -- to enable soft
real-time performance guarantees by estimating resource behavior.
• Data management – for storage and cataloging of observational
data, model output and results from data mining.
• Data mining tools – that detect faults, allow incremental processing
(interrupt / resume), and estimate run time and memory
requirements based on properties of the data (e.g., number of
samples, dimensionality).
• Semantic and data interchange technologies – to enable use of
heterogeneous data by diverse tools and applications.
Linked Environments for Atmospheric Discovery
LEAD Meteorology Research
•
•
•
•
•
•
ARPS Data Assimilation System (ADAS) for the WRF model – adaptation of
the CAPS ADAS to the WRF model to allow users to assimilate a wide
variety of observations in real time, especially those obtained locally
Orchestration system for the WRF model – to allow users to manage flows
of data, model execution streams, creation and mining of output, and
linkages to other software and processes for continuous or on-demand
application, including steering of remote observing systems
Fault tolerance in the WRF model for on-demand, interrupt-driven
utilization – to accommodate interrupts in streaming data and user
execution commands
Continuous model updating – to allow numerical models to be steered
continually by observations and thus be dynamically responsive to them
Hazardous weather detection – to identify hazardous features in gridded
forecasts and assimilated data sets, using data mining technologies, for
comparison with sensor-only approaches
Storm-scale ensemble forecasting – to create multiple, concurrently valid
forecasts from slightly different initial conditions, from different models, or
by using different options within the same or multiple models.
Linked Environments for Atmospheric Discovery
LEAD Design Features
•
•
•
•
•
•
•
•
•
•
•
Entirely web- and web service-based; only requires a browser and Java
Web Start
Minimum local resources needed to have significant functionality (to
empower grades 6-12 and reach underprivileged areas)
Highly intuitive functionality with separate portal interfaces for different
classes of users
Grid tool kit for security, authentication, job management, resource
allocation, replication, etc.
Maximum utilization of existing capabilities (OpenDAP, THREDDS, Globus,
DLESE)
Transparent access to all requisite resources (data, tools, computing,
visualization)
Minimum depth accessibility (fewest number of mouse clicks)
Backward software compatibility
Scalable to large numbers of users
User extensibility
Ability to use each service in a stand-alone manner, outside of the
orchestration and portal infrastructures
Linked Environments for Atmospheric Discovery
The 5 Canonical Problems
• #1. Create a 10-year detailed climatology of thunderstorm
characteristics across the U.S. using historical and streaming
NEXRAD radar data. This could be expanded to a fine-scale hourly
re-analysis using ADAS.
• #2. Run a broad parameter suite of convective storm simulations to
relate storm characteristics to the environments in which they
form/move
• #3. Produce high-resolution nested WRF forecasts that respond
dynamically to prevailing and predicted weather conditions and
compare with single static forecasts
• #4. Dynamically re-task a Doppler radar to optimally sense
atmospheric targets based upon a continuous interrogation of
streaming data
• #5. Produce weather analyses and ensemble forecasts on demand –
in response to the evolving weather and to the forecasts themselves
Linked Environments for Atmospheric Discovery
LEAD Technology Roadmap
Generation 3
Adaptive
Sensing
Generation 3
Adaptive
Sensing
Generation 2
Dynamic
Workflow
Generation 2
Dynamic
Workflow
Generation 2
Dynamic
Workflow
Generation 1
Static
Workflow
Generation 1
Static
Workflow
Generation 1
Static
Workflow
Generation 1
Static
Workflow
Year 2
Year 3
Technology & Capability
Look-Ahead Research
Look-Ahead Research
Generation 1
Static
Workflow
Year 1
Year 4
Linked Environments for Atmospheric Discovery
Year 5
In LEAD, Everything is a Web Service
• Finite number of services – they’re the “low-level” elements but
consist of lots of hidden pieces…services within services.
Service A
(ADAS)
Service B
(WRF)
(NEXRAD Stream)
Service D
(MyLEAD)
Service E
(VO Catalog)
Service F
(IDV)
Service G
(Monitoring)
Service H
(Scheduling)
Service I
(ESML)
Service J
(Repository)
Service K
(Ontology)
Service L
(Decoder)
Service C
Many others…
Linked Environments for Atmospheric Discovery
Web Services
• They are self-contained, self-describing, modular applications
that can be published, located, and invoked across the Web.
• The XML based Web Services are emerging as tools for
creating next generation distributed systems that are expected
to facilitate program-to-program interaction without the user-toprogram interaction.
• Besides recognizing the heterogeneity as a fundamental
ingredient, these web services, independent of platform and
environment, can be packaged and published on the internet as
they can communicate with other systems using the common
protocols.
• Emerging web services standards such as SOAP, WSDL and
UDDI are enabling much easier system-to-system integration.
Linked Environments for Atmospheric Discovery
Decoder & Data Mover Service
Source URL
Destination URL
Decoder &
Data mover
service
Destination URL
• When it receives two messages (source and
destination URL) it decodes the file and invokes
GridFTP to move it.
– Note: each message also identifies the user and experiment
name.
• When the move is complete it send a message
indicating that it is done and where to find the file.
• It also sends a “notification event” with the same
information. (explained in a later slide)
Linked Environments for Atmospheric Discovery
Start by Building Simple Prototypes to
Establish the Services/Other Capabilities…
Service A
(ADAS)
Service F
(IDV)
Service E
(VO Catalog)
Service D
(MyLEAD)
Service L
(Decoder)
Prototype Z
Linked Environments for Atmospheric Discovery
Solve General Problems
by Linking Services Together in Workflows
Service D
(MyLEAD)
Service C
(NEXRAD Stream)
Note that these services
can be used as stand-alone
capabilities, independent of
the LEAD infrastructure
(e.g., portal)
Service L
(Decoder)
Service A
(ADAS)
Service B
(WRF)
Linked Environments for Atmospheric Discovery
Service L
(Mining)
Service J
(Repository)
LEAD Prototype 4
IDD
Data
Stream
GWSTBs
Decoders
ADAS
ADAS
Output
IDV,NCL
WRF
Model
WRF
Output
ADAM
• Employ components of WRF prediction as a series of
linked web services in a Grid Environment.
Linked Environments for Atmospheric Discovery
The LEAD E&O Goal
• To scale, integrate, and make extensible the
new opportunities and environments for
teaching and learning created by a Web
Services framework that brings data
accessibility, sharing, analysis and
visualization tools to end users at different
educational levels through the development
of:
– LEAD LEARNING COMMUNITITES (LLC)
– LEAD-TO-LEARN Modules
– Evaluation and assessment rubrics
– Outreach activities
Linked Environments for Atmospheric Discovery
LEAD-TO-LEARN Modules
•
•
•
•
•
•
•
•
•
•
•
Using technology tools in collecting, processing, analyzing,
evaluating, visualizing, and interpreting data
Using technology tools to enhance learning, increase
productivity, and promote creativity
Conducting scientific investigations
Predicting and explaining using evidence
Understanding the 1) concept of models as a method of
representing processes; 2) application of scientific and
technological concepts; 3) use of models in prediction
Recognizing and analyzing alternative explanations and models
Identifying questions and concepts that guide scientific
investigations
Employing technology to improve investigations and develop
problem-solving strategies
Evaluating and selecting new information resources based on
specific tasks
Communicating and defending a scientific argument
Conducting outcomes assessment
Linked Environments for Atmospheric Discovery
LEAD LEARNING COMMUNITIES
•
•
•
•
Pre-College
Undergraduate
Graduate
Meteorology/Computer Science
Research
Linked Environments for Atmospheric Discovery
This Can Only Be Achieved With
Broad Deployment and Sustainability
•
•
LEAD’s audience: higher education, operations research, grades 6-12
LEAD will be integrated into dozens of universities and operational
research centers via the UCAR Unidata Program
–
–
–
–
•
•
includes 150 organizations
21,000 university students
1800 faculty
hundreds of operational practitioners.
Unidata will serve as the focal point for community-wide deployment,
updates, training, integration
LEAD will accelerate the transfer of WRF-based research results into
operations via its links with the
– NOAA Forecast Systems Laboratory
– National Center for Atmospheric Research  Developmental Test Bed Center
•
LEAD will draw in traditionally underrepresented groups via its education
programs (HU is a major player in WRF via NOAA center)
Linked Environments for Atmospheric Discovery
Prototype Ia Demo
• Now the dreaded demo
• Keep your fingers crossed!!!!
Linked Environments for Atmospheric Discovery
Prototype 1a
IDD
Eta
Grids
GWSTBs
Automatic
Decoder
User driven
Linked Environments for Atmospheric Discovery
IDV
LEAD Contact Information
• LEAD PI: Prof. Kelvin Droegemeier, [email protected]
• Project Coordinator: Terri Leyton, [email protected]
• LEAD/UCAR PI: Mohan Ramamurthy
Other LEAD UPC staff: Doug Lindholm, Tom Baltzer, Brian Kelly, Ben
Domenico, Anne Wilson, Don Murray and
http://lead.ou.edu/
Linked Environments for Atmospheric Discovery