Network and Grid Computing Geoffrey Fox Andrea Donnellan May 3, 2004 Computational Geoinformatics Workshop.

Download Report

Transcript Network and Grid Computing Geoffrey Fox Andrea Donnellan May 3, 2004 Computational Geoinformatics Workshop.

Network and Grid Computing
Geoffrey Fox
Andrea Donnellan
May 3, 2004
Computational Geoinformatics Workshop
Solid Earth Science Questions
1.
2.
3.
What is the nature of
deformation at plate
boundaries and what are the
implications for earthquake
hazards?
How do tectonics and climate
interact to shape the Earth’s
surface and create natural
hazards?
What are the interactions
among ice masses, oceans,
and the solid Earth and their
implications for sea level
change?
4.
How do magmatic systems
evolve and under what
conditions do volcanoes
erupt?
5.
What are the dynamics of the
mantle and crust and how
does the Earth’s surface
respond?
6.
What are the dynamics of the
Earth’s magnetic field and its
interactions with the Earth
system?
From NASA’s Solid Earth Science Working Group
Report, Living on a Restless Planet, Nov. 2002
The Solid Earth is:
Complex, Nonlinear, and Self-Organizing
SESWG fed into NASA ESE Computational Technology
Requirements Workshop, May 2002
Relevent questions that Computational technologies can help
answer:
1. How can the study of strongly correlated solid earth systems be
enabled by space-based data sets?
2. What can numerical simulations reveal about the physical
processes that characterize these systems?
3. How do interactions in these systems lead to space-time
correlations and patterns?
4. What are the important feedback loops that mode-lock the system
behavior?
5. How do processes on a multiplicity of different scales interact to
produce the emergent structures that are observed?
6. Do the strong correlations allow the capability to forecast the
system behavior in any sense?
Characteristics of Computing for
Solid Earth Science
• Widely distributed heterogeneous
datasets
• Multiplicity of time and spatial scales
• Decomposable problems requiring
interoperability for full models
• Distributed models and expertise
Enabled by Grids and Networks
Objectives
• IT approaches: Integrate multiple scales
into computer simulations.
• Web services: Simplified access to
data, simulation codes, and flow
between simulations of varying types.
What are Grids Good for?
• They are “Internet Scale Distributed Computing”
and support the linking of globally distributed entities
in e-Science concept
– Computers
– Data from repositories and sensors
– People
• Early Grids focused on metacomputing (linking
computers together) but recently e-Science has
highlighted integration of data and building
communities
• Grid technology naturally build Problem Solving
Environments
Some Relevant Grid/Framework Projects
• QuakeSim and Solid Earth Research Virtual
Observatory SERVOGrid (JPL …)
• GEON: Cyberinfrastructure for the Geosciences (San
Diego, Missouri, USGS ..)
• CME: Community Modeling Environment from SCEC
• CIG: Computational Infrastructure for Geodynamics
• Geoframework.org Caltech/VPAC
• ESMF: Earth System Modeling Framework (NASA)
• NERCGrid: Natural Environment Research Council
UK e-Science
• Earth Systems Grid in DoE Science Grid
Earth Science Computing
Capability
Capacity
Earth Science Data
Analysis and
Visualization
ADVANCED
VISUALIZATION
,ANALYSIS
QuickTime™ and a
decompressor
are needed to see this picture.
COMPUTATIONAL
RESOURCES
Large Disks
Metacomputing Grid
LARGE-SCALE DATABASES
Large Scale Parallel Computers
NO Capability: Spread a single large Problem over multiple supercomputers
YES Capacity: Seamless access to multiple computers
Repositories
Federated Databases
Database
Sensors
Streaming
Data
Field Trip Data
Database
Research
SERVOGrid
Data
Filter
Services
Research
Simulations
Geoscience Research and
Education Grids
Customization
Services
From
Research
to Education
?
Discovery
Services
Education
Analysis and
Visualization
Portal
Education
Grid
Computer
Farm
More General Material on Grids
• Grids today are built in terms of Web Services – a technology
designed to support Enterprise Software and e-Business
– Provides wonderful support tools
– Provides a new software engineering model supporting
interoperability
• Grids do not compete with parallel computing
– They let MPI run untouched so your parallel codes run as
fast as they used to do
• Grids do “control/management/metadata management” where
higher latency (around 10 milliseconds – thousand times worse
than MPI) acceptable
• Global Grid Forum, W3C, OASIS set relevant standards and
support community
User
Services
System
Services
Portal
Services
Grid
Computing
Environments
System
Services
Application
Application Metadata
Service
Middleware
System
Services
Actual Application
System
Services
System
Services
Raw (HPC)
Resources
“Core”
Grid
Database
Grids provide
• “Service Oriented Architecture” supporting distributed programs
in scalable fashion with clean software engineering
• “Multi-tier” architecture supporting seamless access with
brokers mediating access to diverse computers and data
sources
• “Workflow” integrating different distributed services in a single
application
• Event services to notify computers and people of issues
(earthquake struck, job completed)
• Easy support of parameter searches and other pleasingly
parallel applications with many related non-communicating jobs
• Security (Web Services), Database access (OGSA-DAI),
Collaboration (Access Grid, GlobalMMCS)
• File, data and meta-data management
Web Services
• Web services are the fundamental pieces of distributed
Service Oriented Architectures.
• We should define lots of useful services that are
remotely available
– Archival data access services supporting queries, real
time sensor access, and mesh generation all seem to
be popular choices.
• Web services have two important parts:
– Distributed services
– Client applications
• These two pieces are decoupled: one can build clients to
remote services without caring about the programming
language implementation of the remote service.
– Java, C++, Python
Web Services, Continued
• Clients can be built in any number of styles
– We build portlet clients: ubiquitous, can combine
– One can build fancier GUI client applications.
– You can even embed Web service client stubs (library
routines) in your application code, so that your code can make
direct calls to remote data sources, etc.
• Regardless of the client one builds, the services are the same in
all cases:
– my portal and your application code may each use the same
service to talk to the same database.
• So we need to concentrate on services and let clients bloom as
they may:
– Client applications (portals, GUIs, etc.) will have a much
shorter lifecycle than service interface definitions, if we do our
job correctly.
– Client applications that are locked into particular services, use
proprietary data formats and wire protocols, etc.,
are at risk. Use WSRF/JSR-168 Portlet standards
Data Deluged Science
• During the HPCC Initiative 1990-2000, we worried about data in the
form of parallel I/O or MPI-IO, but we didn’t consider it as an enabler of
new algorithms and new ways of computing
• Data assimilation was not central to HPCC
• DoE ASCI (Stockpile Stewardship) set up because didn’t want/have
test data!
• Now particle physics will get 100 petabytes from CERN LHC
– Nuclear physics (Jefferson Lab) in same situation
– Use continuously ~30,000 CPU’s simultaneously 24X7
• Weather, climate, solid earth (EarthScope)
• Bioinformatics curated databases
• Virtual Observatory and SkyServer in Astronomy
• Environmental Sensor nets
Data Deluged
Science
Computing
Paradigm
Data
Assimilation
Information
Simulation
Informatics
Model
Ideas
Computational
Science
Datamining
Reasoning
OGSA-DAI
Grid Services
Grid
Grid Data
Assimilation
Distributed Filters
massage data
For simulation
HPC
Simulation
Analysis
Control
Visualize
Data Deluged
Science
Computing
Architecture
Some Questions for Data Deluged Science
• A new trade-off: How to split funds between sensors and
simulation engines
• No systematic study of how best to represent data deluged
sciences without known equations at resolution of interest
• Data assimilation very relevant
• Relationship to “just” interpolating data and then extrapolating
a little
• Role of Uncertainty Analysis – everything (equations, model,
data) is uncertain!
• Relationship of data mining and simulation
• Growing interest in Data curation and provenance
• Role of Cellular Automata (CA) Potts Models and Neural
Networks which are “fundamental equation free” approaches
Recommendations of NASA’s Computational
Technologies Workshop (May 2002)
1. Create a Solid Earth Research Virtual Observatory (SERVO)
•
•
•
•
Numerous distributed heterogeneous real-time datasets
Seamless access to large distributed volumes of data
Data handling and archiving part of framework
Tools for visualization, datamining, pattern recognition, and data fusion
2. Develop an Solid Earth Science Problem Solving Environment (PSE)
•
•
•
•
Addresses the NASA specific challenges of multiscale modeling
Model and algorithm development and testing, visualization, and data
assimilation
Scalable to workstations or supercomputer depending on size of problem
Numerical libraries existing within a compatible framework
3. Improve the Computational Environment
•
•
•
PetaFLOP computers with Terabytes of RAM
Distributed and cluster computers for decomposable problems
Development of GRID technologies
SERVOGrid Requirements
• Seamless Access to Data repositories and large scale
computers
• Integration of multiple data sources including sensors,
databases, file systems with analysis system
– Including filtered OGSA-DAI (Grid database access)
• Rich meta-data generation and access with SERVOGrid
specific Schema extending openGIS (Geography as a Web
service) standards and using Semantic Grid
• Portals with component model for user interfaces and web
control of all capabilities
• Collaboration to support world-wide work
• Basic Grid tools: workflow and notification
• Not metacomputing
Solid Earth Research Virtual Observatory (SERVO)
Observations
…
…
•1
PB per year data rate in 2010
• Distributed Heterogeneous
Real-Time Datasets
Archive
~TBytes/day
…
Archive
…
…
Downlink
Downlink
…
Downlink
…
Archive
Tier 0 +1
100 TeraFLOPs sustained
Goddard
Tier 1
Tier 3
Ames
JPL
Fully functional problem
solving environment
InstituteInstitute Institute
Data cache
Workstations,
other portals
Institute
100 - 1000
Mbits/sec
SERVO
…
Tier 2
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
Tier2 Center
• Program-to-program communication in
milliseconds
• Approximately 100 model codes
Tier 4
• Plug and play composing of parallel programs from algorithmic modules
• On-demand downloads of 100 GB in 5 minutes
• 106 volume elements rendering in real-time
Virtual Observatory Project
Capability
Scaled to 100 sites
Prototype cooperative
federated data base
service integrating 5
datasets of 10 TB each
Decomposition
into services with
requirements
Architecture &
technology
approach
Prototype
modeling service
capable of
integrating 5
modules
• Solid earth
research virtual
observatory
(SERVO)
• On-demand
downloads of 100
GB files from 40
TB datasets within
5 minutes.
• Uniform access to
1000 archive sites
with volumes from
1 TB to 1 PB
Prototype 1920x1080 pixels
at 120 frames per second
visualization service
Prototype data
analysis service
2003 2004 2005 2006 2007 2008 2009 2010
Timeline
NASA CT Workshop, May 2002
Problem Solving Environment Project
Capability
Integrated visualization
service with volumetric
rendering
Extend PSE to Include
• 20 users collaboratory with shared windows
• Seamless access to high-performance computers
linking remote processes over Gb data channels.
Plug and play composing of
sequential programs from
algorithmic modules
• Fully functional PSE
used to develop
models for building
blocks for simulations.
• Program-to-program
communication in
milliseconds using
staging, streaming,
and advanced cache
replication
• Integrated with SERVO
• Plug and play
composing of parallel
programs from
algorithmic modules
Prototype PSE front end
(portal) integrating 10
local and remote services
Isolated
platform
dependent
code fragments
2003 2004 2005 2006 2007 2008 2009 2010
Timeline
NASA CT Workshop, May 2002
Computational Environment
~100 model codes
with parallel scaled
efficiency of 50%
~104 PetaFLOPs
throughput per
subfield per year
Capability
~100 TeraFLOPs
sustained capability
per model
Access to mixture of platforms low cost
clusters (20-100) to supercomputers with
massive memory and thousands of
processors
100’s GigaFLOPs
40 GB RAM
1 Gb/s network bandwidth
~106 volume
elements rendering
in real time
This slide appears inconsistent
with slide 8
2003 2004 2005 2006 2007 2008 2009 2010
Timeline
NASA CT Workshop, May 2002
Solid Earth Research Virtual Observatory
(iSERVO)
Web-services (portal) based Problem Solving Environment (PSE)
Couples data with simulation, pattern recognition software, and
visualization software
Enable investigators to seamlessly merge multiple data sets and
models, and create new queries.
Data
•
•
•
•
Spaced-based observational data
Ground-based sensor data (GPS, seismicity)
Simulation data
Published/historical fault measurements
Analysis Software
• Earthquake fault
• Lithospheric modeling
• Pattern recognition software
Philosophy
• Store simulated and observed data
• Archive simulation data with original simulation code and
analysis tools
• Access heterogeneous distributed data through cooperative
federated databases
• Couple distributed data sources, applications, and hardware
resources through an XML-based Web Services framework.
• Users access the services (and thus distributed resources)
through Web browser-based Problem Solving Environment
clients.
• The Web services approach defines standard, programming
language-independent application programming interfaces,
so non-browser client applications may also be built.
SERVOGrid Basics
• Under development in collaboration with
researchers at JPL, UC-Davis, USC, and
Brown University.
• Geoscientists develop simulation codes,
analysis and visualization tools.
• We need a way to bind distributed codes,
tools, and data sets.
• We need a way to deliver it to a larger
audience
– Instead of downloading and installing the code,
use it as a remote service.
SERVOGrid Application Descriptions
• Codes range from simple “rough estimate” codes to parallel, high
performance applications.
– Disloc: handles multiple arbitrarily dipping dislocations (faults) in an elastic
half-space.
– Simplex: inverts surface geodetic displacements for fault parameters using
simulated annealing downhill residual minimization.
– GeoFEST: Three-dimensional viscoelastic finite element model for
calculating nodal displacements and tractions. Allows for realistic fault
geometry and characteristics, material properties, and body forces.
– Virtual California: Program to simulate interactions between vertical strikeslip faults using an elastic layer over a viscoelastic half-space
– RDAHMM: Time series analysis program based on Hidden Markov Modeling.
Produces feature vectors and probabilities for transitioning from one class to
another.
– PARK: Boundary element program to calculate fault slip velocity history
based on fault frictional properties.a model for unstable slip on a single
earthquake fault.
• Preprocessors, mesh generators
• Visualization tools: RIVA, GMT
iSERVO Web Services
• Job Submission: supports remote batch and shell invocations
– Used to execute simulation codes (VC suite, GeoFEST, etc.), mesh
generation (Akira/Apollo) and visualization packages (RIVA, GMT).
• File management:
– Uploading, downloading, backend crossloading (i.e. move files
between remote servers)
– Remote copies, renames, etc.
• Job monitoring
• Apache Ant-based remote service orchestration
– For coupling related sequences of remote actions, such as RIVA
movie generation.
• Database services: support SQL queries
• Data services: support interactions with XML-based fault and surface
observation data.
– For simulation generated faults (i.e. from Simplex)
– XML data model being adopted for common formats with translation
services to “legacy” formats.
– Migrating to Geography Markup Language (GML) descriptions.
Some Conclusions
• Grids facilitates support
– International Collaborations
– Integration of computing with distributed data
repositories and real-time sensors
– Web services from a variety of fields (e.g.
map services from openGIS)
– Seamless access to multiple networked
compute resources including computational
steering
– Software infrastructure for Problem Solving
Environments