Grid Infrastructure at CCT

Download Report

Transcript Grid Infrastructure at CCT

UCoMS: Grid Computing Framework
for Petroleum Engineering
UCoMS
• Ubiquitous Computing and Monitoring
System for Discovery and Management of
Energy Resources.
• UCoMS, sponsored by Department of Energy
and the Louisiana Board of Regents, is a
project to research and develop new Grid
computing and sensor network technologies
for the management of energy resources.
• Three Louisiana Universities are involved in
this project: Louisiana State University (LSU),
University of Louisiana at Lafayette (ULL) and
Southern University at Baton Rouge (SUBR).
UCoMS Application Areas
• Reservoir simulation on computing
grids.
• Real-time well surveillance.
• Drilling performance analysis with highrate data coming from real time drilling.
ResGrid
• Massive reservoir simulation tool for
uncertainty analysis.
• Leveraging data archiving tool, condorbased task farming framework and
Information service, and GridSphere.
• Basic functions implemented.
Why ResGrid?
• Reservoir simulation is one of the largest
users of computing power
– Large, complex, uncertainty models
– High risks and high rewards
• Where is the gain?
– Moderate-sized jobs can be farmed out onto a
heterogeneous grid.
– Large jobs can be run in parallel on a grid.
• Efficiency gains can help to assess risks,
estimate parameters, run larger and more
complex models.
Computational Details
• Two Parts of the Problem:
– constructing grid blocks as input for reservoir simulation.
– The Reservoir simulation itself.
• Geostatic Modeling is used to generate input data.
• UTChem is the reservoir simulation app.
• Computational complexity can be increased by
increasing the granularity of the grid, increasing
number of simulations for better estimates and
increasing the complexity of the algorithm itself.
• Currently a typical problem requires 55 days on a
single 3GHz P4 system.
• UTChem’s memory requirement is small.
• GSLIB is memory intensive. Typical run needs 8GB
of RAM. Trying to move to algorithms using sparse
matrices to reduce the memory requirement.
Typical Reservoir Simulation Workflow
ResGrid Workflow
UCoMS Grid Enabled
WorkFlow
Grid Computing in UCOMS
• Grid Application Toolkit/Simple API
for Grid Applications (GAT/SAGA)
– Many new adaptors/examples
• Grid portals using GridSphere
• Data Archiving Toolkit based on
GAT
• Task Farming Framework using
Condor.
• Information service based on
MDS/GPIR
• Mobile Grid Computing (Migration)
• ResGrid: a tool for reservoir
uncertainty analysis.
General Components:
•APIs
•Replica location
•Job submission
•Data transfer
•Metadata
•Workflow
•Task-farming
•Visualization
Grid Application Toolkit (GAT)
•
•
•
Abstract programming
interface between
applications and Grid
services
Designed for
applications (move file,
run remote task,
migrate, write to
remote file)
Led to GGF Simple
API for Grid
Applications
Q u ic k T im e ™ a n d a
TI FF ( L ZW ) d e c o m p r e s s o r
a r e n e e d e d t o s e e t h is p ic t u r e .
Default Adaptors
Basic functionality, will work on single isolated
machine (e.g. cp, fork/exec)
Globus Adaptors
Core Globus functionality: GRAM, MDS, GTRLS, GridFTP
GridLab Adaptors
GRMS, Mercury, Delphoi, iGrid
Under Develop
Scp, DRMAA, Condor, SGE, SRB, Curl, RFT.
www.gridlab.org/GAT
Grid Portals
• GridSphere Portal
used to provide
portal interfaces
• HPC Portal to
monitor and submit
to HPC resources at
CCT and other
national and
international Grids.
Data Archiving
• Grid-enabled data archiving tool (server and clients) for
large scale data management.
• Based on GAT.
• Essential for large-scale reservoir simulations and drilling
applications.
Task Farming
Need flexible mechanisms
for scheduling and
deploying our
applications
• Fault tolerance,
scheduling, data
integration, …
Two approaches
• Cactus-based task
farming framework.
• Condor-based task
farming framework.
Replica
Catalog
SMS
Server
“The
Grid”
GridSphe
re Portal
Mail
Server
Task farming
infrastructure
implemented in Cactus
TFM
GAT used for
starting remote TFMs
TFM
TFM
TFM
TFM
Designed for
the Grid
Tasks can be
Anything (MPI,
Single proc)
Current Status
•
•
•
•
ResGrid running locally.
Using Two Grids: CCT Grid and ULL Grid.
Also Running on KISTI machines in Korea.
Interested in Connections to other external
Grids.
Grid Infrastructure: CCT Grid
Grid Infrastructure: Services
•
•
•
•
•
•
CCT Certificate Authority
GSI Open SSH access to all machines
GridSphere portals
PBSPro, Maui
Condor, GAT, Globus
GridHub
–
–
–
–
–
–
Grid Services Server hosting various Grid Services
GridLab Resource Management System (GRMS)
iGrid
MDS
RLS
GPIR
Requirements from Sites
• Users in GridMap files.
• C and Fortran compilers.
• Globus 3.x/4.x
– GridFTP Server
– GRAM
Other Applications/Services from CCT
• SCOOP WW3 – Wave Watch 3
• SURA Archive – Can provide the
archiving service and getdata clients
based on GAT for facilitating storage
and retrieval of data for various
applications. Also provides metadata
and logical file services. ADCIRC can
already use the SCOOP Archive we
have.
Credits
•
Investigators
–
–
–
–
–
–
–
•
Ph. D. Candidates
–
–
–
–
–
•
Edward Seidel (CCT)
Gabrielle Allen (CCT)
Christopher D. White (PE)
John R. Smith (PE)
Zhou Lei (CCT)
Hartmut Kaiser (CCT)
Archit Kulshrestha (CCT)
Richard Duff (PE)
Xin Li (PE)
Dayong Huang (CS)
Santiago Pena (CS)
Promita Chakraborty (CS)
M. S. Students
– Chongjie Zhang (CS)
•
Undergraduate Students
– John Lewis (CS)
– Yunan Yuan (EE)