GridChem A Computational Chemistry Cyber

Download Report

Transcript GridChem A Computational Chemistry Cyber

Cloud Resources in Production
Cyberenvironments for E-Science Virtual
Organizations GridChem/ParamChem
Interoprability
NSF Cloud ComputingWorkshop
Arlington, VA
17-18 Mar 2011
Sudhakar Pamidighantam
NCSA, University of Illinois at
Urbana-Champaign
[email protected]
National Center for Supercomputing Applications
Acknowledgements
National Center for Supercomputing Applications
Acknowledgments
•
•
•
•
•
•
•
•
•
Jayeeta Ghosh, NCSA, ParamChem
Suresh Marru, Indiana U. OGCE
Ye Fan, Indiana U. OGCE
Kenno Vonnommeslaeghe, U. Maryland/Paramchem,
Narendra Polani, UKy, Middleware/ParamChem
Michael Sheetz, UKy, Application Interfaces/ ParamChem
Vikram Gazula, UKy, Server Administration
Tom Roney, NCSA, Server and Database Maintenance
Nikhil Singh, NCSA, Paramchem
•
•
•
•
•
•
•
Liu Yang, NCSA, GridChem
Scott Brozell, OSC, Applications and Testing
Rion Dooley, TACC Middleware Infrastructure
Stelios Kyriacou, OSC Middleware Scripts
Chona Guiang, TACC Databases and Applications
Kent Milfeld, TACC Database Integration
Kailash Kotwani, NCSA, Applications and Middleware
National Center for Supercomputing Applications
Outline
• Historical Background :
--- Grid Computational Chemistry
• Production Environments
• Current Status Web Services
• Usage:Grid and Science Achievements
• Cloud in Hybrid Environments
• Interoperability
• Future
National Center for Supercomputing Applications
Motivation
Integrating Services for E-Science and
Engineering in
Research, Education and Training
Software
- Reasonably Mature and easy to use to address
chemists questions of interest
Community of Users
- Need and capable of using the software
Some are non traditional computational chemists
Resources
- Various in capacity and capability
- Distributed and heterogeneous
National Center for Supercomputing Applications
Extended TeraGrid Facility
www.teragrid.org
National Center for Supercomputing Applications
NSF Petascale Road Map
• Track I Scheme Multi-petaflop single site system to be deployed by
2011 at NCSA
BlueWaters http://www.ncsa.illinois.edu/BlueWaters/
• Track 2 Sub-petaflop systems
Several to be deployed until Track 1 is online
System
OS
Cores
• Dell PowerEdge(NCSA)
EM64T
9600
• SGI-Altix(PSC)
IA64
768
• SGI UV-Ice(NCSA)
EM64T
1568
• IBM Power4 Cluster(NCSA)
Pwr4
48
• IBM PowerPC(Indiana)
Pwr4
1536
• Sun Constellation (TACC)
EM64T
50000
Additional Systems to be online soon (currently being allocated)
SGI UV-Ice(PSC)
EM64T
4096
FutureGrid
Diverse
on demand
National Center for Supercomputing Applications
Grids and New Opportunities
Alliance to TeraGrid
Homogenous Grid with predefined fixed software and
system stack was planned (Teragrid) but it was difficult
to keep it homogenous
Local preferences and diversity leads to
heterogeneous grids now!
(Operating Systems, Schedulers, Policies, Software and Services)
Grid Hardware
Interfaces
Scientific
Applications
Middleware
Openness and standards that lead interoperability are
critical for successful services
National Center for Supercomputing Applications
User Community
Chemistry and Computational Biology
NRAC
AAB
Small Allocations As of Oct 04
#PIs 26
23
64
#SUs 5,953,100 1,374,100
640,000
TeraGrid Allocations in 2010
Discipline
# PIs Initial Alloc. SUs
Physics
Molecular Biosciences
Chemistry
Chemical, Thermal Systems
Materials Research
125
308
264
143
207
920,254,700
689,733,465
255,479,494
232,905,769
210,602,367
2101 Users using Chemistry Software
230 ASC 30 AST 18 ATM 8 BCS 30 CCR 28 CDA 653 CHE 11 CTS
1 DBS 2 DEB 805 DMR 10 DMS 18 EAR 1 ECS 23 IBN 2 IRI
153 MCB 10 MSS 3 NCR 4 OCE 37 PHY 6 SEE 5 SES 3 STA
National Center for Supercomputing Applications
National Center for Supercomputing Applications
Computational Chemistry Grid
This is a Virtual Organization
Integrated Cyber Infrastructure for
Computational Chemistry
Integrates Applications, Middleware, HPC
resources, Scheduling and Data
management
Allocations, User services and Training
National Center for Supercomputing Applications
Other Resources
Extant HPC resources at various
Supercomputer Centers, Cloud resources
(Interoperable)
Optionally Other Grids and Hubs/local/personal
resources
These may require existing
allocations/Authorization
National Center for Supercomputing Applications
National Center for Supercomputing Applications
GridChem System
user
user
application
user
user
Portal Client
user
application
Grid Middleware
Proxy Server
Grid Services
Grid
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0438312
National Center for Supercomputing Applications
Mass
Storage
Applications
• GridChem supports some apps already
– Gaussian, GAMESS, NWChem, Aces3 Molpro, ADF, Quild,
QMCPack, Castep, DMol3, Amber, Charmm
• Schedule of integration of additional software
–
–
–
–
–
Crystal
Q-Chem
Wein2K
MCCCS Towhee
Others...
Workflows
National Center for Supercomputing Applications
Gridchem Middleware Service
(GMS)
National Center for Supercomputing Applications
GridChem Resources Monitoring
http://portal.gridchem.org:8080/gridsphere/gridsphere?cid=home
National Center for Supercomputing Applications
Application Software Resources
Currently Supported
Suite
Version
Location
Gaussian 03
C.02/D.01
Many Platforms
MolPro
2006.1
NCSA
NWChem
5.0/4.7
Many Platforms
Gamess
Jan 06
Many Platforms
Amber
8.0
Many Paltforms
QMCPack
2.0
NCSA
National Center for Supercomputing Applications
GridChem Software Resources
New Applications
Integration Underway
•
•
ADF
Amsterdam Density Functional Theory
Wien2K
Linearized Augemented Plain wave (DFT)
•
•
•
•
CPMD
QChem
Aces3
Gromacs
•
•
NAMD
Molecular Dynamics
DMol3
Periodic Molecular Systems ( Quantum
Chemistry)
Castep
Quantum Chemistry
MCCCS-Towhee Molecular Confirmation Sampling (Monte Carlo)
Crystal98/06
Crystal Optimizations (Quantum Chemistry)
….
•
•
•
•
Car Parinello Molecular Dynamics
Molecular Energetics (Quantum Chemistry)
Parallel Coupled Cluster Quantum Chemistry
Nano/Bio Simulations (Molecular Dynamics)
National Center for Supercomputing Applications
GridChem User Services
• Allocation
https://www.gridchem.org/allocations/index.shtml
Community and External Registration
Reviews, PI Registration and Access Creation
Community User Norms Established
• Consulting/User Services
https://www.gridchem.org/consult
Ticket tracking, Allocation Management
• Documentation, Training and Outreach
https://www.gridchem.org/doc_train/index.shtml
FAQ Extraction, Tutorials, Dissemination
Help is integrated into the GridChem client
National Center for Supercomputing Applications
Users and Usage
• 433 Users under 221 Projects
Include Academic PIs, two graduate
classes
And about 15 training users
More than a 2, 000, 000 CPU Wallhours
More than 35500 Jobs processed
5 Dissertations, More than 50
Publications
National Center for Supercomputing Applications
User Research
Phosphinoborane
percyclics
Diversity of User Research
CytP450 Catalysis
FTIR of
Heptanedione on
Si
Semiquinone
reactions
NH3 on Si Surfaces
Si Surface IR
Disulfide clevage
by P-
Zeolite
Chemistry
V in
photocatalysts
Thiolate –SS
interchange
PES of
diphenylbutadienes
National Center for Supercomputing Applications
Science Enabled
• Azide Reactions for Controlling Clean Silicon Surface
Chemistry: Benzylazide on Si(100)-2 x 1
Semyon Bocharov et al..
J. Am. Chem. Soc., 128 (29), 9300 -9301, 2006
• Chemistry of Diffusion Barrier Film Formation: Adsorption
and Dissociation of Tetrakis(dimethylamino)titanium on
Si(100)-2 × 1
Rodriguez-Reyes, J. C. F.; Teplyakov, A. V.
J. Phys. Chem. C.; 2007; 111(12); 4800-4808.
• Computational Studies of [2+2] and [4+2] Pericyclic
Reactions between Phosphinoboranes and Alkenes. Steric
and Electronic Effects in Identifying a Reactive
Phosphinoborane that Should Avoid Dimerization Thomas
M. Gilbert* and Steven M. Bachrach Organometallics, 26 (10),
2672 -2678, 2007.
National Center for Supercomputing Applications
Science Enabled
• Chemical Reactivity of the Biradicaloid (HO...ONO) Singlet
States of Peroxynitrous Acid. The Oxidation of
Hydrocarbons, Sulfides, and Selenides. Bach, R. D et al. J.
Am. Chem. Soc. 2005, 127, 3140-3155.
• The "Somersault" Mechanism for the P-450 Hydroxylation
of Hydrocarbons. The Intervention of Transient Inverted
Metastable Hydroperoxides. Bach, R. D.; Dmitrenko, O. J. Am.
Chem. Soc. 2006, 128(5), 1474-1488.
• The Effect of Carbonyl Substitution on the Strain Energy of
Small Ring Compounds and their Six-member Ring
Reference Compounds Bach, R. D.; Dmitrenko, O. J. Am.
Chem. Soc. 2006,128(14), 4598.
National Center for Supercomputing Applications
Distribution of GridChem User Community
National Center for Supercomputing Applications
Job Distribution
National Center for Supercomputing Applications
System Wide Usage
HPC System
Usage (SUs)
Tungsten(NCSA)
5507
Copper(NCSA)
86484
CCGcluster(NCSA)
55709
Condor(NCSA)
30
SDX(UKy)
116143
CCGCluster(UKy)
.5
Longhorn(TACC)
54
CCGCluster(OSC)
62000
TGCluster(OSC)
36936
Cobalt(NCSA)
2485
Champion(TACC)
11
Mike4 (LSU)
14537
National Center for Supercomputing Applications
Force Field Parameterization
Molecular Force Fields require constant improvement
as new reference data becomes available (that can
not be accommodated easily with existing sets)
New molecular systems become amenable for
computational analysis
New models/potential energy functions/Hamiltonians for
force are established
Coverage of force fields should constantly be extended
to cover new fields of research/new functionality
(nanomaterials, biomaterials and medicine,...)"
Cyberenvironments for Molecular
Force Fields
• Extension of currently available models, with the
•
•
•
•
resulting parameters sets to be made available publicly
Databases of experimental and quantum mechanical
reference data to be used in the parameterization
process
Integration of computational resources for data
acquisition, automation of QM reference data generation
Automation Extensible infrastructure for parameterization
management for rapid and systematic parameterization
of novel Hamiltonians (empirical and semi-empirical)
Systematic improvement of parameter optimization
processes
Accurate Force Fields Are needed
Fig. 1. Errors (V) in electrostatic potential on a surface at 1.8 times van der Waals radii around N-methyl
propanamide for two models. (Left) Point charges; (right) charge, dipole, and quadrupole on C, N, and O; charge and
dipole on H. The errors are much reduced in the multipole approach
A. J. Stone Science 321, 787 -789 (2008)
Published by AAAS
Science Gateways Layer Cake
User
Interfaces
Gateway
Services
Web/Gadget
Container
Web/Gadget
Interfaces
Application
Abstractions
Fault
Tolerance
Workflow
System
Auditing &
Reporting
Web Enabled
Desktop Applications
Application
Monitoring
Gateway Abstraction
Interfaces
User
Management
Information
Services
Security
Provenance &
Metadata
Management
Registry
Resource
Middleware
Cloud Interfaces
Grid Middleware
SSH & Resource
Managers
Compute
Resources
Computational
Clouds
Computational
Grids
Local Resources
Color Coding
OGCE Gateway Components
Complimentary Gateway Components
Dependent resource provider components
XSUL/Apache Axis2
GFac Current & Future Features
Input
Handlers
Registry
Interface
Scheduling
Interface
Globus
Monitoring
Interface
Output
Handlers
Fault
Tolerance
Data Management
Abstraction
Auditing
Checkpoint
Support
Job Management
Abstraction
Color Coding
Campus
Resourc
es
Amazon
Eucalyptus
Unicore
Condor
Existing Features
Planned/Requested Features
OGCE Layered Workflow Architecture:
Derived from LEAD Workflow System
Workflow
Interfaces
(Design &
Definition)
XBaya GUI
(Composition,
Deploying, Steering &
Monitoring)
BPEL 2.0
Python
BPEL 1.0
Java Code
Flex/Web Composition
Gadget Interface for
Input Binding
Scufl
Workflow
Specification
Workflow
Execution &
Control Engines
Apache
ODE
GBPEL
Pegasus DAG
Dynamic
Enactor
Condor
DAGMan
Jython Interpreter
Taverna
Putting It All Together
Pegasus WMS
35
ParamChem-Xbaya-Pegasus
• Input Workflow for GridChem/ParamChem created using Pegasus
JAVA DAX API
-- DAX can have combinations of tasks ( like Charmm/ multiple
Gaussian tasks) each taking respective input file.
• The tasks can be mapped to either respective specific applications
(like charmm/amber/g03 or g09 )based on a simple configuration.
• Input data (instructions, structure, topology, parameters) will be
staged from middleware using GridFTP to the execute clusters
(such as TeraGrid systems Mercury and Abe at NCSA).
• Jobs will be distributed across the multiple execute clusters using
Round-Robin or other schema.
-- Any heuristics based scheduling is also possible.
• Output files will be staged back from execute clusters to middleware
using GridFTP for post processing/archiving.
36
Some New GridChem Infrastructure
•
•
•
•
•
•
Workflow Editors
Coupled Application Execution
Large Scale Computing
Metadata and Archiving
Rich Client Platform Refactorization
Intergrid Interactions
• Open Source Distribution
http://cvs.gridchem.org/cvs/
• Open Architecture and Implementation details
http://www.gridchem.org/wiki
National Center for Supercomputing Applications
•
•
•
•
•
•
•
•
•
•
ParamChem Apache Axis2
Services
NotificationService
ResourceService
TriggerService
SessionService
SoftwareService
JobService
Workflow Service
FileService
UserService
ProjectService
Cloud HPC Interoperability
 The Cloud in our case is a part of over all resources for
computing and storage
 They have to be usable interoperably along with other
HPC and local resources
Particular use will be for on-demand computing and high
throughput computing
Certain routine sensor enabled data dependent
computing hydrological event monitoring and simulation
could be handled by clouds for rapid on demand
prediction of short term events
The interoperability requirements that enable data and
computation movement from one resource to other should
be explored.
National Center for Supercomputing Applications
sarvE janAh SukhinO bhavantu
May every person be happy
Questions?
Imaginations unbound