Transcript Slide 1

Tools and methods for multiscale biomolecular
simulations
Celeste Sagui
Department of Physics, NC State University, Raleigh NC
We represent a partnership of 7 researchers located at the 3
Universities and the National Institute of Environmental and Health
Sciences (NIEHS), all located within North Carolina’s Research
Triangle
NC State Members: Jerry Bernholc, Lubos Mitas, Christopher
Roland and Celeste Sagui (PI) – Physics
UNC Member: Lee Pedersen – Chemistry and NIEHS
Duke Member: John Board – Computer Science
NIEHS Collaborator: – Tom Darden – Structural Biology Lab
ITR Scientific Aims
• explore science as to enable a set of scalable
computational tools for large-scale, multiscale
biomolecular simulations
• multiscale methods are to range from Quantum Monte Carlo
(QMC) to continuum methods
• codes will be based on real-space grids, with multigrid
acceleration and convergence
• electrostatics will be treated in a highly efficient and accurate
manner
• codes will be used to solve paradigmatic biomolecular
problems
• codes will ultimately be distributed under the Open Source
GPL license
Some Highlights of Current Progress
1. Accurate and efficient electrostatics for largescale biomolecular simulations (Sagui, Darden,
Roland)
2.
3.
4.
5.
Coupling of QMC and MD (Mitas)
Thomas Fermi and DFT (Bernholc)
PMEMD and AMBER 8.0 (Pedersen, Duke)
Applications
Classical Molecular Dynamics
Developments: Accurate Electrostatics
Accurate and Efficient Electrostatics for Large-Scale
Biomolecular Simulations
Accurate electrostatics is absolutely essential for
meaningful biomolecular simulations (i.e., they
stabilize the delicate 3-d structures, bind complexes together,
represent the computational bottleneck in current simulations,
etc)
Key Challenges:
(a) More accurate description of electrostatics with
higher-order multipoles is needed (Our solution:
Wannier functions)
(b) Computationally efficient ways of simulating such
systems are needed (Our solution: advanced PME/
Multigrid approaches)
Limitations of Current Modeling of Electrostatic Fields
Higher-order multipoles have not been implemented
due to their overwhelming costs:
1. Multipoles up to hexadecapoles have 35 degrees of
freedom, so that interaction matrix between them has 1225
components  i.e., cost of fixed cutoff implementation is 3
orders of magnitude more than just charges alone !!
2. Ewald implementation grows like O(N2)
3.
Use of cutoffs alleviate the problem: WRONG !!
 much of the cost originates in the direct part

truncation leads to artifactual behavior unless cutoffs of
the order of 25 Å are used
Our Solution
1. Implement a McMurchie-Davidson formalism for the
direct part of the Ewald summation
2.
Switch most of the calculation to reciprocal space
3.
Implement a Particle-Mesh Ewald (PME)-based
approach for single-processor machines
4.
Implement a multigrid-based approach for parallel
machines
Sagui, Pedersen, and Darden, J. Chem. Phys. 120, 73 (2004)
PME-based results for 4096 water molecules
Rc
Spline hx
Direct Recip

rocal
(Ang-1) (Ang) Order (Ang) (sec)
(sec)
charges
0.50
5.63
5
0.775 0.21
0.18
dipoles
0.50
5.60
5
0.775 0.33
0.21
quadrupoles 0.55
5.10
6
0.668 0.57
0.32
octupoles
0.70
4.25
8
0.620 0.93
0.60
hexadecapoles 0.85
3.60
8
0.459 1.54
1.12
Overall
(sec)
0.42
0.58
0.96
1.70
3.05
Relative RMS force error: 5x10-4 ; for error 5x10-5, hexadecapole cost is
4.4 secs; 5x10-6 – cost is 5 secs
With Rc=8 Å cutoff, costs is 6 times more than with PME and has a RMS
error of about 0.05
Single processor Intel Xeon, 3.06 GHz, 512kB cache, 2GB memory, g77 compiler
Multigrid-based results for 4096 water molecules

(Ang-1)
Rc
(Ang)
RG
(Ang)
hx
(Ang)
Direct
(sec)
Recip
rocal
(sec)
charges
0.60
5.20
3.50
0.620
0.20
2.30
dipoles
0.61
5.20
3.63
0.620
0.29
2.66
quadrupoles
0.70
4.80
3.45
0.516
0.52
4.64
octupoles
0.75
4.25
3.10
0.443
0.94
8.42
hexadecapoles
0.79
4.25
3.05
0.388
2.21
15.71
Relative RMS force error: 5x10-4
Single processor Intel Xeon, 3.06 GHz, 512kB cache, 2GB memory,
g77 compiler
MD simulation of DNA decamer d(CCAACGTTGG)2 in a
crystal environment
J. Baucom et al, submitted to J. Chem. Phys. 2004
RMS deviation of crystal simulations at constant
pressure with respect to crystal structure
charges only
charges and induced dipoles
Calculating Multipole Moments via Wannier
Functions (WFs)
•
to partition the charge density and calculate the
multipole moments, we use a WANNIER FUNCTION
(WF) approach
•
this has several advantages:
1.
WFs provide for a chemical and physically intuitive
way of partitioning the charge (ref. Marzari and
Vanderbilt, PRB 56, 12847 (1997))
2.
WFs are distributed in space, which allows for a more
faithful representation of the electrostatic potential
3.
no ad hoc assignment of the charge
4.
numerically quite stable procedure
Ref: C. Sagui, P. Pomorski, T. Darden and C. Roland, J. Chem. Phys. 120,
4530 (2004)
Maximally localized Wannier functions for water
• water molecule has 4 WFs
• 2 associated with OH bond (light blue)
• 2 associated with O lone pairs (dark blue)
Electrostatic potential for single water molecule as
generated by WFs
ab initio
m
mdq
mdqo
mdqoh
Wannier functions for carbon dioxide
• CO2 has 8 WFs
• 6 associated with the CO bonds (light blue)
• 2 associated with O (dark blue)
Electrostatic potential for carbon dioxide molecule as
generated by WFs
ab initio
m
mdq
mdqo
mdqoh
Quantum Monte Carlo Developments
New continuous quantum Monte Carlo/molecular
dynamics method
• we propose a new method for coupling ab initio
molecular dynamics ionic with stochastic DMC
electronic steps to provide accurate DMC energies
“on-the-fly”
• exploits the slowness of MD evolution which enables
to update the QMC sampling process very efficiently
• accurate for both thermal averages and description of
energies along the pathways
• we have carried out the first QMC/MD simulations
using both forces and energies from QMC
Ref: J. Grossmann and L. Mitas, preprint 2004
Coupling of QMC and MD: Basic Idea
average distance made by
an ion in one MD time step
10-4 … 10-3 a.u.
average distance by an electron
in a typical DMC time step
10-2 … 10-1 a.u.
Instead of discrete sampling of each point with a new QMC run:
calculate QMC energies “on-the-fly” during the dynamic
simulation !
Continuously update the DMC walkers so that they correctly
represent the evolving wave function (CDMC method)
Evolution of both configuration spaces is coupled: as the ionic
dynamical trajectories evolve, so does the population of DMC
electrons
Stable CDMC simulation
Successful CDMC Algorithm
ab initio MD Step
R, (R)
Compute orbital overlaps
with current DMC walkers
Orbital swapping or rotation?
no
Check for node crossings
• Stable DMC population
compute weights
• How accurate is it?
Take DMC step(s) and calculate energy
• Benchmark against discrete
DMC
do VMC
then DMC
yes
CDMC: Number of DMC step needed per MD step
• Use large discrete sampled
runs (1000 steps each) for
comparison
• As simulation progresses, 1step CDMC energies begin to
differ significantly from
discrete DMC
• Using 3 steps corrects time
“lag”
Thermal Averages (over 1 ps)
E(discrete DMC)
= -6.228(2)
E(1 step continuous) = -6.220(2)
E(2 steps continuous) = -6.220(2)
E(3 steps continuous) = -6.226(2)
E(10 steps continuous)= -6.230(2)
E(20 steps continuous)= -6.228(2)
• 33 times more efficient than
discrete sampling
• Thermal averages are converged for
N≥3
• Same convergence (3 CDMC steps)
observed for Si2H6 and Si5H12
CDMC: Si2H6
As for SiH4, asymmetrically
stretch molecule and let go
QuickT ime™ and a YUV420 codec decompressor are needed t o see this pict ure.
Average temperature ~ 1500
K
CDMC: Si2H6 Results
ForSi2H6, 3 steps
appears to lead to
stability as for SiH4
# steps looks like a
function of dynamics
rather than size
Can pinpoint specific
types of strain that
lead to wf lag
Test of quantum Monte Carlo/molecular dynamics
method on water dissociation
“QMC only” molecular dynamics, with no external input from DFT
SiH4 at 1500 K
• DMC forces in very good
agreement with DFT forces
H2O Dissociation
• DMC-MD and DFT-MD
trajectories are in excellent
agreement
Density Functional Theory Developments
Development of hydrid QM calculations: interfacing
DFT with Thomas-Fermi calculations
Idea: in many biological systems,
only part of the system is
chemically active
 Use ab initio methods for this
part (real-space, multigrid-based
code)
 Use more approximate
methods for the rest of the
system (in this case Thomas
Fermi approach, with frozen
density for molecules, and
gradient corrections)
Ref: M. Hodak, W. Lu, and Bernholc, in preparation
Hybrid calculation tests
●
Interaction of two water
molecules in hybrid calculation
- Hydrogen bonding test
- Molecule 1: Ab initio
- Molecule 2: Thomas-Fermi
Gives estimated speed-up of 500 times !!
Parallelization and Coding Advances
Improvements in performance and parallel scalability
of AMBER MD Software
A redesign of the AMBER SANDER program, along with a rewrite
in FORTRAN 90 were undertaken with the goal of substantially
improving the practicality of multi-nanosecond PME simulations
(i.e., in 100,000 to 300,000 atom range)
Resulting software has been released to the AMBER community in
3 phases – the new software is named PMEMD for PARTICLE
MESH EWALD MOLECULAR DYNAMICS
Release Dates: July 2003
– PMEND 3.00
October 2003 – PMEMD 3.10
March 2004
– PMEMD 8.0 (part of AMBER 8.0)
PMEMD is now the primary high performance MD
modeling tool in AMBER !!
Representative Benchmark
Results for the Factor IX constant pressure system from Dr. Lalith
Perera, a solvated system with a total of 90906 atoms. The time step was
1.5 fs, the direct force cutoff was 8.0 angstrom, all simulations used PME.
The runs were done on the IBM 1.3 GHz p690 Regatta at the Edinburgh Parallel
Computing Centre.
#procs
8
16
32
64
96
112
128
PMEMD 8
psec/day
nd
672
1125
1975
2743
2945
2516*
PMEMD 3.1
psec/day
346
607
1035
1770
2304
2631
2864
PMEMD 3.0
psec/day
353
594
929
1127
nd
nd
nd
Sander 8
psec/day
nd
nd
nd
369
nd
nd
339
Sander 7
psec/day
233
279
306
318
nd
nd
nd
* Performance falloff observed here. Max performance obtained at higher
processor count was 3600 psec/day, but required using only 4 of the 8 cpu's
on each 8 cpu multi-chip module of the SP4.
Sander 6
psec/day
182
258
297
339
nd
nd
nd
Software References
Maximum throughput obtainable for SP4 is up an order
of magnitude for PMEMD 8, than any other version of
SANDER
Software Publication References:
R.E. Duke and L.G. Pedersen, PMEMD 3.0 (2003)
R.E. Duke and L.G. Pedersen, PMEMD 3.1 (2003)
D.A. Case et al, AMBER 8.0 (2004)
Some Applications …
1. QM/MM studies of enzymatic sulfate transfer in the heparan sulfate
precursor
2. QM/MM studies of enzymatic sulfate transfer in estrogen sulfation
3. PMEMD study of the mammalian P450 enzyme and the ternary blood
coagulation complex tissue factors
4. Protein folding study on ionic domains of the coagulation protein
protothrombin
5. Solvation and deprotonation of formic acid
6. Crystallographic studies of DNA
7. Binding of vancomycin and teicoplanin antibiotics to bacterial cell
wall termini
8. Structure and function of serine proteases
9. QM/MM study of role of Mg ions in the mechanism of DNA polymerase
Mixed Quantum and Molecular Mechanics Simulations of Sulfuryl
Transfer Reaction Catalyzed by Human Estrogen
Sulfotransferase
P. Lin and L. Pedersen
K47
PAPS
E2
H107
Mixed Quantum and Molecular Mechanics Simulations of Sulfuryl
Transfer Reaction Catalyzed by Human Estrogen Sulfotransferase
P. Lin and L. Pedersen
• estrogen is one of the most
important hormones found in the
human body
• it is extremely important that the
body regulate estrogen, being able to
both turn it on and off
• the deactivation of estrogen takes
place by means of transfering a sulfate
group to the hormone
• the details of this important reaction
were investigated by means of a mixed
quantum and classical molecular
dynamics simulation, as shown in the
movie
• movie shows how the sulfate
group gets placed on the
estrogen
Summary
Scientific aims are to produce a set of scalable and portable
computational tools for multiscale biomolecular calculations
Considerable progress in number of aspects:
1. Development of accurate and efficient methods for treatment
of long-range electrostatic forces
2. Development of QMC and MD methods
3. Development of DFT and TF interface
4. PMEMD and AMBER 8.0
5. Applications