Transcript Slide 1
Tools and methods for multiscale biomolecular simulations Celeste Sagui Department of Physics, NC State University, Raleigh NC We represent a partnership of 7 researchers located at the 3 Universities and the National Institute of Environmental and Health Sciences (NIEHS), all located within North Carolina’s Research Triangle NC State Members: Jerry Bernholc, Lubos Mitas, Christopher Roland and Celeste Sagui (PI) – Physics UNC Member: Lee Pedersen – Chemistry and NIEHS Duke Member: John Board – Computer Science NIEHS Collaborator: – Tom Darden – Structural Biology Lab ITR Scientific Aims • explore science as to enable a set of scalable computational tools for large-scale, multiscale biomolecular simulations • multiscale methods are to range from Quantum Monte Carlo (QMC) to continuum methods • codes will be based on real-space grids, with multigrid acceleration and convergence • electrostatics will be treated in a highly efficient and accurate manner • codes will be used to solve paradigmatic biomolecular problems • codes will ultimately be distributed under the Open Source GPL license Some Highlights of Current Progress 1. Accurate and efficient electrostatics for largescale biomolecular simulations (Sagui, Darden, Roland) 2. 3. 4. 5. Coupling of QMC and MD (Mitas) Thomas Fermi and DFT (Bernholc) PMEMD and AMBER 8.0 (Pedersen, Duke) Applications Classical Molecular Dynamics Developments: Accurate Electrostatics Accurate and Efficient Electrostatics for Large-Scale Biomolecular Simulations Accurate electrostatics is absolutely essential for meaningful biomolecular simulations (i.e., they stabilize the delicate 3-d structures, bind complexes together, represent the computational bottleneck in current simulations, etc) Key Challenges: (a) More accurate description of electrostatics with higher-order multipoles is needed (Our solution: Wannier functions) (b) Computationally efficient ways of simulating such systems are needed (Our solution: advanced PME/ Multigrid approaches) Limitations of Current Modeling of Electrostatic Fields Higher-order multipoles have not been implemented due to their overwhelming costs: 1. Multipoles up to hexadecapoles have 35 degrees of freedom, so that interaction matrix between them has 1225 components i.e., cost of fixed cutoff implementation is 3 orders of magnitude more than just charges alone !! 2. Ewald implementation grows like O(N2) 3. Use of cutoffs alleviate the problem: WRONG !! much of the cost originates in the direct part truncation leads to artifactual behavior unless cutoffs of the order of 25 Å are used Our Solution 1. Implement a McMurchie-Davidson formalism for the direct part of the Ewald summation 2. Switch most of the calculation to reciprocal space 3. Implement a Particle-Mesh Ewald (PME)-based approach for single-processor machines 4. Implement a multigrid-based approach for parallel machines Sagui, Pedersen, and Darden, J. Chem. Phys. 120, 73 (2004) PME-based results for 4096 water molecules Rc Spline hx Direct Recip rocal (Ang-1) (Ang) Order (Ang) (sec) (sec) charges 0.50 5.63 5 0.775 0.21 0.18 dipoles 0.50 5.60 5 0.775 0.33 0.21 quadrupoles 0.55 5.10 6 0.668 0.57 0.32 octupoles 0.70 4.25 8 0.620 0.93 0.60 hexadecapoles 0.85 3.60 8 0.459 1.54 1.12 Overall (sec) 0.42 0.58 0.96 1.70 3.05 Relative RMS force error: 5x10-4 ; for error 5x10-5, hexadecapole cost is 4.4 secs; 5x10-6 – cost is 5 secs With Rc=8 Å cutoff, costs is 6 times more than with PME and has a RMS error of about 0.05 Single processor Intel Xeon, 3.06 GHz, 512kB cache, 2GB memory, g77 compiler Multigrid-based results for 4096 water molecules (Ang-1) Rc (Ang) RG (Ang) hx (Ang) Direct (sec) Recip rocal (sec) charges 0.60 5.20 3.50 0.620 0.20 2.30 dipoles 0.61 5.20 3.63 0.620 0.29 2.66 quadrupoles 0.70 4.80 3.45 0.516 0.52 4.64 octupoles 0.75 4.25 3.10 0.443 0.94 8.42 hexadecapoles 0.79 4.25 3.05 0.388 2.21 15.71 Relative RMS force error: 5x10-4 Single processor Intel Xeon, 3.06 GHz, 512kB cache, 2GB memory, g77 compiler MD simulation of DNA decamer d(CCAACGTTGG)2 in a crystal environment J. Baucom et al, submitted to J. Chem. Phys. 2004 RMS deviation of crystal simulations at constant pressure with respect to crystal structure charges only charges and induced dipoles Calculating Multipole Moments via Wannier Functions (WFs) • to partition the charge density and calculate the multipole moments, we use a WANNIER FUNCTION (WF) approach • this has several advantages: 1. WFs provide for a chemical and physically intuitive way of partitioning the charge (ref. Marzari and Vanderbilt, PRB 56, 12847 (1997)) 2. WFs are distributed in space, which allows for a more faithful representation of the electrostatic potential 3. no ad hoc assignment of the charge 4. numerically quite stable procedure Ref: C. Sagui, P. Pomorski, T. Darden and C. Roland, J. Chem. Phys. 120, 4530 (2004) Maximally localized Wannier functions for water • water molecule has 4 WFs • 2 associated with OH bond (light blue) • 2 associated with O lone pairs (dark blue) Electrostatic potential for single water molecule as generated by WFs ab initio m mdq mdqo mdqoh Wannier functions for carbon dioxide • CO2 has 8 WFs • 6 associated with the CO bonds (light blue) • 2 associated with O (dark blue) Electrostatic potential for carbon dioxide molecule as generated by WFs ab initio m mdq mdqo mdqoh Quantum Monte Carlo Developments New continuous quantum Monte Carlo/molecular dynamics method • we propose a new method for coupling ab initio molecular dynamics ionic with stochastic DMC electronic steps to provide accurate DMC energies “on-the-fly” • exploits the slowness of MD evolution which enables to update the QMC sampling process very efficiently • accurate for both thermal averages and description of energies along the pathways • we have carried out the first QMC/MD simulations using both forces and energies from QMC Ref: J. Grossmann and L. Mitas, preprint 2004 Coupling of QMC and MD: Basic Idea average distance made by an ion in one MD time step 10-4 … 10-3 a.u. average distance by an electron in a typical DMC time step 10-2 … 10-1 a.u. Instead of discrete sampling of each point with a new QMC run: calculate QMC energies “on-the-fly” during the dynamic simulation ! Continuously update the DMC walkers so that they correctly represent the evolving wave function (CDMC method) Evolution of both configuration spaces is coupled: as the ionic dynamical trajectories evolve, so does the population of DMC electrons Stable CDMC simulation Successful CDMC Algorithm ab initio MD Step R, (R) Compute orbital overlaps with current DMC walkers Orbital swapping or rotation? no Check for node crossings • Stable DMC population compute weights • How accurate is it? Take DMC step(s) and calculate energy • Benchmark against discrete DMC do VMC then DMC yes CDMC: Number of DMC step needed per MD step • Use large discrete sampled runs (1000 steps each) for comparison • As simulation progresses, 1step CDMC energies begin to differ significantly from discrete DMC • Using 3 steps corrects time “lag” Thermal Averages (over 1 ps) E(discrete DMC) = -6.228(2) E(1 step continuous) = -6.220(2) E(2 steps continuous) = -6.220(2) E(3 steps continuous) = -6.226(2) E(10 steps continuous)= -6.230(2) E(20 steps continuous)= -6.228(2) • 33 times more efficient than discrete sampling • Thermal averages are converged for N≥3 • Same convergence (3 CDMC steps) observed for Si2H6 and Si5H12 CDMC: Si2H6 As for SiH4, asymmetrically stretch molecule and let go QuickT ime™ and a YUV420 codec decompressor are needed t o see this pict ure. Average temperature ~ 1500 K CDMC: Si2H6 Results ForSi2H6, 3 steps appears to lead to stability as for SiH4 # steps looks like a function of dynamics rather than size Can pinpoint specific types of strain that lead to wf lag Test of quantum Monte Carlo/molecular dynamics method on water dissociation “QMC only” molecular dynamics, with no external input from DFT SiH4 at 1500 K • DMC forces in very good agreement with DFT forces H2O Dissociation • DMC-MD and DFT-MD trajectories are in excellent agreement Density Functional Theory Developments Development of hydrid QM calculations: interfacing DFT with Thomas-Fermi calculations Idea: in many biological systems, only part of the system is chemically active Use ab initio methods for this part (real-space, multigrid-based code) Use more approximate methods for the rest of the system (in this case Thomas Fermi approach, with frozen density for molecules, and gradient corrections) Ref: M. Hodak, W. Lu, and Bernholc, in preparation Hybrid calculation tests ● Interaction of two water molecules in hybrid calculation - Hydrogen bonding test - Molecule 1: Ab initio - Molecule 2: Thomas-Fermi Gives estimated speed-up of 500 times !! Parallelization and Coding Advances Improvements in performance and parallel scalability of AMBER MD Software A redesign of the AMBER SANDER program, along with a rewrite in FORTRAN 90 were undertaken with the goal of substantially improving the practicality of multi-nanosecond PME simulations (i.e., in 100,000 to 300,000 atom range) Resulting software has been released to the AMBER community in 3 phases – the new software is named PMEMD for PARTICLE MESH EWALD MOLECULAR DYNAMICS Release Dates: July 2003 – PMEND 3.00 October 2003 – PMEMD 3.10 March 2004 – PMEMD 8.0 (part of AMBER 8.0) PMEMD is now the primary high performance MD modeling tool in AMBER !! Representative Benchmark Results for the Factor IX constant pressure system from Dr. Lalith Perera, a solvated system with a total of 90906 atoms. The time step was 1.5 fs, the direct force cutoff was 8.0 angstrom, all simulations used PME. The runs were done on the IBM 1.3 GHz p690 Regatta at the Edinburgh Parallel Computing Centre. #procs 8 16 32 64 96 112 128 PMEMD 8 psec/day nd 672 1125 1975 2743 2945 2516* PMEMD 3.1 psec/day 346 607 1035 1770 2304 2631 2864 PMEMD 3.0 psec/day 353 594 929 1127 nd nd nd Sander 8 psec/day nd nd nd 369 nd nd 339 Sander 7 psec/day 233 279 306 318 nd nd nd * Performance falloff observed here. Max performance obtained at higher processor count was 3600 psec/day, but required using only 4 of the 8 cpu's on each 8 cpu multi-chip module of the SP4. Sander 6 psec/day 182 258 297 339 nd nd nd Software References Maximum throughput obtainable for SP4 is up an order of magnitude for PMEMD 8, than any other version of SANDER Software Publication References: R.E. Duke and L.G. Pedersen, PMEMD 3.0 (2003) R.E. Duke and L.G. Pedersen, PMEMD 3.1 (2003) D.A. Case et al, AMBER 8.0 (2004) Some Applications … 1. QM/MM studies of enzymatic sulfate transfer in the heparan sulfate precursor 2. QM/MM studies of enzymatic sulfate transfer in estrogen sulfation 3. PMEMD study of the mammalian P450 enzyme and the ternary blood coagulation complex tissue factors 4. Protein folding study on ionic domains of the coagulation protein protothrombin 5. Solvation and deprotonation of formic acid 6. Crystallographic studies of DNA 7. Binding of vancomycin and teicoplanin antibiotics to bacterial cell wall termini 8. Structure and function of serine proteases 9. QM/MM study of role of Mg ions in the mechanism of DNA polymerase Mixed Quantum and Molecular Mechanics Simulations of Sulfuryl Transfer Reaction Catalyzed by Human Estrogen Sulfotransferase P. Lin and L. Pedersen K47 PAPS E2 H107 Mixed Quantum and Molecular Mechanics Simulations of Sulfuryl Transfer Reaction Catalyzed by Human Estrogen Sulfotransferase P. Lin and L. Pedersen • estrogen is one of the most important hormones found in the human body • it is extremely important that the body regulate estrogen, being able to both turn it on and off • the deactivation of estrogen takes place by means of transfering a sulfate group to the hormone • the details of this important reaction were investigated by means of a mixed quantum and classical molecular dynamics simulation, as shown in the movie • movie shows how the sulfate group gets placed on the estrogen Summary Scientific aims are to produce a set of scalable and portable computational tools for multiscale biomolecular calculations Considerable progress in number of aspects: 1. Development of accurate and efficient methods for treatment of long-range electrostatic forces 2. Development of QMC and MD methods 3. Development of DFT and TF interface 4. PMEMD and AMBER 8.0 5. Applications