Parallel Computing in Chemistry

Download Report

Transcript Parallel Computing in Chemistry

Molecular Simulation at MCSR

Brian W. Hopkins Mississippi Center for Supercomputing Research

9 October 2008

What We’re Doing Here

• Talk about molecular simulation procedures.

• Talk about available simulation programs.

• Discuss user simulation needs and program options.

Why Simulation

• NOT so we can watch pretty pictures of atoms moving around!

• Many systems have very large numbers of internal DoF.

• It’s simply not possible to determine what is “the minimum” or whatever on these surfaces.

• To get an accurate thermodynamic & kinetic picture of these systems, it’s necessary to sample a medium->large region of phase space.

Simulation: Anatomy of A Program

• Read in sim paramters, atomic coordinates and topology. • Begin Timestepping – Apply periodic boundary conditions.

– Build neighbor list.

– Compute bonded forces.

– Compute short-rang nonbonded forces.

– Compute electrostatic forces.

– Integrate forward in time.

• Write atomic positions, velocities and forces.

• Take data for statistical accumulators. – Repeat.

• End timestepping.

• Print restart data.

Program Anatomy: Read Control

nafion #restart temperature 298.00

steps 200000 timestep 0.00050 ps ensemble nve evbconv 1e-6 movlp 1 cutoff 10.0 angstrom rvdw 9.0 angstrom delr 0.00 angstrom

Program Anatomy: Read Config

hcl in water 2 1 2000000 0.001

21.5518556898 0 0 0 21.5518556898 0 0 0 21.5518556898

…… OW 7 15.994900 -0.820000

4.5231E-01 4.3198E+00 -6.6975E+00 4.7263E+00 3.4690E-01 -6.3309E+00 -1.9170E+04 5.2043E+03 -2.1576E+03 HW 8 1.007800 0.410000

5.1477E-01 5.3695E+00 -6.6887E+00 1.0886E+01 -6.6502E+00 2.6571E+01 -6.1635E+02 -8.2402E+03 2.6958E+03 HW 9 1.007800 0.410000

-5.5940E-01 4.0344E+00 -6.5708E+00 7.6145E+00 -7.4611E-01 -1.0319E+01 1.8438E+04 1.8167E+03 8.6264E+02 …

Program Anatomy: Read Topology

UNITS kcal MOLECULAR TYPES 4 FLEXIBLE-TIP3P-Water NUMMOLS 332 atoms 3 OW 15.9949 -0.8200

HW 1.0078 0.4100

HW 1.0078 0.4100

BONDS 2 harm 1 2 1059.162 1.012

harm 1 3 1059.162 1.012

ANGLES 1 harm 2 1 3 75.900 113.240

FINISH …… vdw 9 OW OW lj 0.155425 3.16549

OT OW lj 0.1238000 3.1420000

HT OW lj 0.0025115 1.5827460

Program Anatomy: Set up Cell

Program Anatomy: Build Neighbor List

• Apply PBC to cell • Begin at an atom • Step out, forming thin shells.

• When an atom falls within the thin shell, add it to the neighbor list.

• Continue until cutoff is reached.

• Move on to the next atom.

Pairlisting

Bonded Forces

• Identify atoms that are connected by bonds, angles, dihedrals and impropers.

• Calculate energies and forces from functions, parameters in input: – Bonds: harmonic, morse – Angles: harmonic – Dihedrals: cosine – Impropers: cosine

Short Range Forces

• Loop over atoms. For each: • Loop over pairs • Calculate van der Waals potential, force • Calculate short-range (real) ES potential, force • Scale for 1-4 interactions?

Long-Range Forces

• Because vdW forces obey a 6-12 potential, they fall away very quickly wrt distance.

• Consequently, as you move away from an atom through a simulation cell you quickly come to a point where including additional vdW terms is irrelevant. Hence, the cutoff.

• Coulomb forces and energies, however, fall off with r 2 and r respectively.

• As a result, no reasonable cutoff will be satisfactory; results will always be strange.

Solution: Ewald Summation

• The coulomb equations converged much more quickly in Fourier space.

• So, we can transform all atomic coordinates to Fourier space, compute ES energies until some convergence criterion is reached, then transform the forces and energies back to real space.

• This is called an Ewald sum; almost all modern simulations use some form of it.

Ewald Sums Are Expensive

• Ewald sums still require computing interactions between atoms and many layers of periodic images.

• Fourier transforms are demanding.

• As much as 80% of a modern simulation is spent in the Ewald routines.

– 15% in pairlisting/vdw and just 5% everywhere else

Speeding up the Ewald sum

• Computational approaches: – Smarter decompositions – Faster comms – Single precision FFT • Physical approximations: – “Multiple timestepping” – Particle mesh approaches

The Particle Mesh Ewald Sum

• Atomic charges are first mapped onto a regular grid.

• The gridpoints are transformed, worked on, and transformed back.

• Forces are then decomposed from grid back onto atoms.

PME Costs and Benefits

• Scales with

N

log

N

rather than

N

2 • Does entail extra cost to map/unmap atomic charges to grid.

– Crossover point?

• Some energetic slop is associated with these methods.

• Some flavors (PPPM, &c.) may interfere with thermo and barostats.

Integration

• Once a full set of forces has been calculated, the atoms are allowed to move in response to those forces • This is done by calculating integral forms of Newton’s Laws.

– Verlet, leapfrog, velocity verlet, &c.

• How far can you integrate forward?

Accumulation

• Recall that the whole point here is to take large scale statistical averages of thermdynamic properties.

• Some of these properties are a lot easier to calculate inside the program than they will be to calculate from trajectories later.

• So, every so often, the program needs to take a second and compute – Properties in the current step – Rolling average from the previous steps.

Output: Accumulators

--------------------------------------------------------------------------------------------------------------------------------- step eng_tot temp_tot eng_cfg eng_vdw eng_cou eng_bnd eng_ang eng_dih eng_tet eng_kin time(ps) eng_pv temp_rot vir_cfg vir_vdw vir_cou vir_bnd vir_ang vir_con vir_tet conserv cpu (s) volume temp_shl eng_shl vir_shl alpha beta gamma vir_pmf press --------------------------------------------------------------------------------------------------------------------------------- 1 8.7047E+01 8.9004E+02 -3.4994E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 1.2204E+02 0.001 8.7047E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 1.98 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 9.0000E+01 9.0000E+01 9.0000E+01 0.0000E+00 0.0000E+00 rolling 8.7047E+01 8.9004E+02 -3.4994E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 1.2204E+02 averages 8.7047E+01 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 0.0000E+00 9.0000E+01 9.0000E+01 9.0000E+01 0.0000E+00 0.0000E+00 ----------------------------------------------------------------------------------------------------------------------------------

Output: Trajectory

timestep 1 48 2 0 0.001000

OW 1 15.994900 -0.820000

0.0000E+00 0.0000E+00 0.0000E+00 4.4176E-01 -2.3695E+00 -4.4226E+00 -1.1626E+04 -3.2094E+04 -1.7497E+04 HW 2 1.007800 0.410000

0.0000E+00 0.0000E+00 9.4700E-01 1.7204E-01 1.3940E+01 5.0345E+01 -4.1073E+03 1.5125E+03 3.6299E+04 HW 3 1.007800 0.410000

8.9567E-01 0.0000E+00 -3.1667E-01 7.7140E+01 4.7094E+00 -3.8531E+01 3.2399E+04 -1.4520E+03 -1.4832E+04 OW 4 15.994900 -0.820000

2.4602E+00 -1.0000E-06 -9.0073E-01 -1.1793E+01 -1.9654E-01 -5.1510E-01 -3.5980E+04 7.3873E+02 8.1414E+03 HW 5 1.007800 0.410000

3.0701E+00 -1.0000E-06 -1.7237E-01 9.2834E+00 -2.3632E+00 6.3046E+01 2.2288E+04 -5.2163E+01 2.1425E+04

Anatomy of a Simulation Project

• Identify a problem – Molecular system – Region of phase space • Build a system • Minimize • Equilibrate • Run MD • Analyze

Choosing a Program

• How intense is your job?

• Do you need particular features?

– Force functions – Thermostats – Barostat – Sampling methods – &c.

• Ask for help!!

Building Your System

• It’s nontrivial to place 1.5 million atoms in a cubic simulation cell.

• Biomolecules: PDB, homology modelling, &c.

• Others: replication of small boxes.

• Lots of tools for more complicated cases.

• Sometimes this is the hardest part.

Minimization

• You probably didn’t do a very good job building a box.

• If anything about your box makes the forcefield angry, it can explode under dynamics.

• Solution is to run minimization for some time to eliminate bad contacts.

Equilibration

• Minimization brings your box down to ~0K.

• You need to heat it back up to the intersting temperature.

• Assign velocities to atoms from a Boltzmann distribution.

• This will be wacky; temperature and energy will be unstable.

• To fix, run some “equilibration time” – Don’t take statistics – Periodically scale atomic velocities

Run

• Typical MD runs use a 0.5-2.0 fs timestep and run for 100ps-2ns.

• This means anywhere from 50,000 to 4 million timesteps.

• Trajectories are typically output every100-1000 timesteps.

• Statistics are taken every 100 or so timesteps.

How Do I Know My Simulation Doesn’t Suck?

• Is it conserving what it’s supposed to be conserving?

• Is it conserving what it’s not really supposed to be conserving?

• Are you in a region of phase space that interests you?

• Have you checked for common failures?

– Flying icecube, &c.

My Simulation is Over. What Now?

• Now you have large amounts of – thermochemical data – atomistic trajectories • From this you can calculate stuff – rates of reactions – diffusion coefficients – structural properties (rdf, sf, &c.)

How Do I Do That?

• Canned analysis programs – VMD!

• Write your own programs – Everybody does this!

• We may talk about this another time.

What We Can Do For You

• Help you select simulation software – Almost all MD software is free, so we can get and install most packages you could want.

• Help you set up your simulation cell.

• Help you manage your jobs as they run.

• Help you write, parallelize, and run analysis code.