Transcript week2

Week 2
Basics of MD Simulations
•
Lecture 3: Force fields (empirical potential functions among atoms)
Boundaries and computation of long-range interactions
•
Lecture 4: Molecular mechanics (T=0, energy minimization)
MD simulations (integration algorithms, constrained dynamics,
ensembles)
Empirical force fields (potential functions)
Problem: Reduce the ab initio potential energy surface among the atoms
U R i  

i j
zi z j e
2
Ri  R
 E e R i 
j
To a classical many-body interaction in the form
U R i  
 U 2 (R i , R j )   U 3 (R i , R j , R k )  
i j
i jk
Such a program has been tried for water but so far it has failed to produce a
classical potential that works. In strongly interacting systems, it is difficult to
describe the collective effects by summing up the many-body terms.
Practical solution: Truncate the series at the two-body level and assume
(hope!) that the effects of higher order terms can be absorbed in U2 by
reparametrizing it.
Non-bonded interactions
Interaction of two atoms at a distance R can be decomposed into 4 pieces
1. Coulomb potential (1/R)
2. Induced polarization (1/R2)
3. Dispersion (van der Waals) (1/R6)
4. Short range repulsion (eR/a)
The first two can be described using classical electromagnetism
1. Coulomb potential:
2. Induced polarization:
Dipole field
Total polarization int.
qi q j
U Coul 
R
,
R  Ri  R
j
p i   iE i
E pol 

3 ( p  Rˆ ) Rˆ   p 
3
1
R
U pol  
1
 iE i  E i

2
(0)
i
(Initial and final E fields
in iteration)
The last two interactions are quantum mechanical in origin.
Dispersion is a dipole-dipole interaction that arises from quantum
fluctuations (electronic excitations to non-spherical states)
U disp   3  
2
4R
6
Short range repulsion arises from Pauli exclusion principle (electron clouds
of two atoms avoid each other), so it is proportional to the electron density
U sr  Ae
R / a
The two interactions are combined using a 12-6 Lennard-Jones (LJ) potential

U LJ  4   R 
12
  R 
6

Here  corresponds to the depth of the potential at the minimum; 21/6σ
Combination rules for different atoms:  ij 
 i
j
 ij  ( i   j ) / 2
2.5
2
 
  3A
kT ,
4
1.5
U(r)
1
1
0.5
0
-0.5
3
3.5
4
4.5
5
r
12-6 Lennard-Jones potential (U is in kT, r in Å)
Because the polarization interaction is many-body and requires iterations,
it has been neglected in most force fields.
The non-bonded interactions are thus represented by the Coulomb and
12-6 LJ potentials.
Popular water models (rigid)
Model RO-H (Å) qHOH
qH (e)
 (kT)
 (Å)
m (D)
k (T=298 C)
SPC
1.0
109.5 0.410
0.262
3.166
2.27
65±5
TIP3P
0.957
104.5 0.417
0.257
3.151
2.35
97±7
Exp.
Ab initio
Gas
1.86
Water 3.00
SPC: simple point charge
TIP3P: transferable intermolecular potential with 3 points
80
There are hundreds of water models in the market, some explicitly
including polarization interaction. But as yet there is no model that
can describe all the properties of water successfully.
SPC and TIP3P have been quite successful in simulation of biomolecules,
and have become industry standards.
However, the mean field description of the polarization interaction is bound
to break in certain situations:
1. Ion channels, cavities
2. Interfaces
3. Divalent ions
To deal with these, we need polarizable force fields.
Grafting point polarizabilities on atoms and fitting the model to data has not
worked well. Ab initio methods are needed to make further progress.
Covalent bonds
In molecules, atoms are bonded together with covalent bonds, which
arise from sharing of electrons in partially occupied orbitals. In order to
account for the dipole moment of a molecule in a classical representation,
partial charges are assigned to the atoms. If the bonds are very strong,
the molecule can be treated as rigid as in water molecule. In most large
molecules, however, the structure is quite flexible and this must be taken
into account for a proper description of the molecules. This is literally
done by replacing the bonds by springs. The nearest neighbour
interactions involving 2, 3 and 4 atoms are described by harmonic and
periodic potentials
r
U bond 

ij
k ij
2
rij  r0 2  
bond stretching
ijk
q
k ijk
2
q ijk
q0  
bending
2
 V ijkl 1  cos n  ijkl
ijkl
torsion (not very good)

 0 
Interaction of two H atoms in the ground (1s) state can be described
using Linear Combinations of Atomic Orbitals (LCAO)
  c1 A (1s )  c 2 B (1s )
From symmetry, two solutions with lowest and highest energies are:
 
1
2
 A (1 s )   B (1 s ) 
+ Symmetric
- Anti-symmetric
The R dependence of the potential energy is approximately given by the
Morse potential

U Morse  D e 1  e
k
Where
De: dissociation energy
Re: equilibrium bond distance
 
k / 2 De
controls the width of the potential
Classical representation:
U bond 

 ( R  R0 ) 2
1
2
k ( R  R0 )
2
Electronic wave functions for the 1s, 2s and 2p states in H atom
H2 molecule:
In carbon, 4 electrons (in 2s and 2p) occupy 4 hybridized sp3 orbitals
1 
1

1
2

2
2
3 
1

1
4

2
2
 s   px
  py   px

 s   px
  py   px

 s   px
  py   px

 s   px
  py   px

Tetrahedral structure of sp3
Thus carbon needs 4 bonds to have a stable configuration.
Nitrogen has an extra electron, so needs 3 bonds.
Oxygen has 2 extra electrons, so needs 2 bonds.
For a complete description of flexibility, we need, besides bond stretching,
bending and torsion. The former can be described with a harmonic form:
U bend 
1
2
k q (q ijk  q 0 )
2
While the latter is represented with a periodic function
U torsion  V 1  cos n  ijkl   0 
Bond stretching and bending interactions are quite strong and well
determined from spectroscopy. Torsion is much weaker compared to
the other two and is not as well determined. Following QM calculations,
they have been revamped recently (e.g., CMAP corrections in the
CHARMM force field), which resulted in a better description of proteins.
An in-depth look at the force fields
Three force fields, constructed in the eighties, have come to dominate
the MD simulations
1. CHARMM (Karplus @ Harvard)
Optimized for proteins, works well also for lipids and nucleic acids
2. AMBER (Kollman & Case @ UCSF)
Optimized for nucleic acids, otherwise quite similar to CHARMM
3. GROMOS (Berendsen & van Gunsteren @ Groningen)
Initially optimized for lipids and did not work very well for proteins
(smaller partial charges in the carbonyl and amide groups) but it has
been corrected in the more recent versions.
The first two use the TIP3P water model and the last one, SPC model.
They all ignore the polarization interaction. Polarizable versions have
been under construction for over a decade but no working code yet.
Charm parameters for alanine
Partial charge (e)
ATOM N
-0.47
|
ATOM HN
0.31
ATOM CA
0.07
|
HB1
ATOM HA
0.09
|
/
GROUP
HN—N
HA—CA—CB—HB2
ATOM CB
-0.27
|
\
ATOM HB1
0.09
|
HB3
ATOM HB2
0.09
ATOM HB3
0.09
ATOM C
0.51
ATOM O
-0.51
Total charge:
0.00
O=C
|
U bond  k r ( r  r0 )
Bond lengths :
kr (kcal/mol/Å2)
N
CA
2
r0 (Å)
320.
1.430
(1)
CA C
250.
1.490
(2)
C
N
370.
1.345
(2)
(peptide bond)
O
C
620.
1.230
(2)
(double bond)
N
H
440.
0.997
(2)
HA CA
330.
1.080
(2)
CB CA
222.
1.538
(3)
HB CB
322.
1.111
(3)
1. NMA (N methyl acetamide) vibrational spectra
2. Alanine dipeptide ab initio calculations
3. From alkanes
Bond angles :
U angle  kq (q  q 0 )
kq (kcal/mol/rad2)
2
q0 (deg)
C
N
CA
50.
120.
(1)
C
N
H
34.
123.
(1)
H
N
CA
35.
117.
(1)
N
CA C
50.
107.
(2)
N
CA CB
70.
113.5
(2)
N
CA HA
48.
108.
(2)
HA CA CB
35.
111.
(2)
HA CA C
50.
109.5
(2)
CB CA C
52.
108.
(2)
N
C
CA
80.
116.5
(1)
O
C
CA
80.
121.
(2)
O
C
N
80.
122.5
(1)
Total 360 deg.
Total 360 deg.
Dihedrals:
U dihed 
Vn
2
Definition of the dihedral
angle for 4 atoms A-B-C-D
1  cos( n    0 ) 
Basic dihedral configurations
trans
cis
Dihedral parameters:
U dihed 
Vn
2
Vn (kcal/mol) n
C N
CA C
1  cos( n    0 ) 
0 (deg)
Name
0.2
1
180.

N CA C
N
0.6
1
0.

CA C N
CA
1.6
1
0.

H N CA CB
0.0
1
0.
H N CA HA
0.0
1
0.
C N CA CB
1.8
1
0.
CA C N H
2.5
2
180.
O
C N H
2.5
2
180.
O
C N CA
2.5
2
180.
 –helix structure
(Sasisekharan)

-helix
 ~ 57o
 ~ 47o

When one of the 4 atoms is not in a chain (e.g. O bond with C in CA-C-N),
then an improper dihedral interaction is used to constrain that atom.
Most of the time, off-the-chain atom is constrained to a plane using:
U imp  V 0 1  cos( q  q 0 ) 
CA
C
N
O
Where q is the angle C=O bond makes
with the CA-C-N plane, and q0 = 180
which enforces a planar configuration.
Boundaries
In macroscopic systems, the effect of boundaries on the dynamics of
biomolecules is minimal. In MD simulations, however, the system size
is much smaller and one has to worry about the boundary effects.
• Using nothing (vacuum) is not realistic for bulk simulations.
• Minimum requirement: water beyond the simulation box must be
treated using a continuum representation (reaction field). An
intermediate zone is treated using stochastic boundary conditions.
• Most common solution: periodic boundary conditions.
The simulation box is replicated in all directions just like in a crystal.
The cube and rectangular prism are the obvious choices for a box
shape, though there are other shapes that can fill the space (e.g.
hexagonal prism and rhombic dodecahedron).
Periodic boundary conditions in two dimensions: 8 nearest neighbours
Particles in the box freely move to the next box, which means they
appear from the opposite side of the same box.
In 3-D, there are 26 nearest neighbours.
Treatment of long-range interactions
Problem: the number of non-bonded interactions grows as N2 where N is
the number of particles. This is the main computational bottle neck that
limits the system size. A simple way to deal with this problem is to
introduce a cutoff radius for pairwise interactions (together with a
switching function), and calculate the potential only for those within the
cutoff sphere. This is further facilitated by creating non-bonded pair lists,
which are updated every 10-20 steps.
• For Lennard-Jones (6-12) interaction, which is weak and falls rapidly,
this works fine and is adapted in all MD codes.
• Coulomb interaction, however, is very strong and falls very slowly.
[Recall the Coulomb potential between two unit charges, U=560 kT/r (Å)]
Hence use of any cutoff is problematic and should be avoided.
This problem has been solved by introducing the Ewald sum method.
All the modern MD codes use the Ewald sum method to treat the longrange interactions without cutoffs.
Here one replaces the point charges with Gaussian distributions, which
leads to a much faster convergence of the sum in the reciprocal (Fourier)
space.
The remaining part (point particle – Gaussian) falls exponentially in real
space, hence can be computed easily using a cutoff radius (~10 Å).
The mathematical details of the Ewald sum method is given in the
following appendix. Please read it to understand how it works in practice.
Appendix: Ewald summation method
The coulomb energy of N charged particles in a box of volume V=LxLyLz
U Coul 
1
N
q i  ( ri ) ,

2
N
 ( ri ) 
i 1
Here
rn  ( n x L x , n y L y , n z L z )
'r
j 1 n
qj
ij
 rn
with integer values for n.
The prime on the sum signifies that i=j is not included for n=0.
Ewald’s observation (1921): in calculating the Coulomb energy of ionic
crystals, replacing the point charges with Gaussian distributions leads to
a much faster convergence of the sum in the reciprocal (Fourier) space.
The remaining part (point particle – Gaussian) falls exponentially in real
space, hence can be computed easily.
The Poisson equation and it’s solution:
    4 
2
 

 (r ' )
 r  r'
3
d r'
For a point charge q at the origin:
 point  q  (r )
 

q
r
When the charge q has a Gaussian distribution:
 Gauss ( r )  q
 (r )  q
where




e
3/2

e
 r
2
 r '
2
3
3/2
3
3
2

  Gauss ( r ) d r  q
3
,
2
d r'
r  r'
erf( x ) 
2
q
erf(  r )
r
x
e
0
u
2
du
is the error function
This result is obtained most easily by integrating the Poisson equation
r
d
1 d
2
r dr
2
r    4 q


3/2

2
3
3
 r
2
e
r
 dr ' 2 r ' dr '   4  q  3 / 2  e

 r '
2
2
r ' dr '

d
r    4 q
dr
Integrate 0 to r :
2
r (r )  q
 (r ) 
q
r
2



3
3/2
r
e
1
2
 r '
0
erf(  r )
2
2
2
 r
2
e
dr '  q
2
2

r
e
0
u
2
du  q erf(  r )
Writing the charge density as
 point   Gauss  (  point   Gauss )
For the second part, the potential due to a charge qj at rj is given by:
 ( ri ) 
qj
rij
1  erf(  rij )  
qj
rij
erfc(  rij )
 r
2
Where erfc(x) is the complementary error function which falls as e
2
Thus choosing 1/ about an Angstrom, this potential converges quickly.
Typically, it is evaluated using a cutoff radius of ~10 Å, so the original box
with N particles is sufficient for this purpose (with the nearest image
convention). The direct (short-range) part of the energy of the system is:
U direct 
1
N
q i  ( ri ) ,

2
i 1
N
 ( ri ) 

qj
r
j 1 ij
erfc(  rij ),
The Gaussian part converges faster in the reciprocal (Fourier) space
hence best evaluated as a Fourier series
1
f (r ) 
V
~
f (k ) 
V

~
i k r
f (k ) e
,
k  2  n x L x , n y L y , n z L z 
n
f (r ) e
 i k r
3
d r
The Poisson equation in the Fourier space
2
1


V

1
i k r 
~

  ( k ) e    4 V

n

~ ( k ) e
i k r
n
2
k ~ ( k )  4  ~ ( k )
For a point charge q at the origin:
~ ( k ) 

q  (r ) e
 i k r
3
d rq

4 q
~
 (k ) 
2
k
When the charge q has a Gaussian distribution:
~ ( k )  q
q


3/2


3
3
3/2

 r
2
e

2
e

 i k r
3
d r

 dx  dy  dz e


Multiply each integral by
2
2
2
2
 ( x  y  z )  i ( k x x  k y y  k z z )
e

e
k
2
4
2
to compete the square
Each Gaussian integral then gives
The corresponding potential in the Fourier space
4 q  k 2
~
 (k ) 
e
2
k
4
2
The Gaussian charge density for the periodic box

N
 Gauss ( r ) 
qj
j 1 n
~ Gauss ( k ) 
V
3
d re
3/2
 i k r
e


r  ( r j  rn )
N
qj
3
d re
 i k r
qje
N
 i k r j
e
k
2


qj
4
2
3
3/2

e

3

3/2
 i k r j
k
j 1
all space
N
2
j 1 n



3
e
2
r  ( r j  rn )

2
4
2
2
j 1
Which yields for the potential
~
Gauss
(k ) 
4
k
2
N
qje
j 1
e
2
r r j
2
2
Transforming back to the real space
 Gauss ( r ) 
1
V
1

V
i k r
 ~Gauss ( k ) e
n0
N

4 q j
k
n  0 j 1
2
e
k
2
4
2
e
i k ( r  r j )
The reciprocal space (long-range) part of the system’s energy:
U recip 
N
1
q i  Gauss

2
( ri )
i 1

1
2V
N
 
n  0 i , j 1
4 q i q j
k
2
e
k
4 ~
2 k 2

 (k ) e

2
2V n  0 k
1
2
4
4
2
2
e
i k  ( ri  r j )
This energy includes the self-energy of the Gaussian density which needs
to be removed
 Gauss ( r ) 
q
erf(  r )
r
2q
 self ( r  0 ) 
U self 

N
1
q i  self

2
( ri )
i 1

So the total energy is:


N

2
qi
i 1
U total  U direct  U recip  U self
Molecular mechanics
Molecular mechanics deals with the static features of biomolecular
systems at T=0 K, that is, particles do not have any kinetic energy.
[Note that in molecular dynamics simulations, particles have an average
kinetic energy of (3/2)kT, which is substantial at room temp., T=300 K]
Thus the energy is given solely by the potential energy of the system.
Two important applications of molecular mechanics are:
1. Energy minimization (geometry optimization):
Find the coordinates for the configuration that minimizes the potential
energy (local or absolute minimum).
2. Normal mode analysis:
Find the vibrational excitations of the system built on the absolute
minimum using the harmonic approximation.
Energy minimization
The initial configuration of a biomolecule  whether experimentally
determined or otherwise  does not usually correspond to the minimum
of the potential energy function. It may have large strains in the
backbone that would take a long time to equilibrate in MD simulations,
or worse, it may have bad contacts (i.e. overlapping van der Waals
radii), which would crash the MD simulation in the first step!
To avoid such mishaps, it is a standard practice to minimize the energy
before starting MD simulations.
Three popular methods for analytical forms:
• Steepest descent (first derivative)
• Conjugate gradient (first derivative)
• Newton-Raphson (second derivative)
Steepest descent:
Follows the gradient of the potential energy function U(r1, …,rN) at each
step of the iteration
n 1
ri
 ri   i f i ,
n
n
n
f i   i U ( r1 ,  , r N ) n
where i is the step size. The step size can be adjusted until the minimum
of the energy along the line is found. If this is expensive, a single step is
used in each iteration, whose size is adjusted for faster convergence.
• Works best when the gradient is large (far from a minimum), but tends
to have poor convergence as a minimum is approached because the
gradient becomes smaller.
• Successive steps are always mutually perpendicular, which can lead to
oscillations and backtracking.
A simple illustration of steepest descent with a fixed step size in
a 2D energy surface (contour plots)
Conjugate gradient:
Similar to steepest descent but the gradients calculated from previous
steps are used to help reduce oscillations and backtracking
n 1
ri
 ri   i δ i ,
n
n
n
n
n 1
δ i  fi  δ i
n
fi
2
n 1
fi
2
,
n
f i   i U ( r1 ,  , r N ) n
(For the first step,  is set to zero)
• Generally one finds a minimum in fewer steps than steepest descent,
e.g. it takes 2 steps for the 2D quadratic function, and ~n steps for nD.
• But conjugate gradient may have problems when the initial
conformation is far from a minimum in a complex energy surface.
• Thus a better strategy is to start with steepest descent and switch to
conjugate gradient near the minimum.
Newton-Raphson:
Requires the second derivatives (Hessian) besides the first.
Predicts the location of a minimum, and heads in that direction.
To see how it works in a simple situation, consider the quadratic 1D case
f ( x )  ax
2
x min  x 0 
 bx  c ,
f'
f"
 x0 
x0
f ' 0
2 ax 0  b

x min  

2a
b
2a
b
2a
In general
n 1
ri
n
n 1 n
 ri  [ H i ] f i ,
2
n
[ H i ] kl 
 U ( r1 ,  , r N )
x k xl
n
For a quadratic energy surface, this method will find the minimum in one
step starting from any configuration.
• Construction and inversion of the 3Nx3N Hessian matrix is
computationally demanding for large systems (N>100).
• It will find a minimum in fewer steps than the gradient-only methods in
the vicinity of the minimum.
• But it may encounter serious problems when the initial conformation is
far from a minimum.
• A good strategy is to start with steepest descent and then switch to
alternate methods as the calculations progress, so that each algorithm
operates in the regime for which it was designed.
Using the above methods, one can only find a local minimum.
To search for an absolute minimum, Monte Carlo methods are more
suitable. Alternatively, one can heat the system in MD simulations,
which will facilitate transitions to other minima.
Normal mode analysis
Assume that the minimum energy of the system is given by the 3N
coordinates, {r0i}. Expanding the potential energy around the
equilibrium configuration gives
 ri  ri
N
0
U ( r1 ,  , r N )  U ({ ri }) 
0
i 1

1
2
   iU { r }  0
0
i
0
0



r

r



U

r

r
 i i
i
j
j
j  
{r }
N
N
i 1 j 1
0
i
Ignoring the constant energy, the leading term is that of a system of
coupled harmonic oscillators. In a normal mode, all the particles in the
system oscillate with the same frequency . To find the normal modes,
first express the 3N coordinates as {xi, i=1,…,3N}.
The potential energy becomes
U ( x1 ,  , x 3 N ) 
x i  x i


2
1
3N 3N
0
K ij x j  x 0j 
i 1 j 1
where the spring constants are given by the Hessian as
2
K ij 
 U
xi x j
0
{ xi }
Introducing the 3Nx3N diagonal mass matrix M
M  diag( m1 , m1 , m1 , m 2 , m 2 , m 2 ,  , m N , m N , m N )
The secular equation for the normal modes is given by
M
1 / 2
KM
1 / 2
 I  0
2
For a 3Nx3N matrix, solution of the secular equation will yield 3N
eigenvalues, i and the corresponding eigenvectors, i
Of these, 3 correspond to translations and 3 to rotations of the system.
Thus there are 3N-6 intrinsic vibrational modes of the system.
At a given temperature T, the motion of the i’th coordinate is given by
 kT
xi (t )   
2

j  7  m i j
3N




1/ 2
 ij cos(  j t   j )
The mean square displacement of the coordinates and atoms:
(  xi )
(  rn )
2
2
 (  x3n )

1

kT
2
j
 (  x 3 n 1 )
2
j7
 ij
2
m i
2
2
3N
 (  x3n  2 )
2
A simple example: normal modes of water molecule
Water molecule has 9-6=3 intrinsic vibrations, which correspond to
symmetric and anti-symmetric stretching of H atoms and bending.
Because of H-bonding, water molecules in water cannot freely rotate
but rather librate (wag or twist).
Wave numbers: 3652 cm-1
3756 cm-1
1595 cm-1
(200 cm-1~1 kT)
Excitation energies >> kT, which justifies the use of a rigid model for water.
Normal modes of a small protein BPTI (Bovine pancreatic tyripsin inhibitor)
Some characteristic frequencies (in cm-1)
Stretching
H-N: 3400-3500
H-C: 2900-3000
C=C, C=O: 1700-1800
C-C, C-N: 1000-1250
Bending
H-C-H: 1500
H-N-H: 1500
C-C=O: 500
S-S-C: 300
Torsion
C=C: 1000
C-O: 300-600
C-C: 300
C-S: 200
Applications to domain motions of proteins
• Many functional proteins have two main configurations called, for
example, resting & excited states, or open & closed states.
• Proteins can be crystallized in one of these states – the other
configuration need to be found by other means.
• The configurational changes that connect these two states usually
involve large domain motions that could take milli to micro seconds,
which is outside the reach of current MD simulations.
• Normal mode analysis can be used to identify such collective motions
in proteins and predict the missing state that is crucial for description of
the protein function.
• Examples:
1. Gating of ion channels (open & closed states)
2. Opening and closing of gates in transporters
Molecular dynamics
In MD simulations, one follows the trajectories of N particles according
to Newton’s equation of motion:
mi
d
2
dt
2
ri  F i ,
Fi   iU ( ri ,  , r N )
where U(r1,…, rN) is the potential energy function consisting of bonded
and non-bonded interactions (Coulomb and LJ 6-12).
We have already discussed force fields and boundary conditions in some
detail. Here, we will consider:
• Integration algorithms, e.g., Verlet, Leap-frog
• Initial conditions and choice of the time step
• Constrained dynamics for rigid molecules (SHAKE and RATTLE)
• MD simulations at constant temperature and pressure
Integration algorithms
Given the position and velocities of N particles at time t, a straightforward
integration of Newton’s equation of motion yields at t+t
v i (t   t )  v i (t ) 
Fi ( t )
t
(1)
mi
ri ( t   t )  ri ( t )  v i ( t )  t 
Fi ( t )
t
(2)
2
2mi
In practice, variations of these equations are implemented in MD codes
In the popular Verlet algorithm, one eliminates velocities using the
positions at t-t,
ri ( t   t )  ri ( t )  v i ( t )  t 
Adding with eq. (2), yields:
Fi ( t )
t
2
2mi
ri ( t   t )  2 ri ( t )  ri ( t   t ) 
Fi ( t )
mi
t
2
This is especially useful in situations where one is interested only in the
positions of the atoms. Velocities can be calculated from
v i (t ) 
or
v i (t ) 
1
2t
1
t
[ ri ( t   t )  ri ( t   t )]
[ ri ( t   t )  ri ( t )]
Some drawbacks of the Verlet algorithm:
• Positions are obtained by adding a small quantity (order t2) to large
ones, which may lead to a loss of precision.
• Velocity at time t is available only at the next time step t+t
• It is not self starting. At t=0, there is no position at t-t. It is usually
estimated using
ri (   t )  ri ( 0 )  v i ( 0 )  t
In the Leap-frog algorithm, the positions and velocities are calculated
at different times separated by t/2
v i (t   t / 2 )  v i (t   t / 2 ) 
Fi ( t )
t
mi
ri ( t   t )  ri ( t )  v i ( t   t / 2 )  t
To show its equivalence to the Verlet algorithm, consider
ri ( t   t )  ri ( t )  v i ( t   t / 2 )  t 
Fi ( t )
t
2
mi
ri ( t )  ri ( t   t )  v i ( t   t / 2 )  t
Subtracting the two equations yields the Verlet result.
If required, velocity
at t is obtained from:
v i (t ) 
1
2
[ v i ( t   t / 2 )  v i ( t   t / 2 )]
To iterate these equations, we need to specify the initial conditions.
• The initial configuration of biomolecules can be taken from the Protein
Data Bank (PDB) (if available).
• In addition, membrane proteins need to be embedded in a lipid bilayer.
VMD has a facility that will perform this step with minimal effort.
• All the MD codes have facilities to hydrate a biomolecule, i.e., fill the
void in the simulation box with water molecules at the correct density.
• Ions can be added at random positions. Alternatively, VMD solves the
Poisson-Boltzmann equation and places the ions where the potential
energy are at minimum.
After energy minimization, these coordinates provide the positions at t=0.
Initial velocities are sampled from a Maxwell-Boltzmann distribution:
 mi 
P ( v ix )  

 2  kT 
1/ 2

2
exp  m i v ix 2 kT

Choosing the time step:
In choosing a time step, one has to compromise between two conflicting
demands:
• In order to obtain statistically significant results and access biologically
relevant time scales, one needs long simulation times, which requires
long time steps
• To avoid instabilities, conserve energy, etc., one needs to integrate the
equations as accurately as possible, which requires using short time
steps.
System
Types of motion
Recom. time step (fs)
Atoms
translation
10 fs
Rigid molecules
… + rotation
5 fs
Flex. mol’s, rigid bonds
… + torsion
2 fs
Completely flex. mol’s
….+ vibrations
1 - 0.5 fs
Constrained dynamics
MD simulations are carried out most efficiently using the 3N Cartesian
coordinates of the N atoms in the system. This is fine if there are no rigid
bonds among the atoms in the system. For rigid molecules (e.g. water),
one needs to impose constraints on their coordinates so that their rigid
geometry is preserved during the integration. This is achieved using
Lagrange multipliers. Consider the case of a rigid diatomic molecule,
where the bond length between atoms i and j is fixed at d, which imposes
the following constraint on their coordinates
 ij  ( ri  r j )  d
2
2
0
The force on atom i due to this constraint can be written as
Fi    i  ij  2  ( ri  r j )
Note that
F j    j  ij   2  ( ri  r j )   Fi
Thus the total constraint force on the molecule is zero, and the constraint
forces do not do any work. Incorporating the constraint forces in the
Verlet algorithm, the positions of the atoms are given by
ri ( t   t )  2 ri ( t )  ri ( t   t ) 
Fi ( t )
2
t 
mi
r j (t   t )  2 r j (t )  r j (t   t ) 
mi
F j (t )
m
2
2
t 
2
m
j
2
 t [ ri ( t )  r j ( t )]
2
 t [ ri ( t )  r j ( t )]
j
To simplify, combine the known first three term on the r.h.s. as
ri ( t   t )  ri ' ( t ) 
2
r j (t   t )  r j ' (t ) 
mi
2
 t [ ri ( t )  r j ( t )]
2
m
j
2
 t [ ri ( t )  r j ( t )]
(1)
(2)
A third equation is given by the constraint condition
ri ( t   t )  r j ( t   t )
2
 ri ( t )  r j ( t )
2
d
2
Substituting eqs. (1) and (2) above yields
 1
1  2

  t [ ri ( t )  r j ( t )]
ri ' ( t )  r j ' ( t )  2 

m

 i mj
2
d
2
This quadratic equation can be easily solved to obtain the Lagrange
multiplier . Substituting  back in eqs. (1) and (2), one finds the positions
of the atoms at t+t.
For a rigid many-atomic molecule, one needs to employ a constraint for
every fixed bond length and angle. (Bond and angle constraints among
three atoms (i, j, k) can be written as three bond constraints.)
Assuming k constraints in total, this will lead to k coupled quadratic
equations, which are not easy to solve.
A common approximation is to exploit the fact that  is small, hence the
quadratic terms in  can be neglected. This leads to a system of k linear
equations, which can be solved by matrix inversion.
In the SHAKE algorithm, a further simplification is introduced by solving
the constraint equations sequentially one by one. This process is then
iterated to correct any constraint violations that arise from the neglect of
the coupling in the constraint equations.
SHAKE is commonly used in fixing the bond lengths of H atoms.
Another well-known constraint algorithm is RATTLE, which is designed
for the velocity-Verlet integrator, where velocities as well as positions are
corrected for constraints.
MD simulations at constant temperature and pressure
MD simulations are typically performed in the NVE ensemble, where all 3
quantities are constant. Due to truncation errors, keeping the energy
constant in long simulations can be problematic. To avoid this problem,
the alternative NVT and NPT ensembles can be employed. The
temperature of the system can be obtained from the average K. E.
K 
3
NkT
2
Thus an obvious way to keep the temperature constant at T0 is to scale
the velocities as:
v i ( t )   v i ( t ),
 
T0 T (t )
Because K. E. has considerable fluctuations, this is a rather crude method.
A better method which achieves the same result more smoothly is the
Berendsen method, where the atoms are weakly coupled an external
heat bath with the desired temperature T0
mi
d
2
dt
2
 T0
 d ri
ri  F i  m i  i 
 1
 T (t )
 dt
If T(t) > T0 , the coefficient of the coupling term is negative, which
invokes a viscous force slowing the velocity, and vice-versa for T(t) < T0
Similarly in the NPT ensemble, the pressure can be kept constant by
simply scaling the volume. Again a better method (Langevin piston), is
to weakly couple the pressure difference to atoms using a force as in
above, which will maintain the pressure at the desired value (~1 atm).
Monte Carlo (MC)
Metropolis algorithm:
Given N particles interacting with a potential energy function U(r1,…, rN)
Probability:
P  e
 U / kT
• Assume some initial configuration with energy U0
• Move the particles by small increments to new positions with energy U1
• If U1 < U0, accept the new configuration
• If U1 > U0 , select a random number r between 0 and 1, and accept the
new configuration if
exp  (U 1  U 0 ) / kT   r
• Keep iterating until the minimum energy is found