Struktura i Energetyka Białek

Download Report

Transcript Struktura i Energetyka Białek

THEORETICAL METHODS TO STUDY
PROTEIN FOLDING: EMPIRICAL
FORCE FIELDS
QM
QM/MM
Averaging over individual components
Individual
components
Atomisticallydetailed
All-atom
Unitedatom
Description
level
Coarse-grained
System level
(Networks)
PDEs to describe
reaction/diffusion
Network graphs
Residue
level
Molecule/
domain
level
Averaging over „less important” degrees of freedom
Fully-detailed
Anfinsen’s thermodynamic hypothesis.
“The studies on the renaturation of fully denaturated ribonuclease required many
supporting investigations to establish, finally, the generality which we have
occasionally called the ‘thermodynamic hypothesis’. This hypothesis states that the
three-dimensional structure of a native protein in its normal physiological milieu
(solvent, pH, ionic strength, presence of other components such as metal ions or
prosthetic group, temperature and other) is the one in which the Gibbs free energy of
the whole system is lowest; that is, the native conformation is determined by the
totality of interatomic interactions and hence by the amino acid sequence in a given
environment.”
C.B. Anfinsen, Science, 181, 223-230, 1973.
To facilitate the implementation of this hypothesis in protein-structure
prediction, “free energy” was replaced with “potential energy”.
“Potential energy” or “free energy”?
Nature (and a canonical
simulation) finds the basin with
the lowest free energy, at a
given temperature which might
happen to but does not have to
contain the conformation with
the lowest potential energy.
The global-optimization
methods are desinged to find
structures with the lowest
potential energy, thus ignoring
conformational entropy.
Technically this corresponds to
canonical simulations at 0 K.
The stability of the structures of biological macromolecules
results from special structure of their energy landscapes,
which can be termed “minimal frustration” or “funnel-like
structure”. A good example is the pit dug by antlion larva.
Theoretical studies of protein
structure and protein folding
• Need to express energy of a system as
function of coordinates
• Need an algorithm to explore the
conformational space
From Schrödinger equation to analytical all-atom
potentials
E (R1 , R 2 ,...,R N ; r1 , r2 ,...,rn ) 
 Hˆ 
 
Hˆ   E
   (R1 , R 2 ,...,R N ; r1 , r2 ,...,rn )
1
ˆ
H 
ma
Z a Zb
Za
1


a  a  i  i 
a b rab
ai rai
i  j rij
 Hˆ N  Hˆ el
The Born-Oppenheimer approximation
 (R 1 , R 2 ,...,R N ; r1 , r2 ,...,rn ) 
N (R 1 , R ) 2 ,...,R N )el (R 1 , R 2 ,...,R N ; r1 , r2 ,...,rn )
N (R 1 , R 2 ,...,R N )   (R 1 ) (R 2 )... (R N )
EN
Eel


 
Z a Zb
Za
1
E
 el  
  el 
a b rab
ai rai
i  j rij
E N  Eel  E (R 1 , R 2 ,...,R N )
Hˆ   E 
el
el
el
el
What is a force field?
A set of formulas (usually explicit) and parameters to
express the conformational energy of a given class of
molecules as a function of coordinates (Cartesian, internal,
etc.) that define the geometry of a molecule or a molecular
system.
Features:
• Cheap
• Fast
• Easy to program
• Restricted to conformational
analysis
• Non-transferable
• Results sometimes
unreliable
All-atom empirical force fields: a very simplified
representation of the potential energy surfaces
Class I force fields
Vn
1
1
d
o 2

o 2
E   ki (d i  d i )   ki ( i   i )    cos(n   )
2 bonds
2 angles
dihedral n 2
angles
12
6
0
0

 rij 
 rij  
qi q j

  ij    2  
r  
 rij 
i  j rij
 ij  

Multiplication of atom types in empirical force fields
Force fields commonly used for protein simulations
Name
Potential
type
References
AMBER/OPLS
all-atom,
united-atom
Weiner et al., 1984; 1986; Cornell et
al., 1995; Jorgensen et al., 1996
http://ambermd.org/
all-atom
Brooks et al., 1983; MacKerrel et al.,
1998; 2001
http://www.charmm.org/
all-atom
van Gunsteren & Berendsen, 1987;
Scott et al., 1999
http://www.gromos.net/
CHARMm
GROMOS
ECEPP/3
DISCOVER
(CVFF)
Nemethy et al., 1995; Ripoll et al.,
all-atom; rigid
1995
http://cbsu.tc.cornell.edu/software/ec
valence
eppak/
geometry
http://www.icm.edu.pl/kdm/ECEPPAK
all-atom
Dauber-Osguthorpe, 1988; Maple et
al., 1998
Bond distortion energy

d

2
Es(d)
1 d
Es d   k d  d 0
2
d0
d
Typical values of d0 and kd
Bond
d0 [A]
kd [kcal/(mol A2)]
Csp3-Csp3
1.523
317
Csp3-Csp2
1.497
317
Csp2=Csp2
1.337
690
Csp2=O
1.208
777
Csp2-Nsp3
1.438
367
C-N (amide)
1.345
719
Comparison of the actual bond-energy curve with that of
the harmonic approximation
Potentials that take into account the asymmetry of bond-energy
curve



1 d
1
0 2
Es d   k d  d   d  d 0
2
6

Es d   De 1  e
 1
b  d  d e  2

3
Anharmonic potential
Morse potential (CVFF
force field)
Harmonic potential
E [kcal/mol]
Anharmonic potential
Morse potential
d [A]


k
Eb()
Energy of bond-angle distortion
1 
0
Eb   k   
2

2
0

Typical values of 0 and k
0 [degrees]
Csp3-Csp3-Csp3 109.47
k
[kcal/(mol degree2)]
0.0099
Csp3-Csp3-H
109.47
0.0079
H-Csp3-H
109.47
0.0070
Angle
Csp3-Csp2-Csp3 117.2
0.0099
Csp3-Csp2=Csp2 121.4
0.0121
Csp3-Csp2=O
0.0101
122.5
Basic types of torsional potentials
Single bond between sp3 carbons or between
sp3 carbon and nitrogen
Example: C-C-C-C quadruplet
Etor [kcal/mol]
Etor    1.61  cos3 
60
Double or partially double bonds
50
40
Example: C-C(carboxyl)-C(amide)-C
quadruplet
30
20
Etor    201  cos2 
10
0
Single bond between electronegative atoms
(oxygens, sulfurs, etc.).
Example: C-S-S-C quadruplet
Etor    3.51  cos2   0.61  cos 
dihedral angle [deg]
Potentials imposed on improper
torsional angles

B
X
A
X
Etor
V2 1  cos 2 

V3 1  cos 3 
Nonbonded Lennard-Jones (6-12) potential
 r 0 12  r 0 6 
  12   6 
Enb r       2   Enb r   4      
 r  
 r 
 r 
 r  
1
6
r 2 
Enb [kcal/mol]
o
r  ri  r
0
ij
0
0
j
 ij   i j
-

r0
r [A]
Lorenz-Berthelot
combining rules
Sample values of i and r0i
Atom type
r0

C(carbonyl)
1.85
0.12
C(sp3)
1.80
0.06
N(sp3)
1.85
0.12
O(carbonyl)
1.60
0.20
H(bonded with C)
1.00
0.02
S
2.00
0.20
Other nonbonded potentials
 r C
Enb r   A exp    6
  r
Buckingham potential
C
D
Ehb r   12  10
r
r
10-12 potential used in
some force fields (e.g.,
ECEPP) for proton…proton
donor pairs
Coulombic (electrostatic) potential
Charge determination
• Mullikan population charges (ECEPP/3, other
early force fields).
• Fitting to molecular electrostatic potentials +
subsequent adjustment to reproduce potentialenergy surfaces or experimental association
energies, etc.
• Based on atomic electronegativities with
corrections to topology and geometry (No and
coworkers, J. Phys. Chem. B, 105, 3624–3634,
2001; Koca and coworkers, J. Chem. Inf. Model.,
53, 2548–2558, 2013).
Charge determination: fitting to molecular
electrostatic potential (MEP) maps
ab initio
V
R   
a
V
Coulomb
el r 
Za
dV
 
rR
R  Ra
2
R; q1 ,...,qN   
a

F q1 ,...,q N    V
i
N
q
j 1
j
Q
ab initio
qa
R  Ra
R i   V
Coulomb
R; q1 ,...,qN 
2
 min
Charge determination: fitting to molecular
electrostatic potential (MEP) maps
Ab initio calculations
Fitted by using CHELP-SV
Francl et al., J. Comput. Chem., 17, 367-383 (1996)
Polarizable force fields
U pol
 0

1
1
ind
ind
ind
  μ i  Ei   μ i   Ei   Tijμ j 
2 i
2 i
j i


 0

ind
μ  α i  Ei   Tijμ j 
j i


1
T
ˆ
ˆ
Tij  3 I  3rijrij
rij
ind
i


Sources of parameters
Energy contribution
Source of parameters
Bond and bond angle
distortion
Crystal and neutronographic data, IR
spectroscopy
Torsional
NMR and FTIR spectroscopy
Nonbonded interactions
Polarizabilities, crystal and
neutronographic data
Electrostatic energy
Molecular electrostatic potentials
All
Energy surfaces of model systems
calculated with molecular quantum
mechanics
Class II force fields (MM3, MMFF, UFF, CFF)
Maple et al., J. Comput. Chem., 15, 162-182
(1994)
Maple et al., J. Comput. Chem., 15, 162-182 (1994)
Parameterization of class II force fields
F  p1 , p2 ,..., pm    w
(E)
n
E p   E 
'QM 2
n
'
n
n
w
(f)
n
n
 En p  E
i  x  x
i
i




QM
n
2
  En p   E
  w  


xi x j
n
i j  i  xi x j
2
(h)
n
2
QM
n




2
Solvent in simulations
 Explicit water
• TIP3P
• TIP4P
• TIP5P
• SPC
 Implicit water
• Solvent accessible surface area (SASA) models
• Molecular surface area models
• Poisson-Boltzmann approach
• Generalized Born surface area (GBSA) model
• Polarizable continuum model (PCM)
TIP3P model
TIP4P model
0.00 e
-0.834 e
H
104.52o
H
0.417 e
H
0.520 e
0.15 Å
O
O
M
H
-1.040 e
O=3.1507 Å
O=3.1535 Å
O=0.1521 kcal/mol
O=0.1550 kcal/mol
Solvent accessible surface area (SASA) models
Fsolw 
 A
i
i
atoms
i
Ai
Free energy of solvation of
atomu i per unit area,
solvent accessible surface of
atom i dostępna
Vila et al., Proteins: Structure, Function, and Genetics, 1991, 10, 199-218.
Comparison of the lowest-energy conformations of [Met5]enkefalin
(H-Tyr-Gly-Gly-Phe-Met-OH) obtained with the ECEPP/3 force field
in vacuo and with the SRFOPT model
vacuum
SRFOPT
Compariosn of the molecular sufraces of the lowest-energy
conformation of [Met5]enkefaliny obtained without and with
the SRFOPT model
vacuum
SRFOPT
Molecular surface are model
Fcav  A

Surface tension
A
molecular surface area
Generalized Born molecular surface (GBSA) model
Fsolw  Fcav  E GB
pol
E
GB
pol
 1
1
 332qi q j  
  in  out
 1

 f GB (rij )
2


r
ij
2

f GB (rij )  rij  Ri R j exp 
 4R R 
i j 

Protein structure calculation/prediction and folding
simulations
• Single energy minimization (wishful thinking at the
early stage of force-field development).
• Global optimization of the PES (ignores
conformational entropy).
• Molecular dynamics/Monte Carlo (take entropy into
account but slow) and liable to non-convergence).
• Generalized ensemble sampling (MREMD).
Force field validation
Structure of gramicidiny S predicted by using the build-up procedure with
energy minimzation with the ECEPP/3 force field (M. Dygert, N. Go, H.A.
Scheraga, Macromolecules, 8, 750-761 (1975). The structure turned out to be
effectively identical with the NMR structure determined later.
Global optimization of the energy surface of the N-terminal portion of
the B-domain of staphylococcal protein A with all-atom ECEPP/3 force
field + SRFOPT mean-field solvation model (Vila et al., PNAS, 2003,
100, 14812–14816)
Superposition of the native fold (cyan)
and the conformation (red) with the
lowest Ca RMSD (2.85 Å) from the
native fold
Energy-RMSD diagram
First successful folding simulation of a globular protein by
molecular dynamics
Duan and Kollman,
Science, 282, 5389, 740744 (1998)
Folding proteins at x-ray resolution using a specially designed
ANTON machine (x-ray: blue, last frame of MD) simulation
(red): villin headpiece (left), a 88 ns of simulations, WW
domain (right), 58 ms of simulations. Good symplectic
algorithm; up to 20 fs time step.
D.E. Shaw et al., Science, 2010, 330, 341-346