Molecular Modeling and Informatics C371 Introduction to Cheminformatics Kelsey Forsythe Characteristics of Molecular Modeling  Representing behavior of molecular systems Visual (tinker toys – LCDs) rendering of molecules  Mathematical.

Download Report

Transcript Molecular Modeling and Informatics C371 Introduction to Cheminformatics Kelsey Forsythe Characteristics of Molecular Modeling  Representing behavior of molecular systems Visual (tinker toys – LCDs) rendering of molecules  Mathematical.

Molecular Modeling
and Informatics
C371
Introduction to Cheminformatics
Kelsey Forsythe
Characteristics of
Molecular Modeling

Representing behavior of molecular systems
Visual (tinker toys – LCDs) rendering of
molecules
 Mathematical rendering (differential equations,
matrix algebra) of molecular interactions
 Time dependent and time independent realms

Molecular Modeling
+
=
Valence
Bond
Theory
Underlying equations:
empirical (approximate, soluble)
-Morse Potential VHH  D0 (1 ea(RR0 ) )2
ab initio (exact, insoluble (less hydrogen atom))
-Schrodinger Wave EquationHˆ   E

8 .3 5 E -2 8
8 .3 5 E -2 8
8 .3 5 E -2 8
8 .3 5 E -2 8
8 .3
1 .4
E -51E8-2 8
8 .3 5 E -2 8
8 .3 5 E -2 8
1 .2 E - 1 8
8 .3 5 E -2 8
8 .3 5 E -2 8
1 .3
E -51E8-2 8
8
8 .3 5 E -2 8
8 .3
E -51E9-2 8
8 .3 5 E -2 8
8
6 .3
E -51E9-2 8
8 .3 5 E -2 8
8 .3 5 E -2 8
4E-19
8 .3 5 E -2 8
8 .3 5 E -2 8
2 .3
E -51E9-2 8
8
8 .3 5 E -2 8
8 .3 50E -2 8
8 .3 5 E -2
0 8
8 .3 5 E -2 8
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 2 .0 3 0 9 8 E - 1 8 1 .0 5 3 7 4 E - 1 8
8 .7 7 5 6 7 E +1
4 2 0 5 6 8for
7 8 7Hydrogen
1 4 0 1 .7 7 5Molecule
6 9 E - 1 8 9 .6 6 1 5 5 E - 1 9
Empirical
Potential
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .5 4 6 8 2 E - 1 8 8 .8 2 3 6 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .3 4 2 0 1 E - 1 8 8 .0 2 3 7 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .1 5 9 1 3 E - 1 8 7 .2 6 1 8 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 9 .9 6 2 0 7 E - 1 9 6 .5 3 7 9 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 8 .5 1 4 5 1 E - 1 9 5 .8 5 2 0 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 7 .2 3 2 0 9 E - 1 9 5 .2 0 4 1 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 6 .0 9 9 7 3 E - 1 9 4 .5 9 4 2 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 5 .1 0 3 6 2 E - 1 9 4 .0 2 2 3 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 4 .2 3 1 1 E -1 9 3 .4 8 8 4 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 3 .4 7 0 6 1 E - 1 9 2 .9 9 2 5 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 2 .8 1 1 5 5 E - 1 9 2 .5 3 4 6 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 2 .2 4 4 2 6 E - 1 9 2 .1 1 4 7 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .7 5 9 8 7 E - 1 9 1 .7 3 2 8 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .3 5 0 3 1 E - 1 9 1 .3 8 8 9 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .0 0 8 2 E -1 9 1 .0 8 3 0 5 E - 1 9
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 7 .2 6 7 8 7 E - 2 0 8 .1 5 1 4 7 E - 2 0
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 4 .9 9 9 2 4 E - 2 0 5 .8 5 2 4 7 E - 2 0
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 3 .2 2 0 0 1 E - 2 0 3 .9 3 3 4 7 E - 2 0
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .8 7 9 0 1 E - 2 0 2 .3 9 4 4 7 E - 2 0
8 .707.5
5 6 7 E +114 2 0 5 618.5
7 8 7 1 4 02 9 .2 9 623.58 E - 2 1 31 .2 3 5 437.5
E-20
8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 3 .2 9 4 4 3 E - 2 1 4 .5 6 4 7 5 E - 2 1
4
Empirical Models
Simple/Elegant?


 Intuitive?-Vibrations ( F  kr )
 Major Drawbacks:

 Does
not include quantum mechanical effects
 No information about bonding (re)
 Not generic (organic  inorganic)

Informatics
 Interface
between parameter data sets and
systems of interest
 Teaching computers to develop new potentials
from existing math templates
MMFF Potential
E
= Ebond + Eangle + Eangle-bond +
Etorsion + EVDW + Eelectrostatic
Atomistic Model History

Atomic Spectra


Plum-Pudding Model


Neils Bohr (circa 1913)
Wave-Particle Duality


Planck (circa 1905)
Planetary Model


J. J. Thomson (circa 1900)
Quantization


Balmer (1885)
DeBroglie (circa 1924)
Schrodinger Wave Equation

Erwin Schrodinger and Werner Heisenberg
Classical vs. Quantum
Trajectory
Real numbers


Deterministic (“The value
is ___”)


Variables
Continuous energy
spectrum





Wavefunction
Complex (Real and
Imaginary components)
Probabilistic (“The average
value is __ ”
Operators
 Discrete/Quantized energy
 Tunneling
 Zero-point energy
Schrodinger’s Equation
ˆ
H  E

ˆ
H - Hamiltonian operator
Hˆ  Tˆ  Vˆ

N


Gravity?


i
2
2mi
N

2
C
i j

eie j
ri  rj
  1
2
ˆ
H (r )  
  (r )
2 r 2
2
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 2 .0 3 0 9 8 E - 1 8 1 .0 5 3 7 4 E - 1 8
Potential
Molecule
8 .3 5 E -2 8 Empirical
8 .7 7 5 6 7 E +1
4 2 0 5 6 8for
7 8 7Hydrogen
1 4 0 1 .7 7 5
6 9 E - 1 8 9 .6 6 1 5 5 E - 1 9
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .5 4 6 8 2 E - 1 8 8 .8 2 3 6 5 E - 1 9
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .3 4 2 0 1 E - 1 8 8 .0 2 3 7 5 E - 1 9
1 .4
8E
.3-51E8-2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .1 5 9 1 3 E - 1 8 7 .2 6 1 8 5 E - 1 9
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 9 .9 6 2 0 7 E - 1 9 6 .5 3 7 9 5 E - 1 9
8E
.3-51E8-2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 8 .5 1 4 5 1 E - 1 9 5 .8 5 2 0 5 E - 1 9
1 .2
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 7 .2 3 2 0 9 E - 1 9 5 .2 0 4 1 5 E - 1 9
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 6 .0 9 9 7 3 E - 1 9 4 .5 9 4 2 5 E - 1 9
1E-18
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 5 .1 0 3 6 2 E - 1 9 4 .0 2 2 3 5 E - 1 9
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 4 .2 3 1 1 E -1 9 3 .4 8 8 4 5 E - 1 9
88E.3-5
1E
9 -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 3 .4 7 0 6 1 E - 1 9 2 .9 9 2 5 5 E - 1 9
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 2 .8 1 1 5 5 E - 1 9 2 .5 3 4 6 5 E - 1 9
68E.3-5
1E
9 -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 2 .2 4 4 2 6 E - 1 9 2 .1 1 4 7 5 E - 1 9
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .7 5 9 8 7 E - 1 9 1 .7 3 2 8 5 E - 1 9
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .3 5 0 3 1 E - 1 9 1 .3 8 8 9 5 E - 1 9
4E-19
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .0 0 8 2 E -1 9 1 .0 8 3 0 5 E - 1 9
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 7 .2 6 7 8 7 E - 2 0 8 .1 5 1 4 7 E - 2 0
28E.3-5
1E
9 -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 4 .9 9 9 2 4 E - 2 0 5 .8 5 2 4 7 E - 2 0
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 3 .2 2 0 0 1 E - 2 0 3 .9 3 3 4 7 E - 2 0
8 .3 50E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 1 .8 7 9 0 1 E - 2 0 2 .3 9 4 4 7 E - 2 0
8 .3 5 E 0
-2 8 8 .707.5
5 6 7 E +1
0 9 .2 9 6
1 4 2 0 5 618.57 8 7 1 4 2
23
.58 E - 2 1 31 .2 3 5 437.5E - 2 0
8 .3 5 E -2 8 8 .7 7 5 6 7 E +1 4 2 0 5 6 8 7 8 7 1 4 0 3 .2 9 4 4 3 E - 2 1 4 .5 6 4 7 5 E - 2 1
4
Hydrogen Molecule
Hamiltonian
Hˆ  Tˆ  Vˆ
2
ˆ
H 
2

  2p1  2p 2  e21  e22 





 m p m p me me 
 1
1
1
1
1
1 
C






 re1e 2 rp1 p 2 rp1e1 rp1e 2 rp 2e1 rp 2e 2 
Born-Oppenheimer Approximation
Hˆ el  Tˆel  Vˆel nuclei  Vnuclei
2
2
2






1
1
1
1 
1
 1

e
1
e
2
ˆ
H el   


C





C



2  me me 
r
r
r
r
r
rp1 p 2


e
1
e
2
p
1
e
1
p
1
e
2
p
2
e
1
p
2
e
2



Now Solve Electronic Problem
Electronic Schrodinger
Equation

Solutions:
F
(r )   c m  m (r )
m
  m (r ) ,

the basis set, are of a known form
Need to determine coefficients (cm)


Wavefunctions gives probability of finding electrons
in space (e. g. s,p,d and f orbitals)
 Molecular orbitals are formed by linear
combinations of electronic orbitals (LCAO)

Hydrogen Molecule

HOMO

LUMO
Hydrogen Molecule

Bond Density
Ab Initio/DFT
Complete Description!
 Generic!
 Major Drawbacks:




Mathematics can be cumbersome
Exact solution only for hydrogen
Informatics

Approximate solution time and storage intensive
– Acquisition, manipulation and dissemination problems
Approximate Methods

SCF (Self Consistent Field) Method (a.ka. Mean
Field or Hartree Fock)





Pick single electron and average influence of remaining
electrons as a single force field (V0 external)
Then solve Schrodinger equation for single electron in
presence of field (e.g. H-atom problem with extra force
field)
Perform for all electrons in system
Combine to give system wavefunction and energy (E)
Repeat to error tolerance (Ei+1-Ei)
Correcting
Approximations

Accounting for Electron Correlations
DFT(Density Functional Theory)
 Moller-Plesset (Perturbation Theory)
 Configuration Interaction (Coupling single
electron problems)

Geometry Optimization


First Derivative is Zero
dV(r )
0
dr
As N increases so does
dimensionality/complexity/beauty/difficulty

Multi-dimensional (macromolecules, proteins)
Conjugate gradient methods
 Monte Carlo methods

Modeling Programs

Observables
Equilibrium bond lengths and angles
 Vibrational frequencies, UV-VIS, NMR shifts
 Solvent Effects (e.g. LogP)
 Dipole moments, atomic charges
 Electron density maps
 Reaction energies

Comparison to
Experiments

Electronic Schrodinger Equation gives bonding energies
for non-vibrating molecules (nuclei fixed at equilibrium
geometry) at 0K


Can estimate G= H TS using frequencies
Eout NOT Hf !

Bond separation reactions (simplest 2-heavy atom components)
provide path to heats of formation
CH3CH2CH3  CH4  2CH3CH3
H fCH 3CH 2CH 3  E bondseparation - H f CH 4 + 2H f CH 3CH 3
QM
E bondseparation  E QM
prod  E react
QM
QM
QM
 2ECH
 (ECH
 ECH
)
3CH 3
3CH 2CH 3
4

Ab Initio Modeling Limits
Function of basis and method used
 Accuracy




~.02 angstroms
~2-4 kcal
N


HF - 50-100 atoms
DFT - 500-1000 atoms
Semi-Empirical Methods
Neglect Inner Core Electrons
 Neglect of Diatomic Differential Overlap
(NDDO)

Atomic orbitals on two different atomic centers
do not overlap
 Reduces computation time dramatically

Other Methods

Energetics




Monte Carlo
Genetic Algorithms
Maximum Entropy
Methods
Simulated Annealing

Dynamics



Finite Difference
Monte Carlo
Fourier Analysis
Large Scale Modeling
(>1000 atoms)

Challenges



Many bodies (Avogardo’s number!!)
Multi-faceted interactions (heterogeneous, solute-solvent,
long and short range interactions, multiple time-scales)
Informatics



Split problem into set of smaller problems (e.g. grid
analysis-popular in engineering)
Periodic boundary conditions
Connection tables
Large Scale Modeling

Hybrid Methods

Different Spatial Realms


Treat part of system (Ex. Solvent) as classical point
particles and remainder (Ex. Solute) as quantum
particles
Different Time Domains
Vibrations (pico-femto) vs. sliding (micro)
 Classical (Newton’s 2nd Law) vs. Quantum (TDSE)

Reference Materials








Journal of Molecular Graphics and Modeling
Journal of Molecular Modelling
Journal of Chemical Physics
THEOCHEM
Molecular Graphics and Modelling Society
NIH Center for Molecular Modeling
“Quantum Mechanics” by McQuarrie
“Computer Simulations of Liquids” by Allen and
Tildesley
Modeling Programs











Spartan (www.wavefun.com)
MacroModel (www.schrodinger.com)
Sybyl (www.tripos.com)
Gaussian (www.gaussian.com)
Jaguar (www.schrodinger.com)
Cerius2 and Insight II (www.accelrys.com)
Quanta
CharMM
GAMESS
PCModel
Amber
Summary
Types of Models

Tinker Toys
Empirical/Classical (Newtonian Physics)
Quantal (Schrodinger Equation)
Semi-empirical

Informatic Modeling






Conformational searching (QSAR, ComFA)
Generating new potentials
Quantum Informatics
Next Time

QSAR (Read Chapter 4)
MMFF Energy

Stretching
Ebond


7

0
2
0 2 
 K bond (rij  r ) * 1  cs(rij  rij ) 
cs (rij  rij ) 
12


0 2
ij
MMFF Energy

Bending


Eangle  K (ijk  ) * 1  cb(ijk  )
0 2
ijk
0
ijk
MMFF Energy

Stretch-Bend Interactions


0
Ebond angle  Kijk (rij  rij0 )  Kkji (rkj  rkj0 ) ijk ijk

MMFF Energy

Torsion (4-atom bending)
Etorsion  0.5V1 1  cos  V2 1  cos2  V3 1  cos3
MMFF Energy

Analogous to Lennard-Jones 6-12 potential
London Dispersion Forces
 Van der Waals Repulsions

EVDW
7


 1.07R

  ij 
*

 Rij  0.07Rij 

*
ij
*7


 1.07Rij

 2
 7
*7


 Rij  0.07Rij

Intermolecular/atomic
models

General form:
N
N
i j
i j
jk
V  V (r)  V (ri ,rj )  V (ri ,rj ,rk )  .....

Lennard-Jones


 12  6


V (rij )  4     
r  r  





Van derWaals repulsion

London Attraction
MMFF Energy

Electrostatics (ionic compounds)
D – Dielectric Constant
 d- electrostatic buffering constant
qi q j
Eelectrostatic 
n
D Rij  d


