Transcript Document
Molecular Docking G. Schaftenaar Docking Challenge • Identification of the ligand’s correct binding geometry in the binding site (Binding Mode) • Observation: – Similar ligands can bind at quite different orientations in the active site. Two main tasks of Docking Tools • Sampling of conformational (Ligand) space • Scoring protein-ligand complexes Rigid-body docking algorithms • Historically the first approaches. • Protein and ligand fixed. • Search for the relative orientation of the two molecules with lowest energy. • FLOG (Flexible Ligands Oriented on Grid): each ligand represented by up to 25 low energy conformations. Introducing flexibility: Whole molecule docking • • • • Monte Carlo methods (MC) Molecular Dynamics (MD) Simulated Annealing (SA) Genetic Algorithms (GA) Available in packages: AutoDock (MC,GA,SA) GOLD (GA) Sybyl (MD) Monte Carlo • Start with configuration A (energy EA) • Make random move to configuration B (energy EB) • Accept move when: EB < EA or if EB > EA except with probability P: P exp EA EB kT Molecular Dynamics • force-field is used to calculate forces on each atom of the simulated system • following Newton mechanics, calculate accelerations and velocities from the forces. (Force = mass times acceleration) • The atoms are moved slightly with respect to a given time step Simulated Annealing Finding a global minimium by lowering the temperature during the Monte Carlo/MD simulation Genetic Algorithms • Ligand translation, rotation and configuration variables constitute the genes • Crossovers mixes ligand variables from parent configurations • Mutations randomly change variables • Natural selection of current generation based on fitness • Energy scoring function determines fitness Introducing flexibility: Fragment Based Methods • build small molecules inside defined binding sites while maximizing favorable contacts. • De Novo methods construct new molecules in the site. • division into two major groups: – Incremental construction (FlexX, Dock) – Place & join. Placing Fragments and Rigid Molecules • All rigid-body docking methods have in common that superposition of point sets is a fundamental sub-problem that has to be solved efficiently: – Geometric hashing – Pose clustering – Clique detection Geometric hashing • originates from computer vision • Given a picture of a scene and a set of objects within the picture, both represented by points in 2d space, the goal is to recognize some of the models in the scene Pose-Clustering • For each triangle of receptor compute the transformation to each ligand matching triangle. • Cluster transformations. • Score the results. Clique-Detection • •Nodes comprise of matches between protein and ligand •Edges connect distance compatible pairs of nodes •In a clique all pair of nodes are connected Scoring Functions • Shape & Chemical Complementary Scores • Empirical Scoring • Force Field Scoring • Knowledge-based Scoring • Consensus Scoring Shape & Chemical Complementary Scores • Divide accessible protein surface into zones: – Hydrophobic – Hydrogen-bond donating – Hydrogen-bond accepting • Do the same for the ligand surface • Find ligand orientation with best complementarity score Empirical Scoring Scoring parameters fit to reproduce Measured binding affinities (FlexX, LUDI, Hammerhead) Force Field Scoring (Dock) Aij Bij qi q j 12 6 + c r r r i j ij ij ij lig prot Enonbond Nonbonding interactions (ligand-protein): -van der Waals -electrostatics Amber force field Knowledge-based Scoring Function Free energies of molecular interactions derived from structural information on Protein-ligand complexes contained in PDB Boltzmann-Like Statistics of Interatomic Contacts. P s p , s l Pref exp bF s p , s l Distribution of interatomic distances is converted into energy functions by inverting Boltzmann’s law. Potential of Mean Force (PMF) ij i s seg r Aij r k BT ln fVol _ corr r ij s bulk ij r s seg ij s bulk Number density of atom pairs of type ij at atom pair distance r Number density of atom pairs of type ij in reference sphere with radius R Consensus Scoring Cscore: Integrate multiple scoring functions to produce a consensus score that is more accurate than any single function for predicting binding affinity. Virtual screening by Docking • Find weak binders in pool of nonbinders • Many false positives (96-100%) • Consensus Scoring reduces rate of false positives Concluding remarks Scoring functions are the Achilles’ heel of docking programs. False positives rates can be reduced using several scoring functions in a consensus-scoring strategy Although the reliability of docking methods is not so high, they can provide new suggestions for protein-ligand interactions that otherwise may be overlooked Docking programs • • • • • • DOCK FlexX GOLD AutoDOCK Hammerhead FLOG FLEXX • Receptor is treated as rigid • Incremental construction algorithm: – Break Ligand up into rigid fragments – Dock fragments into pocket of receptor – Reassemble ligand from fragments in low energy conformations How DOCK works • Generate molecular surface of protein Cavities in the receptor are used to define spheres (blue); the centres are potential locations for ligand atoms. Sphere centres are matched to ligand atoms, to determine possible orientations for the ligand. 104 orientations generated thioketal in the HIV1-protease active site GOLD (Genetic Optimisation for Ligand Docking) Performs automated docking with full acyclic ligand flexibility, partial cyclic ligand flexibility and partial protein flexibility in and around active site. Scoring: includes H-bonding term, pairwise dispersion potential (hydrophobic interactions), molecular and mechanics term for internal energy. Analysis shows algorithm more likely to fail if ligand is large or highly flexible, and more likely to succeed if ligand is polar • The GA is encoded to search for H-bonding networks first; • Fitness function contains a term for dispersive interactions but takes no account of desolvation, thus underestimates The Hydrophobic Effect