Aucun titre de diapositive - u

Download Report

Transcript Aucun titre de diapositive - u

Pharmacophores in Chemoinformatics:
1. Pharmacophore Patterns & Topological
Fingerprints
Dragos Horvath
Laboratoire d’InfoChimie
UMR 7177 CNRS – Université de Strasbourg
[email protected]
The Pharmacophore Way of Life – A Medicinal
Chemist’s Dream
• (Bio)Molecular Recognition is based on ligand-site
interactions of extremely complicated nature
– Understanding them requires a solid knowledge of statistical
physics and, therefore, of higher maths…
– But medicinal chemists hate maths… so they developed a
simplified rule set to rationalize ligand binding.
• Functional groups of similar physicochemical behavior
represent pharmacophore types:
– Hydrophobic, Aromatic, Hydrogen Bond (HB) donors, Cations,
HB Acceptors, Anions.
– Now, we just need to know how each of the six types interacts
with the site… welcome to the “pharmacophore” paradigm,
farewell higher maths (for the moment, at least)
The Interaction Saga: (1) van der Waals
Interactions
• Atoms are more or less hard spheres – squeezing them
against each other causes a sharp rise in energy:
– Erep=Aijd-12
• At distances larger than the sum of their « van der Waals
spheres », an attractive term due to dipole-induced dipole
interactions (London dispersion term) is predominant…
– Eatt= - Bijd-6
The Interaction Saga: (2) Electrostatics &
Solvation
• Coulomb charge-charge interactions are easy to compute,
once the partial charges Qk are assigned
on the atoms…
E
– ECoul=QiQj/4ped
ti
ti
np
nt
• … and the solvent molecules
are
explicitly
modeled –
BEi;i
k;the
k
accountig forBEall
possible solvation pi
shell structures, in
Qi
tk
order to estimate aQksolvation
free energy.
ui
Epi
• Alternatively, pka neglected!
continuum solvent model may be
vi
n
p
employed.
k
e
= Ep. np pk 1- ext
e0
eint
i
e
= Ep. np pi 1- ext
e0
eint
D. Horvath et al., J. Chem. Phys. 104, 6679 (1996)
The Interaction Saga: (2bis) The Hydrophobic
Effect
• The mysterious force that separates grease and water is
not due to grease-grease van der Waals interactions being
stronger than grease-water attraction!
• It is not of electrostatic nature either, because greasy alkyl
chains have no charges!
• Actually, it’s not a force at all, but the consequence of the
drift towards a more probable state of matter (?!)
• For practical purposes, however, it makes sense to believe
that hydrophobes « attract » each other – for making
hydrophobic contacts significantly improves binding
affinity!
Physical Chemistry For Dummies: The Rules
• Hydrophobes make favorable contacts with other
hydrophobes (we do not want to know why!). Assume
strenght proportional to the buried hydrophobic area.
• Hydrophobes in close contact to polar groups cause
frustration, for they chase away the water molecules
favorably solvating the latter and offer no substitute
interactions
• Hydrogen bond donors seek to pair with acceptors, so that
they may reestablish the water hydrogen bonds they lost
• Cations seek to pair with anions and avoid hydrophobes.
• Shape is of paramount importance: groups of a same kind
may replace each other if they are shaped likely
BioIsoSteres – Equivalent Functional Groups
• Wikipedia: bioisosteres are substituents or groups with
similar physical or chemical properties that impart similar
biological properties to a chemical compound
Pharmacophore Patterns
• The pharmacophore pattern of a molecule
characterizes the relative arrangement of all its
pharmacophore types
– What pharmacophore types are represented?
– How are they arranged (spatially, topologically) with
respect to each other ?
– How can these aspects be captured numerically to yield
molecular descriptors of the pharmacophore pattern?
• Note: Pharmacophore patterns are essentially 3D.
Since geometry is determined by connectivity, 2D
“pharmacophore patterns” also make sense!
Exploiting pharmacophore patterns…
• N-dimensional vector D(M)=[D1(M), D2(M), …,DN(M)];
each Di encodes an element of the pharmacophore pattern
– Allows meaningful quantitative definitions of molecular
similarity:
• Neighborhood Behavior: Similar molecules - characterized by covariant
vectors - are likely to display similar biological properties
• As chemists do not easily perceive the pharmacophore pattern, such
covariance may reveal hidden but real molecular relatedness…
– May serve as starting point for searching a binding
pharmacophore – the subset of features that really
participate in binding to a receptor
• Machine learning to select those elements Di that are systematically
present in actives, but not in inactives of a molecular learning set!
Some examples of "hidden similarity"
100
Cl
90
N
80
O
20
70
N
50
60
90
0
100
60
CGRP
MAPkin
IL-8
NEUPTh
HIVP
PK55fyn
EGF-TK
PKC
PDEIV
PDEII
Elast
CatB
Cl
K-ATP
V1Ah
Sigma1
5HTUpt
5HT6h
5HT3h
5HT2ch
5HT1D
5HT1Ah
Muh
NPY
NK1h
M3h
M1h
ML1
H1c
Galan
ETAh
DaUpt
D2h
D1h
CCKAh
B2h
Bomb
BZDc
AT1h
Beta1h
Alpha2
Alpha1
A1h
0
Br
70
S
N
N
I
80
N
H
90
N
Cl
N
70
N
O
N
I
80
N
Cl
0
100
H
N
60
O
N
40
30
10
50
40
30
20
10
50
40
30
20
10
Tricentric Pharmacophore Fingerprints:
monitoring feature arrangement
• Topological: the distance between two features equals the
(minimal) number of chemical bonds between them
9
4
11
• Spatial: if stable conformers are known, use the distance in
Ǻ between two features
Example: Binary Pharmacophore Triplets
Basis Triplets:
• all possible feature combinations
• at a given series of distances…
3
3
3
3
4
3
5
4
0
0
5
…
0
0
5
5
4
0

3
…
…
1
?
4
3
7
5
…
…
6
…
0
Pickett, Mason & McLay, J. Chem. Inf. Comp. Sci. 36:1214-1223 (1996)
…
…
0
…
First key improvement: Fuzzy mapping of
atom triplets onto basis triplets in 2D-FPT
3
3
3
3
0
4
5
0
…
4
0
0
…
+6
4
7
5
5
4
0
5
3
…
…
+3
6
…
…
…
Di(m) = total occupancy of basis triplet i in molecule m.
…
0
…
Combinatorial enumeration of basis triplets
• Example: there are 36796 basis triplets, verifying triangle
inequalities, when considering 6 pharmacophore types and
11 edge lenghts between Emin=3 to Emax=13 with an
increment of Estep=1: (3, 4, 5,…13)
– Canonical representation: T1d23-T2d13-T3d12 with T3≥T2≥T1
(alphabetically).

Hp7-Ar4-PC6
4
7
Ar4-Hp7-PC6 
6
– Out of two corners of a same type, priority is given to the one
opposed to the shorter edge.

Ar4-Hp7-Hp6
4
7
6
Ar5-Hp6-Hp7 
Triplet matching procedure
• The triplet matching score represents the optimal degree of
pharmacophore field overlap:
– if corner k of the triplet is of pharmacophore type T, e.g. F(k,T)=1,
then it contributes to the total pharmacophore field of type T,
observed at a point P of the plane:
3
T (P)F(k,T)exp(T dk,P)
2
k 1
Horvath, D. ComPharm pp. 395-439; in "QSPR /QSAR Studies by Molecular Descriptors", Diudea, M.,
Editor, Nova Science Publishers, Inc., New York, 2001
Control parameters for triplet enumeration &
matching in two 2D-FPT versions.
Parameter
Description
Emin
Minimal Edge Length of basis triangles (number of bonds
between two pharmacophore types)
2
4
Emax
Maximal Triangle Edge Length of basis triangles
12
15
Estep
Edge length increment for enumeration of basis triangles
2
2
e
Edge length excess parameter: in a molecule, triplets with
edge length > Emax+e are ignored
0
2

Maximal edge length discrepancy tolerated when attempting
to overlay a molecular triplet atop of a basis triangle.
2
2
Hp = Ar
Gaussian fuzziness parameter for apolar (Hydrophobic and
Aromatic) types
0.6
0.9
PC = NC
Gaussian fuzziness parameter for charged (Positive and
Negative Charge) types
0.6
0.8
HA = HD
Gaussian fuzziness parameter for polar (Hydrogen bond
Donor and Acceptor) types
0.6
0.7
Aromatic-Hydrophobic interchangeability level
0.6
0.5
Number of basis triplets at given setup
4494
7155
l
FPT-1 FPT-2
Second key improvement: Proteolytic
equilibrium dependence of 2D-FPT
?
12%
88%
Some ‘activity cliffs’ in rule-based descriptor
space are smoothed out in 2D-FPT-space
Pharmacophore Pattern-Based Similarity
Queries: Lead Hopping!
Pharmacophore
Hypothesis
Nearest Neighbors
Reference
Fingerprint
?
Superposition-based Similarity Scoring
Automated
Fingerprint
Matching...
Potential Pharmacophore
Fingerprint Library
Best Matching Candidates
Docking
Some examples of "hidden similarity"
100
Cl
90
N
80
O
20
70
N
50
60
90
0
100
60
CGRP
MAPkin
IL-8
NEUPTh
HIVP
PK55fyn
EGF-TK
PKC
PDEIV
PDEII
Elast
CatB
Cl
K-ATP
V1Ah
Sigma1
5HTUpt
5HT6h
5HT3h
5HT2ch
5HT1D
5HT1Ah
Muh
NPY
NK1h
M3h
M1h
ML1
H1c
Galan
ETAh
DaUpt
D2h
D1h
CCKAh
B2h
Bomb
BZDc
AT1h
Beta1h
Alpha2
Alpha1
A1h
0
Br
70
S
N
N
I
80
N
H
90
N
Cl
N
70
N
O
N
I
80
N
Cl
0
100
H
N
60
O
N
40
30
10
50
40
30
20
10
50
40
30
20
10
Successful Virtual Screening Simulations
Confirmed Actives (PF)
Confirmed Actives (FPT-2)
(OPT3)
Confirmed Actives (PF)
Confirmed Actives (FPT-2)
(OPT3)
Confirmed Inactives (PF)
Confirmed Inactives (FPT-2)
(OPT3)
7
% Retrieved Seed Compounds
% Retrieved Seed Compounds
90
80
70
60
50
40
30
20
5
4
3
2
0
8
0
45
D2
40
35
30
25
TK
20
15
10
% Retrieved Seed Compounds
50
% Retrieved Seed Compounds
6
1
10
7
6
5
4
3
2
1
5
0
90
0
% Retrieved Seed Compounds
45
40
% Retrieved Seed Compounds
Confirmed Inactives (PF)
Confirmed Inactives (FPT-2)
(OPT3)
35
30
25
20
15
10
80
70
60
50
40
30
20
10
5
0
0
0
20
40
60
80
100
120
Selection Size
140
160
180
200
0
20
40
60
80
100
120
Selection Size
140
160
180
200
Successful QSAR model construction with 2DFPT: predicting c-Met TK activity
.
Learning Set Compounds
Validation Set Compounds
9
8.5
8
Experimental pIC50
7.5
7
6.5
6
5.5
5
4.5
4
4
4.5
25 variables entering nonlinear model
153 molecules for training: RMSE=0.4 (log units), R2=0.82
540 molecules
5.5
6 validation:
6.5
7
7.5(log units),
8
for
RMSE=0.8
R28.5
=0.53 9
Calculated
8 validation molecules
outpIC50
of 40 mispredicted by more than 1 log
What more could be done?
• 3D FPT version under study
– does it pay off to generate conformers? How many would you
need to get better results than with 2D-FPT? What’s the best
conformational sampler to use?
• Accessibility-weighted fingerprints?
– class to return (topological and/or 3D) estimate of the solventaccessible fraction of an atom?
• Tautomer-dependent fingerprints?
– if tautomers and their percentage were enumerated like any other
microspecies…
THE END
Pharmacophore Hypotheses
(A): From individual Active Leads: 2D/3D
• ALL features in the Lead assumed relevant for binding
(B): Consensus hypotheses from set of Leads: 2D/3D
• Ignore features that can be deleted without losing activity
(C): Site-Ligand interaction models: 3D*
• Select Ligand features shown to interact with the site in the
3D X-ray structure of the site-ligand complex.
(D): Active Site filling models: 3D*
• Design a pharmacophoric feature distribution complementary to the groups available in the active site
* In these cases, docking may be performed starting from pharmacophore –based
overlays
ComPharm Overlay…
- chosen conformer
of the reference
- chosen conformer
of the candidate
- pair of matching
atoms
- 3 Euler angles
- mirroring toggle
GA-controlled
overlay optimization
Reference Atoms
ComPharm Pharmacophoric Fields
1
Pharmacophoric Features
Alk. Aro. HBA HDB (+)
(-)
X11 X12 X13 X14 X15 X16
2
X21
X22
X23
X24
X25
X26
3
X31
X32
X33
X34
X35
X36
4
X41
X42
X43
X44
X45
X46
5
X51
X52
X53
X54
X55
X56
• A descriptor of the nature of the
molecule’s pharmacophoric neighborhood “seen” by every reference
atom, assuming an optimal overlay
of the molecule on the reference...