Presentation - Chemical Information BULLETIN
Download
Report
Transcript Presentation - Chemical Information BULLETIN
Privileged Substructures Revisited:
Target Community-Selective Scaffolds
Jürgen Bajorath
Life Science Informatics
University of Bonn
Privileged Substructures
First postulated by Evans et al. in 1988 based on the observation that
many cholecystokinin antagonists contained conserved substructures
not frequently seen in other active compounds
Since then the search for target class-privileged chemotypes has
continued in medicinal chemistry
Generally accepted definition:
- Recurrent fragments in ligands of a given target family
- Selective at the family level, but not for individual targets
Evans BE et al. J. Med.Chem.1988, 31, 2235-2246
Privileged Substructures
Existence of truly target family-privileged substructures has
remained controversial
Intrinsic limitation: Search for privileged substructures has
been based on frequency of occurrence analysis of preselected substructures
Often drawn conclusion: Substructure might occur with high
frequency among ligands of a particular target family but also
act on other families
Privileged Substructures
Are target family-privileged substructures truly privileged?
Target Family Set
# Compounds
# Substructures
GPCR class A
21620
1190
Ligand gated ion
channels
3792
297
Nuclear hormone
receptors (NHRs)
2176
121
Protein kinases
1079
101
Serine proteases
3015
323
Schnur DM et al. J. Med. Chem. 2006, 31, 2000-2009
Privileged Substructures
Are target family-privileged substructures truly privileged?
Ligand sets
Target Family
Substructure
Sets
GPCR
Ion
channels
NHRs
Protein
kinases
Serine
proteases
Random
cpd sets
GPCR class A
-
26%
10%
11%
17%
46%
Ligand gated ion
channels
47%
-
15%
19%
92%
99%
Nuclear hormone
receptors (NHRs)
40%
30%
-
17%
15%
45%
Protein kinases
48%
34%
16%
-
20%
57%
Serine proteases
25%
11%
7%
91%
-
37%
Schnur DM et al. J. Med. Chem. 2006, 31, 2000-2009
Changing the Analysis Concept
Do molecular scaffolds exist that exclusively occur in
ligands of individual target families ?
Peptidases
GPCRs
Kinases
...
- Bemis & Murcko framework (scaffold)
- Large-scale distribution in target families
Departing from frequency of occurrence analysis of preselected substructures
Systematic compound data mining taking all available
activity annotations into account
Hierarchical Scaffolds
Compound
1 R-groups
Framework
2 Ring System
3 Linker
Bemis GW and Murcko MA. J. Med. Chem.1996, 39, 2887-2893
Public Data Source - BindingDB
BindingDB database:
-
Public repository of activity information of small
molecules
~31,000 compound entries with ~57,000 activity
annotations
17,745 compounds active against human targets
extracted
Analysis Strategy - Compound Sets
Target pair sets:
- Active compounds are organized into target pair sets
- A set contains all compounds active against two individual
targets (i.e. compounds might belong to multiple sets)
Binding DB target pair sets:
- Sets obtained for 520 pairs of targets that share >= 5
-
compounds
6,343 compounds active against 259 human targets
Pubchem confirmatory bioassays:
- Only 3 relevant human target pairs meet the >= 5 compound
criterion
Compound-Based Target Network
520 target pairs are visualized in a network
representation
- Nodes: targets
- Edges: target pair sets
- Edge width: number of
1
2
3
4
5
6
shared compounds
Densely connected
communities
- 18 communities
- >= 4 targets
- Different target families
7
9
10
11
12
13
14
15
16
8
17
18
Community-Selective Scaffolds
520 human target pair sets (6,343 BDB compounds;
259 targets); 18 target communities
206 community-selective scaffolds:
- Exclusively act in a single community
- With 5 - 45 compounds/scaffold (av. ~12)
- Yielding 147 distinct carbon skeletons (topological
diversity)
Adding Selectivity Information
For each compound active against a target pair, its target
selectivity (TS) is calculated as:
TS pKi A pKi B
Compound |TS| values range from 0 to 6.86
- 0: equal potency, no selectivity
- 6.86: potency difference of nearly 7 orders of magnitude, i.e.
highly selective for one target over another
Selectivity profiles of scaffolds
- Community-based
- Target-based
Selectivity Profiles
Community-based selectivity profile:
-
For each scaffold found in a given community
All corresponding compounds active against any target pair in
this community pooled
Median of their absolute TS values determined (median |TS|)
Target-based selectivity profile:
-
For each scaffold active against a given target
All corresponding compounds active against this target pooled
Selectivity against any other target calculated
Median of their TS values determined (median TS)
Community Selectivity of Scaffolds
Scaffold / Community heat map:
- Columns: target communities
- Rows: scaffolds
- Color spectrum: median |TS|
Red: scaffold yields many
compounds with different potency
against individual targets
Yellow: scaffold does not yield
selective compounds
Non-selective scaffolds
- Occur in multiple communities
Community-selective scaffolds
- Exclusively occur in one community
Target Selectivity of Scaffolds
Scaffold / Target heat map:
- Columns: targets in a community
- Rows: scaffolds
- Cell: the scaffold represents >= 5
compounds active against the target
- Color spectrum: median TS
Red (positive): more selective for the
target over others in the community
Yellow (negative): more selective for
other members of the community
Target Selectivity of Scaffolds
Community 3: 16 serine proteases
Different scaffolds display same
selectivity profile
-
e.g. Factor Xa/Thrombin
Scaffolds with no apparent target
selectivity
Number of scaffolds per target varies
-
Factor Xa: 17; Thrombin: 18
Tryptase: 0; Hepsin: 0
Target Selectivity Ranking
Community-selective scaffolds are ranked according to
median |TS|
5.2
37 scaffolds
at least half of compounds
having >= 100-fold potency
differences against >= 2
community targets
111 scaffolds with targetselective tendency
2
1
0
Community-Selective Scaffolds
98: 1.10
3: 4.03
Rank
Median |TS|
DPP8
CA9
CA2
DPP4
CA1
CA14
CA12
Color spectrum: median TS
Red: high potential to yield
target-selective compounds
Yellow: low potential
CA5A
CA7
CA5B
CA4
CA3
CA6
Selectivity Searching (MDDR)
Thrombin
FXa
Highly selective for
FXa over other serine
proteases
Selectivity Searching
Caspase 7
Caspase 3
Inhibit both caspase 3
and 7 with nM potency;
~200-fold selective over
caspases 1, 6, 8
Extending the Analysis: ChemblDB
Recent public domain database: ChemblDB
-
~500,000 compounds with activity information
32,848 compounds with high-confidence annotations
active against 671 human targets
High-confidence activity annotations:
-
Target confidence level: 9
Interaction type: D(irect)
ftp://ftp.ebi.ac.uk/pub/databases/chembl/latest/
ChemblDB vs. BindingDB
Comparison at different levels
-
Active compounds (human targets)
Scaffolds
Network
Community-selective scaffolds
Topologically distinct scaffolds
ChemblDB
BDB
3,589
32,848
17,745
Compounds
ChemblDB
BDB
1,409
12,902
6,291
Scaffolds
ChemblDB vs. BindingDB
Comparison at different levels
-
Active compounds (human targets)
Scaffolds
Network
Community-selective scaffolds
Topologically distinct scaffolds
BDB
GPCRs
shared targets
unique targets
tyrosine
kinases
CDB
ChemblDB vs. BindingDB
Comparison at different levels
-
Active compounds (human targets)
Scaffolds
Network
Community-selective scaffolds
Topologically distinct scaffolds
ChemblDB
BDB
34
311
206
Community-selective
ChemblDB
BDB
85
227
147
Topologically distinct
Community-Selective Scaffolds
Distribution in drugs?
- DrugBank: 1,247 approved drugs with 726 unique scaffolds
- Only 11 overlap with 206 community-selective BDB scaffolds
- Community-selective scaffolds currently underrepresented in
drugs; opportunities for further chemical exploration
Conclusions
The existence of target class-privileged substructures has
remained controversial over the years
From putative privileged substructures to confirmed target
community-selective scaffolds through systematic data mining
Community-seletive scaffolds are abundant and topologically
diverse
A subset of community-selective scaffolds displays a notable
tendency to produce compounds with different target selectivity
BDB and CDB contain complementary target and scaffold
information
Acknowledgments
Ye Hu
Anne Mai Wassermann
Eugen Lounkine