Transcript Document

Tuberculosis Research – An Indian Perspective (TRIP)
AstraZeneca India
20 Oct 2005
In silico identification of novel biosynthetic pathways
in Mycobacteria
D. Mohanty
National Institute of Immunology
New Delhi
O
O
-D-Man
[-D-Man]X
CH3
O
H
N
N
O
OH
N
N
H
O
x=0-4
O
C19H39
O P O
CO
CO
O
OH
R2
R1
OH
HO
N
O
O
OR1
Mycobactin
OR2
R2
OH
O
H2C
Phosphatidyl inositol
mannoside
O
O
O
C
O
O
O
SO3H
Mycolic
acids
Sulfolipid
R3
O
O
O
O
R4
MYCOBCATERIUM TUBERCULOSIS
Trehelose
PDIM
R2
CO
HO
Mycoketide
O
HO
CH2
OH
Man
R2
HOOC
OH
HOOC
OH
OH
OH
O
H2C
O
O P OH2C
O
O
C
O
O
O
HO
OH
CO
O
OH
-D-Man
CH2
R
O
O
O
OH
OH
OH
O
OH
NH
O
O
O
O
OH
O
OH
H3C
O
CH3
H3C
CH3
O
OH
OCH3
S
HO
N
CH3
CH3
O
H3C
O
HO
H3C
O
HO
O
O
O
O
OH
OMe
HO
HO
O
CH3
H3C
O
O
O
OH
OH
O
O
O
H
CH3
O
H3C
OH
H3CO
OCH3
CH3
HO
O
O
NMe2
OH
OH
O
N
O
H3C
O
Genes
Metabolites
Substrate Specificity of Catalytic Domains
PKS
NRPS
Modifying Domains
Acyl Transferase
(AT)
Adenylation
(A)
Acyl CoA Synthetases
(ACS)
Keto Synthases
(KS)
Condensation
(C)
Glycosyl Transferases
(GTr)
Chalcone Synthases
(CHS)
N-Acyl Transferases
(NAT)
Domains Involved in
Protein-Protein
Interactions
KS – ACP
AT – ACP
A – PCP
C – PCP
PapA5 – ACP
Levels of Functional Annotation
Sequence based methods: Fundamental for functional annotation
Drawback: Cannot predict substrate specificity
Luciferin
Oxyluciferin
Amino acid
Coumarate
SCoA
PCPS
Amino acyl PCP
Coumaroyl CoA
SCoA
Fatty acid
Acyl CoA
Computational Chemistry
DFGHYKLMVC
Homology modeling
Model protein based
on known structure
of a similar protein
Find the
substrate which
binds to the
model protein
Range of possible
substrates
Knowledge Based Approach
Applications of comparative modeling. The potential uses of a
comparative model depend on its accuracy. This in turn depends
significantly on the sequence identity between the target and the
template structure on which the model was based.
Knowledge Based Approach
Sequence/structure information for large number of proteins with known
specificity
Sequence-Product correlation
Predictive rules
Predicting substrate specificity
of new members of the family
using evolutionary information
In silico identification of PKS/NRPS
products
Design of novel proteins
with altered specificity.
Design of novel polyketides/nonribisomal
peptides
FUNCTIONAL IMPORTANCE OF PPS CLUSTER
44.199 kb
26.751 kb
6.333 kb
KOLLATUKUDY et al. MOL. MICROBIOL. (1997) 24, 263-270
COX et al. NATURE (1999) 402, 79-83
CAMACHO et al. J. BIOL. CHEM. (2001) 276, 19845
O
O
O
H3C O CH
3
H3C
CH3
H3C
CH3
H3C
CH3
R
GENETIC STUDIES: pps CLUSTER REQUIRED
FOR PDIMs BIOSYNTHESIS
FadD KNOCK-OUTS DISRUPT PDIM
BIOSYNTHESIS
Phthiocerol dimycocerosates
(PDIMs)
ACYL TRANSFERASE (AT) DOMAIN
Involved in selection of starter and extender units during
Biosynthesis of Fatty acids and Polyketides
POSSIBLE STARTER AND EXTENDER UNITS
PKS
O
FAS
O
O
CoAS
SCoA
SCoA
Isobutyryl CoA
Acetyl CoA
O
Acetyl CoA
O
O
SCoA
HO
O
SCoA
Malonyl CoA
Propionyl CoA
CoAS
Butyryl CoA
O
O
O
HO
SCoA
SCoA
O
O
Methylmalonyl CoA
O
HO
SCoA
Acetyl CoA
O
SCoA
Malonyl CoA
H3C
Benzoyl CoA
O
SCoA
Acetoacetyl CoA
Yadav G, Gokhale R.S and Mohanty D. (2003) Nucl. Acids Res. 31:3654-3658
PKSDB
Table 3 (a)
Substrate Specificity of AT domains of PKS
Yadav, G., Gokhale, R. S. and
Mohanty, D. (2003)
J. Mol. Biol. 328, 335-363.
Trivedi et al. Mol Cell 2005, 17: 1-13
RETROBIOSYNTHESIS OF PDIM (16 ORFS)
Rv2929
tesA fadD26
ppsA
PCP KS AT KR AC
P
S
O
KS A
T
K
R
AC
P
S
O
AMP
HO
KS AT D
H
ppsD
ER KR AC
P
KS AT D
H
E
R
ppsE
K
R
AC
P
O
O
papA5
drrABC
AT ACP C
HO
mmpL
fadD28
7
mas
KS AT D
H
E
R
KR AC
P
S
S
O
O
R2
O
R2 = -H, -CH3
HO
HO
HO
FadD26
HO
HO
OH
K
S
S
S
S
O
PPi
ATP
ppsC
ppsB
HO
HO
R1
O
R1
R1
R1
R1
R1
R1
PapA5
R1
R1 = -(CH2)3-7 -CH3
PpsA-E
phthiocerol
R3
O
O
O
R4
O
Mas
3 (n-C16-C20 acid) + 3 Malonyl-CoA + 10 Methyl Malonyl-CoA + NADPH + ATP + CoASH
PDIM
R3 = -CH2-CH3
or -CH3
R4 = -OCH3
or O
mycocerosic
acids
CELL-FREE RECONSTITUTION OF 26 CATALYTIC STEPS
TRIVEDI et al., Molecular Cell (2005) 17 631-643.
CRAT
CAT
His
His
H2C
H2C
O
O
H
R
C
S
NH
H
C
O2N
O
NH
CoA
CH
O
C
S
O
CH
S
1
PCP 2
NH
Cy
O
-O
O
C
OH
R
HS
CH
NH2
PCP 2
O
CH
CAT
AA
NH2
O
AA
1
PCP 1
NH2
CH
S
AA
C
CH
S
AA
2
NH2
O
2
PCP 2
O
O
O
C
CH3
HS
CoA
S
CoA
C
NH2
E
S
AA
AA
1
PCP 1
PCP 1
CRAT
O
CH2)12
(
S
PREDICT STRUCTURAL FOLD
AND CORRELATE WITH
KNOWN CHEMISTRY
H3C
CH2)18
(
HO
OH
ACP
PapA5
H3C
E2
CH2)18
(
O
O
SH
His
O
O
H2C
ACP
NH
N
O
SH
S
C
CAT Fold
CH3
E2-Lys
H
S
O
R
E2p
O
(H2C)12 (H2C)12
CoA
OH
H3C
C
O
S
CoA
R
O
C
CH3
HS
CoA
OH
SH
OH
SH
O
E2-Lys
H3C
C
S
CoA
R
OH
O
BAHD
C
NH2
H
H
NRPS
O
1
NH
PCP 2
N+
CH3
CH3
CH
S
CoA
NH
H
C
O2N
O
1
N
X
XH
PCP 2
S
CHCl3
AA
O
O
CH3
CH3
O
1
NH2 Cy
CH
S
XH
OH
O
AA
NH2
N+
H3C
H3C
H3C
CH
NH2
PCP 1
O
H
O
AA
N
N
O
S
CHCl3
NH
-O
O
OH
1
Threading or Fold Recognition
•Proteins often adopt similar folds despite no
significant sequence or functional similarity.
•For many proteins there will be suitable template
structures in PDB.
•Unfortunately, lack of sequence similarity will
mean that many of these are undetected by
sequence-only comparison done in homology
modelling.
CAT Superfamily
NRPS Domains
B
A
H
D
Epi
Cyc
C
R
A
T
CAT
N
R
P
S
Con
D-X
L-X
Identification of crucial residues involved in protein-protein interaction
PapA5
(Crystal Structure)
MAS ACP
(Homology Model)
PapA5
Protein Docking
Mutational studies
of these crucial
residues
WT
Trivedi et. al. Mol. Cell. (2005) 17, 631-643
R234E R312E
Mbt BIOSYNTHETIC GENE CLUSTER
O
R1
NON-RIBOSOMAL PEPTIDE SYNTHASES (NRPSs)
POLYKETIDE SYNTHASES (PKSs)
R2
N
OH
R5
O
N
O
H
N
O
N
H
O
OH
N
O
R4
O
R3
MYCOBACTIN CORE
BIOCHEMICAL AND GENETIC STUDIES SUGGESTED INVOLVEMENT OF THIS CLUSTER
IN BIOSYNTHESIS OF MYCOBACTIN CORE
Quadri et al., Chem Biol. 1998 Nov;5(11):631-45.
De Voss et al., Proc Natl Acad Sci U S A. 2000 Feb 1;97(3):1252-7.
GENETIC LOCUS INVOLVED IN TAILORING THE MYCOBACTIN PEPTIDIC
CORE TO PRODUCE FUCTIONAL SIDEROPHORE NOT KNOWN?
IRON-DEPENDENT TRANSCRIPTIONAL PROFILING USING MICROARRAYS
Rodriguez & Smith; Mol Microbiol. 2003 Mar;47(6):1485-94.
acp fadD33 fadE14
(SIMILARITY TO PROTEINS INVOLVED IN b-OXIDATION)
IDENTIFIED NUMBER OF GENES
Rv1347c
(HOMOLOGY WITH
HISTONE ACETYL TRANSFERASE)
BIOCHEMICAL PATHWAY INVOLVED IN TAILORING MYCOBACTIN CORE
1
O
O
2
NH
O
MS/MS of 477.31
O
1
369.26
Rel. Int. (%)
100
3
433.32
90
80
NH
O
70
60
50
C9H19
40
2
3
30
266.24
187.11
477.31
20
251.17295.16
142.09
10
234.14
0100 150 200 250 300 350 400 450 500 550 600
II
m/z, amu
MS/MS of 475.32
90
475.32
1
100
O
O
O
NH
O
80
70
Rel. Int. (%)
60
2 NH
50
O
40
3
30
C9H19
20
1
295.23
3
2
251.24
367.33 431.45
10 144.14181.15
197.17
0 120 160 200 240 280 320 360 400 440 480
m/z, amu
Fatty acids are transferred as acyl-S-enzyme intermediates by Rv1347c
Novel acyl-ACP dehydrogenase generates unsaturation in the lipidic chain
III
Aryl-N-acetyl transferase
Rv1347c (1YK3)
Docking of Myristic Acid
Rv1347c + CoA (Transformed)
Myristic Acid (C-14)
(Ligand)
Docking (AutoDock)
30 member cluster (COO- Flipped)
29 member cluster
OH
O
A
C
HN
H
N
O
H
N
N
H2N
O
O
H
N
N
H
OH
O
OH
O
NH
O
OH
NH
H
N
OH
OH
O HO NH
O
P. putida (Pyoverdin)
N
O
O
O
OH
HN
M. bovis (Mycobactin R = C17-C20)
H
N
O
OH
Pyoverdin
HN
NH
M. tuberculosis (Mycobactin R = C17-C20)
O
O
O N
H
M. smegmatis (Mycobactin R = C9-C19)
OH
O
O
O
H
N
M. Aviam (Mycobactin R= C11-C18)
N
N
OH
O
N
H
E. coli (Aerobactin)
O
O
OH
B. cepacia R18194 B (Ornibactin)
Mycobactin
C H
N
N. farcinia (Nocobactin)
COOH
O
CH
O
NH
N C
S. meliloti B (Rhizobactin)
O
OH
OH
Aerobactin
COOH
V. fisheri (Aerobactin)
OH
O
O
NH
N C
V. mimicus (Aerobactin)
N
CH
O
CH
COOH
OH
B. cepacia R18194 A (Ornibactin)
Nocobactin
OH
O
H C H
H
H
O
B. cepacia R1808 (Ornibactin)
O
N
N
OH
N
N
H
Acenobacter sp. (Acinetoferrin)
O
O
8
CH
H
OH
S. meliloti A (Rhizobactin)
19 39
3
B
3
3
3
3
O
OH
O
H
N
N
H
N
N
H
OH
O
O
OH
O
O
O
O
N
H
O
O
HO
COOH
Ornibactin
HO
N
H3C
N
N
H
N
H
OH
N
O
OH
OH
OH
OH
N
O
Rhizobactin
H
N
H
N
O
O
O
OH
N
O
Acinetoferrin
H
NH2
LPLPVFLCAL
LPLPVFLCAL
MPLPVFLCAL
LSLPVFFCSL
LALPVCHLHT
WRMKCGSYIC
WPIRTGYACC
LQLRTLHLAA
FSIGPCALNL
CTLPLQNLSI
INMRSLRIVI
INMRSLLFLL
IALSVAYMQL
LHWSLGYMLI
LHWSLGYMLM
M. bovis (Mycobactin)
M. tuberculosis (Mycobactin )
M. smegmatis (Mycobactin )
M. Aviam (Mycobactin )
B. cepacia R1808 (Ornibactin)
Acenobacter sp. (Acinetoferrin)
S. meliloti A (Rhizobactin)
E. coli (Aerobactin)
N. farcinia (Nocobactin)
S. meliloti B (Rhizobactin)
V. fisheri (Aerobactin)
V. mimicus (Aerobactin)
P. putida (Pyoverdin)
B. cepacia R18194 A (Ornibactin)
B. cepacia R18194 B (Ornibactin)
Acyl (R = C17-C20)
Acyl ( R = C17-C20)
Acyl ( R = C9-C19)
Acyl ( R= C11-C18)
Acyl (2-ene)
Acyl (2-ene)
Acyl (2-ene)*
Acetyl
Acetyl
Acetyl*
Acetyl
Acetyl
Formyl
Acyl (b-OH)/Formyl?
Acyl (b-OH)/Formyl?
* Predicted specificity based on these two positions.
? Unable to differentiate between two substrate based on these two positions.
BIOSYNTHETIC SCHEME FOR AMPHPHILIC MYCOBACTIN
Low Iron Concentration
IdeR
IdeR
mbt-2
acp fadD33 fadE14
IdeR
Rv1347c
Fe- box
I
Fe- box
Fe- box
Fatty acyl-ACP
Mycobactin
IdeR
J
A
mbt-1
B
C
D
E
F
G
H
Fe- box
Mycobactin core
mbtG
Didehydroxymycobactin
NEW GENETIC LOCUS INVOLVED IN SIDEROPHORE BIOSYNTHESIS
Patterns in Networks: Network Motifs
ACKNOWLEDGEMENTS
COMPUTATIONAL
BIOLOGY
GROUP
CHEMICAL
BIOLOGY
GROUP
KNOWLEDGE-BASED
COMPUTATIONAL
APPROACH
RECONSTRUCTION
OF METABOLIC
PATHWAYS
Gitanjali Yadav
Md. Zeeshan Ansari
Pankaj Kamra
Rajesh S. Gokhale
Dr. S.K. Basu, Director, NII
BTIS, DBT, India
Substrates of
Coumarate CoA Ligases
Cinnamate
Ferulate
Coumarate
Sinapate
Coumarate CoA
Ligase
Coenzyme A
Caffeate
3,4-DMC
Substrates of
NRPS
Adenylation domain
of NRPS
PCP Domain
Substrates of Fatty Acid CoA
Ligases
Fatty acid CoA
Ligase
Coenzyme A
Acetic acid
n ~ 4 - 8 : Medium chain fatty acid
n ~ 5 -11: Long chain fatty acid
n > 11 : Very Long chain fatty acid
Enzymic activation and transfer of fatty acids as acyl-adenylates in mycobacteria
Trivedi, O.A., Arora, P., Sridharan, V., Tickoo, R., Mohanty, D. and Gokhale, R.S. 2004 Nature 428:441.