Transcript Document
Tuberculosis Research – An Indian Perspective (TRIP) AstraZeneca India 20 Oct 2005 In silico identification of novel biosynthetic pathways in Mycobacteria D. Mohanty National Institute of Immunology New Delhi O O -D-Man [-D-Man]X CH3 O H N N O OH N N H O x=0-4 O C19H39 O P O CO CO O OH R2 R1 OH HO N O O OR1 Mycobactin OR2 R2 OH O H2C Phosphatidyl inositol mannoside O O O C O O O SO3H Mycolic acids Sulfolipid R3 O O O O R4 MYCOBCATERIUM TUBERCULOSIS Trehelose PDIM R2 CO HO Mycoketide O HO CH2 OH Man R2 HOOC OH HOOC OH OH OH O H2C O O P OH2C O O C O O O HO OH CO O OH -D-Man CH2 R O O O OH OH OH O OH NH O O O O OH O OH H3C O CH3 H3C CH3 O OH OCH3 S HO N CH3 CH3 O H3C O HO H3C O HO O O O O OH OMe HO HO O CH3 H3C O O O OH OH O O O H CH3 O H3C OH H3CO OCH3 CH3 HO O O NMe2 OH OH O N O H3C O Genes Metabolites Substrate Specificity of Catalytic Domains PKS NRPS Modifying Domains Acyl Transferase (AT) Adenylation (A) Acyl CoA Synthetases (ACS) Keto Synthases (KS) Condensation (C) Glycosyl Transferases (GTr) Chalcone Synthases (CHS) N-Acyl Transferases (NAT) Domains Involved in Protein-Protein Interactions KS – ACP AT – ACP A – PCP C – PCP PapA5 – ACP Levels of Functional Annotation Sequence based methods: Fundamental for functional annotation Drawback: Cannot predict substrate specificity Luciferin Oxyluciferin Amino acid Coumarate SCoA PCPS Amino acyl PCP Coumaroyl CoA SCoA Fatty acid Acyl CoA Computational Chemistry DFGHYKLMVC Homology modeling Model protein based on known structure of a similar protein Find the substrate which binds to the model protein Range of possible substrates Knowledge Based Approach Applications of comparative modeling. The potential uses of a comparative model depend on its accuracy. This in turn depends significantly on the sequence identity between the target and the template structure on which the model was based. Knowledge Based Approach Sequence/structure information for large number of proteins with known specificity Sequence-Product correlation Predictive rules Predicting substrate specificity of new members of the family using evolutionary information In silico identification of PKS/NRPS products Design of novel proteins with altered specificity. Design of novel polyketides/nonribisomal peptides FUNCTIONAL IMPORTANCE OF PPS CLUSTER 44.199 kb 26.751 kb 6.333 kb KOLLATUKUDY et al. MOL. MICROBIOL. (1997) 24, 263-270 COX et al. NATURE (1999) 402, 79-83 CAMACHO et al. J. BIOL. CHEM. (2001) 276, 19845 O O O H3C O CH 3 H3C CH3 H3C CH3 H3C CH3 R GENETIC STUDIES: pps CLUSTER REQUIRED FOR PDIMs BIOSYNTHESIS FadD KNOCK-OUTS DISRUPT PDIM BIOSYNTHESIS Phthiocerol dimycocerosates (PDIMs) ACYL TRANSFERASE (AT) DOMAIN Involved in selection of starter and extender units during Biosynthesis of Fatty acids and Polyketides POSSIBLE STARTER AND EXTENDER UNITS PKS O FAS O O CoAS SCoA SCoA Isobutyryl CoA Acetyl CoA O Acetyl CoA O O SCoA HO O SCoA Malonyl CoA Propionyl CoA CoAS Butyryl CoA O O O HO SCoA SCoA O O Methylmalonyl CoA O HO SCoA Acetyl CoA O SCoA Malonyl CoA H3C Benzoyl CoA O SCoA Acetoacetyl CoA Yadav G, Gokhale R.S and Mohanty D. (2003) Nucl. Acids Res. 31:3654-3658 PKSDB Table 3 (a) Substrate Specificity of AT domains of PKS Yadav, G., Gokhale, R. S. and Mohanty, D. (2003) J. Mol. Biol. 328, 335-363. Trivedi et al. Mol Cell 2005, 17: 1-13 RETROBIOSYNTHESIS OF PDIM (16 ORFS) Rv2929 tesA fadD26 ppsA PCP KS AT KR AC P S O KS A T K R AC P S O AMP HO KS AT D H ppsD ER KR AC P KS AT D H E R ppsE K R AC P O O papA5 drrABC AT ACP C HO mmpL fadD28 7 mas KS AT D H E R KR AC P S S O O R2 O R2 = -H, -CH3 HO HO HO FadD26 HO HO OH K S S S S O PPi ATP ppsC ppsB HO HO R1 O R1 R1 R1 R1 R1 R1 PapA5 R1 R1 = -(CH2)3-7 -CH3 PpsA-E phthiocerol R3 O O O R4 O Mas 3 (n-C16-C20 acid) + 3 Malonyl-CoA + 10 Methyl Malonyl-CoA + NADPH + ATP + CoASH PDIM R3 = -CH2-CH3 or -CH3 R4 = -OCH3 or O mycocerosic acids CELL-FREE RECONSTITUTION OF 26 CATALYTIC STEPS TRIVEDI et al., Molecular Cell (2005) 17 631-643. CRAT CAT His His H2C H2C O O H R C S NH H C O2N O NH CoA CH O C S O CH S 1 PCP 2 NH Cy O -O O C OH R HS CH NH2 PCP 2 O CH CAT AA NH2 O AA 1 PCP 1 NH2 CH S AA C CH S AA 2 NH2 O 2 PCP 2 O O O C CH3 HS CoA S CoA C NH2 E S AA AA 1 PCP 1 PCP 1 CRAT O CH2)12 ( S PREDICT STRUCTURAL FOLD AND CORRELATE WITH KNOWN CHEMISTRY H3C CH2)18 ( HO OH ACP PapA5 H3C E2 CH2)18 ( O O SH His O O H2C ACP NH N O SH S C CAT Fold CH3 E2-Lys H S O R E2p O (H2C)12 (H2C)12 CoA OH H3C C O S CoA R O C CH3 HS CoA OH SH OH SH O E2-Lys H3C C S CoA R OH O BAHD C NH2 H H NRPS O 1 NH PCP 2 N+ CH3 CH3 CH S CoA NH H C O2N O 1 N X XH PCP 2 S CHCl3 AA O O CH3 CH3 O 1 NH2 Cy CH S XH OH O AA NH2 N+ H3C H3C H3C CH NH2 PCP 1 O H O AA N N O S CHCl3 NH -O O OH 1 Threading or Fold Recognition •Proteins often adopt similar folds despite no significant sequence or functional similarity. •For many proteins there will be suitable template structures in PDB. •Unfortunately, lack of sequence similarity will mean that many of these are undetected by sequence-only comparison done in homology modelling. CAT Superfamily NRPS Domains B A H D Epi Cyc C R A T CAT N R P S Con D-X L-X Identification of crucial residues involved in protein-protein interaction PapA5 (Crystal Structure) MAS ACP (Homology Model) PapA5 Protein Docking Mutational studies of these crucial residues WT Trivedi et. al. Mol. Cell. (2005) 17, 631-643 R234E R312E Mbt BIOSYNTHETIC GENE CLUSTER O R1 NON-RIBOSOMAL PEPTIDE SYNTHASES (NRPSs) POLYKETIDE SYNTHASES (PKSs) R2 N OH R5 O N O H N O N H O OH N O R4 O R3 MYCOBACTIN CORE BIOCHEMICAL AND GENETIC STUDIES SUGGESTED INVOLVEMENT OF THIS CLUSTER IN BIOSYNTHESIS OF MYCOBACTIN CORE Quadri et al., Chem Biol. 1998 Nov;5(11):631-45. De Voss et al., Proc Natl Acad Sci U S A. 2000 Feb 1;97(3):1252-7. GENETIC LOCUS INVOLVED IN TAILORING THE MYCOBACTIN PEPTIDIC CORE TO PRODUCE FUCTIONAL SIDEROPHORE NOT KNOWN? IRON-DEPENDENT TRANSCRIPTIONAL PROFILING USING MICROARRAYS Rodriguez & Smith; Mol Microbiol. 2003 Mar;47(6):1485-94. acp fadD33 fadE14 (SIMILARITY TO PROTEINS INVOLVED IN b-OXIDATION) IDENTIFIED NUMBER OF GENES Rv1347c (HOMOLOGY WITH HISTONE ACETYL TRANSFERASE) BIOCHEMICAL PATHWAY INVOLVED IN TAILORING MYCOBACTIN CORE 1 O O 2 NH O MS/MS of 477.31 O 1 369.26 Rel. Int. (%) 100 3 433.32 90 80 NH O 70 60 50 C9H19 40 2 3 30 266.24 187.11 477.31 20 251.17295.16 142.09 10 234.14 0100 150 200 250 300 350 400 450 500 550 600 II m/z, amu MS/MS of 475.32 90 475.32 1 100 O O O NH O 80 70 Rel. Int. (%) 60 2 NH 50 O 40 3 30 C9H19 20 1 295.23 3 2 251.24 367.33 431.45 10 144.14181.15 197.17 0 120 160 200 240 280 320 360 400 440 480 m/z, amu Fatty acids are transferred as acyl-S-enzyme intermediates by Rv1347c Novel acyl-ACP dehydrogenase generates unsaturation in the lipidic chain III Aryl-N-acetyl transferase Rv1347c (1YK3) Docking of Myristic Acid Rv1347c + CoA (Transformed) Myristic Acid (C-14) (Ligand) Docking (AutoDock) 30 member cluster (COO- Flipped) 29 member cluster OH O A C HN H N O H N N H2N O O H N N H OH O OH O NH O OH NH H N OH OH O HO NH O P. putida (Pyoverdin) N O O O OH HN M. bovis (Mycobactin R = C17-C20) H N O OH Pyoverdin HN NH M. tuberculosis (Mycobactin R = C17-C20) O O O N H M. smegmatis (Mycobactin R = C9-C19) OH O O O H N M. Aviam (Mycobactin R= C11-C18) N N OH O N H E. coli (Aerobactin) O O OH B. cepacia R18194 B (Ornibactin) Mycobactin C H N N. farcinia (Nocobactin) COOH O CH O NH N C S. meliloti B (Rhizobactin) O OH OH Aerobactin COOH V. fisheri (Aerobactin) OH O O NH N C V. mimicus (Aerobactin) N CH O CH COOH OH B. cepacia R18194 A (Ornibactin) Nocobactin OH O H C H H H O B. cepacia R1808 (Ornibactin) O N N OH N N H Acenobacter sp. (Acinetoferrin) O O 8 CH H OH S. meliloti A (Rhizobactin) 19 39 3 B 3 3 3 3 O OH O H N N H N N H OH O O OH O O O O N H O O HO COOH Ornibactin HO N H3C N N H N H OH N O OH OH OH OH N O Rhizobactin H N H N O O O OH N O Acinetoferrin H NH2 LPLPVFLCAL LPLPVFLCAL MPLPVFLCAL LSLPVFFCSL LALPVCHLHT WRMKCGSYIC WPIRTGYACC LQLRTLHLAA FSIGPCALNL CTLPLQNLSI INMRSLRIVI INMRSLLFLL IALSVAYMQL LHWSLGYMLI LHWSLGYMLM M. bovis (Mycobactin) M. tuberculosis (Mycobactin ) M. smegmatis (Mycobactin ) M. Aviam (Mycobactin ) B. cepacia R1808 (Ornibactin) Acenobacter sp. (Acinetoferrin) S. meliloti A (Rhizobactin) E. coli (Aerobactin) N. farcinia (Nocobactin) S. meliloti B (Rhizobactin) V. fisheri (Aerobactin) V. mimicus (Aerobactin) P. putida (Pyoverdin) B. cepacia R18194 A (Ornibactin) B. cepacia R18194 B (Ornibactin) Acyl (R = C17-C20) Acyl ( R = C17-C20) Acyl ( R = C9-C19) Acyl ( R= C11-C18) Acyl (2-ene) Acyl (2-ene) Acyl (2-ene)* Acetyl Acetyl Acetyl* Acetyl Acetyl Formyl Acyl (b-OH)/Formyl? Acyl (b-OH)/Formyl? * Predicted specificity based on these two positions. ? Unable to differentiate between two substrate based on these two positions. BIOSYNTHETIC SCHEME FOR AMPHPHILIC MYCOBACTIN Low Iron Concentration IdeR IdeR mbt-2 acp fadD33 fadE14 IdeR Rv1347c Fe- box I Fe- box Fe- box Fatty acyl-ACP Mycobactin IdeR J A mbt-1 B C D E F G H Fe- box Mycobactin core mbtG Didehydroxymycobactin NEW GENETIC LOCUS INVOLVED IN SIDEROPHORE BIOSYNTHESIS Patterns in Networks: Network Motifs ACKNOWLEDGEMENTS COMPUTATIONAL BIOLOGY GROUP CHEMICAL BIOLOGY GROUP KNOWLEDGE-BASED COMPUTATIONAL APPROACH RECONSTRUCTION OF METABOLIC PATHWAYS Gitanjali Yadav Md. Zeeshan Ansari Pankaj Kamra Rajesh S. Gokhale Dr. S.K. Basu, Director, NII BTIS, DBT, India Substrates of Coumarate CoA Ligases Cinnamate Ferulate Coumarate Sinapate Coumarate CoA Ligase Coenzyme A Caffeate 3,4-DMC Substrates of NRPS Adenylation domain of NRPS PCP Domain Substrates of Fatty Acid CoA Ligases Fatty acid CoA Ligase Coenzyme A Acetic acid n ~ 4 - 8 : Medium chain fatty acid n ~ 5 -11: Long chain fatty acid n > 11 : Very Long chain fatty acid Enzymic activation and transfer of fatty acids as acyl-adenylates in mycobacteria Trivedi, O.A., Arora, P., Sridharan, V., Tickoo, R., Mohanty, D. and Gokhale, R.S. 2004 Nature 428:441.