Pf_c# with Hs_C#

Download Report

Transcript Pf_c# with Hs_C#

Bioinformatics for Parasitic
diseases: Malaria
Prof. A.S. Kolaskar
Vice Chancellor
University of Pune
Life cycle of Plasmodium falciparum
Mosquito
Human
Countries endemic to Malaria
Drug-Resistance in Malaria endemic-countries
Source: National Centre for Infectious Diseases, CDC,Atlanta
Information & Resources for Malaria: 1
WHO/TDR: Focus on Malaria
Information & Resources for Malaria: 2
Malaria Focus: Bill & Melinda Gates Foundation
$50 million grant
for malaria
research
Malaria Focus: Wellcome Trust Foundation
Partly funded the
Plasmodium
sequencing project
Information & Resources for Malaria: 3
Information & Resources for Malaria: 4
MR4@ATCC
Deposit OR
Order Culture
•Text search
• Sequence search
Information & Resources for Malaria: 5
National Institute of Allergy & Infectious Diseases
Information & Resources for Malaria: 6
CDC Home
Division of Parasitic Diseases: Information on Malaria
CDC: Division of Vector-borne infectious diseases
Complete details regarding the
life-history of mosquito, the
vector for many infectious
diseases
Genomes: the current

status
• Published complete genomes: 169
Highly voluminous data:
– Archaeal: 17
Needs to be analyzed
– Bacterial: 131
for Knowledge
– Eukaryal: 21
Generation
• Completed Viral genomes:
>1400
• Prokaryotic ongoing genomes: 428
• Eukaryotic ongoing genomes: 360
As of January 13, 2004
Genome database: Plasmodium
Genome database: Anopheles
Genome Organisation of Homo sapiens, Anopheles
gambiae and Plasmodium falciparum
Organism
Genome size
Homo sapiens
(Hs)
3 GB
23
Number of
predicted
genes
~24,000-40,000
Anopheles
gambiae (Ag)
0.27 GB
3
~12,000
14
~5000
Plasmodium
23 MB
falciparum (Pf)
Number of
chromosomes
Approaches to mine genomes of
host, vector & parasite
1. Chromosome-wise comparison
2. Comparison of pathway-specific genes
3. Stage-specific comparison
Rate limiting factors:
• Extent of annotation of genomic data
• Lack of complete connectivity between
genomic and derived databases
• Need to define appropriate cutoffs to detect
similarities between phylogenetically diverse
organisms
Chr Size (bp)
Chromosomes of Homo sapiens
Chromosomes of
Plasmodium falciparum
Chromosomes of Anopheles gambiae
Chr
Size (bp)
X
24,902,716
2R
78,412,699
2L
52,393,056
3R
64,548,413
3L
56,406,562
Chromosome-wise Comparisons of
Proteomes: Program & parameters
• Data sources:
– P. falciparum: PlasmoDB
– H. sapiens : RefSeq@NCBI
– A. gambiae : ENSEMBL
• Program : BLASTP
• Sequence identity: >20%
• Alignment length: >50aa
• E value: zero or with negative powers
Comparison of Proteomes of H. sapiens & A. gambiae
Ag\Hs
2L
2R
3L
3R
X
1
607
790
532
588
239
2
518
649
421
517
179
3
512
559
378
467
171
4 5 6
435 406 448
561 538 550
343 315 404
389 401 437
164 168 166
7
459
496
352
411
176
8
408
503
344
370
144
9
406
491
334
371
136
10
379
503
307
382
154
11
480
585
391
457
181
12
506
580
363
485
197
13
218
266
169
216
78
14
282
423
193
286
112
15
317
444
220
339
116
16
409
499
334
391
146
17
473
545
328
414
201
18
228
245
172
216
78
19
444
579
372
456
165
20 21
332 180
387 200
250 114
333 165
119 61
1000
2L
800
2R
600
3L
400
3R
200
X
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y
22 X
330 391
402 476
239 296
358 372
105 155
Comparison of Proteomes of P.falciparum & A. gambiae
Ag\Pf
2L
2R
3L
3R
X
1
235
279
109
172
49
2
234
344
157
213
70
3
267
340
184
262
80
4
283
370
188
240
86
5
339
445
265
358
111
6
275
363
204
263
70
7
223
340
169
228
70
8
313
386
178
246
94
9
323
451
199
295
114
10
371
452
251
353
97
700
600
500
400
300
200
100
0
11
394
523
282
372
114
12
441
605
333
443
146
2L
2R
3L
3R
X
1
2
3
4
5
6
7
8
9 10 11 12 13 14
13
420
608
286
414
139
14
447
616
310
418
142
Comparison of Proteomes of Homo sapiens & Plasmodium falciparum
Hs
Pf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1
2
3 4
5
6
7
8
9
10
11 12 13 14 15 16 17 18 19 20 21 22 X Y
25
28
18 13
17
25
16
14
16
18
15
20
14
14
21
24
17
13
12
17
7
13
17
2
51
40
27 19
31
31
21
26
31
39
36
31
27
17
41
30
22
15
16
24
12
16
26
6
56
53
43 41
51
48
42
29
29
39
41
46
40
34
42
36
43
23
40
29
20
24
33
12
47
45
29 27
29
34
28
20
22
27
31
28
20
23
36
25
30
19
26
30
11
19
28
6
77
65
60 44
41
55
58
36
47
49
52
50
43
46
53
55
54
31
47
45
29
29
50
8
58
49
39 36
39
45
37
30
35
50
33
44
32
34
42
38
45
15
37
34
17
25
38
6
40
42
33 24
27
38
28
28
22
35
28
28
23
24
35
27
27
17
23
32
14
26
36
5
63
62
49 43
35
50
40
29
46
46
49
46
36
38
49
40
42
18
42
32
21
20
39
8
72
58
40 41
41
55
54
32
38
41
38
50
36
33
36
50
56
19
44
32
17
26
38
9
68
75
51 47
40
59
42
37
52
51
53
62
41
41
51
53
43
21
54
54
25
30
44
11
118
96
85 66
70
91
77
60
85
85
85
82
61
62
86
87
72
49
67
81
37
48
71
20
101
111
80 67
70
78
72
54
68
87
73
86
65
61
81
84
76
37
61
78
33
43
68
14
141
125
104 80
77
94
87
60
76
84
103
97
88
72
103
91
95
43
82
82
42
53
80
25
145
128
118 84
87
110
100
77
100
98
101 105
80
85
112
109
122
64
90
99
58
46
94
29
Pf chr1 vs Hs chr2
28 significant matches
Chr. in Hs
Chr. in Pf
•
•
•
•
•
•
•
•
•
•
•
•
•
•
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Significant
matches
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y
List of significant matches
• Proteins that are part of eukaryotic transcriptional
and translational machinery
• Heat shock proteins: molecular chaperones
• Histones
• Actin and tubulin: cytoskeletal proteins
• Ornithine aminotransferase: Involved in the interconversion of arginine, proline and glutamate
residues and the synthesis of polyamines.
Polyamines are implicated to have a role in cell
proliferation.
List of significant matches contd.,
• Polyubiquitin: involved in the ATP-dependent
selective degradation of cellular proteins,
maintenance of chromatin structure, regulation of
gene expression, stress response and ribosome
biogenesis
• Proteasome are large barrel-like bodies which
contain proteolytic enzymes in their inner surface.
• DEAD family RNA helicases.
• Histone deacetylase: critical mediators of
transcription repression.
Case study: 1
gi_pf
Seq_
pf
Acc_hs
g23509043
459
XP_029431
Seq_ Score E
Aln_len %id Func_hs
Func_pf
A
search
against
Pfam
database
revealed
hs
value
CARP37/84
has matches
both at its
N
269 that
71 the 3e44
bruno-like
clustered014
4, RNAdomains.
asparagineand C terminii
to RNA binding
A
binding
rich protein
protein with a probable house-keeping
protein
gene activity is known to be
immunogenic….
• Clustered asparagine rich
protein (CARP)
• Function not clearly known
• Expressed in different stages
of the life-cycle of the parasite
• Immunogenic (Kuma
et.al,1990)
• Bruno like 4 RNA binding
protein
• Transcriptional regulator
Pf_c12 with Hs_c18
Case Study
• Hypothetical protein
• Zinc-finger binding
• Blast against nr database
domain.
shows significant matches • Transcriptional
towards the N terminus
factor
with zinc-finger
• Binds both to RNA
containing proteins of
and DNA.
higher eukaryotes like
mammals and fishes.
May have acquired
• No significant matches to
from the host….
other protozoans
Pf_c10 vs Hs_c18
Case Study
• Expressed in the
intraerythrocytic (up to
0.5% of parasite protein)
and schizont-stage of the
parasite.
• Dual function of protein
folding and signal
transduction.
• Cyclophilin is also present
in other parasites like T.
gondii, Brugia malayi etc.
• Receptor for Immunosupressive drug
cyclosporin A
• Known to be present in
higher eukaryotes
including
plants(involved in
handling stress
response).
Pf_c12 vs Hs_c21
Pf_c2 with Hs_c6
gi_pf
Seq_
pf
Acc_hs
Seq_
hs
Score E
value
Aln_len
%id
Func_hs
Func_pf
g16804988
457
XP_041840
428
513
263/453
58
HLA-B
associated
transcript-1
eIF-4A-like
DEAD
family RNA
helicase
• Putative helicase
1e147
• Involved in a number of
cellular functions
including translation,
RNA splicing, and
ribosome assembly.
• Located within human
major histocompatibility
complex class III region.
Malaria parasite pathways: Hagai Ginsburg
URL: http://sites.huji.ac.il/malaria/
Metabolome of Plasmodium falciparum
• Metabolic pathways of Plasmodium falciparum are known
to be stage-specific.
• Asexual blood-stage parasites depend on glycolysis and
conversion of pyruvate to lactate to derive energy.
• MS-MS studies carried out by Florens et.al(2002),
revealed that gametocyte and sporozoite stages of the
malarial parasite contain peptides of enzymes known to be
involved in mitochondrial TCA cycle and oxidative
phosphorylation.
In Plasmodium falciparum
Chromosomal locations of TCA cycle-enzymes
Enzyme
Chromosome
Hs
Ag
Pf
Citrate synthase
12
3L
10
Aconitase
22
3R
13
Isocitrate dehydrogenase
2
2L
13
Alpha-keto glutarate dehydrogenase (E1)
Alpha-keto glutarate dehydrogenase (E2)
7
14
2R
3L
8
13
Alpha-keto glutarate dehydrogenase (E3)
7
3L
12
Succinyl CoA ligase
13
2L
14
Succinate dehydrogenase (Cyt b560) (SDHA)
Succinate dehydrogenase ( Cyt b small) (SDHB)
Succinate dehydrogenase (flavoprotein) (SDHC)
1
11
5
3L
X
3L
10
Succinate dehydrogenase (iron-sulfur) (SDHD)
1
2L
12
Fumarase
Malate dehydrogenase
1*
7
2R*
3R
9**
6
• * Class II non-iron dependent Fumarase
• ** Class I iron-dependent Fumarase
Comparison of TCA cycle enzymes of
Plasmodium falciparum-Anopheles gambiae-Homo sapiens
Comparison of TCA cycle
90
80
70
60
50
40
30
20
10
0
Query:Human Database:
Anopheles
Query: Human Database:
Plasmodium
SD
SC
H_
oA
cy
tb
SD 560
H_
c
SD ytb
H_
fla
SD v o
H_
Fu iron
m
ar
as
e
M
DH
Ci
t_
sy
Ac
n
on
ita
se
IC
D
AK H
G
DH
1
AK
G
DH
2
AK
G
DH
3
Query: Anopheles Database:
Plasmodium
Enzyme
Plasmodium contains
only two SDH subunits
in contrast to 4 SDH
subunits in human &
anopheles
Fumarase class I is present
in Plasmodium whereas
Fumarase class II is present
in human & anopheles
TCA cycle: Comparison of proteome of
host,vector and parasite revealed…
• TCA cycle-specific enzymes of Homo sapiens and Anopheles gambiae
have high degree of sequence identity.
• Aconitase and Fumarase enzymes of Plasmodium falciparum show
very less similarity with their human and mosquito counterparts.
• An iron regulatory protein that has a C terminal domain similar to
Aconitase is present in Plasmodium and it likely carries out the function
of Aconitase enzyme.
• Fumarase (Class I) an iron-dependent enzyme is present in Plasmodium
whereas Fumarase (Class II), an non-iron dependent enzyme is present
in human and mosquito.
• Succinate dehydrogenase in Plasmodium contains only two subunits in
contrast to its human & mosquito counterparts, which have four
subunits.
Homology models of Isocitrate dehydrogenase
High sequence identity Structural similarities
Chromosomal location of enzymes involved in Purine
Biosynthesis
Enzyme
Chromosome
Hs
Ag
Pf
Adenosine deaminase
22
2L
10
Adenylate kinase
1
2L
10
Adenylosuccinate lyase
1
2R
2
Adenylosuccinate synthetase
1
2R
13
DNA polymerase 1
2
2L
9,10,14,
DNA-directed RNA polymerase II
11
3R
2
GMP synthetase
3
3R
10
Guanylate kinase
1
3L
9
Hypoxanthine phosphoribosyltransferase
X
-
10
Inosine-5'-monophosphate dehydrogenase
7
3L
9
Nucleoside diphosphate kinase
17
2L
6
Purine nucleoside phosphorylase
14
add
5
Ribonucleotide reductase
8
3R
10,14
Thioredoxin reductase
22
X
9
Sequence similarity of Enzymes involved in Purine
Biosynthesis using Pf as a reference
Enzyme
Sequence Identity
Hs
Ag
Adenosine deaminase
26
30
Adenylate kinase
52
51
Adenylosuccinate lyase
22
24
Adenylosuccinate synthetase
46
45
DNA polymerase 1
26
24
DNA-directed RNA polymerase II
54
52
GMP synthetase
30
30
Guanylate kinase
37
40
Hypoxanthine phosphoribosyltransferase
49
-
Inosine-5'-monophosphate dehydrogenase
49
48
Nucleoside diphosphate kinase
61
60
Purine nucleoside phosphorylase
-
-
Ribonucleotide reductase
60
64
Thioredoxin reductase
45
44
Chromosomal location of enzymes
involved in Pyrimidine Biosynthesis
Enzyme
Aspartate carbamoyltransferase
Carbamoyl phosphate synthetase
Cytidine triphosphate synthetase
Deoxyuridine 5'-triphosphate
nucleotidohydrolase
Thymidylate
synthase
Multi-domain
Dihydroorotase
protein
in Hs & Ag
Dihydroorotate
dehydrogenase
DNA polymerase 1
DNA-directed RNA polymerase II
Nucleoside diphosphate kinase
Orotate phosphoribosyltransferase
Orotidine-monophosphate-decarboxylase
Ribonucleotide reductase
Serine hydroxymethyltransferase
Thioredoxin reductase
Thymidylate kinase
2
2
1
15
Chromosome
Hs
Ag
Pf
UNK
13
UNK
13
3R
14
UNK
11
18
2L
4
2
X
14
16
2R
6
• Single-domain
proteins
in Pf
3
2L
6
• Located3R
on different
16
2 chromosomes
17
2L
6
3
2R
5
3
2R
10
10,14
2
2L
12
X
12
22
X
9
2
2L
12
Sequence similarity of Enzymes involved in Pyrimidine
Biosynthesis using Pf as a reference
Enzyme
Sequence Identity
Hs
Ag
37
38
45
47
Aspartate carbamoyltransferase
Carbamoyl phosphate synthetase
44
43
Cytidine triphosphate synthetase
Deoxyuridine 5'-triphosphate
36
35
nucleotidohydrolase
55
42
Thymidylate synthase
Dihydroorotase
36
35
Dihydroorotate dehydrogenase
26
24
DNA polymerase 1
• Present
in Hs,
Ag
30 Pf & 52
DNA-directed RNA polymerase
II
60
Nucleoside diphosphate kinase• No sequence61similarity
27 Hs and
29 Pf vs Ag
Orotate phosphoribosyltransferase
between Pf vs
27
Orotidine-monophosphate-decarboxylase
60
64
Ribonucleotide reductase
45
47
Serine hydroxymethyltransferase
45
44
Thioredoxin reductase
39
42
Thymidylate kinase
Chromosomal location of enzymes involved in
Hemoglobin degradation pathway
Enzyme
Chromosome
Hs
Ag
Pf
Aspartyl protease
11
UNK
13,14
Aspartic hemoglobinase
-
-
14
Leucine aminopeptidase
4
2R
14
Methionine aminopeptidase
12,4
3R,2R
10,13,14
O-sialoglycoprotein endopeptidase
4
2L
7
Papain family cysteine protease
9
3L
9
Pepsinogen
11
UNK
8
SERA antigen/papain-like proteinase with
active Cys
9
3L
2
Serine protease
4
UNK
5
Zinc-metallopeptidase
17
UNK
13
Sequence similarity of Enzymes involved in
Hemoglobin digestion using Pf as a reference
Enzyme
Sequence Identity
Hs
Ag
Aspartyl protease
34
32
Aspartic hemoglobinase
-
-
Leucine aminopeptidase
39
30
Methionine aminopeptidase
58
56
O-sialoglycoprotein endopeptidase
32
30
Papain family cysteine protease
26
27
Pepsinogen
31
30
SERA antigen/papain-like proteinase with
active Cys
23
24
Serine protease
22
31
Zinc-metallopeptidase
29
24
Comparing pathways:
Lessons learnt
Human:
• de novo
HGPRT: Missing
•Salvage
[Hypoxanthine Guanine
PhosphoribosylTransferase]
HGPRT: Present
• well-studied drug target
Purine
Biosynthesis
Anopheles
Plasmodium
• de novo
• Salvage
•Salvage
Stage-specific comparison of P.falciparum proteins
with Human proteome
Stage
No. of
proteins
No. of matches
% matches
Sporozoite
1025
198
19
Merozoite
828
284
34
Trophozoite
1024
338
33
Transition stage:
MosquitoHuman
Human: Liver
specific
Human: RBC
specific