Riboswitches: the oldest regulatory system?

Download Report

Transcript Riboswitches: the oldest regulatory system?

Evolution of bacterial
regulatory systems
Mikhail Gelfand
Research and Training Center “Bioinformatics”
Institute for Information Transmission Problems
Moscow, Russia
ASM, Philadelphia, 18.IV.2009
Catalog of events
• Expansion and contraction of regulons
• New regulators (where from?)
• Duplications of regulators with or without
regulated loci
• Loss of regulators with or without regulated
loci
• Re-assortment of regulators and structural
genes
• … especially in complex systems
• Horizontal transfer
Trehalose/maltose catabolism
in alpha-proteobacteria
Duplicated LacI-family regulators: lineagespecific post-duplication loss
The binding motifs are very similar (the blue branch is
somewhat different: to avoid cross-recognition?)
Utilization of an unknown galactoside
in gamma-proteobacteria
Yersinia and Klebsiella: two regulons, GalR and Laci-X
Erwinia: one regulon, GalR
Loss of regulator and merger of
regulons: It seems that laci-X was
present in the common ancestor
(Klebsiella is an outgroup)
Utilization of maltose/maltodextrin
in Firmicutes
Displacement: invasion of a regulator from a
different subfamily (horizontal transfer from a
related species?) – blue sites
Orthologous TFs with
completely different regulons
(alpha-proteobaceria and
Xanthomonadales)
Cryptic sites and loss of regulators
Loss of RbsR in Y. pestis
(ABC-transporter also is lost)
RbsR binding site
Start codon of rbsD
Regulon expansion, or
how FruR has become CRA
• CRA (a.k.a. FruR) in Escherichia coli:
– global regulator
– well-studied in experiment
(many regulated genes known)
• Going back in time: looking for candidate
CRA/FruR sites upstream of (orthologs of)
genes known to be regulated in E.coli
Common ancestor of gamma-proteobacteria
Mannose
Glucose
manXYZ
ptsHI-crr
edd
epd
eda
adhE
aceEF
Mannitol
mtlA
gapA
fbp
Fructose
pykF
mtlD
fruBA
fruK
pfkA
pgk
gpmA
icdA
ppsA
pckA
aceA
tpiA
aceB
Gamma-proteobacteria
Common ancestor of the Enterobacteriales
Mannose
Glucose
manXYZ
ptsHI-crr
edd
epd
eda
adhE
aceEF
Mannitol
mtlA
gapA
fbp
Fructose
pykF
mtlD
fruBA
fruK
pfkA
pgk
gpmA
icdA
ppsA
pckA
aceA
tpiA
aceB
Gamma-proteobacteria
Enterobacteriales
Common ancestor of Escherichia and Salmonella
Mannose
Glucose
manXYZ
ptsHI-crr
edd
epd
eda
adhE
aceEF
Mannitol
mtlA
gapA
fbp
Fructose
pykF
mtlD
fruBA
fruK
pfkA
pgk
gpmA
icdA
ppsA
pckA
aceA
tpiA
aceB
Gamma-proteobacteria
Enterobacteriales
E. coli and Salmonella spp.
Regulation of amino acid biosynthesis
in the Firmicutes
• Interplay between regulatory RNA elements and
transcription factors
• Expansion of T-box systems (normally – RNA
structures regulating aminoacyl-tRNA-synthetases)
Why T-boxes?
• May be easily identified
• In most cases functional specificity may
be reliably predicted by the analysis of
the specifier codons (anti-anti-codons)
• Sufficiently long to retain phylogenetic
signal
=> T-boxes are a good model of
regulatory evolution
Partial alignment of predicted T-boxes
Terminator(underlined)
===========> <===========
TGG: T-box Antiterminator
==> ===>
<===<==
AminoacyltRNA
synthetases
Amino acid
biosynthetic
genes
Amino acid
transporters
SA
DHA
ST
CA
DF
PN
MN
DF
HD
DF
ZC
BQ
MN
MN
ST
serS
tyrZ
trpS
aspS
valS
thrS
ileS
leuS
argS
proS
lysS
metS
pheS
glyQ
alaS
->
->
->
->
->
->
->
->
->
->
->
->
->
->
->
26
47
37
39
41
30
89
28
41
33
46
55
14
14
20
CGTTA
CGTTA
CCTTA
CGTTA
CGTTA
CGTTA
CGTTA
AGCTA
CGTTA
CGTTA
CGTTA
CGTTA
AATTA
AGCTA
AATTA
51
65
61
34
77
38
68
29
27
30
63
66
20
23
18
AAATAGGGTGGCAACGCGTAGAC------------CACGTCCCTTGTAGGGATGTGGTCTTTTTTTA
AGGTAAGGTGGTAACACGGGAGCA-------TACTCTCGTCCTTCTGGCAATGAAGGACGGGAGTTTTTTGTTTT
AATTGAGGTGGTACCGCGTATTACTT----GTAATAACGCCCTCACGTTTTAATAGCGTGGGGACTTTTTGCTAT
ATAAAGGATGGCACCGTGAAAA----------GCCTTCACTCCTTACTGGAGTGGAGGCTTTTTTTATTTTAAATAAA
AATTAAGGTGGTAACGCGAGC------------TTTTCGTCCTTTTTAAAGAGGATGAAGAGCTCTTTTTTATTTCT
AATGAAGGTGGAACCACGTTG-------------CGACGTCCTTTCGAGGATGTCGCATTTTTTTATTAG
AATTAAGGTGGTACCACGAGC-------------TTTCGTCCTTTGATGAAAGTTCTTTTTTATTGAT
AATTAGGGTGGTACCGCGAAGATT-------TATCCTCGTCCCTAAACGTAAGTTTAGTGACGAGGATTTTTTATTTTCA
AACGAGAGTGGTACCGCGGGTAA---------AAGCTCGCCTCTTTTTAGAAGAGGCGGGTTTTTTATTTT
AACTAGAGTGGTACCGCGGAAAT-----TAAACCTTTCGTCTCTATACTTGTATAGAGATGAGAGGTTTTTTATATTTTCAGG
AACTGAGGTGGTACCGCGAAGCTAA-----CAACTCTCGTCCTCAAGATGAATAATCTTGGGGGTGGGAGTTTTTTTGTTGCA
AAATAAGGTGGTACCGCGACTGTTTA---TACAGCCCCGCCCTTATCTTTTTTAGATAAGGGCGGGGCTTTTTATATTTAA
AAAACGGATGGTACCGCGTGTC-------------AACGCTCCGCTTAAGGAGTTTTGGCACTTTTTTTGTTTT
AATTAGGGTGGAACCGCGTTT------------CAAACGCCCCTATGTCAGTTGGCATGGGAGTGATTGAGCGTGGCTCTTTT
AATAGAGGTGGTACCGCGGTT--------------TTCGCCCTCTGTGAGATGGACTTGTTTTGTATGGAGGACTATTTGAAA
SA
BS
CA
BQ
BS
SA
MN
DHA
HD
BQ
EF
trpE
ilvB
ilvC
asnA
proB
cysE
hisC
pheA
serA
phhA
yxjH
->
->
->
->
->
->
->
->
->
->
->
32
50
40
51
33
33
46
41
42
51
40
AATTA
CGTTA
CGTTA
CGTTA
CGTTA
CATTA
CGTTA
CGTTA
cgtta
CGTTA
CGTTA
4
47
14
62
30
62
50
50
57
34
51
AACTAAGGTGGCACCACGGTA-------------ACGCGTCCTTACAGGTATATGCGTTATGTGGTGTCTTTTT
AACAAGGGTGGTACCGCGGAAAGAAA---AGCCTTTTCGCCCCTTTTAGCTATCGCAGTTACTGCGCGGCTGATTGT
AATTTGGGTGGTACCGCGCGACCAAA-----AATTCTCGCCCCAAGCAGGGAATTTTGGCCGTTTTTTTATATAAATAAAT
AATTTGGGTGGTACCGCGGAACC-----AAAGCCTTTCGTCCCAGTTTTTTGGGAAAGAAGGGCTTTTTTTGTTGGCTT
AATCAAGGTGGTACCACGGAAAC--------CCATTTCGTCCTTATGAATCAGGATGAAATGGGTTTTTTTATTGTAGA
ATTCAGAGTGGAACCGTGCGG-------------AAGCGCCTCTAACAATACAATTTGTATGTTAGTGGTGCTTTTTTG
AATGAAGGTGGAACCACGTGTGT---------GTCAGCGTCCTTGCAAGTTTTTTGCAAGGGCGCTTTTTTGAATAGT
AAAAAGGGTGGTACCGCGTGAC---------TTAACTCGTCCCTTATTTGGGGGTGAGGTAAGTCTTTTTTTATTTA
AATGAGGGTGGCACCGCGGTATG-------AACCTTCCGCCCCTCACGACAGTCGTCGTGTGGGCAGAAGGTTTTTTTACTAT
AAATAGGGTGGTACCGCGATTC------------TTTCGCCCCTATCGGATTTTCCGATAGGGGCTTTTTCTATTTC
AAAAAAGGTGGTACCGCGATAA-----------TAATCGCCCTTTTACTAGTTACGGCTAGTAAAAGGGCGTTTTTTTATAAA
CA yckK -> 38
DF yqiX -> 41
HD BH0807->74
EF yheL -> 8
BQ ykbA -> 46
BQ sdt2 -> 40
EF yusC -> 42
CA yhaG -> 48
BQ brnQ -> 44
REF01723 -> 44
BS yvbW -> 56
CGTTA
CCTTA
TGTTA
AATTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
57
30
56
33
45
56
60
51
66
55
32
AATTAGAGTGGTACCGTGGAATT-------CAACTTCTGCCTCTAACTATGAGGATAGAAGTTTTTTGTTTTTAT
AAAAAGAGTGGTAACGCGGATAT----------AATTCGTCTCTTAGCTGTAAAGCTAAGGGACTTTTTTGATTTA
AACTGGGGTGGCACCACGACAAG----------TGATCGTCCCCAAGACTTTTATCAGTCTTGGGGACGTTTTTTTGTTCAT
AATTAAGGTGGTACCGCGGAGA-----------GATTCGTCCTTATTCTTTAAGGATGAATCTCTCTTTTTATGTAGC
AACAAGGGTGGAACCACGAATAT--------AACACTCGTCCCTTTTTTAGGGAGGAGTGTTTTTTTATT
AATTGAGGTGGTACCACGGTATTAACATTACATATATCGTCCTCTACATGCATATTTGCGTGTAGGGGACTTTTTTATTTTC
AATTAAGGTGGTATCACGAAATGA-----CAAACTTTCGTCCTTTTTGCTGTAATAGCAAAAGGATGGAAGTTTTTTTGTTT
AATTTAGGTGGTACCGCGGAAGT---------ATCTCCGTCCTAATTAATAAGATTAGGGCGGAGTTTTTTATTTGC
AATTAGGGTGGTATCGCGGGTAAA------TATAACTCGTCCCTTTCTTTAGGGACGAGTTTTTTGTGTTCTT
AATTGAGGTGGCACCACGAATGC----------GATTCGTCCTCTTGGCTCACAGCCAAGAGGCTTTTTTGTTTTTTTAATA
AACAAGAGTGGTACCGCGGTCAGC--CGAAGGCTCGTCGTCTCTTTATCTATTAGATTAGGTAGGAGACGGCGGGCTTTTTT
… continued (in the 5’ direction)
specifier hairpin
===>
==>
===>
<=== <==
anti-anti
(specifier)
codon
SC<===
SA
DHA
ST
CA
DF
PN
MN
DF
HD
DF
ZC
BQ
MN
MN
ST
SERS
tyrZ
trpS
ASPS
VALS
THRS
ileS
leuS
ARGS
proS
lysS
metS
pheS
glyQ
alaS
SER
Tyr
Trp
ASP
VAL
THR
Ile
Leu
ARG
Pro
Lys
Met
Phe
Gly
Ala
---GTAGGACAAGTA
----AAGAACAAGTA
---ATTAGAAGAGTA
-----GAGAAAAGTA
-GAAGAAGAGGAGTA
----AGAGACAAGTC
----CAAAAACACAA
----CTAGAGCAGTA
-----TGGGAGAGTA
---AAAGAAATAGTA
---AAGAGAAGAGTA
---AAAGGAAAAGTA
----TGAGATTAGTA
---AGAAAGAGAGTT
-AGTTAAGAATTGTT
19
18
16
18
16
18
17
19
20
18
19
19
18
15
17
AGAGAGCTTGTGGTT---AGTGTGAACAAG--AGAAAGTTGCCGGCT---GATGAGAGGCGCTT
AGAGAGTTAGTGGTT---GGTGCAAGCTAACAGCGAATTGGGAAAT---GGTGTGAGCCCAAAGAGAGGAAAATTCACTGGCTGTAAGATTTTC
AGAGAGTGCGTGGTT---GCTGGAAACGCATAGCGAATAGGTGAT----GGTGTAAGACCTATT
AGAGGAAGTGGAA-----GGTGAGAACTAATATT
AGCGAGTCGGGAT-----GGTGGGAGCCGATAGAGAGAAAACGGT----GGTGAGAGTTTTC-AGAGAGCTCTGGTA----GCTGAGAAAGAGC-AGAGAGCTTCGGTA----GCTGAGAAGAAGC-AGGGAATGCGGGGCGTG-ACTGGAAACCCGCAGCGAACCTGAGAG----AGTGTAAGTCAGGT
AGAAAAGTGACGGTT---GCTGCGAGTCATT-
15
18
12
15
17
14
18
10
14
14
15
14
16
14
17
GAA--TCTACCTACTT
GAA--TACCTCTTTGA
GAAA-TGGACTAATGA
GAAA-GACATCTCGGA
GAAT-GTAGCTTTGGA
GAT--ACTACTCTTGA
-----ATCATTTTGTT
GAA--CTTACTAGATT
GAAA-CGCACCCATGA
GAA--CCTGTCTTTTA
GAAAAAAGACTTGGAG
GAACAATGGCCTTTGA
GAA--TTCACTCAGAA
GACT-GGCACTTTCTC
-----GCTACTTAACT
->
->
->
->
->
->
->
->
->
->
->
->
->
->
->
Amino acid
biosynthetic
genes
SA
BS
CA
BQ
BS
SA
MN
DHA
HD
BQ
EF
trpE
ilvB
ilvC
asnA
proB
cysE
hisC
pheA
serA
phhA
yxjH
Trp
Leu
Val
Asn
Pro
Cys
His
Phe
Ser
Tyr
Met
TCTAAAGAAATAGTA
---TGAGGATAAGTA
-----AGGAAGAGTA
--AGGACGAGTAGTA
-----AGGATTAGTA
--CGAAGGATTAGTA
-----AGAGAAAAAA
-----AAAGAGAGCA
----GAAGATGAGGA
AGAATCGCAGTAGTA
-----TAGGAAAGTA
22
20
17
15
18
18
16
19
17
17
17
AGAAAGCTAATGGGT---GATGGGAATTAGC-AGAGAACCGGGTTA----GCTGAGAACCGG--AGAGAGTGAGATACT---GGTGGGAACTCAT-AGCGAGTCAGGGGT----GGTGTGAGCCTGA-AGAGAGCAAAATGAACC-GCTGAAACATTTTGC
AGAGAGTGTACGGTT---GCTGTGAGTACA--AGAGAGTATGGGAA----GCTGAAAACATAC-AGGGAACTAAAGTCGGAGACTGAAAGCTTTAGT
AGAGAGCTGGTGGTT---GCTGTGAACCAGCTAGAGAGCTAATGGTC---GGTGGAAATTGGC-AGAGAGACTTTGGTT---GGTGAAAAAAGTT--
14
16
13
15
15
14
15
14
18
14
13
GAAT-TGGACTTTGGA
GAA--CTCGCCTCAGA
GAAG-GTAGCCTTTGA
GAAG-AACCTCCTGGA
GAA--CCTGCCTTGGA
GAA--TGCACCTTCGT
-----CACATTCTTGA
GAGA-TTCACTCTGGA
-----AGCCCTTCTGA
GAAT-TACAATTCTGG
GAAAAATGGCCTAGGA
->
->
->
->
->
->
->
->
->
->
->
Amino acid
transporters
CA yckK
DF yqiX
HD BH0807
EF yheL
BQ ykbA
BQ sdt2
EF yusC
CA yhaG
BQ brnQ
REF01723
BS yvbW
Cys
Arg
Lys
Tyr
Thr
Trp
Met
Trp
Ile
His
Leu
----AAGAACCAGTA
-----AGAGAAAGTA
----AGAGAAGAGTA
-TTATTAGCCCAGTA
--GAGGACACGATCA
---GCAAGAAGAGTA
----AAAGAAGAGTA
----AAGGAAGAGTA
----GAGAACGAGTA
--TTAGGACATAGTA
-----GGGAGCAGTA
17
16
19
19
16
18
18
18
19
18
18
AGAGAAAAATCTCCAAG-GCTGAAAGGGATTTT
AGCGAGTTAGGGGTT---GGTGTAAGCCTAGCAGAAAGCCTGTAGTT---GCTGAGAACGGGT-AGAAAGTCGATGGTT---GCTGCGAATCGAT-AGAGAGGGAAGCCTTTG-GCTGTGAGCTTCCTAGAGAGCTGGGGGAA---GGTGTGAGCCCGGTAGAGAGCCCTGTTT----GCTGAGAATGGG--AGAGAGCTGAGGGT----GGTGTGATCTCAGTAGAGAGTTGGCGATTT--GCTGAAAGCCAAC-AGAGACTTTTTCATTG--GCTGAAAGAAAAAGAGAGAGCTGCGGGGT---GGTGCGACGCAGC--
15
14
14
13
14
15
16
15
15
17
13
GAA--TGCATCTTTGA
GAAG-AGAGCTCTGGA
GAAGCAAGACTCTGAG
GAAT-TACACTAATAA
GATT-ACCACCTCTGA
GAA--TGGGCTTGCGA
GAAG-ATGGTCTTTGA
GAA--TGGACCTTTTA
GAAA-ATCATCTCCGA
-----CACACCTAAAA
GAA--CTCGCCCGGGA
->
->
->
->
->
->
->
->
->
->
->
AminoacyltRNA
synthetases
805 T-boxes in 96 bacteria
• Firmicutes
–
–
–
–
aa-tRNA synthetases
enzymes
transporters
all amino acids excluding glutamate
• Actinobacteria (regulation of translation – predicted)
– branched chain (ileS)
– aromatic (Atopobium minutum)
• Delta-proteobacteria
– branched chain (leu – enzymes)
• Thermus/Deinococcus group (aa-tRNA synthases)
– branched chain (ileS, valS)
– glycine
• Chloroflexi, Dictyoglomi
– aromatic (trp – enzymes)
– branched chain (ileS)
– threonine
Recent duplications and bursts:
ARG-T-box in Clostridium difficile
LR_ARGS
CPE_ARGS
CAC_ARGS
CB_ARGS
CBE_ARGS
Lactobacillales
CTC_ARGS
LP_ARGS
LME_ARGS
Clostridiales
argS
argS
LJ_ARGS
CDF_YQIXYZ
LGA_ARGS
RDF02391
PPE_ARGS
LSA_ARGS
СDF_ARGC
BC_ARGS2
EF_ARGS
BH_ARGS
CDF_ARGH
Bacillales
argS
: ARG-specific T-box regulatory site
yqiXYZ
NEW
NEW
aminoacyl-tRNA synthetase
biosynthetic genes
amino acid transporters
Clostridium
difficile
RDF02391
argCJBDF
argH
others
argG
predicted
amino acid
transporters
amino acid
biosynthetic
genes
… caused by loss of transcription factor AhrC
Gram+ bacteria:
Clostridium
difficile:
AhrC regulatory protein
(negative regulation of arginine metabolism
positive regulation of arginine catabolism)
Binding to 5’ UTR gene region
regulation of gene expression
5’
...
AhrC site
AhrC is lost
Expansion of T-box regulon
regulation of expression of
arginine biosynthetic
and transport genes by
T-box antitermination
Other clostridia spp.
(CA, CTC, CTH, CPE, CB, CPE)
yqiXYZ
yqiXYZ
argC
argH
argC
argH
argG
: AhrC binding site
: ARG-specific T-box regulatory site
CH_HISS
Bacillales
Other Gram+
hisS aspS
CTH_HISS
Lactobacillales
ASP\ASN
his operon
DRE_HISS
HIS
TTE_HISS
ASP
GAC
his XYZ
PL_HISS
Rapid mutation
of regulatory codons
NEW
BE_HISS
ASN
AAC
BL_HISS
BS_HISS
BC_HISS
LRE_HISXYZ
LSA_HISXYZ
OOE_HISXYZ SGO_HISC
SMU_HISC
Z
XY
HIS
_
LP
EF_HISXYZ
OB_HISS
Duplications
and changes in
specificity:
ASN/ASP/HIS
T-boxes
BCL_HISS
HIS
BH_HISS
EX_HISS
LME_HISXYZ
CDF_HISZX
EF_HISS
LMO_HISXYZ
EF_HISXYZ
LME_HIS(Z\G)
LL_HISC
LP_HISZ
Clostridiales
CPE_ASNS2
CDF_ASNA
CB_ASNS2
CDF_ASNS2
CTC_ASNA
asnS
ASN
LCA_HISZ
CB_ASNS3
CAC_ASNS32
asnA
BC_ASNS2
BC_ASNA
ASN
CBE_ASNS2
P. pentosaceus
asnS
CTC_ASNS2
CPE_ASNA
ASP
PPE_HISXYZ
Lactobacillales
hisS aspS
PPE_ASNS
EX_ASNA
LCA_HISS
ASP
hisXYZ
HIS
LB_ASNA
LB_ASNS2
LJ_HISS
LP_ASNA
PPE_ASNA
Lactobacillales
asnS
ASN
LB_HISS
asnA
LRE_ASPS
LP_HISS PPE_HISS
L. reuteri
aspS
ASP
hisS
HIS
LRE_HISS
ASN
LJ_ASNA
L. johnsonii
asnA
LJ_glnQHMP
LD_ASNA
ASN
glnQHMP
ASP
SG_ASPS2 SMU_ASPS2
Blow-up 1
LCA_HISS
LJ_HISS
PPE_HISXYZ
PPE_ASNS2
LB_HISS
LRE_ASPS
LB_ASNA
LP_HISS PPE_HISS
PPE_ASNA
LP_ASNA
LRE_HISS
ASN
AAC
HIS
CAC
P. pentosaceus
asnS
ASP
LJ_ASNA
hisXYZ
LJ_GLNQHMP
ASP
ASN
AAC
HIS
CAC
GAC
ASP
GAC
Lactobacillales
Lactobacillales
asnA
hisS aspS
ASN
ASP
L. reuteri
L. johnsonii
aspS
hisS
HIS
LD_ASNA
ASP
disruption of hisS-aspS operon
mutation of regulatory codon
asnA
ASN
glnQHMP
ASP
HIS
Blow-up 2. Prediction
Regulators
lost in
lineages
with
expanded
HIS-T-box
regulon??
… and validation
• conserved motifs upstream of HIS biosynthesis genes
Bacillales
(his operon)
Clostridiales
Thermoanaerobacteriales
Halanaerobiales
Bacillales
• candidate transcription factor yerC co-localized with the his genes
• present only in genomes with the motifs upstream of the his genes
• genomes with neither YerC motif nor HIS-T-boxes: attenuators
The evolutionary history of the his genes
regulation in the Firmicutes
More duplications:
THR-T-box in C. difficile and B. cereus
Bacillales
thrS
BE_THRS BC_THRS
BCE_BRNQ2
BH_THRS
BL_THRZ
BS_THRZ*
BC_HOM
thrZ
hom
B. cereus
thrCB
BCL_THRZ*
BC_THRZ*
BC_THRZ
brnQ
LMO_THRS
BCL_THRS
BL_THRS
BS_THRS
BCL_THRZ
PPE_THRS
LB_THRS
LJ_THRS
LP_THRS
TR_THRZ
Lactobacillaceae
Leuconostocaceae
thrS
EX_THRS
BS_THRZ
thrZ
CBE_THRZ CTH_THRZ
CPE_THRS
CDF_THRZ
CAC_THRZ
HMO_YNGI
OOE_THRS
CTE_THRZ
CDF_THRC
CDF_HOM
CDF_HOM*
TTE_THRZ
Clostridiales
thrS
СB_THRZ
CBE_THRS
MFL_THRS
MMY_THRS
LME_THRS
thrZ
SA_THRS
CTC_BRNQ1
SPY_THRS
SEQ_THRS
SUB_THRS
SMU_THRS
Streptococcaecae
thrS
hom
SAG_THRS
thrCB
С. difficile
SMI_THRS
SPN_THRS
brnQ
SG_THRS
STH_THRS
LL_THRS
SUI_THRS
: THR-specific T-box regulatory site
aminoacyl-tRNA synthetase
biosynthetic genes
amino acid transporters
others
T-boxes: Summary / History
Life without Fur
Regulation of iron homeostasis
(the Escherichia coli paradigm)
Iron:
• essential cofactor (limiting in many environments)
• dangerous at large concentrations
FUR (responds to iron):
• synthesis of siderophores
• transport (siderophores, heme, Fe2+, Fe3+)
• storage
• iron-dependent enzymes
• synthesis of heme
• synthesis of Fe-S clusters
Similar in Bacillus subtilis
Regulation of iron homeostasis in α-proteobacteria
[- Fe]
[+Fe]
[ - Fe]
[+Fe]
RirA
RirA
Irr
Irr
FeS
heme
degraded
Siderophore
uptake
2+
3+
Fe / Fe
uptake
Iron uptakesystems
Fur
[- Fe]
Iron storage
ferritins
FeS
synthesis
Heme
synthesis
Iron-requiring
enzymes
[ironcofactor]
Fur
IscR
Fe
FeS
Transcription
factors
FeS status
of cell
[+Fe]
Experimental studies:
• FUR/MUR: Bradyrhizobium, Rhizobium and Sinorhizobium
• RirA (Rrf2 family): Rhizobium and Sinorhizobium
• Irr (FUR family): Bradyrhizobium, Rhizobium and Brucella
Distribution of
transcription
factors in
genomes
Search for
candidate
motifs and
binding sites
using
standard
comparative
genomic
techniques
FUR/MUR branch of the FUR family
Fur
sp|
Escherichia coli: P0A9A9
ECOLI
Pseudomonas aeruginosa : sp|Q03456
PSEAE
NEIMA
Fur in g- and b- proteobacteria
Neisseria meningitidis : sp|P0A0S7
HELPY Helicobacter pylori : sp|O25671
P54574
BACSU Bacillus subtilis : sp|
SM mur
Sinorhizobium meliloti
Mesorhizobium sp. BNC1 (I)
MBNC03003179
BQ fur2
Bartonella quintana
BMEI0375
Brucella melitensis
EE36 12413 Sulfitobacter sp. EE-36
MBNC03003593Mesorhizobium sp. BNC1 (II)
Rhodobacterales bacterium HTCC2654
RB2654 19538
Agrobacterium tumefaciens
AGR C 620
RHE_CH00378 Rhizobium etli
Rhizobium leguminosarum
RL mur
Nham 0990 Nitrobacter hamburgensis X14
Nwi 0013
Nitrobacter winogradskyi
Rhodopseudomonas palustris
RPA0450
Bradyrhizobium japonicum
BJ fur
Roseovarius sp.217
ROS217 18337
Jannaschia sp. CC51
Jann 1799
Silicibacter pomeroyi
SPO2477
STM1w01000993Silicibacter sp. TM1040
MED193 22541 Roseobacter sp. MED193
OB2597 02997Oceanicola batsensisHTCC2597
Loktanella vestfoldensisSKA53
SKA53 03101
Rhodobacter sphaeroides
Rsph03000505
Roseovarius nubinhibensISM
ISM 15430
PU1002 04436Pelagibacter ubiqueHTCC1002
GOX0771 Gluconobacter oxydans
Zmomonas
y
mobilis
ZM01411
Saro02001148 Novosphingobium aromaticivorans
Sphinopyxis alaskensis RB2256
Sala 1452
ELI1325
Erythrobacter litoralis
Oceanicaulis alexandrii HTCC2633
OA2633 10204
PB2503 04877 Parvularcula bermudensis HTCC2503
CC0057
Caulobacter crescentus
Rhodospirillum rubrum
Rrub02001143
Magnetospirillum magneticum (I)
Amb1009
Magnetospirillum magneticum(II)
Amb4460
Fur in e- proteobacteria
Fur in Firmicutes
Mur
in a-proteobacteria
Regulator of manganese
uptake genes (sit, mntH)
Fur
in a-proteobacteria
Regulator of iron uptake
and metabolism genes
Irr
a-proteobacteria
Erythrobacter litoralis
Caulobacter crescentus
Zymomonas mobilis
Novosphingobium aromaticivorans
Oceanicaulis alexandrii
Sphinopyxis alaskensis
Gluconobacter oxydans
Rhodospirillum rubrum
Parvularcula bermudensis -
Magnetospirillum magneticum
Identified Mur-binding sites
of a - proteobacteria
-
FUR and
MUR
boxes
Bacillus subtilis
Mur
Escherichia coli
Sequence logos for
the known
Fur-binding sites
in Escherichia coli
and Bacillus subtilis
Irr branch of the FUR family
Fur
Escherichia coli : P0A9A9
sp|
ECOLI
Pseudomonas aeruginosa : sp|Q03456
PSEAE
NEIMA
Fur in g- and b- proteobacteria
Neisseria meningitidis : sp|P0A0S7
HELPY Helicobacter pylori : sp|O25671
sp|
BACSU Bacillus subtilis : P54574
Fur in e- proteobacteria
Fur in Firmicutes
a-proteobacteria
Mur / Fur
Agrobacterium tumefaciens
AGR C 249
Sinorhizobium meliloti
SM irr
Rhizobium etli
RHE CH00106
Rhizobium leguminosarum (I)
RL irr1
RL irr2 Rhizobium leguminosarum (II)
Mesorhizobium loti
MLr5570
MBNC03003186 Mesorhizobium sp. BNC1
BQ fur1 Bartonella quintana
Brucella melitensis (I)
BMEI1955
Brucella melitensis (II)
BMEI1563
BJ blr1216 Bradyrhizobium japonicum (II)
RB2654 182 Rhodobacterales bacterium HTCC2654
Loktanella vestfoldensis SKA53
SKA53 01126
Roseovarius sp.217
ROS217 15500
Roseovarius nubinhibens ISM
ISM 00785
OB2597 14726 Oceanicola batsensis HTCC2597
Jann 1652 Jannaschia sp. CC51
Rsph03001693Rhodobacter sphaeroides
Sulfitobacter sp. EE-36
EE36 03493
STM1w01001534 Silicibacter sp. TM1040
Roseobacter sp. MED193
MED193 17849
SPOA0445
Silicibacter pomeroyi
Rhodobacter capsulatus
RC irr
RPA2339
Rhodopseudomonas palustris (I)
RPA0424*
Rhodopseudomonas palustris (II)
Bradyrhizobium japonicum (I)
BJ irr*
Nwi 0035* Nitrobacter winogradskyi
Nham 1013* Nitrobacter hamburgensis X14
PU1002 04361
Pelagibacter ubique HTCC1002
Irr in a-proteobacteria:
regulator of iron
homeostasis
Irr boxes
Rhizobiaceae plus
Bradyrhizobiaceae
Rhodobacteriaceae
Rhodospirillales
RirA/NsrR family (Rhizobiales)
IscR family
Regulation of genes
in functional
subsystems
Rhizobiales
Bradyrhizobiaceae
Rhodobacteriales
The Zoo (likely
ancestral state)
Reconstruction of history
Frequent
co-regulation
with Irr
Strict division
of function
with Irr
Appearance of the
iron-Rhodo motif
All logos and Some Very
Tempting Hypotheses:
Cross-recognition of
FUR and IscR motifs
in the ancestor.
2. When FUR had
become MUR, and
IscR had been lost in
Rhizobiales, emerging
RirA (from the Rrf2
family, with a rather
different general
consensus) took over
their sites.
3. Iron-Rhodo boxes
are recognized by
IscR: directly
2
1.
testable
1
3
Summary and open problems
• Regulatory systems are very flexible
–
–
–
–
easily lost
easily expanded (in particular, by duplication)
may change specificity
rapid turnover of regulatory sites
• With more stories like these, we can start thinking about
a general theory
– catalog of elementary events; how frequent?
– mechanisms (duplication, birth e.g. from enzymes, horizontal
transfer)
– conserved (regulon cores) and non-conserved (marginal regulon
members) genes in relation to metabolic and functional
subsystems/roles
– (TF family-specific) protein-DNA recognition code
– distribution of TF families in genomes; distribution of regulon
sizes; etc.
People
•
•
•
•
•
Andrei A. Mironov – software, algorithms
Alexandra Rakhmaninova – SDP, protein-DNA correlations
•
•
•
•
•
•
Olga Kalinina (on loan to EMBL) – SDP
Yuri Korostelev – protein-DNA correlations
Olga Laikova – LacI
Dmitry Ravcheev– CRA/FruR
Dmitry Rodionov (on loan to Burnham Institute) – iron etc.
Alexei Vitreschak – T-boxes and riboswitches
•
•
•
Andy Jonson (U. of East Anglia) – experimental validation (iron)
Leonid Mirny (MIT) – protein-DNA, SDP
Andrei Osterman (Burnham Institute) – experimental validation
Howard Hughes Medical Institute
Russian Foundation of Basic Research
Russian Academy of Sciences, program “Molecular and Cellular Biology”