Riboswitches: the oldest regulatory system?

Download Report

Transcript Riboswitches: the oldest regulatory system?

Evolution of bacterial
regulatory sytsems
Mikhail Gelfand
Research and Training Center “Bioinformatics”
Institute for Information Transmission Problems
Moscow, Russia
Institute of Protein Research
40th Anniversary Conference
June 2007
Comparative genomics of zinc regulons
Two major roles of zinc in bacteria:
• Structural role in DNA polymerases,
primases, ribosomal proteins, etc.
• Catalytic role in metal proteases and
other enzymes
Poisonous in large concentrations
=> the concentration of zinc is
tightly controlled
Regulators (zinc uptake) and motives
nZUR-
nZUR-
GATATGTTATAACATATC
GAAATGTTATANTATAACATTTC
GTAATGTAATAACATTAC
TTAACYRGTTAA
pZUR
TAAATCGTAATNATTACGATTTA
AdcR
Predictions
TM
Zn
• Known transporters
– Orthologs of the AdcABC and
YciC transport systems
– Paralogs of the components
of the AdcABC and YciC
transport systems
adcA
zinT
• Candidate transporters with
previously unknown
specificity
– ZinT is a new type of zincbinding component of zinc
ABC transporter
S. pneumoniae
S. pyogenes
S. agalactiae
zinc regulation shown in
experiment
lmb phtD
phtA
phtE
phtB
S. equi
lmb phtD
phtY
lmb phtD
• PHT (pneumococcal histidine
triad) proteins of
Streptococcus spp.
– PHT proteins are adhesins
involved in the attachment of
streptococci to epithelium
cells, leading to invasion. This
process is regulated by zinc
concentration.
AdcR
pZUR
nZUR
Zinc and (paralogs of) ribosomal proteins
L36
E. coli, S.typhi
–
K. pneumoniae
–
Y. pestis,V. cholerae – 
B subtilis
–
S. aureus
–
Listeria spp.
–
E. faecalis
–
S. pne., S. mutans
–
S. pyo., L. lactis
–
L33
–
–
–
–+–
–––
––
–––
–––
–––
L31
–+
––
–+
–+
–
–
–
–
–
S14
–
–
–
–+
–+
–+
–+–
–
–+
AdcR
pZUR
nZUR
Zn-ribbon motif
(Makarova-Ponomarev-Koonin, 2001)
L36
E. coli, S.typhi
(–)
K. pneumoniae
(–)
Y. pestis,V. cholerae (–) 
B subtilis
(–)
S. aureus
(–)
Listeria spp.
(–)
E. faecalis
(–)
S. pne., S. mutans
(–)
S. pyo., L. lactis
(–)
L33
–
–
–
(–) + –
(–) – –
(–) –
(–)  – –
(–) – –
(–) – –
L31
(–) +
(–) –
(–) +
(–) +
–
–
–
–
–
S14
–
–
–
(–) +
(–) +
(–) +
(–) + –
(–)
(–) +
Summary of observations:
• Makarova-Ponomarev-Koonin, 2001:
– L36, L33, L31, S14 are the only ribosomal proteins
duplicated in more than one species
– L36, L33, L31, S14 are four out of seven ribosomal
proteins that contain the zinc-ribbon motif (four
cysteines)
– Out of two (or more) copies of the L36, L33, L31, S14
proteins, one usually contains zinc-ribbon, while the
other has eliminated it
• Among genes encoding paralogs of
ribosomal proteins, there is (almost)
always one gene regulated by a zinc
repressor, and the corresponding protein
never has a zinc ribbon motif
Zinc starvation and its consequences
Zn-rich
conditions:
sufficient Zn for
the ribosomes
and the enzymes
Bad scenario:
all Zn utilized by the
ribosomes,
no Zn for Zndependent enzymes
Good scenario:
some ribosomes
without Zn,
some Zn left for the
enzymes
Regulatory mechanism
Sufficient Zn
ribosomes
repressor
R
Zn-dependent
enzymes
Zn starvation
R
Prediction …
(Proc Natl Acad Sci U S A. 2003 Aug 19;100(17):9912-7.)
… and confirmation
(Mol Microbiol. 2004 Apr;52(1):273-83.)
T-boxes: the mechanism
(Grundy & Henkin; Putzer & Grunberg-Manago)
Partial alignment of predicted T-boxes
Terminator(underlined)
===========> <===========
TGG: T-box
Antiterminator
==> ===>
<===<==
AminoacyltRNA
synthetases
Amino acid
biosynthetic
genes
Amino acid
transporters
SA
DHA
ST
CA
DF
PN
MN
DF
HD
DF
ZC
BQ
MN
MN
ST
serS
tyrZ
trpS
aspS
valS
thrS
ileS
leuS
argS
proS
lysS
metS
pheS
glyQ
alaS
->
->
->
->
->
->
->
->
->
->
->
->
->
->
->
26
47
37
39
41
30
89
28
41
33
46
55
14
14
20
CGTTA
CGTTA
CCTTA
CGTTA
CGTTA
CGTTA
CGTTA
AGCTA
CGTTA
CGTTA
CGTTA
CGTTA
AATTA
AGCTA
AATTA
51
65
61
34
77
38
68
29
27
30
63
66
20
23
18
AAATAGGGTGGCAACGCGTAGAC------------CACGTCCCTTGTAGGGATGTGGTCTTTTTTTA
AGGTAAGGTGGTAACACGGGAGCA-------TACTCTCGTCCTTCTGGCAATGAAGGACGGGAGTTTTTTGTTTT
AATTGAGGTGGTACCGCGTATTACTT----GTAATAACGCCCTCACGTTTTAATAGCGTGGGGACTTTTTGCTAT
ATAAAGGATGGCACCGTGAAAA----------GCCTTCACTCCTTACTGGAGTGGAGGCTTTTTTTATTTTAAATAAA
AATTAAGGTGGTAACGCGAGC------------TTTTCGTCCTTTTTAAAGAGGATGAAGAGCTCTTTTTTATTTCT
AATGAAGGTGGAACCACGTTG-------------CGACGTCCTTTCGAGGATGTCGCATTTTTTTATTAG
AATTAAGGTGGTACCACGAGC-------------TTTCGTCCTTTGATGAAAGTTCTTTTTTATTGAT
AATTAGGGTGGTACCGCGAAGATT-------TATCCTCGTCCCTAAACGTAAGTTTAGTGACGAGGATTTTTTATTTTCA
AACGAGAGTGGTACCGCGGGTAA---------AAGCTCGCCTCTTTTTAGAAGAGGCGGGTTTTTTATTTT
AACTAGAGTGGTACCGCGGAAAT-----TAAACCTTTCGTCTCTATACTTGTATAGAGATGAGAGGTTTTTTATATTTTCAGG
AACTGAGGTGGTACCGCGAAGCTAA-----CAACTCTCGTCCTCAAGATGAATAATCTTGGGGGTGGGAGTTTTTTTGTTGCA
AAATAAGGTGGTACCGCGACTGTTTA---TACAGCCCCGCCCTTATCTTTTTTAGATAAGGGCGGGGCTTTTTATATTTAA
AAAACGGATGGTACCGCGTGTC-------------AACGCTCCGCTTAAGGAGTTTTGGCACTTTTTTTGTTTT
AATTAGGGTGGAACCGCGTTT------------CAAACGCCCCTATGTCAGTTGGCATGGGAGTGATTGAGCGTGGCTCTTTT
AATAGAGGTGGTACCGCGGTT--------------TTCGCCCTCTGTGAGATGGACTTGTTTTGTATGGAGGACTATTTGAAA
SA
BS
CA
BQ
BS
SA
MN
DHA
HD
BQ
EF
trpE
ilvB
ilvC
asnA
proB
cysE
hisC
pheA
serA
phhA
yxjH
->
->
->
->
->
->
->
->
->
->
->
32
50
40
51
33
33
46
41
42
51
40
AATTA
CGTTA
CGTTA
CGTTA
CGTTA
CATTA
CGTTA
CGTTA
cgtta
CGTTA
CGTTA
4
47
14
62
30
62
50
50
57
34
51
AACTAAGGTGGCACCACGGTA-------------ACGCGTCCTTACAGGTATATGCGTTATGTGGTGTCTTTTT
AACAAGGGTGGTACCGCGGAAAGAAA---AGCCTTTTCGCCCCTTTTAGCTATCGCAGTTACTGCGCGGCTGATTGT
AATTTGGGTGGTACCGCGCGACCAAA-----AATTCTCGCCCCAAGCAGGGAATTTTGGCCGTTTTTTTATATAAATAAAT
AATTTGGGTGGTACCGCGGAACC-----AAAGCCTTTCGTCCCAGTTTTTTGGGAAAGAAGGGCTTTTTTTGTTGGCTT
AATCAAGGTGGTACCACGGAAAC--------CCATTTCGTCCTTATGAATCAGGATGAAATGGGTTTTTTTATTGTAGA
ATTCAGAGTGGAACCGTGCGG-------------AAGCGCCTCTAACAATACAATTTGTATGTTAGTGGTGCTTTTTTG
AATGAAGGTGGAACCACGTGTGT---------GTCAGCGTCCTTGCAAGTTTTTTGCAAGGGCGCTTTTTTGAATAGT
AAAAAGGGTGGTACCGCGTGAC---------TTAACTCGTCCCTTATTTGGGGGTGAGGTAAGTCTTTTTTTATTTA
AATGAGGGTGGCACCGCGGTATG-------AACCTTCCGCCCCTCACGACAGTCGTCGTGTGGGCAGAAGGTTTTTTTACTAT
AAATAGGGTGGTACCGCGATTC------------TTTCGCCCCTATCGGATTTTCCGATAGGGGCTTTTTCTATTTC
AAAAAAGGTGGTACCGCGATAA-----------TAATCGCCCTTTTACTAGTTACGGCTAGTAAAAGGGCGTTTTTTTATAAA
CA yckK -> 38
DF yqiX -> 41
HD BH0807->74
EF yheL -> 8
BQ ykbA -> 46
BQ sdt2 -> 40
EF yusC -> 42
CA yhaG -> 48
BQ brnQ -> 44
REF01723 -> 44
BS yvbW -> 56
CGTTA
CCTTA
TGTTA
AATTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
57
30
56
33
45
56
60
51
66
55
32
AATTAGAGTGGTACCGTGGAATT-------CAACTTCTGCCTCTAACTATGAGGATAGAAGTTTTTTGTTTTTAT
AAAAAGAGTGGTAACGCGGATAT----------AATTCGTCTCTTAGCTGTAAAGCTAAGGGACTTTTTTGATTTA
AACTGGGGTGGCACCACGACAAG----------TGATCGTCCCCAAGACTTTTATCAGTCTTGGGGACGTTTTTTTGTTCAT
AATTAAGGTGGTACCGCGGAGA-----------GATTCGTCCTTATTCTTTAAGGATGAATCTCTCTTTTTATGTAGC
AACAAGGGTGGAACCACGAATAT--------AACACTCGTCCCTTTTTTAGGGAGGAGTGTTTTTTTATT
AATTGAGGTGGTACCACGGTATTAACATTACATATATCGTCCTCTACATGCATATTTGCGTGTAGGGGACTTTTTTATTTTC
AATTAAGGTGGTATCACGAAATGA-----CAAACTTTCGTCCTTTTTGCTGTAATAGCAAAAGGATGGAAGTTTTTTTGTTT
AATTTAGGTGGTACCGCGGAAGT---------ATCTCCGTCCTAATTAATAAGATTAGGGCGGAGTTTTTTATTTGC
AATTAGGGTGGTATCGCGGGTAAA------TATAACTCGTCCCTTTCTTTAGGGACGAGTTTTTTGTGTTCTT
AATTGAGGTGGCACCACGAATGC----------GATTCGTCCTCTTGGCTCACAGCCAAGAGGCTTTTTTGTTTTTTTAATA
AACAAGAGTGGTACCGCGGTCAGC--CGAAGGCTCGTCGTCTCTTTATCTATTAGATTAGGTAGGAGACGGCGGGCTTTTTT
… continued (in the 5’ direction)
specifier hairpin
===>
==>
===>
anti-anti
(specifier)
codon
<=== <==
SC<===
SA
DHA
ST
CA
DF
PN
MN
DF
HD
DF
ZC
BQ
MN
MN
ST
SERS
tyrZ
trpS
ASPS
VALS
THRS
ileS
leuS
ARGS
proS
lysS
metS
pheS
glyQ
alaS
SER
Tyr
Trp
ASP
VAL
THR
Ile
Leu
ARG
Pro
Lys
Met
Phe
Gly
Ala
---GTAGGACAAGTA
----AAGAACAAGTA
---ATTAGAAGAGTA
-----GAGAAAAGTA
-GAAGAAGAGGAGTA
----AGAGACAAGTC
----CAAAAACACAA
----CTAGAGCAGTA
-----TGGGAGAGTA
---AAAGAAATAGTA
---AAGAGAAGAGTA
---AAAGGAAAAGTA
----TGAGATTAGTA
---AGAAAGAGAGTT
-AGTTAAGAATTGTT
19
18
16
18
16
18
17
19
20
18
19
19
18
15
17
AGAGAGCTTGTGGTT---AGTGTGAACAAG--AGAAAGTTGCCGGCT---GATGAGAGGCGCTT
AGAGAGTTAGTGGTT---GGTGCAAGCTAACAGCGAATTGGGAAAT---GGTGTGAGCCCAAAGAGAGGAAAAT TCACTG GCTGTAAGATTTTC
AGAGAGTGCGTGGTT---GCTGGAAACGCAT AGCGAATAGGTGAT----GGTGTAAGACCTAT T
AGAGGAAGTGGAA-----GGTGAGAACTAAT ATT
AGCGAGTCGGGAT-----GGTGGGAGCCGATAGAGAGAAAACGGT----GGTGAGAGTTTTC -AGAGAGCTCTGGTA----GCTGAGAAAGAGC-AGAGAGCTTCGGTA----GCTGAGAAGAAGC-AGGGAATGCGGGGCGTG-ACTGGAAACCCGCAGCGAACCTGAGAG----AGTGTAAGTCAGGT
AGAAAAGTGACGGTT---GCTGCGAGTCATT -
15
18
12
15
17
14
18
10
14
14
15
14
16
14
17
GAA--TCTACCTACTT
GAA--TACCTCTTTGA
GAAA-TGGACTAATGA
GAAA-GACATCTCGGA
GAAT-GTAGCTTTGGA
GAT--ACTACTCTTGA
-----ATCATTTTGTT
GAA--CTTACTAGATT
GAAA-CGCACCCATGA
GAA--CCTGTCTTTTA
GAAAAAAGACTTGGAG
GAACAATGGCCTTTGA
GAA--TTCACTCAGAA
GACT-GGCACTTTCT C
-----GCTACTTAACT
->
->
->
->
->
->
->
->
->
->
->
->
->
->
->
Amino acid
biosynthetic
genes
SA
BS
CA
BQ
BS
SA
MN
DHA
HD
BQ
EF
trpE
ilvB
ilvC
asnA
proB
cysE
hisC
pheA
serA
phhA
yxjH
Trp
Leu
Val
Asn
Pro
Cys
His
Phe
Ser
Tyr
Met
TCTAAAGAAATAGTA
---TGAGGATAAGTA
-----AGGAAGAGTA
--AGGACGAGTAGTA
-----AGGATTAGTA
--CGAAGGATTAGTA
-----AGAGAAAAAA
-----AAAGAGAGCA
----GAAGATGAGGA
AGAATCGCAGTAGTA
-----TAGGAAAGTA
22
20
17
15
18
18
16
19
17
17
17
AGAAAGCTAATGGGT---GATGGGAATTAGC -AGAGAACCGGGTTA----GCTGAGAACCGG--AGAGAGTGAGATACT---GGTGGGAACTCAT-AGCGAGTCAGGGGT----GGTGTGAGCCTGA-AGAGAGCAAAATG AACC-GCTGAAACATTTTGC
AGAGAGTGTACGGTT---GCTGTGAGTACA--AGAGAGTATGGGAA----GCTGAAAACATAC-AGGGAACTAAAG TCGGAG ACTGAAAGCTTTAGT
AGAGAGCTGGTGGTT---GCTGTGAACCAGCT AGAGAGCTAATGGTC---GGTGGAAATTGGC -AGAGAGACTTTGGTT---GGTGAAAAAAGTT --
14
16
13
15
15
14
15
14
18
14
13
GAAT-TGGACTTTGGA
GAA--CTCGCCTCAGA
GAAG-GTAGCCTTTGA
GAAG-AACCTCCTGGA
GAA--CCTGCCTTGGA
GAA--TGCACCTTCG T
-----CACATTCTTGA
GAGA-TTCACTCTGGA
-----AGCCCTTCTGA
GAAT-TACAATTCTGG
GAAAAATGGCCTAGGA
->
->
->
->
->
->
->
->
->
->
->
Amino acid
transporters
CA yckK
DF yqiX
HD BH0807
EF yheL
BQ ykbA
BQ sdt2
EF yusC
CA yhaG
BQ brnQ
REF01723
BS yvbW
Cys
Arg
Lys
Tyr
Thr
Trp
Met
Trp
Ile
His
Leu
----AAGAACCAGTA
-----AGAGAAAGTA
----AGAGAAGAGTA
-TTATTAGCCCAGTA
--GAGGACACGATCA
---GCAAGAAGAGTA
----AAAGAAGAGTA
----AAGGAAGAGTA
----GAGAACGAGTA
--TTAGGACATAGTA
-----GGGAGCAGTA
17
16
19
19
16
18
18
18
19
18
18
AGAGAAAAATCTC CAAG-GCTGAAAGGGATTTT
AGCGAGTTAGGGGTT---GGTGTAAGCCTAGC AGAAAGCCTGTAGTT---GCTGAGAACGGGT -AGAAAGTCGATGGTT---GCTGCGAATCGAT -AGAGAGGGAAGC CTTTG-GCTGTGAGCTTCCT AGAGAGCTGGGGGAA---GGTGTGAGCCCGGT AGAGAGCCCTGTTT----GCTGAGAATGGG--AGAGAGCTGAGGGT----GGTGTGATCTCAGT AGAGAGTTGGCGATTT--GCTGAAAGCCAAC -AGAGACTTTTTCATTG--GCTGAAAGAAAAAG AGAGAGCTGCGGGGT---GGTGCGACGCAGC --
15
14
14
13
14
15
16
15
15
17
13
GAA--TGCATCTTTGA
GAAG-AGAGCTCTGGA
GAAGCAAGACTCTGAG
GAAT-TACACTAATAA
GATT-ACCACCTCTGA
GAA--TGGGCTTGCGA
GAAG-ATGGTCTTTGA
GAA--TGGACCTTTTA
GAAA-ATCATCTCCGA
-----CACACCTAAAA
GAA--CTCGCCCGGGA
->
->
->
->
->
->
->
->
->
->
->
AminoacyltRNA
synthetases
Why T-boxes?
• May be easily identified
• In most cases functional specificity may be
reliably predicted by the analysis of
specifier codons (anti-anti-codons)
• Sufficiently long to retain phylogenetic
signal
• Thus T-boxes are a good model of
regulatory evolution
~800 T-boxes in ~90 bacteria
• Firmicutes
–
–
–
–
aa-tRNA synthetases
enzymes
transporters
all amino acids excluding glutamate
(lysine and glutamine – rare)
• Actinobacteria (regulation of translation)
– branched chain (ileS)
– aromatic (Atopobium minutum)
• Delta-proteobacteria
– branched chain (leu – enzymes)
• Thermus/Deinococcus group (aa-tRNA synthases)
– branched chain (ileS, valS)
– glycine
• Chloroflexi, Dictyoglomi
– aromatic (trp – enzymes)
– branched chain (ileS)
– threonine
Double and one-and-a-half T-boxes
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
TRP: trp operon (Bacillales,
C. beijerincki, D. hafniense)
TYR: pah (B. cereus)
THR: thrZ (Bacillales);
hom (C. difficile)
ILE: ilv operon (B. cereus)
LEU: leuA (C. thermocellum)
ILE-LEU: ilvDBNCB-leuACDBA
(Desulfotomaculum reducens)
TRP: trp operon (T. tengcongensis)
PHE: arpLA-pheA (D. reducens, S. wolfei)
PHE: trpXY2 (D. reducens)
PHE: yngI (D. reducens)
TYR: yheL (B. cereus)
SER: serCA (D. hafniense)
THR: thrZ (S. uberis)
THR: brnQ-braB1 (C. thermocellum)
HIS: hisXYZ (Lactobacillales)
ARG: yqiXYZ (C. difficile)
Predicted regulation of translation:
ileS in many Actinobacteria
• Instead of the terminator, the sequester hairpin
(hides the translation initiation site)
• Same mechanism regulates different processes –
cf. riboswitches
Same enzymes
– different
regulators
(common part
of the aromatic
amino acids
biosynthesis
pathway)
PEP
E4P
aroA
aro:
Regulated by TYR (BC)
Regulated by PHE (SWO, DRE, HMO, CH, MTH, CTH)
Regulated by TRP (DE, DEH)
DAHP
aroB
aroC
aroD
SHIKIMATE
aroI
aroE
aroF
pabA
pabB
CHORISMATE
aroA
trpE
pheB
aroH
trpG
ADC
FOLATE
ANTHRANILATE
tyrA
hisC
aspB
trpDCFBA
kinurenine
pathway
TRP
yhaG
TRP
TYR
PHE
phhA
TRP trpXYZ
TRP\PHE yocR family
TYR yheL
cf. E.coli: aroF,G,H:
feedback inhibition
by TRP, TYR, PHE;
transcriptional
regulation by
TrpR, TyrR
Recent duplications and bursts:
ARG-T-box in Clostridium difficile
LR_ARGS
CPE_ARGS
CAC_ARGS
CB_ARGS
CBE_ARGS
Lactobacillales
CTC_ARGS
LP_ARGS
LME_ARGS
Clostridiales
argS
argS
LJ_ARGS
CDF_YQIXYZ
LGA_ARGS
RDF02391
PPE_ARGS
LSA_ARGS
СDF_ARGC
BC_ARGS2
EF_ARGS
BH_ARGS
CDF_ARGH
Bacillales
argS
: ARG-specific T-box regulatory site
yqiXYZ
NEW
NEW
aminoacyl-tRNA synthetase
biosynthetic genes
amino acid transporters
Clostridium
difficile
RDF02391
argCJBDF
argH
others
argG
predicted
amino acid
transporters
amino acid
biosynthetic
genes
Gram+ bacteria:
Clostridium
difficile:
AhrC regulatory protein
(negative regulation of arginine metabolism
positive regulation of arginine catabolism)
Binding to 5’ UTR gene region
regulation of gene expression
5’
...
AhrC site
AhrC is lost
Expansion of T-box regulon
regulation of expression of
arginine biosynthetic
and transport genes by
T-box antitermination
Other clostridia spp.
(CA, CTC, CTH, CPE, CB, CPE)
yqiXYZ
yqiXYZ
argC
argH
argC
argH
argG
: AhrC binding site
: ARG-specific T-box regulatory site
More duplications: THR-T-box in C. difficile
Bacillales
thrS
BE_THRS BC_THRS
BCE_BRNQ2
BH_THRS
BL_THRZ
BS_THRZ*
BC_HOM
thrZ
hom
B. cereus
thrCB
BCL_THRZ*
BC_THRZ*
BC_THRZ
brnQ
LMO_THRS
BCL_THRS
BL_THRS
BS_THRS
BCL_THRZ
PPE_THRS
LB_THRS
LJ_THRS
LP_THRS
TR_THRZ
Lactobacillaceae
Leuconostocaceae
thrS
EX_THRS
BS_THRZ
thrZ
CBE_THRZ CTH_THRZ
CPE_THRS
CDF_THRZ
CAC_THRZ
HMO_YNGI
OOE_THRS
CTE_THRZ
CDF_THRC
CDF_HOM
CDF_HOM*
TTE_THRZ
Clostridiales
thrS
СB_THRZ
CBE_THRS
MFL_THRS
MMY_THRS
LME_THRS
thrZ
SA_THRS
CTC_BRNQ1
SPY_THRS
SEQ_THRS
SUB_THRS
SMU_THRS
Streptococcaecae
thrS
hom
SAG_THRS
thrCB
С. difficile
SMI_THRS
SPN_THRS
brnQ
SG_THRS
STH_THRS
LL_THRS
SUI_THRS
: THR-specific T-box regulatory site
aminoacyl-tRNA synthetase
biosynthetic genes
amino acid transporters
others
CH_HISS
Bacillales
Other Gram+
hisS aspS
CTH_HISS
Lactobacillales
ASP\ASN
his operon
DRE_HISS
HIS
TTE_HISS
ASP
GAC
his XYZ
PL_HISS
Rapid mutation
of regulatory codons
NEW
BE_HISS
ASN
AAC
BL_HISS
BS_HISS
BC_HISS
LRE_HISXYZ
LSA_HISXYZ
OOE_HISXYZ SGO_HISC
SMU_HISC
Z
XY
HIS
_
LP
EF_HISXYZ
OB_HISS
Duplications
and changes in
specificity:
ASN/ASP/HIS
T-boxes
BCL_HISS
HIS
BH_HISS
EX_HISS
LME_HISXYZ
CDF_HISZX
EF_HISS
LMO_HISXYZ
EF_HISXYZ
LME_HIS(Z\G)
LL_HISC
LP_HISZ
Clostridiales
CPE_ASNS2
CDF_ASNA
CB_ASNS2
CDF_ASNS2
CTC_ASNA
asnS
ASN
LCA_HISZ
CB_ASNS3
CAC_ASNS32
asnA
BC_ASNS2
BC_ASNA
ASN
CBE_ASNS2
P. pentosaceus
asnS
CTC_ASNS2
CPE_ASNA
ASP
PPE_HISXYZ
Lactobacillales
hisS aspS
PPE_ASNS
EX_ASNA
LCA_HISS
ASP
hisXYZ
HIS
LB_ASNA
LB_ASNS2
LJ_HISS
LP_ASNA
PPE_ASNA
Lactobacillales
asnS
ASN
LB_HISS
asnA
LRE_ASPS
LP_HISS PPE_HISS
L. reuteri
aspS
ASP
hisS
HIS
LRE_HISS
ASN
LJ_ASNA
L. johnsonii
asnA
LJ_glnQHMP
LD_ASNA
ASN
glnQHMP
ASP
SG_ASPS2 SMU_ASPS2
Blow-up
LCA_HISS
LJ_HISS
PPE_HISXYZ
PPE_ASNS2
LB_HISS
LRE_ASPS
LB_ASNA
LP_HISS PPE_HISS
PPE_ASNA
LP_ASNA
LRE_HISS
ASN
AAC
HIS
CAC
P. pentosaceus
asnS
ASP
LJ_ASNA
hisXYZ
LJ_GLNQHMP
ASP
ASN
AAC
HIS
CAC
GAC
ASP
GAC
Lactobacillales
Lactobacillales
asnA
hisS aspS
ASN
ASP
L. reuteri
L. johnsonii
aspS
hisS
HIS
LD_ASNA
ASP
disruption of hisS-aspS operon
mutation of regulatory codon
asnA
ASN
glnQHMP
ASP
HIS
Duplications and changes in specificity :
branched-chain amino acids
Firmicutes
leuS
LEU
LEU
Bacillales
PL_ILVB
Ilv operon
BH_ILVB
C. thermocellum
LEU
B. cereus
148_0001
.......
YOCR3
LEU
LEU
δ-proteobacteria
Clostridium difficile
Desulfitobacterium
hafniense
BS_ILVB
BE_ILVB
BL_ILVB
Oceanobacillus
iheyensis
CDF_LEUA
OB1271
B. Subtilis
B. licheniformis
CPE_LEUS
LEU
CTC_LEUS
GSU_LEUA
BS_LEUS
029_0008
LEU
CBE_LEUS
_L
EU
yvbW
Syntrophomonas
wolfei
A
BCL_ILVB
DH
A
.......
DF_LEUS
TTE_LEUS
DTH_ILVB
leu operon
LEU
CTH_148_0001
CB_LEUS
CA_LEUS
BL_LEUS
LEU
LP_BRNQ1_ile
BCL_LEUS
BH_LEUS
BC_LEUS
BE_LEUS
OB_LEUS
SWO_029_0008
Firmicutes
US
LE
O_
LP_LEUS
LCR_ILES
LL_ILES
LM
SW
DRE_070_0004
O_
CH_LEUS
LE
US BS_YVBW
BL_YVBW
LSA_LEUS
EX_LEUS
Firmicutes
DAC_LEUA
ileS
OB_ILVB
LJ_LEUS
LGA_LEUS
valS
VAL
LB_LEUS
ILE
SPY_ILES
SZ_ILES
SEQ_ILES
EF_LEUS
BC_YOCR3
STH_ILES
PPE_LEUS
OB1271
C. acetobutylicum
OOE_LEUS
SMU_ILES EF_ILES
LP3666
VAL
DG_VALS
SG_ILES
SAG_ILES
ilvC
CA_ILVC
SA_VALS
BE_VALS
CTH_VALS CH_VALS BH_VALS
Ilv operon2
SMI_ILES
SP_ILES
SOB_ILES
ILE
LME_ILES
Ilv operon2
BC_VALS
EX_VALS BCL_VALS
HMO_VALS
E_
VA
LS
LJ_OPP
CPE_ILES
CB_ILES
DF_VALS
CTC_VALS CBE_VALS
LJ_VALS
VAL
CB_VALS
PPE_ILES
CAC_VALS
LS
VA
A_
LS
LL_VALS
LCR_VALS
brnQ
ILE
LMO_ILES
VAL
DF_ILES
EX_ILES
BC_YBGE*
BC_YBGE
LR_VALS
Lactobacillus casei
Lactobacillus plantarum
brnQ
CTC_ILES
LD_VALS
LME_VALS
Lactobacillaceae
Clostridiaceae
Bacillus cereus
TTE_ILES
CP
DHA_VALS
PPE_VALS
EF_VALS
LCA_BRNQ2_ile
LRE_BRNQ_ile
BL_VALS
IlvCB
ILE
LP_BRNQ2_val
LSA_ILES
BS_VALS
TTE_VALS
LP_VALS
OB_ILES
ILE
LRE_3666_1
BC_ILES
CPE_BRNQ
CTC_BRNQ2
LP_ILES
BCE_BRNQ1
CAC_BRNQ
CTH_ILES
LR_LEUS
HMO_ILVB
BS_ILES
BL_ILES
BC_ILES2
ILE
VAL
ATC
CTC
GTC
T-box duplication and mutation
of regulatory codon
BCL_ILES
BH_ILES
CTC_BRNQ1 CDF_ILVC
BC_ILVB
Lactobacillales
lp3666
DHA_ILES
BE
CH_ILES
ILE
_IL
Desulfotomaculum reducens
Ilv operon
ES
OOE_ILES
Lactobacillus johnsonii
opp
LRE_3666_2
DRE_ILES
CH_YBGE
ILE
LEU
HMO_ILES
ILE
DRE_ILVD*_leu
Lactobacillus reuteri
panE
ILE
DRE_ILVD_ile
IlvBN
ILE
.......
LCA_BRNQ1_val
LJ_BRNQ_ile
DRE_VALS
.......
C. difficile
ILE
LB_ILES
OOE_LP3666
LRE_PANE
Heliobacillus mobilis
Ilv operon
ILE
.......
LJ_ILES
LD_ILES SA_ILES
LMO_VALS
Carboxydothermus
hydrogenoformans
B. cereus
SUB_ILES
Recent T-box duplication and mutation
of regulatory codon
ILE
CTC
ATC
LEU
ATC
CTC
Blow-up
transporter:
ATC
GTC
dual
regulation of
common
enzymes:
ATC
CTC
Summary / History
Regulation of iron homeostasis in α-proteobacteria
[- Fe]
[+Fe]
[ - Fe]
[+Fe]
RirA
RirA
Irr
Irr
FeS
heme
degraded
Siderophore
uptake
2+
3+
Fe / Fe
uptake
Iron uptakesystems
Fur
[- Fe]
Iron storage
ferritins
FeS
synthesis
Heme
synthesis
Iron-requiring
enzymes
[ironcofactor]
Fur
IscR
Fe
FeS
Transcription
factors
FeS status
of cell
[+Fe]
Experimental studies:
• FUR/MUR: Bradyrhizobium, Rhizobium and Sinorhizobium
• RirA (Rrf2 family): Rhizobium and Sinorhizobium
• Irr (FUR family): Bradyrhizobium, Rhizobium and Brucella
Distribution of
transcription
factors in
genomes
Regulation of genes
in functional
subsystems
Rhizobiales
Bradyrhizobiaceae
Rhodobacteriales
The Zoo (likely
ancestral state)
Reconstruction of history
Frequent
co-regulation
with Irr
Strict division
of function
with Irr
Appearance of the
iron-Rhodo motif
• Andrey Mironov (software):
– genome analysis
– conserved RNA patterns
• Ekaterina Panina (now at UCLA, USA)
– zinc and ribosomes
• Alexey Vitreschak
– T-boxes
• Dmitry Rodionov
– iron homeostasis
• Support:
–
–
–
–
Howard Hughes Medical Institute
INTAS
Russian Fund of Basic Research
Russian Academy of Sciences
(“Molecular and Cellular Biology”)