The biotin biosynthesis

Download Report

Transcript The biotin biosynthesis

Регуляторные структуры
РНК
RNA genes:
• sRNAs: CsrB/RsmB , CsrC, DsrA, GadY,
MicC, OxyS, RyhB, RydC, etc
• Antisense RNAs: CopA, DicF, MicF, RNAI,
QaRNA etc
Cys UTR regulatory RNAs: riboswitches, Tboxes, attenuators, IREs, etc.
sRNAs
DsrA RNA
Regulation of rpoS:
• Overcoming transcriptional silencing
• Promoting translation
E. coli, salmonella spp., Shigella spp
sRNAs
CsrB/RsmB RNA family
RNA binds to approximately 18
copies of the CsrA protein
negative effect: glycogen
biosynthesis, glyconeogenesis,
glycogen catabolism
positive effect: glycolysis
conserved motif CAGGXX
enterobacteria
sRNAs
PrrB/RsmZ RNA family
Pseudomonas spp.
5'-AGGA-3' repeats in loops
RNA possibly interacts with
a CsrA-like protein
Involvement in regulation of 2, 4-diacetylphloroglucinol (Phl) and
hydrogen cyanide (HCN) production
sRNAs
GadY
GadY interacts with the 3'
UTR of mRNA gadX:
increased stability to the
transcript
E. coli, salmonella spp., Shigella spp.
RydC RNA
RydC is known to bind the protein Hfq
The Hfq/RydC complex causes degradation
of the target nRNA
E. coli, salmonella spp.
Antisense RNAs
CopA-like RNA
RNAs regulate plasmid copy number
four-way inhibition junction structure
copA-mRNA copT
MicF RNA
regulates ompF expression by inhibiting
translation and inducing degradation
UTR RNA regulatory elements
Mediators of regulation:
Ribosomes (Transcription attenuation )
Repressor/Activator proteins (feedback inhibition of gene
translation/splicing, antitermination (bgl), IREs (regulation of
translation/mRNA stability), etc)
Uncharged tRNA (T-boxes)
Small molecules (various riboswitch regulatory elements)
Alternative RNA structures in transcription termination
Antitermination
Termination
(anti-antitermination)
Attenuation of transcription (Yanovsky).
Prediction of attenuators:
Amino acid biosynthesis (branched amino acids (ILE, LEU, VAL),
histidine, threonine, tryptophan, and phenylalanine)
(gamma- and alpha-proteobacteria, in some cases low-GC Gram-positive
bacteria, Thermotogales and Bacteroidetes/Chlorobi)
Three new histidine transporters were predicted:
• ortholog of BS- yuiF and yvsH
• from lysQ/lysP family
• HI0325 (Haemophylus influenzae )
E. coli: three aspartate kinase
isozymes, ThrA, MetL and LysC
thrA: ILE-THR attenuator
metL: MetJ
lysC: LYS-element
Pasteurellales (two aspartate
kinase isozymes):
thrA THR-MET-ILE attenuator
LysC: LYS-element
Detection of 5’ UTR RNA-elements
The RNApattern program:
RNA pattern:
• consensus motifs
• RNA secondary structure:
 number of helices
 length of each helix
 loop lengths
 parameters of topology and distance between pairs of helices
Partial alignment of predicted T-boxes
specifier hairpin
===>
==>
===>
<=== <==
SC<===
SA
DHA
ST
CA
DF
PN
MN
DF
HD
DF
ZC
BQ
MN
MN
ST
SERS
tyrZ
trpS
ASPS
VALS
THRS
ileS
leuS
ARGS
proS
lysS
metS
pheS
glyQ
alaS
SER
Tyr
Trp
ASP
VAL
THR
Ile
Leu
ARG
Pro
Lys
Met
Phe
Gly
Ala
---GTAGGACAAGTA
----AAGAACAAGTA
---ATTAGAAGAGTA
-----GAGAAAAGTA
-GAAGAAGAGGAGTA
----AGAGACAAGTC
----CAAAAACACAA
----CTAGAGCAGTA
-----TGGGAGAGTA
---AAAGAAATAGTA
---AAGAGAAGAGTA
---AAAGGAAAAGTA
----TGAGATTAGTA
---AGAAAGAGAGTT
-AGTTAAGAATTGTT
19
18
16
18
16
18
17
19
20
18
19
19
18
15
17
AGAGAGCTTGTGGTT---AGTGTGAACAAG--AGAAAGTTGCCGGCT---GATGAGAGGCGCTT
AGAGAGTTAGTGGTT---GGTGCAAGCTAACAGCGAATTGGGAAAT---GGTGTGAGCCCAAAGAGAGGAAAATTCACTGGCTGTAAGATTTTC
AGAGAGTGCGTGGTT---GCTGGAAACGCATAGCGAATAGGTGAT----GGTGTAAGACCTATT
AGAGGAAGTGGAA-----GGTGAGAACTAATATT
AGCGAGTCGGGAT-----GGTGGGAGCCGATAGAGAGAAAACGGT----GGTGAGAGTTTTC-AGAGAGCTCTGGTA----GCTGAGAAAGAGC-AGAGAGCTTCGGTA----GCTGAGAAGAAGC-AGGGAATGCGGGGCGTG-ACTGGAAACCCGCAGCGAACCTGAGAG----AGTGTAAGTCAGGT
AGAAAAGTGACGGTT---GCTGCGAGTCATT-
15
18
12
15
17
14
18
10
14
14
15
14
16
14
17
GAA--TCTACCTACTT
GAA--TACCTCTTTGA
GAAA-TGGACTAATGA
GAAA-GACATCTCGGA
GAAT-GTAGCTTTGGA
GAT--ACTACTCTTGA
-----ATCATTTTGTT
GAA--CTTACTAGATT
GAAA-CGCACCCATGA
GAA--CCTGTCTTTTA
GAAAAAAGACTTGGAG
GAACAATGGCCTTTGA
GAA--TTCACTCAGAA
GACT-GGCACTTTCTC
-----GCTACTTAACT
->
->
->
->
->
->
->
->
->
->
->
->
->
->
->
Amino acid
biosynthetic
genes
SA
BS
CA
BQ
BS
SA
MN
DHA
HD
BQ
EF
trpE
ilvB
ilvC
asnA
proB
cysE
hisC
pheA
serA
phhA
yxjH
Trp
Leu
Val
Asn
Pro
Cys
His
Phe
Ser
Tyr
Met
TCTAAAGAAATAGTA
---TGAGGATAAGTA
-----AGGAAGAGTA
--AGGACGAGTAGTA
-----AGGATTAGTA
--CGAAGGATTAGTA
-----AGAGAAAAAA
-----AAAGAGAGCA
----GAAGATGAGGA
AGAATCGCAGTAGTA
-----TAGGAAAGTA
22
20
17
15
18
18
16
19
17
17
17
AGAAAGCTAATGGGT---GATGGGAATTAGC-AGAGAACCGGGTTA----GCTGAGAACCGG--AGAGAGTGAGATACT---GGTGGGAACTCAT-AGCGAGTCAGGGGT----GGTGTGAGCCTGA-AGAGAGCAAAATGAACC-GCTGAAACATTTTGC
AGAGAGTGTACGGTT---GCTGTGAGTACA--AGAGAGTATGGGAA----GCTGAAAACATAC-AGGGAACTAAAGTCGGAGACTGAAAGCTTTAGT
AGAGAGCTGGTGGTT---GCTGTGAACCAGCTAGAGAGCTAATGGTC---GGTGGAAATTGGC-AGAGAGACTTTGGTT---GGTGAAAAAAGTT--
14
16
13
15
15
14
15
14
18
14
13
GAAT-TGGACTTTGGA
GAA--CTCGCCTCAGA
GAAG-GTAGCCTTTGA
GAAG-AACCTCCTGGA
GAA--CCTGCCTTGGA
GAA--TGCACCTTCGT
-----CACATTCTTGA
GAGA-TTCACTCTGGA
-----AGCCCTTCTGA
GAAT-TACAATTCTGG
GAAAAATGGCCTAGGA
->
->
->
->
->
->
->
->
->
->
->
Amino acid
transporters
CA yckK
DF yqiX
HD BH0807
EF yheL
BQ ykbA
BQ sdt2
EF yusC
CA yhaG
BQ brnQ
REF01723
BS yvbW
Cys
Arg
Lys
Tyr
Thr
Trp
Met
Trp
Ile
His
Leu
----AAGAACCAGTA
-----AGAGAAAGTA
----AGAGAAGAGTA
-TTATTAGCCCAGTA
--GAGGACACGATCA
---GCAAGAAGAGTA
----AAAGAAGAGTA
----AAGGAAGAGTA
----GAGAACGAGTA
--TTAGGACATAGTA
-----GGGAGCAGTA
17
16
19
19
16
18
18
18
19
18
18
AGAGAAAAATCTCCAAG-GCTGAAAGGGATTTT
AGCGAGTTAGGGGTT---GGTGTAAGCCTAGCAGAAAGCCTGTAGTT---GCTGAGAACGGGT-AGAAAGTCGATGGTT---GCTGCGAATCGAT-AGAGAGGGAAGCCTTTG-GCTGTGAGCTTCCTAGAGAGCTGGGGGAA---GGTGTGAGCCCGGTAGAGAGCCCTGTTT----GCTGAGAATGGG--AGAGAGCTGAGGGT----GGTGTGATCTCAGTAGAGAGTTGGCGATTT--GCTGAAAGCCAAC-AGAGACTTTTTCATTG--GCTGAAAGAAAAAGAGAGAGCTGCGGGGT---GGTGCGACGCAGC--
15
14
14
13
14
15
16
15
15
17
13
GAA--TGCATCTTTGA
GAAG-AGAGCTCTGGA
GAAGCAAGACTCTGAG
GAAT-TACACTAATAA
GATT-ACCACCTCTGA
GAA--TGGGCTTGCGA
GAAG-ATGGTCTTTGA
GAA--TGGACCTTTTA
GAAA-ATCATCTCCGA
-----CACACCTAAAA
GAA--CTCGCCCGGGA
->
->
->
->
->
->
->
->
->
->
->
AminoacyltRNA
synthetases
… continued
AminoacyltRNA
synthetases
Amino acid
biosynthetic
genes
Amino acid
transporters
Terminator(underlined)
===========> <===========
Antiterminator
==> ===>
<===<==
SA
DHA
ST
CA
DF
PN
MN
DF
HD
DF
ZC
BQ
MN
MN
ST
serS
tyrZ
trpS
aspS
valS
thrS
ileS
leuS
argS
proS
lysS
metS
pheS
glyQ
alaS
->
->
->
->
->
->
->
->
->
->
->
->
->
->
->
26
47
37
39
41
30
89
28
41
33
46
55
14
14
20
CGTTA
CGTTA
CCTTA
CGTTA
CGTTA
CGTTA
CGTTA
AGCTA
CGTTA
CGTTA
CGTTA
CGTTA
AATTA
AGCTA
AATTA
51
65
61
34
77
38
68
29
27
30
63
66
20
23
18
AAATAGGGTGGCAACGCGTAGAC------------CACGTCCCTTGTAGGGATGTGGTCTTTTTTTA
AGGTAAGGTGGTAACACGGGAGCA-------TACTCTCGTCCTTCTGGCAATGAAGGACGGGAGTTTTTTGTTTT
AATTGAGGTGGTACCGCGTATTACTT----GTAATAACGCCCTCACGTTTTAATAGCGTGGGGACTTTTTGCTAT
ATAAAGGATGGCACCGTGAAAA----------GCCTTCACTCCTTACTGGAGTGGAGGCTTTTTTTATTTTAAATAAA
AATTAAGGTGGTAACGCGAGC------------TTTTCGTCCTTTTTAAAGAGGATGAAGAGCTCTTTTTTATTTCT
AATGAAGGTGGAACCACGTTG-------------CGACGTCCTTTCGAGGATGTCGCATTTTTTTATTAG
AATTAAGGTGGTACCACGAGC-------------TTTCGTCCTTTGATGAAAGTTCTTTTTTATTGAT
AATTAGGGTGGTACCGCGAAGATT-------TATCCTCGTCCCTAAACGTAAGTTTAGTGACGAGGATTTTTTATTTTCA
AACGAGAGTGGTACCGCGGGTAA---------AAGCTCGCCTCTTTTTAGAAGAGGCGGGTTTTTTATTTT
AACTAGAGTGGTACCGCGGAAAT-----TAAACCTTTCGTCTCTATACTTGTATAGAGATGAGAGGTTTTTTATATTTTCAGG
AACTGAGGTGGTACCGCGAAGCTAA-----CAACTCTCGTCCTCAAGATGAATAATCTTGGGGGTGGGAGTTTTTTTGTTGCA
AAATAAGGTGGTACCGCGACTGTTTA---TACAGCCCCGCCCTTATCTTTTTTAGATAAGGGCGGGGCTTTTTATATTTAA
AAAACGGATGGTACCGCGTGTC-------------AACGCTCCGCTTAAGGAGTTTTGGCACTTTTTTTGTTTT
AATTAGGGTGGAACCGCGTTT------------CAAACGCCCCTATGTCAGTTGGCATGGGAGTGATTGAGCGTGGCTCTTTT
AATAGAGGTGGTACCGCGGTT--------------TTCGCCCTCTGTGAGATGGACTTGTTTTGTATGGAGGACTATTTGAAA
SA
BS
CA
BQ
BS
SA
MN
DHA
HD
BQ
EF
trpE
ilvB
ilvC
asnA
proB
cysE
hisC
pheA
serA
phhA
yxjH
->
->
->
->
->
->
->
->
->
->
->
32
50
40
51
33
33
46
41
42
51
40
AATTA
CGTTA
CGTTA
CGTTA
CGTTA
CATTA
CGTTA
CGTTA
cgtta
CGTTA
CGTTA
4
47
14
62
30
62
50
50
57
34
51
AACTAAGGTGGCACCACGGTA-------------ACGCGTCCTTACAGGTATATGCGTTATGTGGTGTCTTTTT
AACAAGGGTGGTACCGCGGAAAGAAA---AGCCTTTTCGCCCCTTTTAGCTATCGCAGTTACTGCGCGGCTGATTGT
AATTTGGGTGGTACCGCGCGACCAAA-----AATTCTCGCCCCAAGCAGGGAATTTTGGCCGTTTTTTTATATAAATAAAT
AATTTGGGTGGTACCGCGGAACC-----AAAGCCTTTCGTCCCAGTTTTTTGGGAAAGAAGGGCTTTTTTTGTTGGCTT
AATCAAGGTGGTACCACGGAAAC--------CCATTTCGTCCTTATGAATCAGGATGAAATGGGTTTTTTTATTGTAGA
ATTCAGAGTGGAACCGTGCGG-------------AAGCGCCTCTAACAATACAATTTGTATGTTAGTGGTGCTTTTTTG
AATGAAGGTGGAACCACGTGTGT---------GTCAGCGTCCTTGCAAGTTTTTTGCAAGGGCGCTTTTTTGAATAGT
AAAAAGGGTGGTACCGCGTGAC---------TTAACTCGTCCCTTATTTGGGGGTGAGGTAAGTCTTTTTTTATTTA
AATGAGGGTGGCACCGCGGTATG-------AACCTTCCGCCCCTCACGACAGTCGTCGTGTGGGCAGAAGGTTTTTTTACTAT
AAATAGGGTGGTACCGCGATTC------------TTTCGCCCCTATCGGATTTTCCGATAGGGGCTTTTTCTATTTC
AAAAAAGGTGGTACCGCGATAA-----------TAATCGCCCTTTTACTAGTTACGGCTAGTAAAAGGGCGTTTTTTTATAAA
CA yckK -> 38
DF yqiX -> 41
HD BH0807->74
EF yheL -> 8
BQ ykbA -> 46
BQ sdt2 -> 40
EF yusC -> 42
CA yhaG -> 48
BQ brnQ -> 44
REF01723 -> 44
BS yvbW -> 56
CGTTA
CCTTA
TGTTA
AATTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
CGTTA
57
30
56
33
45
56
60
51
66
55
32
AATTAGAGTGGTACCGTGGAATT-------CAACTTCTGCCTCTAACTATGAGGATAGAAGTTTTTTGTTTTTAT
AAAAAGAGTGGTAACGCGGATAT----------AATTCGTCTCTTAGCTGTAAAGCTAAGGGACTTTTTTGATTTA
AACTGGGGTGGCACCACGACAAG----------TGATCGTCCCCAAGACTTTTATCAGTCTTGGGGACGTTTTTTTGTTCAT
AATTAAGGTGGTACCGCGGAGA-----------GATTCGTCCTTATTCTTTAAGGATGAATCTCTCTTTTTATGTAGC
AACAAGGGTGGAACCACGAATAT--------AACACTCGTCCCTTTTTTAGGGAGGAGTGTTTTTTTATT
AATTGAGGTGGTACCACGGTATTAACATTACATATATCGTCCTCTACATGCATATTTGCGTGTAGGGGACTTTTTTATTTTC
AATTAAGGTGGTATCACGAAATGA-----CAAACTTTCGTCCTTTTTGCTGTAATAGCAAAAGGATGGAAGTTTTTTTGTTT
AATTTAGGTGGTACCGCGGAAGT---------ATCTCCGTCCTAATTAATAAGATTAGGGCGGAGTTTTTTATTTGC
AATTAGGGTGGTATCGCGGGTAAA------TATAACTCGTCCCTTTCTTTAGGGACGAGTTTTTTGTGTTCTT
AATTGAGGTGGCACCACGAATGC----------GATTCGTCCTCTTGGCTCACAGCCAAGAGGCTTTTTTGTTTTTTTAATA
AACAAGAGTGGTACCGCGGTCAGC--CGAAGGCTCGTCGTCTCTTTATCTATTAGATTAGGTAGGAGACGGCGGGCTTTTTT
Amynoacyl-tRNA synthetases
Aromatic a/a
TRP, PHE,
Most FIRMICUTES, Atopobium minutum
TYR
Branched chain
Most FIRMICUTES, Actinobacteria(ileS), Dienococcales\ Thermales(ileS, valS),
a/a
Chloroflexi(ileS), Thermomicrobium roseum(leuS)
ILE, LEU,VAL
methionine
Bacillales, Clostridiales, Thermoanaerobacter tengcongensis
proline
Some Bacillales, Clostridiales,
cysteine
Bacillales, some Lactobacillales, Clostridiales, Thermoanaerobacteriales
Bacillales, Lactobacillales(exept streptococcus spp.), some Clostridiales,
histidine
Thermoanaerobacter tengcongensis
arginine
Bacillales, Lactobacillales (exept streptococcus spp.), Clostridiales,
threonine
Bacillales, Lactobacillales, Clostridiales, Dictyoglomi, Thermomicrobium roseum
serine
Most FIRMICUTES
alanine
Bacillales, Lactobacillales, Clostridiales
ASP, ASN
Most FIRMICUTES (exept streptococcus spp., Mycoplasmatales, Entomoplasmatales)
glycine
Most FIRMICUTES, Dienococcales\ Thermales
lysine
Bacillus cereus, Clostridium thermocellum
Amino acid biosynthetic genes
Aromatic a/a
Most FIRMICUTES, Chloroflexi and Dictyoglomi (trp operon), some FIRMICUTES
TRP, PHE,
(aro genes, pheA, pah)
TYR
Branched chain
Bacillales, Clostridiales, Syntrophomonas wolfei, δ-proteobacteria(leu), Dictyoglomi,
a/a
Thermomicrobium roseum
ILE, LEU,VAL
methionine
Lactobacillales (exept streptococcus spp.), Desulfotomaculum reducens
proline
Bacillales, Desulfitobacterium hafniense, Desulfotomaculum reducens
cysteine
Bacillales, Enterococcus faecalis, Clostridium acetobutylicum, Dictyoglomi
histidine
some Lactobacillales
arginine
Clostridium difficile
threonine
Bacillus cereus, Clostridium difficile
serine
some FIRMICUTES
alanine
ASP, ASN
some FIRMICUTES
glutamine
Clostridium perfringes
glycine
lysine
-
ycbK
yhaG
yvbW
ykbA
ybgF/aapA
T-box
specificity
TRP
TRP
LEU
THR
?
yheL
TYR
LysX
LYS
Gene name
Predicted function
T-box srecifier codon
Bacillus subtilis, Bacillus licheniformis
Clostridiales
Bacillus subtilis, Bacillus licheniformis
Bacillus subtilis
Lactobacillus reuteri
yusCBA
yqiXYZ
MET
ARG
tryptophan-specific permease
tryptophan-specific permease
leucine-specific permease
threonine-specific permease
?
Tyrosine transporter
(Na+/H+ antiporter)
lysine transporter
Branched-chain amino acid transporter
family: ILE-specific
Branched-chain amino acid transporter
family:: THR-specific
Branched-chain amino acid transporter
family:: VAL-specific
methionine ABC transporter
arginine ABC transporter
hisXYZ
HIS
histidine ABC transporter
yckKJI
CYS
MET
ASP
MET
cysteine ABC transporter
methionine ABC transporter
ASP(ASN) ABC transporter
methionine ABC transporter
TRP-specific sodium dependent
transporter
PHE-specific sodium dependent
transporter
LEU-specific sodium dependent
transporter
sodium dependent transporter
uptake of unknown methionine
precursors, possibly oligopeptides
ILE
brnQ_braB
THR
VAL
aspQHMP
ytmKLM
TRP
yocR(yhdH)
PHE
LEU
TYR?\MET
some Bacillales and Lactobacillales
some Bacillales
some Bacillales, Lactobacillales
andClostridiales
Bacillus cereus, Clostridium
tetani
some Lactobacillales
Lactobacillales, Enterococcus faecalis
Clostridium difficile
Lactobacillales, Clostridium difficile,
Listeria monocytogenes, Enterococcus
faecalis
Clostridium acetobutylicum
some Lactobacillales
Lactobacillus johnsonii
Leuconostoc mesenteroides
Bacillus cereus
Bacillus cereus
Bacillus cereus
Clostridium tetani
mtsABC
opp
MET
trpXYZ
TRP
tryptophan ABC transporter
RDF02391
ABC-like
transporter
CBX
gltT like
ARG
arginine permease
Peptococcaceae, Streptococcus spp.,
Paenibacillus larvae
Clostridium difficile
?
?
Desulfotomaculum reducens
?
?
?
?
Clostridium botulinum
some Clostridium spp.
some Lactobacillales
New predicted
amino acid
transporters
Conserved RNA secondary structure of the regulatory RFN element
RFN element
additional
stem-loop
variable
stem-loop
Ag
Y
CC
N
R GN
rU G
GY Y
G
3
G
C
c
A
N
A UCCcN
a
N Y G G g Nc
G
2
x
G
G
g
RC
U
Y
Y
y
N
N
N
N
5’
GA
A
R
R
r
N
N
N
N
K N
R
u
A
RG
x
Y
yB RYC K
V
4
Rr
C
C
G
A
U xN
CRG
N
G G Y C U Ax
G
A 5
u
x
g
Capitals: invariant (absolutely conserved)
RR
3’
positions.
Lower case letters: strongly conserved positions.
Dashes and stars: obligatory and facultative base
pairs
Degenerate positions: R = A or G; Y = C or U;
K = G or U; B= not A; V = not U.
5’ UTR regions of riboflavin genes from various bacteria
BS
BQ
BE
HD
Bam
CA
DF
SA
LLX
PN
TM
DR
TQ
AO
DU
CAU
FN
TFU
SX
BU
BPS
REU
RSO
EC
TY
KP
HI
VK
VC
YP
AB
BP
AC
Spu
PP
AU
PU
PY
PA
MLO
SM
BME
BS
BQ
BE
CA
DF
EF
LLX
LO
PN
ST
MN
SA
AMI
DHA
FN
GLU
1
2
2’
3
=========>
==>
<==
===>
TTGTATCTTCGGGG-CAGGGTGGAAATCCCGACCGGCGGT
AGCATCCTTCGGGG-TCGGGTGAAATTCCCAACCGGCGGT
TGCATCCTTCGGGG-CAGGGTGAAATTCCCGACCGGCGGT
TTTATCCTTCGGGG-CTGGGTGGAAATCCCGACCGGCGGT
TGTATCCTTCGGGG-CTGGGTGAAAATCCCGACCGGCGGT
GATGTTCTTCAGGG-ATGGGTGAAATTCCCAATCGGCGGT
CTTAATCTTCGGGG-TAGGGTGAAATTCCCAATCGGCGGT
TAATTCTTTCGGGG-CAGGGTGAAATTCCCAACCGGCAGT
ATAAATCTTCAGGG-CAGGGTGTAATTCCCTACCGGCGGT
AACTATCTTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT
AAACGCTCTCGGGG-CAGGGTGGAATTCCCGACCGGCGGT
GACCTCTTTCGGGG-CGGGGCGAAATTCCCCACCGGCGGT
CACCTCCTTCGGGG-CGGGGTGGAAGTCCCCACCGGCGGT
AATAATCTTCAGGG-CAGGGTGAAATTCCCGATCGGCGGT
TTTAATCTTCAGGG-CAGGGTGAAATTCCCGATCGGTGGT
GAAGACCTTCGGGG-CAAGGTGAAATTCCTGATCGGCGGT
TAAAGTCTTCAGGG-CAGGGTGAAATTCCCGACCGGTGGT
ACGCGTGCTCCGGG-GTCGGTGAAAGTCCGAACCGGCGGT
-AGCGCACTCCGGG-GTCGGTGAAAGTCCGAACCGGCGGT
GTGCGTCTTCAGGG-CGGGGTGAAATTCCCCACCGGCGGT
GTGCGTCTTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT
TTACGTCTTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT
GTACGTCTTCAGGG-CGGGGTGGAATTCCCCACCGGCGGT
GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT
GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT
GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT
TCGCATTCTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT
GCGCATTCTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT
CAATATTCTCAGGG-CGGGGCGAAATTCCCCACCGGTGGT
GCTTATTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT
GCGCATTCTCAGGG-CAGGGTGAAAGTCCCTACCGGTGGT
GTACGTCTTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT
ACATCGCTTCAGGG-CGGGGCGTAATTCCCCACCGGCGGT
AACAATTCTCAGGG-CGGGGTGAAACTCCCCACCGGCGGT
GTCGGTCTTCAGGG-CGGGGTGTAAGTCCCCACCGGCGGT
GGTTGTTCTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT
AAACGTTCTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT
TAACGTTCTCAGGG-CGGGGTGCAACTCCCCACCGGCGGT
TAACGTTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT
TAAAGTTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT
AAGCGTTCTCAGGG-CGGGGTGAAATTCCCCACCGGCGGT
GCTTGTTCTCGGGG-CGGGGTGAAACTCCCCACCGGCGGT
ATCAATCTTCGGGG-CAGGGTGAAATTCCCTACCGGCGGT
GTCTATCTTCGGGG-CAGGGTGAAAATCCCGACCGGCGGT
ATTCATCTTCGGGG-CAGGGTGAAATTCCCGACCGGCGGT
AATGATCTTCAGGG-CAGGGTGAAATTCCCTACCGGCGGT
GAAGATCTTCGGGG-CAGGGTGAAATTCCCTACCGGCGGT
GTTCGTCTTCAGGGGCAGGGTGTAATTCCCGACCGGTGGT
AAATATCTTCAGGG-CACCGTGTAATTCGGGACCGGCGGT
GTTCATCTTCGGGG-CAGGGTGCAATTCCCGACCGGTGGT
AAGAGTCTTCAGGG-CAGGGTGAAATTCCCGACCGGCGGT
AAGTGTCTTCAGGG-CAGGGTGTGATTCCCGACCGGCGGT
AAGTGTCTTCAGGG-CAGGGTGAGATTCCCGACCGGCGGT
ATTCATCTTCGGGG-TCGGGTGTAATTCCCAACCGGCAGT
TCACAGTTTCAGGG-CGGGGTGCAATTCCCCACTGGCGGT
ACGAACCTTCGAGG-TAGGGTGAAATTCCCGACCGGCGGT
AATAATCTTCGGGG-CAGGGTGAAATTCCCGACCGGTGGT
---TGTTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT
Add.
3’
-><<===
21 AGCCCGTGAC-19 AGTCCGTGAC-20 AGCCCGCGA--19 AGTCCGTGAC-23 AGCCCGTGAC-2 AGCCCGCAA--2 AGCCCGCG---6 AGCCTGCGAC-2 AGCCCGCGA--2 AGCCCACGA--3 AGCCCGCGAG-15 AGCCCGCGAA-3 AGCCCGCGAA-2 AGTCCGCGA--2 AGTCCGCGA--20 AGCCCGCGA--2 AGTCCACG---3 AGTCCGCGAC-3 AGTCCGCGAC-30 AGCCCGCGAGCG
21 AGCCCGCGAGCG
31 AGCCCGCGAGCG
21 AGCCCGCGAGCG
17 AGCCCGCGAGCG
67 AGCCCGCGAGCG
20 AGCCCGCGAGCG
2 AGCCCACGAGCG
14 AGCCCACGAGCG
13 AGCCCACGAGCG
40 AGCCCGCGAGCG
25 AGCCCACGAGCG
18 AGCCCGCGAGCG
16 AGCCCGCGAGCA
34 AGCCCGCGAGCG
13 AGCCCGCGAGCG
17 AGCCCGCGAGCG
19 AGCCCGCGAGCG
19 AGCCCGCGAGCG
19 AGCCCGCGAGCG
16 AGCCCGCGAGCG
34 AGCCCGCGAGCG
17 AGCCCGCGAGCG
18 AGCCCGCGA--27 AGCCCGCGA—-20 AGCCCGCGA--2 AGCCCGCGAG-2 AGCCCGCG---3 AGTCCACGAC-21 ACTCCGCGAT-3 AGTCCACGAT-125 AGTCCGTG---14 AGTCCGCG---104 AGTCCGCG---6 AGCCTGCGAC-14 AGCCCGCGC--20 AGCCCGCAAC-2 AGTCCACG---28 AGCCCGCGAGCG
Variable
4
4’
5
5’
1’
->
<====>
<====
==>
<==
<=========
8 4 8 -----TGGATTCAGTTTAA-GCTGAAGCCGACAGTGAA-AGTCTGGAT-GGGAGAAGGATGAT
8 5 8 -----TGGATCTAGTGAAACTCTAGGGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGGATATG
3 4 3 -----AGGATCCGGTGCGATTCCGGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGGATGCC
10 4 10 ----–TGGACCTGGTGAAAATCCGGGACCGACAGTGAA-AGTCTGGAT-GGGAGAAGGAAACG
8 4 8 ----–TGGATTCAGTGAAAAGCTGAAGCCGACAGTGAA-AGTCTGGAT-GGGAGAAGGATGAG
3 4 3 ------AGATCCGGTTAAACTCCGGGGCCGACAGTTAA-AGTCTGGAT-GAAAGAAGAAATAG
7 6 7 --------ATTTGGTTAAATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GGAAGAAGATATTT
11 3 11 ----–CTGATCTAGTGAGATTCTAGAGCCGACAGTTAA-AGTCTGGAT-GGGAGAAAGAATGT
4 4 4 -----ATGATTCGGTGAAACTCCGAGGCCGACAGT-AT-AGTCTGGAT-GAAAGAAGATAATA
3 4 3 -----ATGATTTGGTGAAATTCCAAAGCCGACAGT-AT-AGTCTGGAT-GAAAGAAGATAAAA
5 4 5 ----–TTGACCCGGTGGAATTCCGGGGCCGACGGTGAA-AGTCCGGAT-GGGAGAGAGCGTGA
8 12 9 ----–CCGATGCCGCGCAACTCGGCAGCCGACGGTCAC-AGTCCGGAC-GAAAGAAGGAGGAG
5 4 5 -----CCGACCCGGTGGAATTCCGGGGCCGACGGTGAA-AGTCCGGAT-GGGAGAAGGAGGGC
7 7 7 -----AGGAACCGGTGAGATTCCGGTACCGACAGT-AT-AGTCTGGAT-GGAAGAAGATGAAA
13 4 12 -----AGGAACTAGTGAAATTCTAGTACCGACAGT-AT-AGTCTGGAT-GGAAGAAGAGCAGA
3 4 3 -----AGGACCCGGTGTGATTCCGGGGCCGACGGT-AT-AGTCCGGAT-GGGAGAAGGTCGGC
5 4 5 -------GATTTGGTGAAATTCCAAAACCGACAGT-AG-AGTCTGGAT-GGGAGAAGAATTAG
8 5 8 -----TGGAACCGGTGAAACTCCGGTACCGACGGTGAA-AGTCCGGAT-GGGAGGTAGTACGTG
8 5 8 -----TTGACCAGGTGAAATTCCTGGACCGACGGTTAA-AGTCCGGAT-GGGAGGCAGTGCGCG
137
GTCAGCAGATCTGGTGAGAAGCCAGAGCCGACGGTTAG-AGTCCGGAT-GGAAGAAGATGTGC
8 4 8 GTCAGCAGATCTGGTCCGATGCCAGAGCCGACGGTCAT-AGTCCGGAT-GAAAGAAGATGTGC
7 5 7 GTCAGCAGATCTGGTGAGAGGCCAGGGCCGACGGTTAA-AGTCCGGAT-GAAAGAAGATGGGC
11 3 11 GTCAGCAGATCCGGTGAGATGCCGGGGCCGACGGTCAG-AGTCCGGAT-GGAAGAAGATGTGC
8 4 8 GACAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAG-AGTCCGGAT-GGGAGAGAGTAACG
8 3 8 GTCAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAA-AGTCCGGAT-GGGAGAGGGTAACG
8 4 8 GTCAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAA-AGTCCGGAT-GGGAGAGAGTAACG
26 9 30 GTCAGCAGATTTGGTGAAATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GAAAGAGAATAAAA
11 9 11 GTCAGCAGATTTGGTGAGAATCCAAAGCCGACAGT-AT-AGTCTGGAT-GAAAGAGAATAAGC
5 4 5 GTCAGCAGATCTGGTGAGAAGCCAGGGCCGACGGTTAC-AGTCCGGAT-GAGAGAGAATGACA
16 6 16 GTCAGCAGACCCGGTGTAATTCCGGGGCCGACGGTTAT-AGTCCGGAT-GGGAGAGAGTAACG
16 4 27 GTCAGCAGATTTGGTGCGAATCCAAAGCCGACAGTGAC-AGTCTGGAT-GAAAGAGAATAAAA
10 4 10 GTCAGCAGACCTGGTGAGATGCCAGGGCCGACGGTCAT-AGTCCGGAT-GAGAGAAGATGTGC
10 3 11 ---CGCAGATCTGGTGTAAATCCAGAGCCGACGGT-AT-AGTCCGGAT-GAAAGAAGACGACG
6 6 6 GTCAGCAGATCTGGTG 52 TCCAGAGCCGACGGT 31 AGTCCGGAT-GGAAGAGAATGTAA
7 3 7 GTCAGCAGATCTGGTGCAACTCCAGAGCCGACGGTCAT-AGTCCGGAT-GAAAGAAGGCGTCA
7 9 7 GTCAGCAGATCCGGTGAGAGGCCGGAGCCGACGGT-AT-AGTCCGGAT-GGAAGAGGACAAGG
19 4 18 GTCAGCAGACCCGGTGTGATTCCGGGGCCGACGGTCAC-AGTCCGGATGAAGAGAGAACGGGA
15 4 16 GTCAGCAGACCCGGTGTGATTCCGGGGCCGACGGTCAT-AGTCCGGATGAAGAGAGAGCGGGA
14 4 13 GTCAGCAGACCCGGTGCGATTCCGGGGCCGACGGTCAT-AGTCCGGATAAAGAGAGAACGGGA
8 5 8 GTCAGCAGATCCGGTGTGATTCCGGAGCCGACGGTTAG-AGTCCGGAT-GAAAGAGGACGAAA
8 3 8 GTCAGCAGATCCGGTCGAATTCCGGAGCCGACGGTTAT-AGTCCGGAT-GGAAGAGAGCAAGC
10 15 10 GTCAGCAGATCCGGTGAGATGCCGGAGCCGACGGTTAA-AGTCCGGAT-GGAAGAGAGCGAAT
5 4 5 -----AGGATTCGGTGAGATTCCGGAGCCGACAGT-AC-AGTCTGGAT-GGGAGAAGATGGAG
3 5 3 -----AGGATTTGGTGTGATTCCAAAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGGAG
3 4 3 -----AGGATCCGGTGCGAGTCCGGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGAAG
3 4 3 ----TATGATCCGGTTTGATTCCGGAGCCGACAGT-AA-AGTCTGGAT-GAAAGAAGATATAT
6 4 6 -------GATTTGGTGAGATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GAGAGAAGATATTT
5 3 5 ----ATTGAATTGGTGTAATTCCAATACCGACAGT-AT-AGTCTGGAT—-AAAGAAGATAGGG
4 4 4 ----–TTGAAGCAGTGAGAATCTGCTAGCGACAGT-AA-AGTCTGGAT-GGAAGAAGATGAAC
3 10 3 ----TTGACTCTGGTGTAATTCCAGGACCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGTTG
3 4 3 -------GATGTGGTGAGATTCCACAACCGACAGT-AT-AGTCTGGAT-GGGAGAAGACGAAA
3 4 3 -------GATGTGGTGTAACTCCACAACCGACAGT-AT-AGTCTGGAT-GAGAGAAGACCGGG
3 4 3 -------GATGTGGTGAAATTCCACAACCGACAGT-AA-AGTCTGGAT-GGGAGAAGACTGAG
11 3 11 ----–CTGATCTAGTGAGATTCTAGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGGAG
5 5 5 ------TGATCTGGTGCAAATCCAGAGCCAACGGT-AT-AGTCCGGAT-GGAAGAAACGGAGC
11 4 11 --CGACTGACTTGGTGAGACTCCAAGGCCGACGGT-AT-AGTCCGGAT-GGGAGAAGGTACAA
4 6 4 -------GATTTGGTGAAATTCCAAAACCGACAGT-AG-AGTCTGGAT-GAGAGAAGAAAAGA
10 4 10 GTCAGCAGATCCGGTTAAATTCCGGAGCCGACGGTCAT-AGTCCGGAT-GCAAGAGAACC---
Distribution of RFN-elements in bacterial genomes
RFN regulates riboflavin biosynthetic genes and transporters
Genomes
Number of
analyzed
genomes
Number of
genomes
with RFN
Number of
the RFN
elements
a-proteobacteria
8
4
4
b-proteobacteria
7
4
4
g-proteobacteria
17
15
15
e- and d-proteobacteria
3
0
0
Bacillus/Clostridium
12
12
19
Actinomycetes
9
4
4
Cyanobacteria
5
0
0
Other eubacteria
7
5
6
Total
68
47
52
Some predicted transporters are NEW
Alternative RNA secondary structures
upstream of riboflavin operons with RFN elements
Attenuation of transcription via antitermination mechanism
Antiterminator
The RFN element
Bam
BS
BQ
BE
HD
CA
DF
LLX
PN*
PN*
TM
AO
DU
FN
SA
DHA
FN
CA
DF
BS
BQ
BE
PN
ST
MN
SA
EF
LLX
LO
GACAAAAAAATATTGATTGTATCCTTCGGGGCTGGGTG
GGACAAATGAATAAAGATTGTATCTTCGGGGCAGGGTG
CTATAATTTGAGCAAACAGCATCCTTCGGGGTCGGGTG
ACATAACGATATAGTGATGCATCCTTCGGGGCAGGGTG
AAATTGAATAATTAATTTTTATCCTTCGGGGCTGGGTG
TAATGGTAATTTAATAGGATGTTCTTCAGGGATGGGTG
TAAATATAAATTTAATACTTAATCTTCGGGGTAGGGTG
ACTTTAGCTACAATTGAATAAATCTTCAGGGCAGGGTG
ATCATCTGTAATTGAATAACTATCTTCAGGGCAGGGTG
ATCATCTGTAATTGAATAACTATCTTCAGGGCAGGGTG
AAAACTGAATACAAAAGAAACGCTCTCGGGGCAGGGTG
ATTTGCAACAATTTTTTAATAATCTTCAGGGCAGGGTG
AATTTTTTTAATACTATTTTAATCTTCAGGGCAGGGTG
TAATCGAATATGTAAAATAAAGTCTTCAGGGCAGGGTG
TATAACAATTTCATATATAATTCTTTCGGGGCAGGGTG
ACTCTTTTTAGATGAATACGAACCTTCGAGGTAGGGTG
GAAAAATAAATATTAAAAATAATCTTCGGGGCAGGGTG
AATATAAAAAAATAAAGAATGATCTTCAGGGCAGGGTG
AAAATTAAAAAATCAAAGAAGATCTTCGGGGCAGGGTG
TAATTAAATTTCATATGATCAATCTTCGGGGCAGGGTG
GGGAAAATAGAATATCGGTCTATCTTCGGGGCAGGGTG
ATAAAAATGTATAAGCGATTCATCTTCGGGGCAGGGTG
GTTTTTTGTTATGATAAAAGAGTCTTCAGGGCAGGGTG
TAAATCTGCTATGCTAGAAGTGTCTTCAGGGCAGGGTG
ATTTTTTGATATGCTATAAGTGTCTTCAGGGCAGGGTG
AAATTTAATAATGTAAAATTCATCTTCGGGGTCGGGTG
AAAAAATATAATACAAGGTTCGTCTTCAGGGGCAGGGT
TTTTTGTGCTATAATAAAAATATCTTCAGGGCACCGTG
ATTGTAAGAAAATATTCGTTCATCTTCGGGGCAGGGTG
-----------------------------------------------------------
TCTGGATGGGAGAAGGATGA 59
TCTGGATGGGAGAAGGATGA 59
TCTGGATGGGAGAAGGATAT 250
TCTGGATGGGAGAAGGATGC 155
TCTGGATGGGAGAAGGAAAC 148
TCTGGATGAAAGAAGAAATA 34
TCTGGATGGAAGAAGATATT 63
TCTGGATGAAAGAAGATAAT 127
TCTGGATGAAAGAAGATAAA 81
TCTGGATGAAAGAAGATAAA 19
TCCGGATGGGAGAGAGCGTG 13
TCTGGATGGAAGAAGATGAA 33
TCTGGATGGAAGAAGAAGAG 47
TCTGGATGGGAGAAGAATTA 18
TCTGGATGGGAGAAAGAATG 74
TCCGGATGGGAGAAGGTACA 43
TCTGGATGAGAGAAGAAAAG 40
TCTGGATGAAAGAAGATATA 19
TCTGGATGAGAGAAGATATT 45
TCTGGATGGGAGAAGATGGA 103
TCTGGATGGGAGAAGATGGA 54
TCTGGATGGGAGAAGATGAA 114
TCTGGATGGGAGAAGACGAA 137
TCTGGATGAGAGAAGACCGG 130
TCTGGATGGGAGAAGACTGA 138
TCTGGATGGGAGAAGATGGA 17
GTCTGGATAAAGAAGATAGG 33
TCTGGATGGAAGAAGATGAA 66
TCTGGATGGGAGAAGATGTTG 79
Terminator +RBS sequestor
----------GTAAAGCCCCGAATGTGTAA---ACATTCGGGGCTTTTTGACGCCAAAT
----------CTAAAGCCCCGAATTTTTTA--TAAATTCGGGGCTTTTTTGACGGTAAA
-----------CCAAACCCCAAGGATATTAAA--ATCCTTGGGGTTTTTTGTTTTTTTT
------------TGAGCCCCCGGGGACAT--------CCCGGGGGTTTCATTTTTATTG
-------------ATGCCCCGTGAGAACAAAA-----TCTCTGGGGCTTTTTTGCGCGC
-------------AATCTCCGAAGGATTACC----TTTCTTTGGAGATTTTTTTATTTG
------------TAAACCCTGAGTTAATT--------CTCAGGGTTTTTTGTTTAAAAA
----------AAAAGACCCTGAAATTTT------ATTTTAGGGTCTTATTTTTTATTAG
----------TGTATGCCTTGAGTAGTCCCC---TATTCAAGGTATATTTTTTTGGAGG
------------CGTGCTCTGAAATGATTACTTGTCATTTCAGAGCATTTTTGTTAATC
-----------ATGGGACCCGAGA----------------GGGTCCCTTTTCTTTTACA
--------TTTACAAGCCTTGAGATCGAAAG----ATTTCAAGGCTTTTTTCATCATTA
--------TGCATAAGCCTTGAGATCTTAG----GATTTCAAGGCTTTTTCATTAGTTA
----------ATATTGCTCAGACTTT------------GTTTGAGCATTTTTTTATTAA
------TTTTCTCCTTGCATCTTAATT----------GATGTGAGGATTTTTGTTTATA
-----------GTTTATGCCTCGAGGAACACCATTTCCTCGAGGCATTTTTGTTCTTTC
------------CTTACCCGAATTCTAT------------AATTCGGTTTTTTTATTTT
----------–-TATGCCCTGACGTTTTT---------CGTTGGGGCTTTTTTAATGCT
----------ATAAAAACTCGAAGATAGGG----TCTTCGAGTTTTTTGTTTTTCCTAA
--AAAGAACCTTTCCGTTTTCGAGTAAGATGTGATCGAAAAGGAGAGAATGAAGTGAAA
-------ATTCTCCCTTTGTGTAAA------------ACACAAAGGGTTTTTTCGTTCTAT
--------GGCAGCCTTCTTCTTGTGAGGATGAATCACGAGAAGGGGAGGAGAACAAGCAT
-–AACTTCTTCTGATTTTATAG------------AAAATTGGAGGAACCTGTTATGACA
---GGAACTTCTTTCAATTTGAAA-----------AAATTGGAGGAATTTTTTAATGTC
---–GGCCTTCTTTCGATTTGTAA-----------AAATTGGAGGAATTTTTTTATGAA
--------TCCTCCTATTCTTACG--------AGATGAATGGAAGGAGAAAATTGAATATG
---CTACTCTATTTTTCCCTGCAGA------------AAAATAGGGTTTTTTTGTATGA
-–TCAACTTCCTCGAAATTTGAAGAAT-TATTTTCTCATATTTGGAGGTTTTTTTATGT
---ATGCACAAACTCTCCCTCAACTTTTTTTA--------GTTGAGGTTTTTTATTTGC
Antiterminator
Alternative RNA secondary structures
upstream of riboflavin genes with RFN elements
Attenuation of translation by sequestering of the RBS
Antisequestor
The RFN element
EC
TY
KP
HI
VK
AB
YP
VC
Spu
MLO
AC
BP
BPS
BU
REU
RSO
PP
PY
PU
PA
BME
CAU
TFU
GLU
DR
SM
TQ
AMI
AATCCGCTTATTCTCAGGGCGGGGCG
AACCCGCTTATTCTCAGGGCGGGGCG
ATCTCGCTTATTCTCAGGGCGGGGCG
TTAGCTCGCATTCTCAGGGCAGGGTG
TATTTGCGCATTCTCAGGGCAGGGTG
TAGGCGCGCATTCTCAGGGCAGGGTG
ATGGGGCTTATTCTCAGGGCGGGGTG
CACAACAATATTCTCAGGGCGGGGCG
CTATCAACAATTCTCAGGGCGGGGTG
GACGTTAAAGTTCTCAGGGCGGGGTG
AAGCGACATCGCTTCAGGGCGGGGCG
AAGCAGTACGTCTTCAGGGCGGGGTG
AGTCAGTGCGTCTTCAGGGCGGGGCG
AATCAGTGCGTCTTCAGGGCGGGGTG
CATCGTTACGTCTTCAGGGCGGGGTG
GCTTGGTACGTCTTCAGGGCGGGGTG
GGTCGGTCGGTCTTCAGGGCGGGGTG
GCCGGTAACGTTCTCAGGGCGGGGTG
CGGCGAAACGTTCTCAGGGCGGGGTG
GGCCGTAACGTTCTCAGGGCGGGGTG
CGCGGGCTTGTTCTCGGGGCGGGGTG
AATCCGAAGACCTTCGGGGCAAGGTG
GTACACACGCGTGCTCCGGGGTCGGT
TGAGTTTTGTTCTCAGGGCGGGGCG
GAACCGACCTCTTTCGGGGCGGGGCG
GTCGCAAGCGTTCTCAGGGCGGGGTG
TTCGGCACCTCCTTCGGGGCGGGGTG
CTTACTCACAGTTTCAGGGCGGGGTG
---------------------------------------------------------
RBS-sequestor
TCCGGATGGGAGAGAGTAACG 59 ----------CTGCCCTGATTCTGGTAACCATAATTTTAGTGAGGTTTTT-------TACCATGAATCAGACGCTA
TCCGGATGGGAGAGGGTAACG 61 ----------CTGCCCTGATTCTGGTAACCATAATGTTAATGAGGTTTTTT------TACCATGAATCAGACGCTA
TCCGGATGGGAGAGAGTAACG 61 ----------CTGCCCTGATTCTGGTAACCATAATTTTAATGAGGTTTTTT------TACCATGAATCAGACGCTC
TCTGGATGAAAGAGAATAAAA 41 ----------CAGCCCTGATTCTGGTATTTAATTGAAATCTCAAAT-TAGGAAAT--TACTATGAATCAGTCAATT
TCTGGATGAAAGAGAATAAGC 76 ----------CAGCCCTGATTCTGGTATCTAAATATCTTTATATTTCAAGGAATT--TACTATGAATCAGTCTATT
TCTGGATGAAAGAGAATAAAA 54 ----------CCGCCCTGATTCTGGTATAAATTCATCTTATTAAA—AAGGCATT---TACTATGAATCAGTCATTA
TCCGGATGGGAGAGAGTAACG 194 ----------CCGCCCTGATTCTGGTAATCCATAATTTTTTAATGAGGTTTCT---TTACCATGAATCAGACGCTT
TCCGGATGAGAGAGAATGACA 83 ----------AAGCCCTGATTCTGGTCATTTTTT--------------GGAGTATT--ACCATGAATCAGTCCTCA
TCCGGATGGAAGAGAATGTAA 145 ----------ACGCCCTGATTCTGGATATTCCCATGTCGTATTTTTGAAGGATATTAA-CCATGAATCAGTCTTTA
TCCGGATGAAAGAGGACGAAA 44 -------CGTGCGTCCTGATTCTGGTTCGAAACGGA--------------AGGATGGACCCATGAATCAGCATTCC
TCCGGATGAAAGAAGACGACG 51 ----------CAGTCCTGAAATGTTTAACCGTAATT-------------------TACGAGAGCATTTCATATGTC
TCCGGATGAGAGAAGATGTGC 62 ----------TAGCCCTGAAACGTTTTTCGCCATTTCCTTTTTT------------GCGAGAGCGTTTCAATGTCC
TCCGGATGAAAGAAGATGTGC 86 ----------GAGCCCTGAAACGTTTTTCGCCCATTCATGTTTC-----------GCGAGGAGCGTTTCACATCATG
GCCGGATGGAAGAAGATGTGC 99 ----------ATGCCCTGAAACGTTTTTCGCCCAACTTTT--------------GCGATGAGCGTTTCAACTATGT
TCCGGATGAAAGAAGATGGGC 77 ----------ATCCCCTGAAACGCCCATCCATGGAAATCCACGCAC-------------GGAGCGTTTCAATGCTG
TCCGGATGGAAGAAGATGTGC 80 ---------CGTGCCCTGGAACGTCTTGTCGCCCATTTCA---------------GCGAGGAGCGTTTCCATGTTG
TCCGGATGAAAGAAGGCGTCA 50 ----------TCGCCCCGAGACGTTCATCGATCATTCA------------------CGAGGAGCGTTTCATGTTCA
CCGGATGAAGAGAGAGCGGGA 91 ----------ATGCCCTGTTTTTTCATTAAATT---------------------AAACAGGAGTCAGAACACGTGC
CCGGATGAAGAGAGAACGGGA 68 ----------ACGCCCTGTTTTTCACAC--------------------------AAACAGGAGTCAGAACATGCAA
CCGGATAAAGAGAGAACGGG
53 ---------AAAGCCCTGTTTTTCAC---------------------------GAAACAGGAGTTCGTCATATG-TCCGGATGGAAGAGAGCGAAT 54 ----------GCGCCCTGATTCTAGTTTCGTG--------------------------AGGAACCTATGAACCAAA
TCCGGATGGGAGAAGGTCGGC 116 ------CGCGATGCCCCGAAGGTGTG-----------------------------TTCAGGGGTGTCGCGATGAAC
GGATGGGAGGTAGTACGTGGT 58 -------GCCTTACCCCGGAGCCTGACCT-------------------------GGCTAGGGGGAAGGCTTCTCGCAT
TCCGGATGCAAGAGAACCG
32 ---------AAGGCCCCGAGGATTACATGCTTTTAAATCCTTTGAAAAGGGGACAAGATCATGAATCCTATAACCG
TCCGGACGAAAGAAGGAGGAG
1 GACGCTCAGCTTGCCCCCCA------------------------------------GCAGGCGGCGTCCGCGTATG
TCCGGATGGAAGAGAGCAAGC 45 ATCATTGGAAAAATGCCAACCCTGAAA-------------------GGCTTGAGACCATGACCATACTT
TCCGGATGGGAGAAGGAGGGCCACTTGCGC
TCCGGATGGAAGAAACGGAGCGCCTTATGG
Direct RBS sequestering
The predicted mechanism of the RFN-mediated regulation
of riboflavin genes and operons
• Transcription attenuation
• Translation attenuation
Phylogenetic tree of RFN-elements
новые потенциальные транспортеры флавинов:
1. ImpX найден в Fusobacterium nucleatum и
Desulfitobacterium halfniense:
impX
Имеет 9 предполагаемых трансмембранных сегментов;
не имеет гомологии с какими-либо известными генами.
2. PnuX найден в актинобактериях:
pnuX
Streptomyces coelicolor
Thermomonospora fusca
pnuX
Corynebacterium glutamicum
Имеет 6 предполагаемых трансмембранных сегментов;
гомологичен PnuC (транспортер N-рибозил никотинамида)
Known Thi-box signal in diverse bacterial genomes
(Miranda-Rios et.al., 1997)
TTCGGGATCCGCGGAACCTGA-TCAGGCTAA-TACCTGCG-AAGGGAACAAGAGTTA
TTCGGGATCCGTTGAACCTGA-TCAGGTTAA-TACCTGCG-AAGGGAACAAGAGAAG
GCAGTGACCCGTTGAACCTGA-TCCAGTTCA-TACTGGCG-TAGGGACGGTGCAAGC
GCAGTGACCCGTTGAACCTGA-TCCAGTTCA-CACTGGCG-TAGGGACGGTGCAGAC
AGAAATACCCTTTACACCCGA-TCGGGATAA-TACCTGCG-TGGGGAGTTTTCACGG
TTCTTAACCCTTTGGACCTGA-TCTGGTTCG-TACCAGCG-TGGGGAAGTAGAGGAA
CCGTCGACCGTACGAACCTGA--CCGGGTAA-TGCCGGCG-TAGGGAGTTGCAAATG
GGATCGACCCTTTGAACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGAAATTATGTCG
TCCTCGACCCCAAGAACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGATCGGGGAAGG
Notation:
Red– Conserved nucleotides;
Green– Purine or Pyrimidine conserved nucleotides;
Blue– Non-conserved nucleotides
THIC_EC
THIC_VC
THIC_MLO
THIC_SM
THIC_NM
thiC_BS
THIC_MT
THIT2_TVO
thi1_TM
Predicted regulatory THI-elements in bacterial genomes
1
2
3
3'
FACULTATIVE STEM-LOOP
2'
4
5
5'
4'
1'
----====>===> -=====>
<=====
========>
<======= <===
===>
=====>
<=====
<=== <====---BACILLUS/CLOSTRIDIUM GROUP
BS_THIC TAGTTACTGGGGGTGCCCGCT----------------TTCcgGGCTGAGAGAGAAGGCA-------------AGCTTCTTAACCCTTT---GGACCTGA-TCTGGTTCG-TACCAGCG-TGGGGA-AGTAGAGGA
BS_TENA TAACCACTAGGGGTGTCCTTC----------------ATAAGGGCTGAGATAAAAGTGT-------------GACTTTTAGACCCTCA---TAACTTGA-ACAGGTTCA-GACCTGCG-TAGGGA-AGTGGAGCG
BS_YLMB TTCATCCTAGGGGTGCTTTG-------------------CGAAGCTGAGAGAGACTT-----------------TGTCTCAACCCTTT---TGACCTGA-TCTGGATCA-TGCCAGCG-GAGGGA-AGCGGTGAA
BS_YKOF AAAGCACTAGGGGTGCTGT--------------------TTTGGCTGAGATAAAGCGCGGAA-----GAAACGCGCTTTGATCCCTTA---TGACCCGA-TCTGGATAA-TACCAGCG-TGGGGA-AGTGCAGGT
SA_TENA GAACTACTAGGGGAGCCTAAT----------------GATATGGCTGAGATGAATT-------------------GTTCAGACCCTTA---TGACCTGA-TTTGGTTAG-TACCAACG-TAGGAA-AGTAGTTAT
SA_YKOE CACACACTAGGGGTGTTT----------------------TATACTGAGATGAGGCTT---------------GCCCTCAAACCCTTT---GAACCTGA-TCTAGCTTG-AACTAGCG-TAGGAA-AGTGTTACT
LLX_YUAJ TTTGCACAATGGGTCTATTGACAAA---------ACTGTCAGTAGCGAGA----------------------------AATACCATC----TGACCTGA-TCTGGGTAA-TGCCAGCG-TAGGAA-TGTGTTAAG
CA_THIS ATAGTTAACGGGGAGCCTGTA-----------------GACAGGCTGAGAGTGGAATG--------------TGATTCCAGACCCTCA---TAACCTGA-TTTGGATAA-TGCCAACG-TAGGGA-GTTAATGCA
CA_YUAJ TATGTGCTAGGGGTGCCTT---------------------TAGGCTGAGAAACAGTTT--------------GTCACGTTAACCCTT-----AACCTGA-TCTGGATAA-TACCAGCG-TAGGGA-AGCAGTTTG
ST_YUAJ TTTCACAAAGGAGTGCTT-----------------------TGGCTGAGATCGCAA------------------TTGCGAAATCCTGA---GGACCTGA-TCTTGTTAG-TACAAGCG-TAGGGA-TTGTGACCA
DHA_THIC TAATCACTAGGGGGGCCGAATA---------------AGGTCGGCTGAGATAAAGGACCCA---------AGAATCCTTTGACCCTT-----AACCTGA-TCTGGGTAA-TGCCAGCG-TAGGGAAGGTGGATAA
LMO_TENA GAAAAACTAGGGGGGCCGAT-------------------TCTGGCTGAGATAGGAAGGTAAT-----------GCTTTCTGACCCTTT---GAACCTGT-TT--GTTAG-TGCAAGCG-TAGGGA-AGTGAATGT
LMO_YUAJ TTACCACAGGGGGGGCTTC---------------------TTAGCTGAGATTGAGTCCACGTGT-----TTTTGGATTCTGACCCTTT---GAACCTGT-TC--GTTAA-TACGAGCG-TAGGGA-TTGTGGCGA
PROTEOBACTERIA
EC_THIB GTTCTCAACGGGGTGCCACGCGT------------ACGCGTGCGCTGAGAAA---------------------------ATACCCGTCGA---ACCTGA-TCCGGATAA-CGCCGGCG-AAGGGATTTGAGGC
EC_THIM AAACGACTCGGGGTGCCCTTCTGC-------------GTGAAGGCTGAGAAA----------------------------TACCCGTATC---ACCTGA-TCTGGATAA-TGCCAGCG-TAGGGA-AGTCACG
EC_THIC TTTCTTGTCGGAGTGCCTTA-------------------ACTGGCTGAGACCGTTT------------------ATTCGGGATCCGCGGA---ACCTGA-TCAGGCTAA-TACCTGCG-AAGGGA-ACAAGAG
VC_THIC CCACTTGTCGGAGTGCCAT---------------------TGGGCTGAGACCGTTT------------------ATTCGGGATCCGTTGA---ACCTGA-TCAGGTTAA-TACCTGCG-AAGGGA-ACAAGAG
VC_THID CCTGTAGTCGGGGAGCCTGAGAG-- 66 5 71 -AATTAAAGGCTGAGATCGCGT-------------------AGCGAGACCCGTTGA---ACCTGA-TTCAGTTAG-GACTGACG-TAGGGA-ACTATCC
VC_THIB CCCACTCACGGGGGGCCACCCATTCAT-------CCGAATGGCGCTGAGATCAAGCAC---------------TGCTTGGGACCCGCA 21 -ACCTGA-ACCAGATAA-TGCTGGCG-TAGGAATTGAGCTA
XFA_THIC TTTGAAGCGGGGGTACCATAGCCA------------AGCTGCGGTTGAGAC----------------------------ACACCCTTCGA---ACCTGA-TCCGGTTTA-CACCGGCG-TAGGAAAGCTTCGT
MLO_THIC CATTCACCAGGGGAGTCCCGG----------------CAAGGGGCTGAGATACTGCTGGCTTTC------GCGGCGCAGTGACCCGTTGA---ACCTGA-TCCAGTTCA-TACTGGCG-TAGGGACGGTGCAA
MLO_THIB CGCTCTAACGGGGTGCCGGA------ 5 3 5 -----GACCGGCTGAGAGGCAGT------------------CTCGCCAACCCGCTGA---ACCTGA-TCCGGTTTG-TACCGGCG-GAGGGA-TTAGACG
MLO_YK GCCCATCCACAGGGGTGCTCCGTAC-------------GGTCGGGGCTGAGACGGGGGCGG-----------CAAGCCCACAGACCCTAGA----AGCTGA-TCTGGGTAA-TACCAGCG-GAGCGA-GGCGGGCG
NX_CITX CTCCTTGTCGGAGTGCCGCCGC---------------CGGGCGGCTGAGATTGCGA------------------AAGCAGAATCCGTAGA---ACCTGT--CGGGGTAA-TGCCTGCG-TAGGAA-ACAAACC
NX_THIC ATTGAAACAGGGGTGCTGCCTGAT----------GTTTAGGCGGCTGAGAA----------------------------ATACCCTTTAC---ACCCGA-TCGGGATAA-TACCTGCG-TGGGGA-GTTTTCA
ACTINOBACTERIAE
MT_THIO CTGTAGACACGGGAGTCCCGGG--------------AGCGGGGTCTGAGAGTGGGCGCGCCT-------------GCCCTTACCGTCAC----ACCTGA-TCCGGATCA-TGCCGGCG-AAGGGAGGTCAAGGATG
MT_THIC GTACCCACGCGGGAGCGCACGC--------------CGAGTGCGCTGAGAGGACGGCTCGGG------------GCCGTCGACCGTACGA---ACCTGA--CCGGGTAA-TGCCGGCG-TAGGGAGTTGCAAATG
CGL_THIC CAGTCCCCACGGGCGCCCGA-----------------GCACGGGCTGAGATCGCGCTGATT---------GCTGCGCGAGCACCGTTTGA---ACCTG--TCCGGTTAG-CACCGGCG-AAGGAAGAGAGGAATGGTGC
CGL_THID ACTAGGCACGGGGTGCCAACCGGATGG---AAAAATTCCGGAGGCTGAGAAA---------------------------ACACCCGTTGA---ACCTGC-TCTAGCTCG-TACTAGCG-AAGGGATGGCCTTAACGTG
CGL_THIE CTTACCCCACGGGTGCCCAAT---------------GCATTGGGCTGAGATTGCGCGCTGT---------TGCTGCGCGGGACCGTTCGA---ACCTG--TCTGGTTAA-CACCAGCG-AAGGAAGCGAGGATTGATTG
CGL_YKOE TCATAGACACGGGTGCTCGGTGA------------AAATCCGGGCTGAGATCTGGCA----------------TAGCCACGACCGTCGA----ACCTG-ATCCGGATAA-TGCCGGCG-ATAGGGAGGAAAAATATG
CGL_OARX TAGTGACACGGGGTGCAAAAGCACTTT----AAAAAAGCTTTCGCTGAGATT---------------------------ACACCCGTCGA---ACCTG-ATCCAGTTAG-TACTGGCG-AAGGGACTGTCGCAT
CYANOBACTERIA
NPU_THIC TCCATGCTAGGGGTGCCTACAT---------------AACCAGGCTGAGATC---------------------------ACACCCTTAAC---ACCTGAGTCTGGGTAA-TACCAGCG-GAGGGAAGCTGTTTATTG
CY_THIC CCATAGCTAGGGGTGTCTAGAA---------------AGCTAGGCTGAGAA----------------------------AAACCCTTAGA---ACCTGAGACTGGGTAA-TACCAGCG-GAGGGAAGCTCACCATTC
AN_THIC TCCATGCTAGGGGTGCTTGCAC---------------TAACAGGCTGAGATT---------------------------ACACCCTTAAC---ACCTGAGACTGGGTAA-TACCAGCG-AAGGGAAGCTGTTTATTG
THERMUS/DEINOCOCCUS, THERMOTOGALES, Fusobacterium, CFB group
DR_THIB CGCGTCACCGGGGGTGCCCTGCTT------------CGGCAGCGGCTGAGAAC---------------------------ACACCCCAGGA---ACCTGA-ACCGGGTCA-TTCCGGCG-GAGGGAGTGTGATGC
DR_THIC ATCGTCAACAGGGGTGCCTCCGCATA--------TGGGCCGGAGGCTGAGAGGGCAACT---------------CGGGCCTAACCCTATGA---ACCTGA-ACTGGTTAG-CACCAGCG-GAGGGA-GTGTGACG
TQ_THIBGGCCGTCACCGGGGGTGCCCCA------------------AAAGGGCTGAGAGC---------------------------ATACCCTTGGA---ACCTGA-TCCGGGTCA-TGCCGGCG-TAGGGAAGGTGACGGCC
TM_THI1 CCTTCCCCAGGGGGAGCTCCTAT---------------TCCGGGGCTGAGAGGAGGACGG-------------AAGTCCTCGACCCCAAGA---ACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGATCGGGGAAGGA
FN_THIC TATATGTACTGGGGAGCTT----------------------TGTGCTGAGATTAGAACCT------------TTTTTCTTAGACCCATAGT---ACCT-GA-TTTGGATAA-TGCCAACG-AAGGGA—GTACCA
FN_THIX ACTAGTTACAAGGGAGTTAATA-----------------AATTGACTGAGAAAAGGATG--------------TGAGCCTTGACCTTTTG----ACCT-GA-TTTGGATAA-TGCCAACG-TAGGAA--GTAAA
PG_THIS AGACCGCTACGGGGGTGCTTGCCG--- 4 3 4 -GATACGGCAGGCTGAGAT---------------------------AATACCCATAG---ACCT-GA-TCCGGATAA-TACCGGCG-GAGGGAT-GTAG
PG_OMR ATTGGGAGAAGGGGTGCTTCCTGTA--- 3 7 3 --GTGGATGGCTGAGAAC---------------------------AAACCCTCATC---ACCT-GA-ACCGGATAA-TACCGGCG-TAGGAAA-CTCTC
BX_THIS TAAAGACAAAGGGGTGCCACC------------------CGGTGGCTGAGATT---------------------------ATACCCTAAGA---ACCT-GA-TGCAGTTAG-TACTGCCG-AAGGGA—TTGTG
ARCHAEA
TAC_T1
GGTGTGGTGGGGGAGCTCCAT-----------------AAGGGGCTGAGAGGATCCGG---------------ATGGATCGATCCCTGGA---ACCTGA-TCCGGGTAA-TACCGGCG-GAGGGAAATTATG
FAC_T1
AGTTATACCGGGGAGCTAA---------------------AATGCTGAGAGGATAA-------------------GGATCGACCCGTGCA---ACCTGA-TCCGGACAA-TACCGGCG-GAGGGAGATGGATA
Conserved RNA secondary structure of the regulatory THI element
THI-element
facultative
stem-loop
G
A
G
U
C
N
A
C
C
R
C
N
G
R
G
K
G
G
Y
G
Y
M
3
N UN
R
G
A U YG
U
C
R
G
CC
R
G
U
C
AC
C
5
A
G
G
N
G
G
A
4
Thi-box
A
2
1
Capitals: strongly conserved positions. Dashes and points: obligatory and facultative
base pairs
Degenerate positions: R = A or G; Y = C or U; K = G or U; M= A or C; N = any nucleotide
Distribution of THI elements in bacterial genomes
THI-element regulates thiamine biosynthetic genes and transporters.
Genomes
Number of
analyzed
genomes
Number of
genomes
with THI
Number of
the THI
elements
a-proteobacteria
7
7
15
b-proteobacteria
6
6
12
g-proteobacteria
18
17
38
e- and d-proteobacteria
3
1
1
Bacillus/Clostridium
18
18
51
Actinomycetes
9
9
25
Cyanobacteria
5
5
5
Other eubacteria
14
11
11
Archaea (Thermoplasma)
17
3
6
Total
97
77
164
A number of NEW candidate thiamine-related transporters were identified.
The predicted mechanism of the THI-mediated regulation
of thiamin genes
•Bacillus/Clostridium group,
•Thermotoga,
•Fusobacterium,
•Chloroflexus
• Transcription attenuation
1,2
•Thermus/Deinococcus group,
•CFB group
•Proteobacteria,
• Translation attenuation
1,2
•Actinobacteria,
•Cyanobacteria,
•Archaea
• Direct RBS sequestering
New functional predictions
Транспорт
гидроксиэтилтиазола
(грамположительные бактерии)
Транспорт
гидроксиметилпиримидина
(грамотрицательные бактерии)
Predicted THI-regulated genes (more enzymes)
•
tenA: gene of unknown function somehow associated with thiD
Found in most firmicutes, some proteobacteria and archaea;
ThiD-TenA gene fusions in some eukaryotes;
Forms clusters with thiD and other THI-elements-regulated genes in most bacteria;
Single tenA gene is also regulated by THI-elements in some bacteria;
Not found in genomes without the thiamin pathway;
Always co-occurs with the thiD and thiE genes
•
tenI: gene of unknown function, thiE paralog
Found in some unrelated bacteria;
Forms a separate branch in the phylogenetic tree for thiE;
In most bacteria, located in clusters of THI-elements-regulated genes.
•
ylmB from Bacilli belongs to ArgE/dapE/ACY1/CPG2/yscS family of metallopeptidases;
regulated by the THI-elements in B. subtilis and B. halodurans, not regulated in B. cereus.
•
thi-4 from Thermotoga maritima belongs to a family of putative thiamine biosynthetic
enzymes from archaea and eukaryotes. Located in the one operon with thiC and thiD.
•
oarX from Methylobacillus and Staphylococcus is a single THI-elements-regulated gene;
belongs to short-chain dehydrogenase/reductase (SDR) superfamily
Regulation of cobalamin-related genes:
Experimentally known facts:
Extensive region of the mRNA leader is essential for regulation of the
btuB gene by vitamin B12.
Involvement of highly conserved B12-box rAGYCMGgAgaCCkGCcd
in regulation of the cobalamin biosynthetic genes (E. coli, S.
typhimurium).
Post-transcriptional regulation: RBS-sequestering hairpin is essential
for regulation of the btuB and cbiA genes.
Ado-CBL is an effector molecule involved in the regulation of the CBL
genes.
Identifying of other conserved sequenced regions and
prediction of common RNA secondary structure of the
B12-element.
B12-элемент – регулятор кобаламинового пути
Дополнительная
шпилька -I
g
aN
t
C
t Gg
cg
N
N
N
N
2
A
A
G
G
G
a
N
a
a
1
C
c
y
G
d
RC
c
C
c
G 3
C
h a
C
Часть I
K
G
T
r
a
4
r
A
G
Y
N
g k
r
c tG
y
G
h
C
B12-бокс
G M
C k Gg
C
C
A
C
d
Часть II
5
g c C
6
A
Дополнительная
шпилька -II
CTG
c gG
GGY
AG
A
Группа
Bacillus/Clostridium
-протеобактерии
a g
0
Факультативная
шпилька
5'
3'
основная спираль
Различные таксономические группы
Allignment of B12-elements alpha and beta proteobacteria
0
1
1'
2
AddI
2'
3
3'
4
======> -===><=======>
><
<==== ===>
<==
=====>
 -proteobacteria
hgGtkcy
rg
aa aGGGAA
cgGtg
a tCcg RCdG-ycCcCGChaCKGTra
MLO_METE
-285 GCGCATGTCGTGGTTCT 22 AGC--TAAGAGGGAA--GCCGGTG 2 ATGCCGGCGCTG-CCCCCGCAACTGTTAGCGGCGAG
MLO_CFRX
-237 CCGCTCCAGACGGTCCC 15 GGGGCTAAGAGGGAA--TGCGGTG 16 AATCCGCGGCTG-TCCCCGCAACTGTAAGCGAAGAG
MLO_BTUD
-290 GGGTGCGTGATGGTCCC 16 GGGT-GAAAAGGGAA--CACGGTG 16 AGACCGTGGCTG-CCCCCGCAACTGTAAGCGGAGAG
MLO_CBTAB -213 AGTCATGCAGTCGTCGG 13 CC----AAGAGGGAA--TGCGGTG 19 ATGCCGTGGCTG-CCCCCGCAACTGTGTGCGGTAGT
MLO_BLUB
-233 CGCCACTGCCTGGTGCC 11 GGA--GAATCGGGAA--CACGGTT 2 ACTCCGTGGCGT--GCCCAACGCTGTAAGGGGGACC
MLO_ARDX
-308 ATGTCATCTCAGGTGCC 18 GGA--GAATTGGGAA--GCCGGTC 2 AGTCCGGCGCTG-CCCCCGCAACGGTGGTGGAGTTC
SM_ARDX
-310 AGGACACTCAAGGTGCC 16 GGA--GAATTGGGAA--GCCGGTC 2 ATCCCGGCGCTG-CCCCCGCAACGGTGGTGGAGCGA
SM_BTUF
-391 CTGGGACCGACGGTTCC 19 GGAT-TAATAGGGAA--CACGGTG 21 AAACCGTGGCTG-CCCCCGCAACTGTAAGCGGATCG
SM_BLUB
-251 TGCCGCCGTCAGGTGCC 11 GGG--GAATCGGGAA--GCCGGTG 2 GTTCCGGCACGT-GCCC---AACGCTGTGAAGGGGA
SM_CBTC
-255 GATCATGTGATGGTTCC 18 GGAT-GAAAAGGGAA--CACGGTG 21 AAACCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG
SM_COBU
-527 GCAGTATGGATGGTTCT 21 GGAG-TAAATGGGAA--TGCGAAG 23 TTATCGCAGCCG-ACCCCGCGACTGTAGAACGGTCA
PD_COBU
-586 AGGTGTTGGATGGTTCC 21 GGAA-TAATTGGGAA--TGTGACG 22 TTATCGCAGCCG-ACCCCGCGACTGTAGAACGGTCA
BME_BTUB
-378 TTTCAGGAGACGGTTCC 11 GGAT-GAAAAGGGAA--CACGGTG 14 AAACCGAGACTG-CCCCCGCAACTGTAACCGGAGAG
BME_BTUF
-398 ACCGTCATGACGGTTCC 17 GGAT-TAATAGGGAA--CACGGTG 22 AGACCGTGGCTG-CCCCCGCAACTGTAAGCGGATTG
BME_NRDH
-558 CTTGTGTTCGAGGTTCT 19 AGCT-AAGACGGGAA--TCCGGTG 23 ATGCCGGAGCTG-CCCCCGCAACTGTAAGCGGCGAG
BME_CBTAB -281 ACCATGTGACAGGTTTT 19 AATACCAAAAGGGAA--TGCGACG 22 TTATCGCAGCCG-ACCCCGCGACTGTAGAGCGGAGA
AU_CFRX
-329 AAGGGACTGACGGTCTT 16 AAGC-TAAGAGGGAA--CACGGTT 18 ATTCCGTGGCTG-CCCCCGCAACTGTAAGCGGTAAG
AU_NRDH
-257 GTGGTGTTCAAGGTTCT 20 AGCT-AAGACGGGAA--TTCGGTG 23 AGGCCGAAACTG-CCCCCGCAACTGTGAGCGGCGAG
AU_CBTAB
-382 ATGTCCGTGATGGTTCC 17 GGT--GAAAAGGGAA--CACGATA 12 CATTCGTGGCTG-CCCCCGCAACTGTGAGCGGAGAG
AU_ACHX
-299 TTAGCCATCGTGGT-TC 16 GAGC-TAAGAGGGAA--TTCGGTG 20 AATCCGAAGCTG-CCCCCGCAACTGTAAGCGACGAG
AU_BTUF
-386 GAGAAAGCGACGGTTCC 18 GGAT-TAATAGGGAA--CATGGTG 20 ATGCCTTGGCTG-CCCCCGCAACTGTAAGCGGATTG
AU_BLUB
-272 TTCTCCGGTCAGGTGCC 9 GGC 4 AATCGGGAA--TCCGGTG 2 AGACCGGAACGT-GCCC-AACGCTGTAAGGCGGATG
BJA_BTUB
-321 TGATCGGTGACGGTTCT 9 GAT CAAAAGGGAA--CGTGGTG 30 ACGCCACGGCTG-CCCCCGCAACTGTAAGCGGTGAA
BJA_METE
-296 CAAGTCGTCGAGGTTCT 12 GAT 8 AAGAGGGAA--GCCGGTG 3 ATGCCGGCTCTG-CCCCCGCAACTGTGAGCGGCGAG
BJA_CBTC
-250 AGGACGGGCATGGTGCT 22 GCA--TAATCGGGAA--TGGGGAT 24 AAACCCCAGCCG-CCCCCGCGACTGTAAGCGGTGAA
BJA_BTUB3 -308 ATGCTCGCGACGGTTTC 11 GAT--GAAAAGGGAA--TGCGGTG 16 ATGCCGCGGCTG-CCCCCGCAACTGTAAGCGGATAA
BJA_CFRX
-308 GGCCCGGCGTTGGTTCC 12 GGC--GAAGAGGGAA--TGCGATA 27 AAAATGCAGCCG-CCCCCGCGACCGTGACCGGAGAG
RC_CBTF
-327 AAGGCGGGATTGGTTCC 12 GGAT-GAAAAGGGAA--TGCGGTG 12 AACCCGCAGCTG-CCCCCGCAACTGTAAGCGGCGAG
RC_BTUB
-313 TGTCCCGTCCAAGTTCC 12 GGAT-TGAAAGGGAA--CACGGAA 14 AGACCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG
RC_X-CBIP3 -264 GCCCGGGCCTTGGTTCC 14 GGAC-GAAGAGGGAA--GCCGGTG 2 AGTCCGGCGCTG-CCCCCGCAACTGTAAGCGGCAAG
RC_BTUF
-361 CCAGCGGCGTCGGTTTC 6 GAAT-TGAAAGGGAA--TCCGTTG 15 GAACCGGAACTG-CCCCCGCAACTGTAGGCGGCGAG
RC_ARDX
-246 GAAGGCCTCAGGGTGCC 14 GGA--GAATTGGGAA--GCCGGTG 2 AGACCGGCGCTG-CCCCCGCAACGGTCAGCAATGAG
RC_X-CNOA -200 GGGGCGTCATCGGTCCC 24 GGGGGAAAGAGGGAA--TACGGTG 21 AATCCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG
RC_X-BTUD -240 TCGCGGCAGATGGTTCC 21 GGT--GAAAAGGGAA--TACGGTG 20 AATCCGTAACTG-CCCCCGCAACTGTAAGCGGCGAG
RC_CFRX
-295 GGGCGGGCGCTGGTTTC 13 GC---GAAGAGGGAA-----TGTG 31 CGACCGCAGCCG-CCCCCGCGACCGTGACCGGAGAG
RC_CBIM
-282 CAACAGGCGATGGTTCC 10 GGAT-TAATAGGGAA--CACGGTG 21 AATCCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG
RC_EXBB
-322 TGACGTGTTCAAGTTCC 12 GGAT-TGAAAGGGAA--CACGGAA 14 AGACCGTGGCTG-CCCCCGCAACTGTGAGCGGCGAG
RC_CRDX
-264 CAGCGGGCCTTGGT-CC 16 GGGG-TAATAGGGAA--GCCGGTG 2 ACTCCGGCGCTG-CCCCCGCAACTGTCAGCGGCAAG
RC_NRDD
-272 GTGACGCTCTGGGT-CT 14 AGC--CAAGAGGGAA--GCCGGTG 2 ATTCCGGCGCTG-CCCCCGCAACTGTAAGCGGCGAG
RC04759
-466 CTTGTGGCGATGGTGGC 17 GCCT-GAAAAGGGAA--TGCGGTG 14 AGGCCGCGGCTG-CCCCCGCAACTGTGAGCGACGAG
RS_BLUB
-217 GGCAGGGGTCAGGTGCC 10 GGA--GAATCGGGAA--GCCGGTG 2 AATCCGGCGCGG-GCCC-GCCGCTGTGACGGGGATG
RS_BLUE
-287 GTGCGGGCGACGGTTCC 14 GGC--GAAGAGGGAA--TGCGGTG 17 AAGCCGCGACTG-CCCCCGCAACTGTAGGCGGCGAG
RS_CFRX
-286 TCCGGCGCGCTGGTTCC 14 GGC--GAAGAGGGAA--TGCCCCA 0 --GAGGCAGCCG-CCCCCGCGACCGTGACCGGAGAG
RS_CBTC
-267 CGGGCTATGACGGTTCC 19 GGAT-GAAAAGGGAA--CGCGGTG 16 GTTCCGCGACTG-CCCCCGCAACTGTGAGCGGCGAG
RS_BTUB
-320 GGAACGGCTTCGGTTCC 12 GGAT-GAAAAGGGAA--CGCGGTG 16 ACTCCGCGGCTG-CCCCCGCAACTGTAGGCGGCGAG
RS_BTUF
-365 CAATCCTCGTCGGTTTC 6 GAAT-TGAAAGGGAA--TCCGCCG 15 GAACCGGAACTG-CCCCCGCAACTGTAGGCGGCGAG
SAR_BTUB
-400 TTGATCGCGCCGGTGCC 8 GGGCTTAATCGGGAA--TGCGGTG 16 AATCCGCGGCTG-TCTCTGCAACTGTAAGCGGATAG
SAR_COBW
-403 ATGATCGCGCCGGTGCC 8 GGCT-TAATCGGGAA--TGCGGTG 16 AATCCGTGGCTG-TCCCTGCAACTGTAAGCGGATAG
SAR_BTUBF -297 CCGACGCCAGAGGTGCC 10 GGCT--AAGAGGGAA--GCCGGTT 2 ATTCCGGCGCTG-CCCCCGCAACTGTAACCGGATAG
CO_METE
-339 GCCGTTGTCGTGGT-CT 18 AGC--TAAGAGGGAA--GTCGGTG 16 AATCCGGCGCTG-CCCCCGCAACTGTGAGCGGCGAG
CO_BTUB
-318 GCTTCGCGTCAGGTTCC 8 GGAT-GAAAAGGGAA--CGAGGTT 2 AGACCTCGGCTG-CCCCCGCAACTGTAAGCGGCGAG
RPA_HOXN
-281 GCGCCCGTTCAGGTGTG 15 CAC------AGGGAA--GCCGGTG 28 AATCCGGCGCTG-CGCCCGCAACTGTGAGCGGTGAG
RPA_BTUB3 -448 TGACCAGCGACGGTTCC 6 GGAT-CAATAGGGAA--CGCGGTG 16 ATTCCGCGGCTG-CCCCCGCAACTGTAAGCGGCGAG
RPA_CFRX5 -383 TTGACGTCTTCGGTGCC 10 GGTG-AAACTGGGAA--TACGGTG 15 AATCCGTAGCTG-CCCCCGCAACTGTAGGCGGATCT
RPA_CRDX
-364 TGCCAAGCGATGGTCCT 10 AGGT-GAAAAGGGAA--GCCGGTG 19 ATCCCGGAGCTG-CCCCCGCAACTGTAAGCGACGAG
RPA_METE
-297 ATCGCCGTCGAGGTTCT 19 AGCT--AAGAGGGAA--GCCGGTG 2 AGGCCGGCGCTG-CCCCCGCAACTGTTAGCGGTGAG
RPA_COBT2 -412 CCGCTCGCTTCGGTGCC 12 GGTG--AAACGGGAA--TGCGGTG 16 AGTCCGCGGCTG-CCCCCGCAACTGTAAGCGGATCG
RPA_BTUF2 -320 GAGGTTGTACCGGTGCC 13 GGTG--AAACGGGAA--TGCGGTG 15 ATGCCGCAGCTG-CCCTCGCAACTGTGGGCGGATCG
RPA_BTUB
-304 ATGGCGGTGACGGTTCC 5 GGGATGAAAAGGGAA--TACGGTG 24 AGGCCGTAGCTG-TTCCCGCAACTGTAAGCGGATCG
RPA_CBIC
CGCGCGCCGACGGTGTC 14 GACG--AAGAGGGAA-TATCGGAA 20 GCGCCGAAGCTG-CCCCCGCAACTGTAAACGGTGAG
BPS_HOXN
-591 GCTCGCGTTTCGGTGCT 23 AGT--CAAACGGGAA--ACAGGGA 22 CAACCTGTGCTG-CCCCCGCAACGGTAAGCGAAGGC
BPS_BTUB
-329 GGCGCCGCCTCGGTGCT 16 GGT--TAAACGGGAA--GCAGGGC 22 CAACCTGCGCTG-CCCCCGCAACGGTAAGCGATCGC
BPS_COBE
-391 TGCGCGCGTTCGGTGCC 22 GCC---CAACGGGAA--ACAGGAA 17 CAACCTGTGCTGCCCCCCGCAACGGTAAGCCGCCTG
BPS_COBG
-303 GTCCGTCGACCGGCGCC 6 GGC---AAGAGGGAA--CGCAGGG 9 CCGCTGCGGCTG-CCCCCGCAACTGTGAGCAGCGAG
NE_BTUB
-343 CCCTTGTTTGAGGTGTC 20 GAT--GAAACGGGAA--GCCGGTG 22 ATGCCGGCACTG-CCCCCGCAACGGTAAATGAGTCA
MFL_BTUB
-327 CCAAGTTTTGAGGTGTC 22 GGTG-AAACTGGGAA--ACAGGTG 23 ATGCCTGTGCTG-CCCCCGCAACGGTAAGCAAGCCG
MFL_BTUB2 -411 ACCTCACTTACGGTTTT 19 AAAT--AATAGGGAA--TCCGGTG 16 AATCCGGAACTG-CCCCCGCAACTGTAATCGGTGAG
MFL_NRDA
-365 ACACCATCTACGGTGTC 22 GA----AACAGGGAA--TGCGGTC 16 AAGCCGCAGCTG-CCCCCGCAACTGTGACCAGTGAG
REU_BTUB
-252 CCCCCGTTCCAGGTGCT 24 AGTT--CAACGGGAA--ACAGGGA 34 CAACCTGTGCTG-CCCCCGCAACGGTAAGCGACCGC
RSO_HOXN
-270 CTCACGATGATGGTGCC 7 GGTG--AAACGGGAA--CGCGGTG 2 ATGCCGCGGCTG-CCCCCGCAACTGTAAGCGACGAG
RSO_BTUB
-388 CGCCGCGTCCTGGTGCC 16 AGTT--AAACGGGAA--GCAGGGA 22 CAACCTGCGCTG-CCCCCGCAACGGTAAGCGAACGC
VS

11
9
10
8
9
12
13
26
37
11
76
77
28
28
10
24
10
13
10
11
29
10
12
14
11
12
8
11
9
8
13
7
10
9
8
10
9
9
10
10
10
11
6
11
11
12
29
28
13
13
10
11
10
11
8
12
11
11
10
9
53
70
28
13
10
9
13
15
35
10
59
5
6
AddII
6'
5'
VS
-==> ======>
><
<======
<==
gcCACTG
YGGGAAGgc
GGTGTCACTGAGGCGAA-----CGGCCTCGGGAAGACGGG 9
AAAGCCACTGGGACG---------TTCCCGGGAAGGCGGC 11
CATGCCACTGGCCGGC-------AAGGCTGGGAAGGCAGG 9
TATGCCACTGAAGATT------CGTCTTCGGGAAGGTGGG 9
AATGCCACTGTCGA-----------TGACGGGAAGGCACC 9
GAGACCACTGGGCAA--------AAGCCTGGGAAGGTGTC 16
AAGGCCACTGGACACC-------GCGTCCGGGAAGGCGCC 18
CCAGCCACTGCGCGCG-------TTGCGCGGGAAGGCAGA 9
TTTGCCACTGAATATTGA---AGCTATTCGGGAAGGCGGC 8
GATGCCATTGGCCATGA-----ATCGGCTGATAAGGCGGA 8
AAAGCCACTGGCGT--- 69 ---ACGCCGGGAAGGCGAG 76
AAAGCCACTGGCGT-- 119 AAGACGCCGGGAAGGTGAG 64
AAAGCCACTGAAA---- 15 ----AATCGGGAAGGCGGA 10
CATGCCACTGTGCCCA-------CGGCACGGGAAGGCAGA 10
CATGCCACTGGCGA----------AAGCCGGGAAGGCGGG 9
GCAGCCACTGGAAATCAGA-TGGATTTCTGGGAAGGCGCT 10
AAAGCCACTGAACCTTTA-TGATCGGTTCGGGAAGGCGGT 12
TGAGCCACTGGAGCCAA-----AAGCTCCGGGAAGGCTGG 11
AATGCCACTGGCAA--- 29 --AATGCCGGGAAGGTGTT 8
CATGTCACTGAGGCC--------GGCCTCGGGAAGACGGA 9
CATGCCACTGTTTTTTT----CGGAATGCGGGAAGGCAGA 10
CATGCCACTGAAGC----------AATTCGGGAAGGCGAA 9
TATGCCACTGGGAATCT-----CGGTCCTGGGAAGGCGAC 9
GATGTCGCTGAAGCCTGC---ACGGCTTCGGGAAGGCCGG 10
ACCGCCACTGGGCCGCA------AGGTCCGGGAAGGCCGG 10
GAAGCCACTGGGTCCC-------GGTCCCGGGAAGGCGAC 10
GAGGCCACTGATCCCTG----ACGGGATCGGGAAGGCGGG 18
GATGCCACTGGGGAT---------GCCCCGGGAAGGCCGA 9
CATGCCACTGGGAT----------TTCCCGGGAAGGCCGA 8
ACGGCCACTGATCGC--------AGGATCGGGAAGGCGCA 9
AAAGTCACTGTGGCGC------ATGCCATGGGAAGGCCGC 11
AAGGCCACTGGACCCC------GGGGTTCGGGAAGGCGCT 18
GATGCCACTGGGCCTG------CCGGTCCGGGAAGGCCGG 8
GCAACCACTGGCCCCGAC--CGCGGGGCCGGGAAGGTGGG 7
GAGGCCACTGGCAC----------CAGCCGGGAAGGCGGG 24
AAGACCACTGGCCC--- 18 -AACGGCCGGGAAGGTGAC 10
CATGCCACTGGGCC----------CGCCCGGGAAGGCCGA 8
AСACCCACTGGCCC----------CGGCCGGGAAGGGGCA 9
CACGCCACTGGGCCC--------CGGCCCGGGAAGGCCAG 8
AATGCCACTGGGCCC--------GCGCCCGGGAAGGTCCG 10
GAGGCCACCGGTT------------CGCCGGGAAGGCGCC 9
ATGCCCACTGGCCCAGG-----ACAGGCCGGGAAGGGCGG 9
CAAGCCACTGGCCGCA--------AGGCCGGGAAGGCGGG 23
GACGCCACTGGACCGAA-----AGGGCCCGGGAAGGTTCG 10
GGCGCCACTGGGAT----------GTCCCGGGAAGGCCGG 9
AAAGCCACTGTGGCCTC-----AAGCCATGGGAAGGCCGC 10
GCAGCCACTGGGCCGG- 15 --AGGTCTGGGAAGGCGTG 10
GCAGCCACTGGGCAG-- 10 -CAAGTCTGGGAAGGCGTG 10
CATGCCACTGGTGTCGG- 6 CCAGCACCGGGAAGGCGGG 10
CGTGTCACTGACGCG-- 15 GATGCGTCGGGAAGGCCAG 11
CATGCCACTGGGCCCAA-----AAGGCCTGGGAAGGCGAC 12
TCGGCCACTGGGCAGCA-----CTTGCCCGGGAAGGCGAA 9
CACGCCACTGGGCTTT-------CGTCCTGGGAAGGCGGT 9
GTAGCCACTGACGTCCT-----CGGCGTCGGGAAGGCGGT 7
GAGGCCACTGGGAA----------TTCCTGGGAAGGCGGC 9
AAAGCCACTGGGAGC--------GATCCCGGGAAGGTCGA 10
TCCGCCACTGAG----- 18 -----CTCGGGAAGGCGAC 7
CATGCCACTGACCAGA-------TCGGTCGGGAAGGCGGA 8
GATGCCACTGGGAACCT-----CGGTCCTGGGAAGGCGAC 6
TACGCCACTGGATCA---------TATCCGGGAAGGCCGC 8
CCAACCACTGGACGCAT-----CGCGTCCGGGAAGGTGAA 5
GGTGCCACTGCGCTTC-------GCGCGCGGGAAGGCGAG 5
ACGGCCACTGTCCTC--------GCGGATGGGAAGGCGGC 7
CACGCCACTGGCCAC-- 16 ----CGCCGGGAAGGCCCG 10
CACGCCACTGTGCTGT------ATGGCACGGGAAGGCGCA 20
CATGCCACTGTGAAAGA-----CCTTCATGGGAAGGCGGC 9
CTTGCCACTGGACTT---------GATCCGGGAAGGCCGC 11
AAGGTCACTGGGCCTGG- 5 TGAGGCCCGGGAAGACAGG 10
AACGCCACTGAATC--- 17 -----ATCGGGAAGGCGGC 6
CCAGCCACCGCACG----------ATGCCGGGAAGGCGGC 9
CATGCCACTGTTCCG--------CGGAACGGGAAGGCGGC 6
4'
0'
<=====
--->
<---<======
rAGYCMGgAgaCCkGCcd
TGACCCGCGAGCCAGGAGACCTGCCACGACGAACAAC
TGACCCGCGAGCCAGGAGACCTGCCGTCTGCGACAAA
AGACCCGCGAGCCAGGAGACCTGCCATCACTGAGTTG
TGATCCGTGAGCCAGGAGACCTGCCGACGACGGCAAA
TTGATCCCGAGCCAGAAGACCGGCCTGGCAGGCATCG
ACACTCCAGAGCCCGGAAACCAGCCCGAGATTTTTGA
CGGCTCCAGAGCCCGGAAACCAGCCTTGAAGCAGAAA
CTGTCCGTGAGCCAGGAGACCTGCCGTCAAATCGATC
ATGATCCGAAGTCAGAAGACCGGCCTGGCGAGATAGA
CGACCCGCAAGCCAGGAGACCTGCCATCACCTTGGGC
GACGCCGTGAGCCAGGAGACCTGCCATCCGTCAGGGC
GACGCCGTGAGCCAGGAGACCTGCCATCCGGCATGGG
AGACCCGGAAGTCAGGAGACCTGCCGTATCCGGTCAC
TTATCCGCAAGCCAGGAGACCTGCCGTCTTACGTAGT
TGAGCCGTGAGCCAGGAGACCTGCCTTGAGCGTGAAC
AGACCCGCGAGCCAGGAGACCTGCCTGTTGCATGAGG
ATAGCCGCAAGCCAGGAGACCTGCCGTTTCAGGAAAA
TGACCCGCAAGTCAGGAGACCTGCCTTGAGCGCAAAT
TGACCCGTAAGCCAGGAGACCTGCCATCACGGAAATA
TGACCCGCAAGCCAGGAGACCTGCCGCGATAGATAAC
AAATCCGTGAGCCAGGAGACCTGCCGTCAAAATGGAA
-TGAAGCTTAGTCAGAAGACCGGCCTGGCAGGATAGA
CGACCCGCGAGCCAGGAGACCTGCCGTCAGCCGTGGT
TGACCAGCAAGCCAGGAGACCGGCCCCGACAATATAT
TGAACCGCGAGCCAGGAGACCGGCCGTGCATGTTTTG
CGACCCGCGAGCCAGGAGACCTGCCGTCAGCCGTGGT
TGCTCCGCAAGCCGGGAGACCTGCCAGCGCGGACGAT
GGACCCGCAAGCCAGGAGACCTGCCACCCCCCGGGCC
AGACCCGTGAGCCAGGAGACCTGCTTGGACGATCACC
CGAGCCGCAAGCCAGGAGACCTGCCAGGCCGAAACCA
-GACCCGCCAGTCAGGAGACCTGCCGACACGTCGAAA
CGCATTGCAAGCCCGGAAACCAGCCCTGTGACCGCCG
AGACCCGCAAGCCAGGAGACCTGCCTGTGATGCGCCC
CGACCCGCAAGTCAGGAGACCTGCCATCAGCGTCATC
GCATCCGCAAGCCGGGAGACCTGCCAGCGCATGGATT
CGAACCGCAAGTCAGGAGACCTGCCATCGCTCTGGCG
AGACCCGCGAGCCAGGAGACCTGCTTGGACATTCACC
CGAGCCGCAAGCCAGGAGACCTGCCAGGCCAAAGACC
TGACCCGCAAAGCAGGAGACCTGCCCGAGCCTTGATG
-GACCCGCAAGCCAGGAGACCTGCCATCGCAGACGTT
ATGAACCGGAGCCAGAAGACCGGCCTGACGCAGAGGT
AGACCCGCGAGTCAGGAGACCTGCCGTCGACGGACCT
ACATCCGCAAGCCGGGAGACCTGCCAGCGCTGAGACT
CGACCCGCGAGTCAGGAGACCTGCCGTCGAGCGCGCA
CGACCCGCAAGTCAGGAGACCTGCCGGAGCGATCACC
TGACCCGCCAGTCAGGAGACCTGCCGGCGTTCGATCT
TGACCCGTGAGCCAGGAGACCTGCCGGCGTCTGGTCG
TGACCCGCGAGCCAGGAGACCTGCCGGCGCACCGGTC
TGACCCGGAAGCCAGGAAACCTGCCCTTGGTTGTCGT
CGACCCGTGAGCCAGGAGACCTGCCTCGACAGATAAC
TGACCCGTGAGCCAGGAGACCTGCCCGGCGCAGTCGT
CGACCCGTGAGCCAGGAGACCGGCCTGAGTACGTCAT
CGACCCGCGAGCCAGGAGACCTGCCGTCAGTCGTGGT
ATATCCGTGAGCCAGGAGACCGGCCGAAGACGGGAAG
CGACTCGCGAGCCAGGAGACCTGCCATCGCGTATTGT
TGACCCGCGAGCCAGGAGACCTGCCTCGTCGAACGAA
ATGTCCGCGAGCCAGGAGACCGGCCGAAGTCCGCAAC
ATATCCGCGAGCCAGGAGACCGGCCGGTACAAGGTGT
TCAACCGCGAGCCAGGAGACCTGCCGTCATTCGTGGT
CGACCCGTGAGCCAGGAGACCTGCCGTCGCCTGCTAT
GTTTTCGTCAGCCCGGATACCGGCCGAGACACGGGGC
GACGTCGCGAGCCCGGATACCGGCCGAGGCGGGGAGG
CACGCGGCCAGCCCGGATACCGGCCGACGCACGGGGC
CGACCTGCCAGCCAGGAGACCTGCCGGGACGTTTCGT
CCGCTCATAAGTCCGGAGACCGGCCTGAAGCAATATC
AATCTTGCAAGCCCGGAGACCGGCCTGAAAACGATCA
TGACCCGAGAGTCAGGAGACCTGCCGCAAGTGAGCTA
GGACCTGGGAGCCAGGAAACCTGCCGTAGATCATTTT
GATGTCGTCAGCCCGGATACCGGCCTGCAGAACGAGG
TGACGCGCGAGCCAGGAGACCGGCCATCTCCTTCTGT
CGGTTCGCCAGCCCGGATACCGGCCAGGACAGTGGGT
Allignment of B12-elements (continued) Gamma-proteobacteria, the Bacillus/Clostridium group
EC_BTUB
SY_BTUB
SY_CBIA
KP_CBIA
KP_BTUB
YP_BTUB
YE_BTUB
YE_CBIA
EO_BTUB
VC_BTUB
PA_BTUB
PA_BTUB2
PA_COBW
PA_COBG
PA_CBTAB
PP_BTUR
PP_BTUF
PP_BTUB2
PP_COBW
PP_CBTAB
PU_BTUR
PU_COBW
PU_CBTAB
PY_COBW
PY_BTUR
PY_CBTAB
PY_BTUF
SON_BTUD
SON_BTUB
AV_BTUB
XAX_BTUB
BS_BTUF
ZC_METE
HD_ACHX
HD_BTUF
HD_METE
HD_COBT
HD_NRDA
BE_NRDA
BE_BTUF
BE_CBIW
BI_CBIW
LMO_X
LMO_CBIA
CA_BTUF
CA_CBIM
CPE_CBIM
CPE_CBIK
CPE_BTUF
CPE_CBLT
CB_CBIP
DF_CBIM
DF_CBIP
DF_BTUF
THT_BTUR
THT_BTUF
EF_BTUF
HMO_CBIM
HMO01408
HMO_CBIQ
HMO_CBLS
HMO_CBID
DHA_BTUF
DHA_CBIET
DHA_CBLS
DHA_CNOA
DHA_NRDD
DHA05379
-248
-252
-265
-264
-245
-324
-288
-282
-360
-326
-297
-297
-305
-244
-245
-334
-302
-319
-299
-309
-300
-302
-335
-331
-298
-303
-321
-303
-332
-302
-327
-237
-309
-377
-401
-322
-247
-345
-318
-333
-346
-566
-318
-332
-332
-307
-480
-294
-482
-537
-317
-367
-287
-393
-337
-352
-340
-396
-271
-382
-306
-294
-297
-298
-282
-352
-316
-325
0
1
1'
2
AddI
2'
3
3'
4
VS
5
6
AddII
6'
5'
VS
4'
0'
======> -===><=======>
><
<==== ===>
<==
=====>

-==> ======>
><
<======
<==
<=====
--->
<---<======
hgGtkcy
rg
aa aGGGAA
cgGtg
a tCcg RCdG-ycCcCGChaCKGTra
gcCACTG
YGGGAAGgc
rAGYCMGgAgaCCkGCcd
ATCCACTTGCCGGT-CCTGTGAGTT--AATAGGGAA--TCCAGTG 2 AATCTGGAGCTG--ACGCGCAGCGGTAAGGAAAGGT 16 GCGGACACTGCCAT----------TCGGTGGGAAGTCATC 19 ACCCCTCCAAGCCCGAAGACCTGCCGGCCAACGTCGC
ATCCGTGGGCCGGT-CCTGTGAGTT--AATAGGGAA--TCCAGTG 2 AATCTGGAGCTG--ACGCGCAGCGGTAAGGAAAGGT 15 GCAGACACTGCCTC-----------CGGCGGGAAGTCATC 24 AACCCTCCAAGCCCGAAGACCTGCCGGCTAACGTCGC
GTAAACCAACAGGTTTG 12 T--------AGGGAA--GGGGGTG 2 AATCCCCCGCAG-CCCCCGCTGCTGTGATGCTGACG 8 AAGACCACTGATCGC--------AAGATTGGGAAGGACGG 6 AGGACGCTAAGCCAGAAGACCTGCCTGTCGGTGATAA
ACAAACCGACAGGTTCG 15 C--------AGGGAA--GGGGGTG 2 AATCCCCCGCAG-CCCCCGCTGCTGTGATGCTGACG 8 AAAACCACTGATCGA--------AAGATTGGGAAGGGCGG 6 ACGAGGCTAAGCCAGAAGACCTGCCTGCCGGTAACTG
ATTCGCCTACCGGT-CCTGTGAGTT--AAAAGGGAA--CCCAGTG 2 AATCTGGGGCTG--ACGCGCAGCGGTAAGGAAGGTG 19 GCAGACACTGCGGCT--------AGCCGTGGGAAGTCATT 11 CAGCCTCCAAGCCCGAAGACCTGCCGGAATACGTCGC
CATTGTGGTCCGGC-CT 22 AGAGTTAAAAGGGAA--TCCGGTG 2 AATCCGGAGCTG--ACGCGCAGCGGTAAGGGGAAGT 18 ACAGACACTGTCCGC--------AAGGATGGGAAGTCATC 67 GAGATCCTAAGCCCGAAGACCTGCCGGTATTACGTCG
CATTGCGGTCCGGC-CT 22 AGAGTTAAAAGGGAA--TCCGGTG 2 AATCCGGAGCTG--ACGCGCAGCGGTAAGGGGAAGT 18 CCAGACACTGTCCGT--------AAGGATGGGAAGTCATC 32 GAGATCCCAAGCCCGAAAACCTGCCGGTATACGTCGC
ATACTGAAACAGGTATG 15 T--------TGGGAA--GGGGGTG 2 AATCCCCCGCAG-CCCCCGCTGCTGTGATGCTGACG 8 GAGACCACTGATCCAT-------AGGATTGGGAAGGTAGC 8 GTGACGCTAAGCCAGAAGACCAGCCAAATCAGTAAAG
GATGAGCGTCCGGC-CTT 7 AAGTC-AAAAGGGAA--TCCAGTG 2 AATCTGGAGCTG--ACGCGCAGCGGTAAGGAATGCC 17 GCAGACACTGTTAT--- 80 ----CGATGGGAAGTCATC 45 CGGCATCCAAGCCCGAAGACCTGCCGGAATACGTCGC
AGCGCCAAGCTGGTGCT 26 GGCT-GAAAAGGGAA--TCCGGTG 2 ACTCCGGAACTG--ACGCGCAGCGGTAAGAGAGAAC 9 AACGACACTGCTTTT---------CGAGTGGGAAGTCGAG 14 GTGCTCTCAAGTCCGAAGACCTGCCAGCAACTGAGTT
GCCTTGCGACAGGTGCC 8 GGTG-AAACAGGGAA--GCTGGTG 15 AGGCCAGCGCTG-CCCCCGCAACGGTAGGCGAATCA 12 ATGACCACTGTGCTC--------CGGCATGGGAAGGCGCG 19 TCGCTCGCGAGCCCGGAGACCGGCCTGACGCACCCAC
GGCCCGTTCCAGGTGCC 18 GGTG--AAACGGGAA--GCCGGTG 16 AGTCCGGCGCTG-CCCCCGCAACGGTAAGCGAGCGA 9 CAGGCCACTGTGCTC--------CGGCATGGGAAGGCGAG 9 ACCCTCGCAAGCCCGGAGACCGGCCTGCAACGCCCTG
TGCCGGTTCGAGGTTCC 16 GGC--TAAGAGGGAA--CGCGGTC 1 ATGCCGCGGCTG-CCCCCGCAACTGTGAACGGCGAT 8 AATGCCACTGCGTG-----------ACGCGGGAAGGCGGG 16 CAGACCGTGAGCCAGGAGACCTGCCTCGTCGATCCCG
GCGCGTTCGTCGGTGCC 37 ------AAGAGGGAA--CACGGAG 25 TAGCCGTGGCTG-CCCCCGCAACTGTATGCAGCCTG 11 TTCGCCACTGGAT------------TACCGGGAAGGCGGC 33 CGGGCTGCGAGCCAGGAGACCTGCCGCCGAAACCAGT
GGGTTGTCCCAGGTGTC 17 AGGT-GAAACGGGAA--GCCGGTG 14 AGTCCGGCGCTG-CCCCCGCAACGGTAAGCGCATC---------------------------------------------------GCGCGCGAGCCCGGAGACCGGCCTGGAACCTTTCG
GGCGTGTTTCAGGTGCC 21 GGTG-AAACTGGGAA--GCCGGTG 17 ATTCCGGCGCTG-CCCCCGCAACGGTGGATGAGTAA 10 AGGGCCACTGGATGCC------AGCATCCGGGAAGGCGCG 17 CCACTCACAAGCCCGGAGACCGGCCTGATACTGCCAA
TGCGGGCCGCCGGTTTC 7 GAAC-TAACAGGGAA--TCCCAGG 15 CAATCGGAACTG-CCCCCGCAACTGTAGGTGCCGAG 11 GATGCCACTGGGCCTG-------CCGCCCGGGAAGGCCGG 11 -GACGCACCAGTCAGGAGACCTGCCGGCCTACATTCA
CGCCAGTTTCAGGTGCC 18 GGTG--AAACGGGAA--ACCGGTG 19 AGTCCGGTGCTG-CCCCCGCAACGGTAAGCGAGCGA 9 GATACCACTGTGCTC--------AAGCATGGGAAGGTGAA 9 CCCCTCGCAAGCCCGGAGACCGGCCTGGAGCTTCACT
TGCCACTTCGAGGTTCT 13 AGCT-AAGACGGGAA--CGCGGTA 1 AAGCCGCGGCTG-CCCCCGCAACTGTAAGCACCGAC 11 ACAGCCACTGCGCCA--------ACGCGCGGGAAGGCGTC 27 AACGGTGCAAGCCAGGAGACCTGCCTCGTCACGTTTT
CCTCGCGTTCAGGTGCC 8 GGTG--AAACGGGAA--ACCGGTG 29 ATGCCGGTGCTG-CCCCCGCAACGGTAAGCGAGTGA 6 TGTACCACTGTGCCTCGT-AGTACGGCATGGGAAGGTGAC 20 TTCCTCGCAAGCCCGGAGACCGGCCTGGCGTTCATGA
GGCTTGTTTCAGGTGCT 19 AGTG-AAACAGGGAA--GCCGGTG 30 ATCCCGGCGCTG-CCCCCGCAACGGTAAATGAGTAA 11 GATGCCACTGCTTA----------ACAGCGGGAAGGCGCG 14 CCGCTCATGAGCCCGGAGACCGGCCTGATCCATCCAG
TGCGCTTTCGAGGTTCT 14 AGCT-AAGAAGGGAA--CGCGGTC 1 AAGCCGCGGCTG-CCCCCGCAACTGTGAACGGTGCT 9 CACGCCACTGCCAA--- 12 ---CCAGCGGGAAGGCGCA 22 AACACCGTCAGCCAGGAGACCTGCCTCGTCACAGATT
AACTTGTTACGGGTGCC 8 GGTG--AAACGGGAA--ACCGGTG 29 AGTCCGGTGCTG-CCCCCGCAACGGTAAGCGAGCGA 6 AGATCCACTGTGCCCA------CGGGCATGGGAAGGTGAC 23 CCCCTCGTGAGCCCGGAGACCGGCCCGCAACACACAG
TGCCGGTTCGAGGTTCT 25 AGCT-AAGACGGGAA--TGCGGTA 1 ATGCCGCAGCTG-CCCCCGCAACTGTAAACGGTCAT 9 ACAGCCACTGCTG------------CGGCGGGAAGGCGCG 39 GCTGCCGTGAGCCAGGAGACCTGCCTCGAACCGGGCT
GGCTTGTTTCAGGTGCT 20 GGTG-AAACAGGGAA--GCCGGTG 16 ATCCCGGCGCTG-CCCCCGCAACGGTAAATGAGTCA 11 CGTGCCACTGTGTTT--------CGACACGGGAAGGCGCG 13 CCGCTCATGAGCCCGGAGACCGGCCTGAACCACTCAA
ACCTTGTTTCGGGTGCC 8 GGTG--AAACGGGAA--ACCGGTG 18 AGTCCGGTGCTG-CCCCCGCAACGGTAAGCGAGAGA 5 TGATCCACTGTGCTC--------TGGCATGGGAAGGTGAC 30 CCCCTCGCGAGCCCGGAGACCGGCCCGACATTTTTCC
TGCCGGCCGTCGGTTTC 6 GAAC-TAACAGGGAA--TTCGCCA 17 AAAACGAAACTG-CCCCCGCAACTGTAGGCATCGAG 11 ACTGCCACTGGATTC--------AGATCCGGGAAGGCCGG 11 -GACATGCCAGTCAGGAGACCTGCCGACCCGATTCAA
--------------------------TAATAGGGAA--TCGGGGC 13 CAGCCCGAACTG-TACCCGCAACTGTGAGTAGTTAA----------------------69------------------------TTTTCTACAAGTCAGGAGACCTGCCTATTGCTGTTTT
CAACCTTCTGTGGTGCT 18 AGA--TAATCGGGAA--GCCAGTG 2 ATTCTGGCACTG-CCCCCGCAACGGTAAAAGGTGAG----------------------89------------------------TATAGCCTAAGTCCGGAGACCGGCCCTAAAGGTGTTT
GCCTCGCTTCAGGTGCC 5 GGTG-AAACAGGGAA--GCCGGTG 24 AGGCCGGCGCTG-CCCCCGCAACGGTAGACGAGTCG 10 ATAGCCACTGTGTTGC-----TCGGACACGGGAAGGCGCG 25 TCGCTCGTGAGCCCGGAGACCGGCCTGTGGCGATCCA
CGCGCCCCTGAGGTGAC 16 GTTT--AAACGGGAA--TCCGGTG 24 ATTCCGGAGCTG-CCCCCGCAACGGTGGGCGAGGTC 11 TACGCCACTGTGCAG--------TCGCATGGGAAGGCGCG 19 CCACTCGCAAGCCCGGAGACCGGCCTGAGGGATTGAC
AATGTCAAATAGGTGCC 18 GGCT-TAAAAGGGAA--ACCGGTA 1 AAGCCGGTGCGG-T-CCCGCCACTGTAATTGGCCAA-------------------------------------------------GCGCCAAGAGCCAGGATACCTGCCTGTTTGATCAGC
AAAGGAAAATAGGTACA 16 TGTT-TAAAAGGGAAG-CTTGGTG 2 ACTCCAACACGG-T-CCCGCCACTGTAAATGCTGAG 9 TGGTGCCACTGTGA-----------AAACGGGAAGGTAAA 10 TGAAGCATAAGTCAGGAGACCTGCCTGTTTTAACAAC
CTCAAGCATTAGGTGGT 16 ATCT-GAAAAGGGAA--GCTGGTG 2 AGTCCAGCACGG-T-CGCGCCACTGTAATAAGGAGC 10 GAAACCACTGTCCAA---------AGGATGGGAAGGTACA 9 -TTATCTTAAGTCAGGAGACCTGCCTAATGTATGCAC
TCGCGCTGAAGGGTCGT 11 GCGT-GAAAAGGGAA--GTCGGTG 2 AATCCGACACGG-T-CCCGCCACTGTAAATGGGAGA 8 AGATCCACTGTCTA----------GCGACGGGAAGGGGGC 9 ATGAACATAAGTCAGGAGACCTGCCTTTCAGTTTGAG
GTTTGGGAACAGGTACG 22 TGTT-TAAAAGGGAA--TCCGGTG 2 AATCCGGAGCGG-T-CCCGCCACTGTCATAGCTGAG 10 ATTGTCACTGACCGTTС-----ATTGGTTGGGAAGACTGT 8 TGACGCTAGAGCCAGGAGACCTGCCTGTTCTAACAGC
TAGGCTTCTTAGGTGCC 9 GGA--GAATAGGGAA---GTTCTG 2 A---CGACGCGG-AGCCCGCCACTGTAGTCGAGGAG 7 AATACCACTGGGA------------AACTGGGAAGGTGTA 8 -TGAATCGGAGCCAGGAGACCTGCCTAAGAAGATGCG
GTGGACGGTAAGGTGCC 6 GGCT-TAAAAGGGAA--TCTGGTG 2 AATCCGGAGCTG-TCCCCGCAACTGTGAGTGCTACG 10 TTTGCCACTGTACATC- 14 AAATGTATGGGAAGGCTTC 8 TAAAGCACGAGTCAGGAGACCTGCCTTACTTCCACAA
TGCCAAGCAATGGTGTC 6 GACT-TAATAGGGAA--TCCGGCG 2 AATCCGGAACTG-CCCCCGCAACTGTATGTGCGGAC 8 ATGGCCACTGGCGGCA- 14 -CGCCGCTGGGAAGGCCCC 9 CGATGCACGAGTCAGGAGACCTGCCTTGCTTGGAACG
ATTCGCAGCAAGGTGCC 6 GGCT-TAATAGGGAA--TCCGGTG 2 AATCCGGAGCTG-TCCCCGCAACTGTCAATGCGGAC 8 ATCGCCACTGTACGGAC 18 -TCCGTACGGGAAGGCTTC 9 TGAAGCATGAGCCAGTAGACCTGCCTTGCTTGCCGCA
AGCCTGCTTAAGGCTTGGGT-AG----AAAGGGGAAG-CCCGGTG 3 AATCCGGCACGG-TGCCCGCCACTGTGGTGGGGAGC 10 CAAGTCACTGAAGGA--------TGCTTCGGGAAGACGCC 8 ATGATCCTAAGTCAGGAGACCTGCCTTGTTTGGATCG
GCAACAGTAAAGGTGCC 5 GGCT-TAATAGGGAA--ACTGGTG 2 AGACCAGTACTG-CCCCCGCAACTGTAAGTGTGGAC 8 ATAACCACTGTGAAAA-------AATCACGGGAAGGTTCT 9 TGATACACAAGTCAGGAGACCTGTCTTTATTGTGAAG
GGTCTTATGTTGGTGGA 12 TTCT-GAAAGAGGAA--TTCGGTG 2 ATGCCGAAACTG-CCCCCGCAACTGTAAGGTGGACA 9 ATAACCACTGTACGTTTT---TAGCGTATGGGAAGGTTCG 8 ATGAAGCCAAGTCAGGATACTCGCCAAATAAGACGGA
ACAACTAAATAGGTGAA 4 TTA---ATCCGGGAA--AGAGGTG 2 AATCCTCTACAGGCCCTAGCTACTGTAATACGGACG 11 TATGTCACTGGAAGC--------AATTCCGGGAAGACTGG 8 ATGATGTTAAGTCAGGAGACC-GCTTTTATATTCGAT
ACCATATTTTAGGCACC 8 GGTT-TAATAGGGAA--ATTGGTG 2 AATCCAATGCAA-CCCCCGTTACTGTATACAGTTAC 7 ATGTCCACTGGAGTT--------TTCTCTGGGAAGGATGG 7 TAAACTGTGAGCCAGGAGACCTACCTAAAATATTATG
TAAAATTTGTAGGTTCA 16 TGAT-TAAAAAGGAA--TCAGGTG 2 AAGCCTGAGCGG-T-CCCGCCACTGTAATAAAGGAG 11 TATGTCACTGGGA------------AACTGGGAAGGCGTA 10 -GATTTTTGAGCCAGGATACTTGCCATATTCTAGTAT
ATTTAGAAATAGGTTAA 20 ATAT-TAAAAGGGAAG-TTGGGTT 2 AATCCCACGCGG-T-CCCGCCGCTGTAATAGAGGAG 12 TAAGCCACTGGAATATA-----ATATTTTGGGAAGGCCAC 9 TGATACTTGAGCCAGAAGACCTGCCTATTTTTAAAAC
TTATATTTTTAGGTTTG 4 TAAT-TAAAAGGGAA--AGTGGTT 2 AGTCCACTACAG-CCCCCGCTACTGTGATAGGATAC 10 TTGACCACTGATTATA-------TAAATTGGGAAGGGAGA 8 TAAGCCTTAAGTCAGGATACCTGCCTAAAGATCATGA
AACTAATAATTGGTGTG 5 CGCT-TAATAGGGAA--TGAAGTT 2 AGTCTTCAACTA-CC------TCAGTAACCGTGAAG 15 TATGTCACTGCATTT-------TTTGTGTGGGAAGACGAG 7 AAGAAGCAAAGTCGGGATACCTGCCTTTTATTTAAGT
TAAGAGCATTAGGTGTT 4 AACT-TAATAGGGAA-----AGTT 2 AAACT---GCAG-CCCCCGCTACTGTTGATAAGGAC 8 AAAGCCACTGTGATAA-----ATAGTCATGGAAAGGATTG 9 -GATTTATTAGCCAGGAGACCTGCCTAGTATGCTATT
AAAAAGATTTAGGTGCC 11 GG-T-GAAAAGGGAA--TGTGGTA 2 A-GCCACAGCAG-CCCCCGCTACTGTAATTGAGGAC 10 TAAACCACTCTTTA----------AAAAGGGGAAGGGAAA 8 TGAATCATGAGCCAGGAGACCTGCCTAGATTTTTATT
ATAATATTATAGGTTCT 7 AGAT-TAATAGGGAA--AAAGGTT 2 ATTCCTTTACAG-CCCCCGCTACTGTGATGCAGACG 9 TTAGCCACTATGATG-- 13 ---CTCATGGGAAGGAAAA 8 ATGAAGCTAAGTCAGGAGACCTGCCTAAAATATTAAA
GATTAAAATTAGGTTCT 5 AGA 4 AAAAGGGAA--AAAGGTT 2 ATGCCTTTGCAG-CCCCCGCTACTGTGAAACCAACG 8 AATACCACTGTCAGT---------TTGATGGGAAGGTTAT 9 ATGAAGTTAAGCCAGGATACCTGCCTAATTTAATTTA
AATAATACTAGGGTACT 5 AGTT-TAATAGGGAA--AGTGATG 2 AATTCACTACAG-CCCCCGCTACTGTATACGGATAC 7 AAATCCACTGAAATTTAT--AAAAATTTTGGGAAGGGTGA 7 AAAGCCGTGAGTCAGGAGACCTGCCCAGTATTATATA
TAAAGCCTTATGGTCCC 5 GGGT-TAAAAGGGAAG-ACGGGTG 2 AATCCCGCGCAG-CCCCCGCTACTGTGAGGGAGGAC 10 TAAGCCACTGTCCGG-- 60 --CCGGGTGGGAAGGCAGG 8 TGAGTCCCGAGCCAGGAGACCTGCCATAAGGTTTTAG
AAAAGCCTTATGGTCCC 5 GGGT-TAAAAGGGAAG-ACGGGTG 2 AATCCCGCGCAG-CCCCCGCTACTGTGAGGGAGGAC 10 TAAGCCACTGTCCGG-- 60 --CCGGATGGGAAGGCAGG 8 TGAGTCCCGAGCCAGGAGACCTGCCATAAGGTTTTTA
TGTACTTATGAAGTGTC-------------AGGGAA--AGAGGTG 2 AATCCTCTACAG-ACCTACCTACTGTATGGTGGATG 8 AAGACCACAGATT------------ATTCTGGAAGGATTG 8 AAGAAGCTAAGTCAGGATACCGGCTTGATAAGTCTAA
TTGCTGGAACAGGTCGC 20 GCGTTAGAAAGGGAAG-TTCGGTG 2 AATCCGACGCGG-T-CCCGCCACTGTAAGGGGAATG 10 AATGTCACTGGCGTTT------AAGCGCTGGGAAGACGGA 8 ATGAACCCGAGCCAGGAGACCTGCCTGTTACCACGTC
CTACGGTTACAGGTGCC 6 GGA--GAATAGGGAA--CCGGGTG 2 AATCCTGGGCGG-T-CCCGCCGCTGTATGGTCGAGT 9 TGAGCCACTTCGT------------GTGAGGGAAGGCGCC 7 ATTAAGCCGAGCCAGAAGACCTGCCTGTACACTGTTC
CGATGTCTGCAGGCGCC 5 GGCT-GAAAAGGGAA--TGAGGTG 2 AGACCTCAGCAG-CCCCCGCTACTGTATGGGAAGAC 12 GAATCCACTGGACTG--------CCGTCTGGGAAGGAAAC 9 TGATTCCTGAGCCAGGAGACCTGCCTGTCGCGACAAA
TAACCGTTTCAGGTGCC 8 GGA--GAATAGGGAA--CTGGGTG 2 AATCCCGGACGG-A-CCCACCACTGTAAGAGGAGCT 8 TTGGCCACTGGGA------------TTCTGGGAAGGCGTG 7 ATGATTCGGAGTCAGGAGACCTGCCTGTAACGCTCGG
ATGATGCAAAGGGTGGC 34 GTC 6 ATTAGGGAA--GTCGGTG 2 ATTCCGACGCGG-T-GCCGCCACTGTGAAAGGGGAG 10 CAGGCCACCGGGT------------AACCGGGAAGGCGAA 8 ATGAACCTGAGCCAGGAAACCTGCCTGTCCCCGCACC
AACTATTGACAGGTTTA 6 TAAT-GAAAAGGGAA--TCAGGTG 2 AATCCTGAGCAA-CCCCCGTTACTGTAAGCGCCGTT 17 CATGCCACTGGCGA----------AGACTGGGAAGGCGAT 4 AAAGGCGCGAGCCAGGAGACCTGCCTGTTAATAAAAC
ATAGTATTCAAGGTTCC 8 GGAA-GAAAAGGGAA--GCCGGTG 2 AATCCGGCGCGG-T-CCCGCCACTGTGAACCACGAG 9 AGGCCCACTGGGATG---------AGCCTGGGAAGGGAAG 7 AAGACTGGAAGCCAGGAGACCTGCCTTGAACATTGCG
TAAGGATTTCAGGTGCC 13 GGA--GAATAGGGAA--CCGGGTG 2 ATTCCCGGACGG-A-CCCGCCACTGTAAAGAGGAGT 10 AATGCCACTGGGT------------AACTGGGAAGGCAGC 9 ATGACTCGAAGTCAGGAGACCTGCCTGGATCCGGGGA
ACATAGCTTAAGGTGCC 5 GGA--GAATAGGGAA--ACCGGTA 2 AGTCCGGTGCGG-A-TCCGCCGCTGTAATCGGAGAC 10 AATGTCACTGTCTTTTT-----TAGAGATGGGAAGGCGTG 9 TGACACGAGAGCCAGAAGACCTGCCTTTTAGAAAGCT
GGAATCTCATAGGTGAC 12 GTT--GAAAAGGGAA--GCCGGTT 2 AGGCCGGCACGG-T-CCCGCCGCTGTAAGGGAAATA 11 ATTACCACTGAAAGG---------GTTTCGGGAAGGTAAG 8 ATGATCCTAAGTCAGAAGAC-TGCCTATGTGTATACC
AGGCGGAATAGGGTTGC 6 GCAT-TAATAGGGAAC-TCCGGTG 2 AAGCCGGGACAG-C-CCCGCTACTGTAAGAAGGACG 11 GGATCCACTGGTGA----------AAACCGGGAAGGTAAG 8 ATGAGTTCAAGTCAGGATACCTGCCCCATTCCGGAAA
Allignment of B12-elements (continued) (Actinobacteria, Cyanobacteria, The CFB group, Thermotogales,
The Thermus/Dienoccoccus group and some others)
DI_BTUC
MT_CBTG
MT_METE
ML_CBTG
ML_METE
RK_CHLID
RK_COBN
RK_CBTE
RK_BTUF
SX_CBIM
SX_METE
SX_PDUX
SX_BTUF
SX_NRDA
SX_BTUC
SX12454
TFU_COBN
TFU_CHLID
TFU_CBTE
PI_CBIB
PI_CBIL
PI_MUTA
PG_BTUB4
PG00461
PG_BTUF
PG_NRDD
PG_X_CBTD
BX_BTUB
BX_PCCC
BX_BTUB4
BX_NRDA
BX_CBTD
BX_METE
BX_NRDD
CL_BTUB2
CL_X_CBIM
CL_X_FRD
CL_X_NRDJ
CL_BTUB
AN_X_CBIJ
AN_CFRX
AN_COBG
TE_X_METE
TE_CBIX
CY_HUPE
SN_HUPE
PMA_HUPE
DR_BTUFC
DR_BTUFR
DR_ACHX
LI_CBIX
LI_BTUB
FN_BTUF
FN_BTUB
TM_BTUF
CAU_BTUR
CAU_BTUF
GME_COBU
TDE_CBTF
TDE_ROCG
TDE_BTUF
-246
-309
-362
-270
-369
-224
-260
-137
-153
-209
-387
-365
-190
-271
-311
-204
-299
-225
-134
-164
-185
-253
-526
-556
-354
-342
-228
-371
-344
-344
-280
-269
-264
-210
-231
-227
-498
-265
-364
-153
-187
-152
-160
-141
-160
-210
-232
-236
-312
-270
-365
-279
-276
-240
-224
-268
-397
-290
-520
-490
-371
0
1
1'
2
AddI
2'
3
3'
4
VS
5
6
AddII
6'
5'
VS
4'
0'
======> -===><=======>
><
<==== ===>
<==
=====>

-==> ======>
><
<======
<==
<=====
--->
<---<======
hgGtkcy
rg
aa aGGGAA
cgGtg
a tCcg RCdG-ycCcCGChaCKGTra
gcCACTG
YGGGAAGgc
rAGYCMGgAgaCCkGCcd
CACATTGATTAGGTGCA 12 TGC-----ATGGGAA--TCTGGTG 2 AATCCAGAGCTG-A-CGCGCAGCGGTGAAGGTGCAA 14 GTAGCCACTGAGAGTATA--AAAACTCTTGGGAAGGTGAG 17 GAGCACCCCAGTCCGAAGACCGGCCTAATCAGAAACA
TCAGGCGATGACGAT--------------GCAGGAA--GCCGGTG 2 AATCCGGCGCGG-T-CCCGCCACTGTCACCGGGGAG 9 TAAGCCACGGCCAC-----------AGGCTGGAAGGCGAG 8 CGATCCGGGAGCCAGGAGACTCGCGTCATCGCGTCCT
ACCACGCAGCTGGTCTG-48-------GAGAGGGAA--CCTGGTG 2 AATCCGGGACTG-T-CCCGCAGCGGTATGCAGGAAC 20 ACAAGCACTGGTCTCA-------ACGACTGGGAAGCGACG 17 GAGCCTGCGAGTCCGAAGACCTGCCAGCCGTGCCGGA
AAAGGCGATGACGATGC--------------AGGAA--GTCGGTG 2 AAGCCGGCGCGG-T-CCCGCCACTGTAATCGGGGAG 9 TAGGCCACGGCCAT-----------TGGCTGGAAGGCGAG 8 TGATCCGAGAGCCAGGAAACTCGCGTCATCGCGTCCT
GCTGGTCTGCTGGTTCC 44 ------GAGAGGGAA--CCCGGTG 2 AATCCGGGACTG-T-CCCGCAGCGGTATGCAGGAAC 11 GGAAGCACTGGTCTTA-- 8 -CGAGACTGGGAAGCGATG 18 GCGCCTGCGAGTCCGAAGACCTGCCGGCTGTGTCGGG
AAGACAATCGAGGTGCC 8 GGA--TAATCGGGAA--GCCGGTG 2 AATCCGGCACAG-G-CCCGCTGCGGTGACCCGGGAG 22 GCAGCCACTGGACCGG------CCGGTCCGGGAAGGCGAT 11 CGACCGGGAAGTCCGAATACCGGCCTCGATTTCAGCT
CCACCTGCCGTGGTGCT-------------CGGGAA--GCCGGTG 2 AGACCGGCGCGG-CCCTCGCCACTGTGAGCGGGTAG 35 GAGACCACTGGACGG--------AAGTCCGGGAAGGTCGG 11 TGATCCGTCAGCCAGGAGACCGGCCACGGCGCGGGAA
CACACGTGCCGAGGTGC-------------AGGCAA--TCCGGTG 2 AGTCCGGAGCGG-T-CGCGCCACTGTGACCGGGCGA-----------------------1------------------------CCGCCCGGGAGTCAGAAAACTGTCTCGGCGCATGGAT
GCTGACGCCCGTGC----------------AGGGAAAGTCCGGTC 2 AGTCCGGCGCTG-A-CCCGCAACGGTAGGCCGTCCA-----------------------1------------------------CCGACGGTGAGCCCGATCGCCTGCACGGGGGTGCGCG
CGCCACGCCTTGGTG------------AACGGGAAA--TCCGGTG 2 ATGCCGGTGCGG-CCCTCGCCACTGTGAATCGGGAA 21 GCAGCCACTGGATCGCT---TGCGGTCCGGGAAGGCGGA 12 GTACCCGTAAGCCAGGAGACCGGCCAAGGCGCGTCGT
CCCGTGCAGCTGGTTCG 21 CGTCGCAAGAGGGAA--CCCGGTG 2 AATCCGGGACTG-C-CCCGCAGCGGTGAGCGGGAAC 10 ATACGCACTGGGCCCG- 6 -CGGGCCCGGGAAGCGACG 29 GGGCCCGCGAGTCCGAAGACCTGCCACCTGCCCGCGC
TGCCCGCAGTTGGTTCG 30 CGACGCAAGAGGGAA--CCCGGTG 2 AATCCGGGACTG-T-CCCGCAGCGGTGAGTGGGAAC 10 AACAGCACTGGGCC-- 13 ---AGCCCGGGAAGCGACG 40 GCGCCCACGAGTCCGAAGACCTGCCACTGCGCCCGTA
TCGCCGCGACGGGAG--------------ACAGGAA--GCCGGTG 2 AATCCGGCACGG-T-CCCGCCACTGTGACCGGGGAG 10 CACGCCACTGCGCGC--------CGCGCGGGAAGGCCAG 10 CGATCCGGGAGTCAGGACACTGGCCTGTCGCGGGCCC
---TCGCTGTCGCCGC-------------AGGGGAA--TCCGGTG 2 AATCCGGAACTG-T-CCCGCAACGGTGTACTTGCGT----------------------38------------------------CGCCTGTCCAGTCCGAGGACCTGCCGACAGTGCGCCC
CGAAGCGCCTCGTGG---------------GGGAA-GTCCGGTC 2 AGTCCGGCGCTG-A-CCCGCAACGGTAGGCGGAGCC-----------------------8------------------------GGTCCCGTGAGCCCGATTACCCGCGGTGGTGAAGCCC
CAGGGCGACGACGGTC-------------CGAGGAA--GCCGGTG 2 AATCCGGCGCGG-T-CCCGCCACTGTGATCGGTGAG 11 GTTGCCACTGCCCCGG------AGGGGCGGGAAGGCCGG 9 TGACCCGGGAGCCAGGAAACTCACGTCGTCGCCTCCT
TGCGCTATGGTGGTCGC 3 GTGGT-GAACGGGAA-GACCGGTG 2 AGACCGGCGCGG-CCCTCGCCACTGTGATCGAGGAG 30 GCGGCCACCGGGCAC-------CAGCCTGGGAAGGTCAG 11 TGACTCGTCAGCCAGGAGACCGGCCACGACGCGTCAT
GGAACCGCCGAGGTGCC 11 GGA--TAATCGGGAA--GCCGGTG 2 AATCCGGCACAG-G-CCCGCTGCGGTGACCTGGGAG 20 GCAGCCACTGGACGG-------CAGTCTGGGAAGGCGAT 10 TGATCAGGAAGTCCGAAGACCGGCCTCGGCATGGCTG
AAGAGCGTCGGGTGC----------------AGGCA-ATCCGGTC 2 AGTCCGGAGCGG-T-CGCGCCACTGTAGACGGGCTC------------------------------------------------AAGCCCGTGAGCCAGAAAACTCACCCGGCGTAGTGGT
CGGCCAGCGCGCGTCCG------------CAGCGAA--GCCGGTG 2 AATCCGGCGCTG-T-CCCGCAACGGTGATGGGGCCC------------------------1---------------------- GCCCCG47CAGCCCCACGAGCTGCCTGCGCGTGCACC
TCCCGGGCACCGGATGA 33 ---------GAGGAAT-GCCGGTG 23 AGTCCGCGACGG-T-CCCGCCACTGTGAGCCGGTGA-------------------------------------------------AGCCGGCGAGTCAGACACTCCGCCGGTGCCGCTGAC
CTAGTAGTGCTGGTTCG 16 CGTCGCAAGAGGGAA--TCCGGTG 3 ATTCCGGAACTG-T-CCCGCAGCGGTCAATGGGAAC 9 TAAGGCACTGGGCGGC------AACGCCTGGGAAGTAGTA 28 ATGCCCATGAGTCCGAAGACCTGCCAGCAGCGACAAC
GAAAAGACTGAAGTAAC 19 GTGC----AAGGGAA--TCCGGTG 2 ATTCCGGAGCTG-AGCCCTCAGCTGTAATGCTTCGA 45 GATGGTCACTGTAGA--- 11 -CCCTATGGGAAGGCCGA 12 AAGAAGCTAAGCCAGAAGACCTGCTTTAGTAGATTTG
GCCGTGTCAATGGTTTT 23 AAT--GAAAAGGGAA--CCCAGTG 2 ATTCTGGGACTG-TACCCTCAGCTGTAAGTTCAGAT 19 AAAGCCACTATACAGA------ATCGTATGGGAAGGCAGC 4 CCTTGAATAAGTCAGAAGACCTGCCATTACAAGCGTT
CGGATATGTGCGGTTCA 39 GAT--TAAAAGAGAA--TTTGGTG 2 AAGCCAAAACTA-TCCCCGTAGCCGTATGGTCGTAC 15 GATGCCACTGCATAT--------CGATGTGGGAAGGCGTA 4 TTTAGGCCGAGTCGGAAGACCTGCCGCACATATCTAA
CCCATCGTAGTGGTCCC 23 GGG 4 AAGAGGGAA--TCGGGTG 2 AATCCCGAGCAG-T-CCCGCTGCTGTAAGCTTTTAC 44 GATGCCACTGTTCATT-- 19 GCTGAATGGGAAGGCGCG 14 GATGAAGTAAGCCAGAATACCTGCCTCTACGAGTTGC
TGTGCGGACTTTGTTCA 33 TGAT-TAAAAGGGAA--TCGGGTG 2 AATCCCGAGCAG-T-CCCGCTGCTGTGAACCTTGTT 15 ATATCCACTGTCCGTTCT---GTGCGGATGGGAAGGAGTC 5 TATGGGGTGAGCCAGAAGACCTGCAAAGTCTTTGTCT
TGCAGTGCATTGGTTTG 22 CAAT-TAAAAGGGAA--TCAGGTG 2 AATCCTGAACAG-T-CCCGCTGCTGTAAGTTTCACA 24 CTTGCCACTGGGAAAC------GTTTCCTGGGAAGGCGCT 5 ACAGAAACGAGTCAGAAGACCTGCCTGTGCATCTTTT
GTCGCCGAATTGGTTCG 18 CGA 5 AAAAGGGAA--TCGGGTG 2 AATCCCGGACAG-T-CCCGCTGCTGTGAAACTCTGT 26 CGTACCACTGACAGAAA-- 7 CTCTGTCGGGAAGGTCCC 7 TGTAGAGTCAGTCAGAAGACCTGCCATTCGTGAATAA
GCAGCCGCTTAGGTGAT 25 AT----AAAAGGGAA--TCGGGTG 2 AATCCCGAACAG-TGCCCGCTACTGTGATCCCCCTG 53 TATACCACTGTCATA--- 10 -CATGACGGGAAGGTAGC 6 AAAAGGGATAGTCAGGAAACCTGCCGAAGCAGACATA
GCTCCCTGATCGGTTCC 20 GGAT-TAAAAGGGAA--TCGGGTG 2 AATCCCGGACAG-T-CCCGCTGCTGTGAAGCTCCGT 20 GTTGCCACTGGGA----- 26 ---CACCGGGAAGGCGTС 5 CAAGGAGTCAGTCAGAAGACCTGCCGCTTATCAAAGG
TGTCCCGAATTGGTTTC 21 GGAT-TAAAAGGGAA--TCGGGTG 2 AATCCCGGACAG-T-CCCGCTGCTGTGAAGCTTCAT 20 TTCGCCACTGACGT---- 15 ----GTCGGGAAGGCTTС 4 TTAGAAGTCAGTCAGAAGACCTGCCGTTCATCAAAGG
GTCGGCAGATTGGTTCG 21 CGAT-TAAAAGGGAA--TCGGGTG 2 ACTCCCGGACAG-T-CCCGCTGCTGTGAAGTTTTAT 25 TTGGCCACTGACTCGT-------GTAGTCGGGAAGGCGTT 5 TGGAAGCTAAGTCAGAAGACCTGCCACTCTCGCTGAT
TAGCAGATTTCAGTACT 12 AGT--CATAAGGGAA--CGCTGTG 2 AATCGGCGACAG-TACCCGCTGCTGTAATTCTCTGA 12 TATGCCACTGCGCCC---------AGCGTGGGAAGGCGTT 5 GGAGAGATAAGTCAGAAGACCTGCTGAAAAAGTAAAC
TTACGGTTTCCGGTGCC 6 GGC 9 AAAAGGGAA--CCCGGTG 2 AATCCGGGACAG-TGCCCGCTGCTGTGATCCTCCCG 37 GAGGCCACTGGTTCGCGC--CCGCGAACCGGGAAGGCCGG 3 CGAGGGGAGAGTCAGAAGACCTGCCGTAATGCAGTAA
TCCGATTATGTGGTGCC 17 GGCT-TAAAAGGGAA--TCCGGTG 2 AGTCCGGAACAG-TACCCGCTGCTGTAATTCCGCGC 32 AATGCCACTGTCCCGTT-----CAGGGATGGGAAGGCCGG 4 ATCCGGGAAAGTCAGAAGACCTGCCTCATATTTTTTG
TCGCCATGACAGGTGCC 12 GGA--GAATAGGGAA--GTACGTG 2 ATTCGTACACTG-TACCCGCAACTGTACAACGGTTA 47 CAGGTCACTGCCGGTT-- 13 -AACTGCGGGAAGGTTTG 11 TGCCGTGAAAGTCAGGAGACCTGCCAGTCATGCATTT
TTCAGCATTACGGTGCC 14 GGA--TAATAGGGAA--GTGCGTG 2 AATCGCACACTG-TGCCCGCAACTGTAAGATGGTAT 50 TGTATCCACTCCGCCA-- 20 --ATGCGGGGGAAGGCTG 29 AGCCATCGAAGTCAGGAGACCTGCCGTAGTGGTTGGC
CATGATTAGCTGGTGCC 12 GGA--GAATAGGGAA--GTACGTG 2 ATTCGTACACTG-TACCCGCAACTGTACAACGGAAA 47 CACGTCACTGCCAG---- 15 ---GGGCGGGAAGGCTGC 8 AAGCCGTAAAGTCAGGAGACCTGCCAGTTACTCTTTG
AATATCAACTCGGTTCT 17 AGAGGTAAGGGGGAAAGTCCGGTG 2 AATCCGGCGCTG-T-CCCGCAACTGTAATGGGGCTT------------------------------------------------ATGCCTCAAAGTCAGAATGCCCGCCGAAAGTACAACA
AATAAATATTCGGTTCT 17 AqAGGTAACGGGGAA26AACGGTG 2 AGTCCGGCGCTG-T-CCCGCAACTGTGAAGGAAAGA-----------------------10-----------------------AACTTTCCCAGTCAGAACGCCCGCCGAAATTGACGAT
TACTAGAACTTGGTGTT--------------GGGAAACTCCGGTG 2 ATTCCGGGGCTG-T-GCCGCAGCTGTGATGAAAAGT-----------------------18-----------------------AACTTTCCGAGTCAGAATGCCAATTCCAAGAGTTAGC
CTTAGTTGCTCGGTTCT 17 AGACGTAAGGGGGAAAGTGCAGTG 2 AATCTGCCGCTG-T-CCCGCAGCTGTGAGGAGAGA-------------------------3-----------------------CACTCTCTAAGTCAGAATGCCCGCCGAGTGGTCAACC
TGTGAGAAGCAGCCTGT-------------AGGGAAAATCCAGTG 2 AGTCTGGTGCTG-T-GCCGCAGCTGTGATGGGAAT--------------------------------------------------CTTCCCTCAGCCAGAATGCCTACTTGCTGTGGTTCA
TAAGTTTAGTTGGTTCC 17 GGAGGTAACGGGGAAAAGCTGGTG 2 AAGCCAATACTG-T-CCCGCAACTGTGATGGGCCC--------------------------------------------------AGGCCCTAAGTCAGGATGCCCGCCAACGATGGCCGA
GGTTTGGGTCTGGTTTC 17 GATGGAAACGGGGAAAGAACGGTG 2 AATCCGTCGCTG-T-CCCGCAGCTGTAAAGCGTCCGGCCC-------------------------------------------CGCCGGCGTCAGTCAGAACGCCCGCCAGGAGCACTACC
ATCCATCAATCGGTTTC 17 GAAGGAAACGGGGAAAGTTCGGCG 2 AATCCGGCACTG-T-CCCGCAGCTGTAAAGCGCAAC-----------------------15-----------------------ACTTGCGCGAGTCAGAATACCCGCCGAATTTCCATCG
TCCTCGCAGCAGGCGC--------------AGGGAAAGTCCGGTT 2 AGTCCGGCACTG-T-CGCGCAACGGTTTT---------------------------------------------------------------CAGTCCGAACACCTCGCCTGCTCGCGCTG
TGAGGCCACCTGAGCC-------------AGGGGAA-GCCCGGTG 34 ATTCCGGCACTG-T-CGCGCAGCGGTGAATCGGCCT------------------------2-----------------------AGGGCCGTCAGTCCGAATGCCTCTCAGGGACGCGAAC
AAGCCTCCCGAGGAAC------------AGAGGGAA-GTCCGGTC 20 AGTCCGGCACAG-T-CGCGCTACGGTTA----------------------------------------------------------------CAGTCCGAACGCTCGCCTCGTGGAGAACG
ATGTTTCACATAGAT----------------AGGAA--GACGGTT 2 AATCCGTCACGG-TATCCGCCGCTGTAAGAAGGACG 10 TAAGCCACTGGGAC----------AACCTGGGAAGGCGTG 9 AAGATTTCAAGTCAGAATACGACCTATGAAAATTCCT
ACGGAAAACTTGTTTAT 7 ATG--AGGAAGGGAA--TCCGGTT 2 AATCCGGAGCTG-AACCCGCAGCTGTAATCGCCGAA 16 CATGCCACTGCGTTAA-------ATACGCGGGAAGGCTGC 3 ATCGGCGAAAGCCAGAAGACCTAACAAGTAAAAAAAC
CATGTCAATTATGTTCC 11 GGC--TAAGAGGGAA--TTTGGTG 2 ATACCAAAACGA-G-CCCGTCGCTGTAATTGAGTTT 10 TATACCACTGGATTT---------TATTTGGGAAGGTAAA 6 TAAATCATAAGTCAGAAGACCTGCATAATTGAATTAC
AGAAACAAATAGGTGCT 4 GGCTTAATAAAGGAA-GTTGGGTG 2 AATCCCACACAG-C-AATGCTACTGTATTGTGGACG 8 ATAGCCACTGGGA------------AACTGGGAAGGTGTA 8 TTGAAACTAAGTCAGGAGACTTACCATTATTTTATAT
CCTCACCGTGCGGTACC 6 GGTT-CAAAGGGGAA--GCCGGTG 2 AATCCGGCGCGG-G-GCCGCCACCGTGACCGGGGAC 11 AACGCCACTGGGGCGA------TCACCCTGGGAAGGCGCG 10 TGATCCGGAAGCCGGGAAACCCGCCCGCGGTGAAGGG
TAGATCGTCGCGGTGAC 28 GTG-----GAAGGAA--GCTGGTG 2 AGTCCAGCACTG-T-GCCGCAACTGTAACCGGCTGT-----------------------2------------------------CAGGCCGGAAGTCAGGACGCCTGCCGCGATGTGTTGT
CATATCGTCGCGGTGAC 25 ------AAGGTGGAA--GCTGGTG 2 AGTCCAGCGCTG-T-GCCGCAACTGTAACCGGTTAG------------------------------------------------AAAGCCGGAAGCCAGGACGCCTGCCGCGATGTGATGA
TTTTACGTTCAGGTGCT 16 AGG--TAAAAGGGAA--AAGGGTG 2 ACTCCCTTGCTG--TCCCGCAACTGTGAACGGTGAT 14 GATGCCACTGATCT---- 18 -----CCGGGAAGGCGCG 10 TGATCCGTGAGCCAGGAAACCTGCCTGACCGTCAGCT
TAGACAAATAAGGTTCT 14 AGAT-TAAAAGGGAA--ACCGGTG 2 AAACCGGCACAGCC-CCCGCTACTGTAATTGAGTTT 34 TAAGCCACTGTTA------------ATATGGGAAGGCGAT 6 TAAATCATAAGTCAGGAGACCTGCCTATTTGTATTAC
GTTTCGGTCTTGGTGCT 14 AGTG--AAAAGGGAA--TCAGGTG 2 AGTCCTGAGCAG-T-CCGGCTGACGTAAGTGAGAGA 14 AATGCCACTGGTTT----------ATTCCGGGAAGGCGAA 9 TGACCTCCGAGCCGTAAGACCTGCCAATGACTATAAG
CAAACCATACAGGTGCC 7 GGTT--AAAAGGGAA-GCACGGTG 2 ATTCCGTCACGG-T-CCCGCCGCTGTAAGAGAATAG 13 TATGTCACTCGGGA----------AATCGGGGAAGGCTTA 10 GAAGCTCGAAGTCAGAATACCTGCCTGTAAAGGACTA
B2
B12-regulon: identification of genes and regulatory elements
 Mesorhizobium loti
Bradyrhizobium japonicum
Pseudomonas denitrificans #
Sinorhizobium meliloti
Brucella melitensis
Agrobacterium tumefaciens
Rhodopseudomonas
palustris
Rhodobacter capsulatus #
Rhodobacter sphaeroides #
Sphing. aromaticivorans #
Rickettsia prowazekii
Caulobacter crescentus
 Bordetella pertussis
Burkholderia pseudomallei
Neisseria meningitidis
Nitrosomonas europaea
Methylobacillus flagellatus #
Ralstonia eutropha #
Ralstonia solanacearum
 Escherichia coli
Salmonella typhimurium
Klebsiella pneumoniae #
Yersinia enterocolitica
Y. pestis;E. carotovora
Vibrio cholerae
Pasteurellaecae
Pseudomonas aeruginosa
Pseudomonas putida
Pseudomonas fluorescens #
Pseudomonas syringae #
Shewanella oneidensis
Azotobacter vinelandii #
Xanthomonas axonopodis
Xylella fastidiosa
 H. pylori, C. jejuni
Magnetococcus #
 Geobacter metallireducens #
T/D Deinococcus radiodurans
Thermus thermophilus #
Fusobacterium nucleatum
6 ardX2<>&transp5-6-cobU-btuR><cbiB<>cobD-X; &G1-cobW-cobN-cobG-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cobA-cbiA-cobS-cobT1; cobF; X-cbiP;
cobT2<>gene4-transp8; &G1-btuDFC; &ardX-frdX; &metE; &cbiY
BJA
4 &transp7-cobF-cobT] [cobS-cbiY] [cbiB-cobD><cbiP-btuR<>&G1-cobW-cobN-cobG] [cbiLH><cbiJ] [cbi(ET)-cobE-cbiF-cbiA-cobA]; &btuB; &metE
[btuB3-transp4]
PD
1 cbiP; &cobU-cobW-cobN-btuR-//-cobE-cobA-cbiA-cobD-cbiB; cobF-cobG-cbiCLH><cbiJ<>cbi(ET)-cbiF; cobT<>cobS-gene4-transp8
SM
5 cbiP; &cobU-cobW-cobN-btuR-//-cobE-cobA-cbiA-cobD-cbiB; cobF-cobG-cbiCLH><cbiJ<>cbi(ET)-cbiF; cobT<>cobS-gene4-transp8; &cbiY; &btuFCD; &ardXfrdX; &transp7
BME
4 cbiP-&transp5-6-cobU-cobW-cobN-btuR-//-cobE-cbiF><cbiJ-cbiD<>cobA-cbiA-cobD-cbiB; cbi(ET)<>cobG-cbiCL-cbi(GH); cobT<>cobS-gene4-transp8; &btuFCD2
&btuBFCD; &nrdHIEF; cbiY
AU
6 cobD<>cbiB><cbiP-btuR<>&G1-cobW-cobN-cobG-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF><cbiD<>cobA-cbiA; cobU; cobT<>cobS-gene4-transp8; &btuFCD;
&nrdHIEF; &transp5-6; &cbiY; yxjH-ATU04068<>&ATU04066-metR
&metE-ZUR~btuB-cobN-gene2-3;
RPA
9 cbiY-cobT1<>&G1-cobU-cobW-cobN-btuR-cbiP1><cbiB-btuF<>cobD><btuDC-hoxN&;
cobF-cobP2-cobS-cobT2&<>ORF663-&cbiCLH>-<cbiJ<>cbi(ET)-cobE-cbiF-cbiA-cobA><btuF2&; &btuB; &btuB3-transp4; cobC-&gene5
RC
12 cobSTU<>cobC-bluE-cbiB-cobD-cbiY><cbiP-btuR<>&G1-cobW-cobN-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cobA-cobF-&cbiMNQO-ORF663-cbiA;
&btuFCD; &btuD; frdX-ardX&<metH<cbiP2-cbiP3&<>&gene6&btuBFCD-cobX; &exbBD-tonB; &gene5; ?&<>&nrdDG; &oppABCD
RS
6 [cobT; &X-bluE-cobD; &cbiY; cbiB; cbiP; btuR-X<>&G1-cobW-cobN-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cobA-cobF; &transp7-cbiA;
&btuBFCD-cobX; &btuFC]
SAR
3 hoxN-&cobW-cobN-cobG-cbiCLH><cbiJ<>cbi(ET)-cobE-cbiF-cbiA-cobF-btuR-cbiY-G1-cobT-cobC-cobS-cobD-cbiB-cbiP-cobU; &btuBFCD; &~btuB
RP
0 no
CO
2 cobT<>gene4-transp8; &btuB-cbiP1; cbiP2; X-btuFCD; &metE
BP
0 metH<>btuBF; btuB3-transp4
BPS
4 cbi(GH)-cbiLC-cobG&<>cbi(ET)-cbiDJF; cbiA-btuR-cobE&<>&hoxN-cobW-cobN--chlID; &btuBCD-cobTSC><btuF-cobD-cbiB-cobU<>cbiP; cbiY
NM
0 no
NE
1 &btuB-transp3 -btuR><gene2-3-cobN; ~btuB-cobN-gene2-3
MFL
3 &btuB-transp3-btuR--cbiA-cobN-gene2-3-cbiY-btuF-cobU; cbiP<>cbiB-cobD><cobC-cobST; &btuB3-transp4; &nrdAB
REU
1 &btuBCD-btuR-cobTS-cobC><btuF-cobD-cbiB-cobU<>cbiP; cbiA; cbiY
RSO
2 &btuBCD-cobTS-cobC><btuF-cobD-cbiB-cobU<>cbiP; &hoxN-cbiW-cobN-chl(ID)--cbiZ-(cbiX-cbiC)-cbiDLFG-cbi(HJ)-cobW-btuR-cbiA-cbiY
EC
1 &btuB; btuCDE; X-btuF-X; X-btuR; cobC; cobUST
SY
2 &btuB; btuCDE; X-btuF-X; X-btuR; cobC<>cobD;
pdu cluster-&cbiABCDETFGHJKLMNQO-cbiP-cobUST
KP
2 &btuB; btuC-//-btuED; X-btuF-X; X-btuR; cobC; cobD; pdu cluster-&cbiABCDETFGHJKLMNQO-cbiP-cobU-cobT2; cobS-cobT1
YE
2 &btuB; btuCED; X-btuF-X; X-btuR; pduX-cobD-cobA<-pdu cluster-&cbiABCDETFGHJKLMNQO-cbiP-cobUS-cobC-cobT; cobA2-pduX2
YP,EO 1 &btuB; btuCDE; X-btuF-X; X-btuR
VC
1 &btuB; btuCD; X-cbiB-btuF-X; btuR; cbiP-X; cobTSU-cobC
HI,VK,AB 0 no
PA
5 &btuB-btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS;chlD-I//-cobN-cobW&<>&transp(5-6)-cobE-cbiF; cbi(GH)-cbiLC-cobG&<>cbi(ET)-cbiDJ;
&bruB2-btuDFC; ZURbtuB3-cobN; -gene2-3--metE; btuB3-transp4
&cobW-cobN-chlID;
&transp5-6-cobE-cbiF; cbi(GH)-cbiLC-cobG<>cbi(ET)-cbiDJ;
cobF;
PP
5 &btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS;
&btuB2-X-&btuFCD; btuF><btuB
chlDI-cobN-cobW&<>&transp5-6-cobE-cbiF; cbi(GH)-cbiLC-cobG<>cbi(ET)-cbiDJ; cobC2-cobF;
PU
3 &btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS;
btuB2-btuDFC; btuB3; btuF><btuB]
chlDI-cobN-cobW&<>&transp5-6 -cobE-cbiF; cbi(GH)-cbiLC-cobG<>cbi(ET)-cbiDJ;
cobF;
PY
4 &btuR-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS;
&btuFCD; metXY-bruB2-btuDFC
SH
2 &btuB-X-metR<>metE; metH<>cobC-&btuDC--cobTSU-cbiP-btuR; cbiB; X-btuF
AV
1 &(btuB-transp3-btuR)-cbiA-cbiY-cbiB-cobD-cbiP-cobUTCS; btuB; X-btuF
XAX
1 &btuB-transp3-transp3-btuR-cbiB-cobD-cbiP-cobUT-cobC-cobS; btuB3-transp4
XFA
0 no
HP,CJ
0 no
MCO
0 [cbiO-cbiA-cbiF-cbiX-cbiCD-cbi(TE)-cbiLG-cobE-cbiH><cbiP<>cobD; cbiB; X-btuR; cbiY; cobC; cobTS; cobU
GME
1 &cobUTSC-cbiA-X-cbiMNQO -cbiX-cbiCD-X-cbi(ET)-cbiLFGH-cbiPB-cobD
DR
3 XX-cobTS-X; X-cobU; &btuFCD; &btuF-btuR-cbiA-cbiB-cobD-cbiP; X-cobC; &achX-nrdIEF
TQ
3 cbiA-btuR-X-cobST<>cobC-&hoxN-cbiDC-cbi(ET)-cbiLFHG-cbiX-cobA-cbiY-cbiB-XX-cobD--(cbiP-cobU); &btuFCD; &achX-nrdBA
FN
2 cbiP1-X-cbiB-X-cobD-cbiA-X-cbiC-//-cbiDE-X-cbiT-//-cbiL-X-cbiF-//-cbiGHJ; btuR; cobUS-cobC-cobT;cbiK; transp11; &btuFCD; &btuBFCD; btuB<>btuFCD
MLO
B12-regulon: identification of genes and regulatory elements
B/CBacillus
subtilis
BS
Bacillus cereus
ZC
Bacillus megaterium #
BI
Bacillus halodurans
HD
Bacillus stearothermophilus # BE
Staphylococcus aureus
SA
Listeria monocytogenes
LMO
Clostridium acetobutylicum CA
Clostridium perfringes
CPE
Clostridium botulinum #
CB
Clostridium difficile #
DF
Thermoanaerobacter tengcongensis
TTE
Enterococcus faecalis
EF
Streptococci (ST, PN, MN, LL)
Act
HMO
Heliobacillus mobilis #
Desulfitobacterium halfniense DHA
Corynebacterium glutamicum CGL
Corynebacterium diphtheriae DI
Mycobacterium tuberculosis MT
ML
Mycobacterium leprae
TFU
Thermobifida fusca #
RK
Rhodococcus str. #
Streptomyces coelicolor
SX
PI
Propionicibacterium freudenreichii#
(PMA,CY,SN)
Anabaena sp.
AN
T. elongatus
TE
CAU
CL
PG
3
1
3
2
2
4
5
BX
7
Cya
Chloroflexus aurantiacus #
Chlorobium tepidum
CFBPorphyromonas gingivalis
Bacteroides fragilis #
Thermotoga maritima
Treponema denticola
Leptospira interrogans
1
1
1
5
3
0
2
2
4
1
3
2
1
0
5
6
0
1
2
2
3
4
7
TM
TDE
LEP
AA
Aquifex aeolicus
Arc
Thermoplasma volcanicum TVO
Methanosarcina acetivorans MAC
HSL
Halobacterium sp.
AG
Archaeoglobus fulgidus
AP
Aeropyrum pernix
Methanopyrus kandleri
MK
Methanococcus jannaschii
MJ
Methanobacterium thermoaut.TH
PK
Pyrobaculum aerophilum
PH
Pyrococcus horikoshii
PO
Pyrococcus abysii
PF
Pyrococcus furiosus
Sulfolobus tokodaii
STO
1
3
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
&btuFCD-pduO
pduO; &metE
[&cbiW-cbi(H-?)-cbiX-cbiJCD-cbi(ET)-cbiLFGA-cobA-cbiY-btuR]
&btuFCD-cbiB-cobD-cobU-cbiP-cobS-cobC-~cobU-pduO; &cobT; cbiA; cblX; &nrdAB; &metE; &achX
&btuFCD-cbiB-cobD-cobUS-cobC-~cobU-pduO-cblT-cblX; [&cbiW-cbiMNQO-cbi(H-?)-cbiX-cbiJCD-cbi(ET)-cbiLFG-cbiAP--cobT-cobA-btuR; &nrdAB
btuFC
pdu-cblX-cobUSC><pocR<>&pduABCDEGHKJL-eutJ-pduMNOPQFW-cobD-pduX; cblT-&cbiABCDETFGHJ-(cobA-hemD)-cbiKLMNQO-cbiP-pduO; btuF
&cbiMNQO-cobD-cbiG-pduX-cobT-cbiK-cbiP-cbiACDTLFJH-cobUS-cobC; cbiB -cbiK2-cbiE; &btuFCD-gene5
cobTSUC-cbiB-cobD-&btuFCD-gene7-cbiP; &cbiK-cbiCDETLFG--cbiHJ-btuFCD; btuR-X; &cbiMNQO; &cblT-cblX
&cbiPB-cobD-cbiMNQ--hemC-(cobA-hemD)--hemB-cbiAC; cbiD-cbiE-X--cbiT; cbiLFGHJ-cblT-cbiK-pduX-btuR; cobC; cobU; cobS--cblX; btuFCD-1; btuFCD-2
B
cobTSU-cobC-&cbiP-cbiA-cbiB-cobD-pduX-cbiCDETFGHJ-cbi(LK)-hemC-(cobA-hemD)-hemB-cysG ; &cbiMNQO; &btuFCD--nrdEF
&btuR-&btuFCD-cbiA-cbiPB-cblT-cblX-cobD-cobUS
&btuFCD-pduV-pduO-pdu cluster
no
&cbiD-cbi(ET)-cbiLFGHJ-cbiX-cbiC-//-cobU-cbiPBA-btuR-cobTS-pduX-cobD-~cobC; cbiK; &cbiM]; &cbiQO; &cblX]; cblT-cbiO;
&HMO01408
cbiD-&cbi(ET)-cbiLFGHJ-cbiX-cbiC-cbiPB--cbiA-btuR-pduX-cobD]; [cobTUS]; cobC]; &btuFCD; &oppAB]; &cblX; [cbiMN]; [cbiQO]; &DHA05379; &nrdD]
gene8-cobUTS; pduO; btuCFD
gene8-cobUTS; chlID-btuR-cbiA-//-cobA;
cbiP; cobF; cbiB-//-cobD; cobN<>cobG-cbiC-cbi(LH)><cbiJ-cbiF-cbi(ET);
&btuCDF
gene8-cobTS; chl(ID)-btuR-cbiA--cysG; &transp12-cbiP-cobU; cbiB-//-cobD; cobN<>cobG-cbiC-cbi(LH)><metZ-X<>X><cbiJ-cbiF-cbi(ET); &metE
gene8-cobTS;
btuR-cbiA;
&transp12;
&metE
gene8-//-cobUS]; cbiB<>cobD-(cbiY-cobT)--cysG; cobF-cobN&<>cobG-cbiC-cbi(LH)><cbiF-cbi(ET)<>&chl(ID)-btuR-cbiA><cbiJ-cbiP-transp10&
gene8--cobT]; cobU; [cobD]; &transp10-cbiP]; [cobF];[cobN&<>cobG-cbiC-cbi(LH)];[cbiF];[cbi(ET)<>&chl(ID)];[btuR];[cbiA];[cbiJ><cbiB]; &btuFCD
gene8-//-cobU--cobT-//-cobS; (cbiY-cobT)--cysG; cobF; cbiBP-cobN-chl(ID)-btuR-cbiA-cbiL><SX03279<>cbiF-cbi(ET)-cbi(GH)-cbiX-cobD; cbiC><cbiJ; &pduX-XX;
&cbiMNQO; &btuFCD; &btuCDF;
&RSX12454;
&nrdAB; &metE
B
mutBA&<>&cbiLF-cbi(EGH)-(cbiX-cysG )><cbiDCTJ; &cbiBP-btuR]; cbiMNQO-cobA
cbiJ;
cbiF;
cbiC; cbiL; cbi(GH); cbi(ET); cbiD; cbiB; cbiA; cobU; cobS; btuR; cbiP; cobD; cobN; cobW; cbiX; &hupE
&G1-cbiJ; cbiF; &cobG-cbiCL-X-cbi(GH); cbi(ET); cbiD; cbiB; cbiA; cobU; cobS; btuR; cbiP; cobD; cobN; cobW; &G1-btuB-(genW-cbiW)-btuFCD; cbiMNQO
cbiJ;
cbiF;
cbiC; cbiL; cbi(GH); cbi(ET); cbiD; cbiB; cbiA; cobU; cobS; btuR; cbiP; cobD; cobN; cobW; &cbiX; cbiMNQO; &metE
B
&btuF-cbiW-btuCD-genW]; [X-cbiCD-cbi(ET)-cbiLF]; [cbiG]; [(cbiH-cysG )-cobA-cbiA-cobUD-cbiP; &btuR-cobN-cobT-cbiMNQO-cbiB; cobCS; chl(ID)
&btuBF-cbiY-cbiP-cobD-cbiB-btuR-btuCD-X-cobUT-X-cobS;&cbiMNQO-cobA-cbiK-cbiL-cbi(HC)-(ET)-(GF)-(JD); chl(ID)-cobN-&btuB2;cbiA-btuF2; &nrdJ
cobUTSC; cbiA-pduO-cbiP-cobD-cbiB; &transp9; &btuB4-cbiK-btuFCD1;
cbi(HC)-cbi(ET)-cbi(GF)-cbi(JD); cbiL;
&btuFCD2; hmuY-hmuR(~btuB)-cobN-X-gene2-3;
&nrdDG; &PG00461-62-63
cobUTSC><cbiB-cobD-cbiP; cbiA-pduO-//-transp9&<>&btuB4-cbiK-btuB3-transp4-cobN-X-gene2-3-cbi(HC)-cbi(ET)-cbi(GF)-cbi(JD); cbiL<>btuFCD
&btuB; ~btuB-cobN-X-gene2-3-&nrdAB; btuFC; ~btuB-&metE; &nrdDG; &BX01357-58-59
&btuFCD; btuR
&btuFCD; cbiK-cbiLA; cbiG-cbiF; cbiHJ-btuD3; cbiET; X-cbiC; cbiD; cobUSC; btuR; cbiB; X-cbiP; cobD; chlID-cobN--btuFCD2;
&transp11]; &rocG
&btuB; &(cbiX-cbiW)-X-frd-cbiDC-cbi(ET)-cbiLG-cbi(H?)-cbiF-btuR-cbiA-cobX-cobU-cbiPB; cobTSC
no
B
cysG -cobA-cbiCHDTLF-cbi(GE); cbiPB; X-cobT; X-cbiA-XX; cobDS; btuR; cobC; cobX cobY><btuD-X;
btuF; btuC
cbiTLFGHC; cbiDE; cbiMNQO; cbiA1; cbiA2; cbiP; cobD-cbiB--cobZ-cobS-cobY; btuR; cobT; transp2-gene2-3-cobN; opp-cobN-chlID; btuFC-X-btuD
cbiTLFGH-HSL00646-(cbiX-cbiW)-X><chl(ID)<>cobN-cbiC-cbiE><cbiD<>cobT-cbiA-btuR-cbiP-HSL01294-cbiB-cobSYD-cobX;
btuF<>btuCD
cbiT-cbiMNQO<>cbiLFG-cbi(HC)-cbiDE--cbiX-gene7; cbiB-X><cbiP; cbiA; cobY-X-cobS1; cobS2; ??-cobD-X; cobT;
btuCD-XX-btuF
cobT-(cobX-cobZ)-cobY-cobD-cobS-cbiB-X-gene7;
$~btuFCD
btuF<$>mutBA-ygfD
X-cbiC; X-cbiD; cbiT<>cbiL; cbiFGH; cbiE; cbiA1; cbiA2-X; cbiB; cbiP-X; ??-cobT; (cobX-cobZ)-cobS-cobY; ?-cobD; cobN<>cobN2-X-metE
cbiC; cbiD; cbiE; cbiF; cbiG; X-cbiH; cbiJ-X; cbiL-X; cbiT-X; cbiMNQO; cbiP; cbiA; X-cbiB-X; cobS-X; X-cobT; cobD; cobY; cobZ; cobN; btuF<>btuCD
cbiC-X; cbiD; cbiE; cbiF; cbiB-cbiG; cbiH; X-cbiJ; cbiL; cbiT; cbiMNQO; cbiP; cbiA; (cobX-cobZ)-cobS-X; cobD; cobT-X; X-cobY;cobN-transp2-gene2
cbiCHDTLF; cbiGE; X-cobD-cobS-cbiB; cbiA; cobT-(cobX-cobZ)-btuR;
cobD-cbiB<$>gene7-cobZ--cobS-cobY;
$cobT; btuR;
$nrdDG; $mutB*-$ygfD-mmcE; $mutB^; $sucS; $btuF; $btuCD
cobD-cbiB<$>cobZ--cobS-XX-cobY; gene7;
$cobT; btuR;
$nrdDG; $mutB*-$ygfD-mmcE; $mutB^; $sucS; $btuF; $btuCD
cobD-cobZ-gene7-cbiB$<//>cobS-cobY><cbiP-X<$>cobT; btuR; $cobS2; $nrdDG; $mutB*-$ygfD-mmcE; $mutB^-X; $sucS; $btuF; $btuCD
$cbiGECHDTLF; cbiP<>cobS-cbiB; $cobT; X-cobD; cobY; cobC; cbiA; $hoxN;
btuF<$>btuCD
Distribution of B12-elements in bacterial genomes
B12-element regulates cobalamin biosynthetic genes and transporters,
cobalt transporters and a number of other cobalamin related genes.
Phylogenetic tree
of B12-elements
CA_BTUF CPE_CBIK
THT_BTUF
DF_BTUF
BX_BTUB
THT_BTUR
SON_BTUD
DHA_CBLS
HMO_CBLS
KP_CBIA
DHA_NRDD
HMO01408
EF_BTUF
DR_ACHX
CL_BTUB
DR_BTUFC
AN_COBG
EO_BTUB
RK_BTUF
TE_CBIX
TQ_HOXN
RPA_BTUF2
C
TU
_B
DI
MLO_CBTA
PI_CBIB
CY_HUPE
C
B R_
TU SA
B
R_
SA
AN_CBIJ
SN_HUPE
O
MT_CBTG
ML_CBTG
SX12454
SX_BTUF
TM_BTUF
TFU_CBTE
PI_CBIL
RK_CBTE
AU_ACHX
CO_METE
MLO_METE
RPA_METE
BJA_METE
BME_NRDH
RC04759
RPA_BTUB3
BJA_BTUB
MLO_BTUD
AU_NRDH
AU_CBTA
SAR_BTUBF
BPS_BTUB
BP
S
_C
OB
E
RSO_BTUB
REU_BTUB
RPA_HOXN
XAX_BTUB
RC_NRDD
PP_BTUR
PU_BTUR
PY_BTUR
NE_BTUB
MFL_BTUB
PP_CBTA
DX
CR
A_
P
R
RX
_C F
AU
BE_NRDA
DHA05379
PP_BTUB2
PA_BTUB2
PY_CBTA
PU_CBTA
BE_BTUF
HD_NRDA
RPA_BTUB
BME_BTUF
AU_BTUF PP_BTUF
SM_BTUF
PY_BTUF
RS_BTUF
MFL_NRDA
RC_BTUF
RPA_CBIC
RK_CHLID
BPS_HOXN
A_
RP
BT
CO
BW
RS_BLUB
MLO_BLUB
SM_BLUB
AU_BLUB TFU_CHLID
(in gray squares)
BI_CBIW
HMO_CBIQ
DHA_CBIET HD_COBT
SX_NRDA
TQ_BTUF
Without B2 domain
BE_CBIW
DF_CBIM
DF_CBIP
KP_BTUB
VC_BTUB
YE_BTUB
YP_BTUB
TE_METE
PMA_HUPE
CPE_BTUF
EC_BTUB
TQ_ACHX
AN_CFRX
HD_ACHX
CPE_CBLT
CAU_BTUR
CAU_BTUF
SX_BTUC
LMO_CBIA
FN_BTUB
ML_METE
DHA_BTUF
SX_PDUX
MT_METE
SX_METE
CL_NRDJ
CL_FRD
LI_CBIX
YE_CBIA
CB_CBIP
DR_BTUFR
HMO_CBID
HMO_CBIM
RK_COBN
TFU_COBN
SX_CBIM
PA_COBG
BPS_COBG
RSO_HOXN
RPA_CFRX
RS_BLUE
RC_CBIP3
RC_CBTF
RC_CRDX BME_CBTA
SM_COBU
SM_
CBT
C
RC
BJA_CFRX
_C
BI RC_BTUD
M
PD_COBU
RS_CBTC
RC_CNOA
BME_BTUB
RS_CFRX
CO_BTUB
RC_BTUB
RC_EXBB
RC_CFRX
PY_COBW
PA_COBW PU_COBW BS_BTUF
ZC_METE
CA_CBIM
MFL_BTUB2
CPE_CBIM
HD_METE
CL_BTUB2
SON_BTUB
BX_NRDD
FN_BTUF
PA_CBTA PA_BTUB
PG_NRDD
CL_CBIM
SM_ARDX
MLO_ARDX
PG_CBTD
BX_PCCC
BX_BTUB4
BX_NRDA
BX_CBTD
BX_METE
The predicted mechanism of the B12-mediated
regulation of cobalamin genes
A.
g
B.
pseudoknot
aN
t
C
t Gg
P2
cg
N
N
N
N
A
A
G
G
G
a
N
a
a
R
C
c
y
G
C d
c
P1 r
+Ado-CBL
C
c
G
C P3
h a
C
g
K
G
T
r
a
P0
C
G M
C k Gg
C C
A
C
d
a g
aN
t
C
t Gg
P2
P4
P6
P5 A
g c C
CTG
c gG
GGY
AG
A
r
A
G
Y
N
g k
c tG
y
G
h
pseudoknot
cg
N
N
N
N
A
A
G
G
G
a
N
a
a
c
P1 r
3
2
1
terminator
B12-element
R
C
c
y
G
C d
+Ado-CBL
C
c
G
C P3
h a
C
K
G
T
r
a
P4
P0
C
G M
C k Gg
C C
A
C
d
a g
CTG
c gG
GGY
AG
A
r
A
G
Y
N
g k
c tG
y
G
h
P6
P5 A
g c C
2
1
RBS-sequestor
hairpin
B12-element
Ado-CBL
1
2
antiterminator
3
Ado-CBL
1
2
antisequestor
Phylogenetic distribution of gene clusters
regulated by B12-elements
Gene cluster
1. CBL biosynthesis:
cbi and cob
cbt, hoxN, cbiMNQO,
hupE
orf1-cobW-cobNchlID
bluB
btuR
Function
Taxonomic group
cobalamin
biosynthesis
cobalt transporters
proteobacteria, the Bacillus/Clostridium group
cobalt chelation
-, -proteobacteria, Pseudomonadaceae,
actinobacteria
-proteobacteria
-, -proteobacteria, Pseudomonadaceae
cobalt reduction
CBL
adenosyltransferase
2. Vitamin B12 transport:
btuB
vitamin B12 receptor
btuFCD
vitamin B12
transporter (ABC
components)
all CBL-synthesizing bacteria
proteobacteria
-, -proteobacteria, Pseudomonadaceae, the
Bacillus/Clostridium and CFB groups, Deinococcus
radiodurans, actinobacteria, spirochetes,
Fusobacteriaceae, Thermotogales,Chloroflexaceae
3. B12 -dependent or alternative metabolic pathways:
metE
methionin synthase
various groups
nrd
ribonucleotide
various groups
reductase
ardX-frdX
predicted enzymes
-proteobacteria
achX
predicted enzymes
Deinococcus radiodurans and some other species
B12-dependent and B12-independent izozymes
Ribonucleotide reductases
Methionine synthases
NrdJ
NrdAB/NrdDG
MetH
MetE
(B12-dependent)
(B12-independent)
(B12-dependent)
(B12-independent)
+
–
+
–
–
+
–
+
+
+
+
+
B12
B12
B12
B12-independent izozymes
of methionine synthase and ribonucleotide reductase
are regulated by the B12-elements
in the genomes possessing both izozymes
(it was not known formerly)
Conserved S-box
structure
D
c
C
a
A
C
G
R
c
gg
y
N G Aa
r Cc N
CCCD
c AG G G A
P3
Gr
y GgN
g
A
P2
Ga
Nc
U
A
u
P1 U
C
u
5'
a
H
g
G
P4
U
G
C
YAA
N
u
c
c
N
P5
g
car
Ga
A
U
R
A
G
a
N
3'
base stem
r gu y
Distriubtion of MET regulatory elements
Genome
AB Methionine biosynthetic genes
Bacillales:
Bacillus subtilis
BS
MetK
metB;
&metI-metC; &metF*; &metE; &metK
&cysH-ylnABCDEF
Bacillus cereus
BC &metY-metB-hom; metC-metI&<>&metF*-metH; &metK
&cysH-ylnBCADEF; metX;
metE
Bacillus halodurans
BH metY; metB; &hom; &metI-metC-metF*-metH; metE&metK
Bacillus stearothermophilus # BE [metY; [metB;
[&metI-metC; [&metF*-metH
&metK
Oceanobacillus iheyensis
OB &metY1; metB;
Staphylococcus aureus
Listeria monocytogenes
&b hmT;
&X-metY2
SA
&metX;
$metI-metC-metF*-metE-mdh
LMO &metY-metX; &metE-metI-metC-metF*
Lactobacillales:
Enterococcus faecalis
Lactobacillus plantarum
Lactobacillus gasseri #
Lactobacillus casei #
Lactobacillus delbrueckii #
Lactobacillus brevis #
Oenococcus oeni #
Leuconostoc mesenteroides #
Pediococcus pentosaceus #
EF no
metK
LP $metB-metY-hom; $metI;
$metE- metF* metK
LGA no
?
LCA
$metE-metF
metK
LDB metY; metB-rhc1-yusA2;
$rhc2-metE-metF metK
LB no
?
OOE metY; yxjH2-metI-metC-metB
metK
LME $metB-metI-metC-yxjH1--rhc-$yxjH2-metY; $metF-E metK
PPE no
metK
Streptococcaceae:
Lactococcus lactis
LL
metY;
metB-metI;
Streptococcus agalactiae
SAG
Streptococcus mutans
MN #metY-mdh; ##metB;
metI;
Streptococcus pneumoniae
PN #metY;
#metB; #metI;
Streptococcus pyogenes
ST
metC
Streptococcus suis #
SSU #metY;
metB; [metI;
Streptococcus thermophilus # STH metY;
##metB; #metI;
Streptococcus uberis #
SUB no
#metE-metF
#metE-metF*
##metE-metF*
#metE-metF
#metE-metF
##metE-metF
Clostridia:
Clostridium acetobutylicum
CAC &metY; &metB;
&metI-metC; metF*; &metH
Clostridium perfringes
CPE no
Clostridium botulinum
CB &metY-hom-metB;
&metF-msd-metH^
Clostridium tetani
CTC &metY-metB;
-msd-metH^
Clostridium difficile #
DF &metY-metB; &hom; folD-X-metF; rhc-msd-metH^
Thermoanaerobacter tengcongensis
TTE &hom-metY-metX;
&metF-metH^
Streptomyces coelicolor
Thermobifida fusca #
Chlorobium tepidum
Chloroflexus aurantiacus
Cytophaga hutchinsonii
Therm otogales (TM, PMI)
SX
TFU metY-metX;
CL &metY-metX;
CAU &metY-metX;
CHU &metY; metX;
metY-metB;
metF; metE; metH
metF;
metH
metF;
metH
metF;
metH
metE; metH--metF
metF-msd-metH^
Transporters
Other genes
&yusCBA; yusA2
mtnZYXW&mtnV<>mtnU&mtnKS; &yoaD;
yrrT-mtn-yrhAB; rhc; &yxjH1-&yxjH2
mtnZYXW&mtnV<>mtnU&mtnKS;
yrrT-mtn-yrhAB; rhc; &mdh; &hmrA
&BH0835; mtn; rhc
mtnZYXW&mtnV<>mtnU&mtnKS;
yrrT-mtn-yrhAB; rhc
yrrT-rhc–yrhAB; mtn; &yxjH; &OB1276;
OB3079&<>&OB3078; &OB2779-OB2778
yrhAB-yusCBA2; rhc<>hmrA; mtn
&yxjH; mtn; rhc
&yusCBA1-yusA2; &yusCBA3; &yusACB4;
&metT; &mtnABC; &oppBCDFA; &mtsABC
&yusCBA
&yusCBA; mtnABC
&metK &yusCBA1; &X-yusACB2; &yusCBA3;
&yusCBA4-hmrA
&metK &metT; &yusCBA;
hcp-mtsABC
&metK &yusCBA1; &yusACB2; &oppABC
$yusCBA1;$$yusCBA2; yusCBA3; $opp; mtsABC
$yusCBA1; $yusCBA2
$yusACB;
hcp-mtsABC
$yusA1; $yusA2CB
$yusACB;
hcp-mtsABC
$yusACB
$yusCBA1; $yusA2-yxjH1-hmrB-yusCB; mtsABC
[yusA1-yusA2-hmrB-yusCB; yusA3; $hcp-mtsABC
$yusCBA
metK
metK
metK
metK
metK
metK
metK
metK
yusA1-A2-A3-A4-yusCB-mtsABC
#yusA-hmrB-yusCB-/-mtsABC; #yusA2; yusCBA3
#yusA-hmrB-yusCB-X-#hcp-mtsABC
#yusA-hmrB-yusCB; #hcp-mtsABC
#yusACB
[yusA-hmrB-yusCB]; [hcp-mtsABC
#yusA-hmrB-yusCB-X-hcp-mtsABC;yusA2
#yusA-hmrB-yusCB; #hcp-mtsABC
&metK
&metK
&metK
&metK
&metK
&metK
metK
metK
metK
metK
metK
metK
&yusCBA
&metT;
&metT; &yusCBA1-yusA2; yusCB2
&metT; &yusCBA
&yusCBA1; &yusA2
mtsABC
yusCBA
yusCBA
&yusACB (only in Petrotoga miotherma)
$yxjH;
$yxjH1; yxjH2;
$yxjH-rhc;
$yxjH-rhc;
rhc;
rhc;
$yxjH;
yxjH3;
mdh1; $mdh2
$yxjH;
rhc;
rhc;
yxjH;
#yxjH;
#fhs; #folD;
#yxjH; #mdh;
#yxjH; #mdh;
rhc;
rhc;
rhc;
rhc;
rhc;
rhc;
rhc
rhc;
rhc
Tcub iG-yrhBA><&; rhc;
ub iG-yrhBA-rhc;
ub iG-yrhB-rhc-yrhA;
&SCD95A.26
&SCD95A.26
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
mtn
Aspartate semialdehyde
hom
Threonine
Homoserine
cysH-...
metX
metB
O-acetylhomoserine
metI
yrhB
Sulfide
metY
Cystathionine
methylene-THF
metC
S-ribosylhomocysteine
(SRH)
yrhA
Homocysteine
yxjH* metE
mtn
S-adenosylmethionine
S-adenosylhomocysteine
(SAH)
metK
(SAM)
CH3
mtn
methyl-THF
metH
Methionine
mtnKSUVWXYZ
MTA
metF
Methylthioribose (MTR)
THF
Phylogenetic tree of the NhaC Na+:H+ antiporter superfamily
including predicted methionine-, lysine- and tyrosine-specific
transporters
Pasteurellaceae
NMB
SON-2
BL1111
SON-1
VC-2
VC-1
BH
SON-3
clostridia
OB
CAC0744
LysT
CB
Archaea
FN0352
PPE
LP-nha2
LGA
LP-nha1
LME
LB
EF-nhaC2
TyrT
BC1434
FN1414
BT1270
CB
NMB05 36
EF-nhaC1
SA2117
CJ
OB2874
BC4121
TTE-nhaC
269.
47
CTC
CPE
DF
FN0978
OB1118
HP
MetT
BS-yheL
FN0650 BC1709
CTC00901
FN062 4
CTC02520
BS-mleN
BB0637
CPE2317
FN1420
CTC02529
VCA0193
SO1087
FN1422
BC0373
BB0638
FN207 7
BH3946
VC2037
SA2292
HI1107
VV21061
MleN
RFN
Riboflavin
biosynthesis and
transport
FMN (flavin
mononucleotide)
Bacillus/Clostridium group,
proteobacteria, actinobacteria,
other bacteria
THI
Biosynthesis and
transport of thiamin
and related
compounds
Thiamin
Bacillus/Clostridium group,
pyrophosphate proteobacteria, actinobacteria,
cyanobacteria, other bacteria,
archea (thermoplasmas),
plants, fungi
B12
Biosynthesis of
cobalamine,
transport of cobalt,
cobalamindependent enzymes
Adenosylcobalamine
Bacillus/Clostridium group,
proteobacteria, actinobacteria,
cyanobacteria, spirochaetes,
other bacteria
S-box
Metabolism of
methionine and
cystein
Adenosylmethionine
Bacillus/Clostridium group
and some other bacteria
LYS
Lysine metabolism
lysine
Bacillus/Clostridium group,
enterobacteria, other bacteria
G-box
Metabolism of
purines
Guanine,
adenine
Bacillus/Clostridium group
and some other bacteria
Properties of riboswitches
• Direct binding of ligands
• Same structure – different mechanisms
• Distribution in all taxonomic groups (diverse
bacteria, archea - thermoplasmas, eukaryotes
– plants and fungi)
• Correlation between the mechanism and
taxonomy:
– Preferable attenuation of transcription (anti-antiterminator) – Bacillus/Clostridium group
– Preferable attenuation of translation (anti-antisequestor of translation initiation) – proteobacteria
Some confirmed predictions of metabolite promoting RNA
riboswitches.
Structure of RFN-element and
RNA riboswitch mechanism.
(regulation of riboflavin
metabolism and transport
genes)
Bacillus subtilis
Winkler et al.,
2002b
Mironov et. al.
2002
Structure of THI-element and
RNA riboswith mechanism.
(regulation of thiamin
metabolism and transport
genes)
Bacillus subtilis
Mironov et. al.
2002
Escherichia coli
Winkler et al.,
2002a
Structure of B12-element
(regulation of cobalamin
metabolism, transport and
others cobalamin related genes)
Escherichia coli
Streptomyces
coelicolor
Nahvi et al.,
2002
Borovk et al,
2006
• Dmitry Rodionov
• Andrei Mironov
• Mikhail Gelfand