The E. coli Extended Genome Fernando Baquero Dept. Microbiology, Ramón y Cajal University Hospital, and Laboratory for Microbial Evolution, CAB (INTA-CSIC) Madrid, Spain.

Download Report

Transcript The E. coli Extended Genome Fernando Baquero Dept. Microbiology, Ramón y Cajal University Hospital, and Laboratory for Microbial Evolution, CAB (INTA-CSIC) Madrid, Spain.

The E. coli Extended Genome

Fernando Baquero

Dept. Microbiology, Ramón y Cajal University Hospital, and Laboratory for Microbial Evolution, CAB (INTA-CSIC) Madrid, Spain

The Species E. coli

Roles of the concept of “species”

• Units of taxonomic classification: Units in the general reference system that microbiologists use to order the isolates • Units of generalization: Kinds of microorganisms over which explanatory-predictive generalizations can be made • Units of evolution: Bacterial entities that participate in evolutionary processes and undergo evolutionary change

(Modified from T.A.C. Reydon, Ph.D. Dissertation, Leiden University, 2005)

The Species E. coli

New way • Units of taxonomic classification: Units in the general reference system that microbiologists use to order the isolates • Units of generalization: Kinds of microorganisms over which explanatory-predictive generalizations can be made • Units of evolution: Bacterial entities that participate in evolutionary processes and undergo evolutionary change Classic way

Diversity at all hierarchical levels

Strain Mutation

Some strains are more mutable than others

Population Clonalization

Some populations tend to produce more clones?

Community Speciation

Some bacterial groups tend to produce more species?

At any level, the origin of diversity is probably stochastic

Adaptation Complexity: Mutation

Single adaptive event

Clonalization

Multiple adaptive events

Speciation

Very complex adaptive events

Clonalization

Allopatric clonalization Sympatric clonalization

Host Defenses

Clonalization

Allopatric clonalization ExPEC* Non ExPEC Sympatric clonalization * From James R. “Linneus” Johnson

The elimination of intermediates

Impossibility of being a business man

and

a little meermaid

Species-Environment Concerted Evolution Phylogenetic groups Core genome species evolution Basic reproductive environment environmental evolution

Co-evolution: Trees within Trees

Host Bacteria or bacterial consortium

The clues of E. coli genetic diversity

• • •

Errors in DNA replication and repair

Horizontal genetic transfer

from other organisms • Creation of

mosaic genes

from parts of other genes

Duplication

and divergence of pre-existing genes De novo invention of genes from DNA that had previously a non-coding sequence

Modified from Wolfe and Li, Nat. Genet. 33, 2003

Not a single strain represents the whole species

• • • • • • • •

K12-MG1655 (4,289 ORFs) K12-W3110 (4,390 ORFs) O157:H7 (Sakai) (5,361 ORFs) O157:H7-EDL933 (5,349 ORFs) E2348/69 CFT073 (UPEC) (5,379 ORFs) O42 (EAEC), HS, E24377A (ETEC) , Nissle (PBEC) Shigella floxneri SF-301 and 2457T (4,084)

E. coli genomes

1,000 genes of difference!

http://colibase.bham.ac.uk

E. coli genomes

http://colibase.bham.ac.uk

Loops in a common core backbone

A-strain B-strain A-loop (A-island) B-loops (B-islands)

Loops in a common core backbone

296 loops in E. coli Sakai 325 loops in E. coli K12 BB: 3,730 kb 1,393 kb S-loops BB: 3,730 kb K-loops 537 kb

Loop sizes

Chiapello et al., BMC Bioinformatics, 6:171, 2005

Large loops

arise from horizontal transfer events

Small loops

may arise from replication errors (small deletions or insertions), or correspond to highly polymorphic regions

The core backbone is not the minimal genome

• The “core backbone” is not the “minimal

E. coli

genome”, because of

high level of gene redundancy.

• A high number of genes are

members of gene families

(2-30 copies), similar enough to be assigned similar functions (

paralogs

) • Such redundancy involves

20-40 % of the E. coli coding sequences

(more in the largest genomes) • “

In-silico

metabolic phenotype” including all basic functions, predict about

700 genes in minimal genome

(

Blattner at al., Science 1997, Edwards and Palsson, PNAS 2000)

Gogarden et Townsend, Nature Rev. Mic. (2005)

The blue gene, unexpected in the species “C”, might have arisen: i) by horizontal gene transfer; or ii) by an ancient

gene duplication

followed by differential gene loss.

The loops

• The

backbone

evolves by

vertical

transfer.

• Large

loops

are probably acquired by

horizontal

gene transfer, but also evolve by vertical transfer.

PAIs, islets, phages, plasmids, transposable, repetitive elements...

Loops

tend to have a

different codon usage and higher AT

% than the backbone.

• Loops tend to contain more frequently

operational genes

(actions) than informative genes (complex regulation)

(R. Jain, 1999)

Random-scale sub-network (loop)

ALIEN

nodes links Operative genes are more easily accepted

Elaboration from Jain et al.

ALIEN

Scale free network (core) Informative genes less easily accepted nodes Number of links (log)

Elaboration from Jain et al.

ALIEN Subnetwork

Scale free network (core) Informative genes less easily accepted except alien replacement of an entire sub-network nodes Number of links (log)

3,256 E. coli genes are connected by 113,894 links Predicted functional modules in E. coli

(von Mering et al., PNAS 100:15428, 2003)

Loops as R&D E. coli laboratories

Proteins expressed

(bars in red)

Positions of K-loops

(bars in blue) The genes in the loops express proteins in only 10% of the cases

M. Taoka et al., Mol & Cell. Proteomics (2004)

Acquisition

Gene flux

Excision Modification

Loss

Duplication Modification

(Daubin et al., Genome Biol., 4:R57, 2003; Ochman and Jones, EMBO J., 19:6637, 2000)

More loss in sequences of recent acquisition* Insertions and deletions occur more frequently in loops Overall less loss than acquisition?

Constant Random Gene Influx?

Acquisition

Gene flux

Excision Modification Loss Duplication Modification

As in the case of random mutation, there might be a blind, random uptake and loss of available foreign genetic sequences; environmental selection and random drift determines the fate of these constructions.

E. coli - where alien genes come from?

Enterobacteriaceae

(56 %) (

Klebsiella

,

Salmonella, Serratia, Yersinia

);

Aeromonas

,

Xylella

,

Ralstonia, Caulobacter, Agrobacterium

Plasmids

(28 %) - about 250 plasmids identified in

E. coli.

Phages

specific) (10%) + many ORFan genes (64 MG1655 ( Modified from

Duphraigne et al., NAR 33, 2005, and Daubin&Ochman, Genome Research, 2004)

The E. coli “Gene Exchange Community” should be better identified!

E. coli Recipient Barriers for Horizontal Gene Transfer

• • • • • • • • • • • • •

Ecological separation

from donor

DNA sequence divergence Low numbers Inadequate phage receptors Inadequate pilus specificity

for mating

Contact-killing or inhibition Surface exclusion

* 200 enzymes!

Restriction*

; no anti-restriction mechanisms,

gene inactivation Absence of replication

of foreign gene,

incompatibility Absence of integration

of foreign gene in specific sites

No recombination

with host genome (AT/CG), MMR system

Decrease in fitness

of recipient after DNA acquisition

No more room

for new DNA: Headroom (Maximal Genome?)

Sequence divergence reduces acquisition of foreign DNA If the acquisition produce neutral events the tolerance increases

Modified from Gogarten and Towsend, Nature RM, 2005

Deleterious events are frequent with high divergence, but eventual beneficial events are rare with low divergence rates

Species-Environment Concerted Evolution Phylogenetic groups Core genome species evolution Basic reproductive environment environmental evolution

Genome Size in E. coli strains ECOR Phylogenetic Groups

kb

5,4 5,2 5 4,8 4,6 4,4 4,2 4

K12 level

A B1 B2 D

Data:

Bergthorsson and Ochman, Microb. Biol. Evol. 15:6-16, 1998

Phylogenetic groups: clinical associations

100 40 30 20 10 0 90 80 70 60 50

A B1 B2

Clinical Rectal (FUTI) Cystitis Faecal HV-Fr Febrile UTI Faecal HV-Sp

D

Clinical: Johnson et al., EID 11:141, 2005; Cystitis: Johnson et al., AAC 49:26, 2005; FUTI and rectal FUTI: Johnson et al., JCM 43:3895, 2005; Faecal Fr/Cr/Ma, Duriez et al., Microbiology 147:1671, 2001; Faecal HV Spain, Machado et al., AAC 49, 2005

Phylogenetic groups: clinical associations

But: “Epidemic extraintestinal strains”, many SxT-R in UTI in US, Israel, France (

Johnson et al.,EID 11:141, 2005

) 70 60 50 40 30 20 10 0

A B1 B2

Groups B2 and D are the more frequently found in

E. coli

bacteremia (

Hilali et al., Inf.Imm 68:3983, 2000; Johnson et al., JID15:2121, 2004, Bingen, yesterday)

D

Clinical Rectal (FUTI) Cystitis Faecal HV-Fr Febrile UTI Faecal HV-Sp Clinical:

Johnson et al., EID 11:141, 2005

; Cystitis:

Johnson et al., AAC 49:26, 2005

; FUTI and rectal FUTI: Johnson et al.,

JCM 43:3895, 2005

; Faecal Fr/Cr/Ma,

Duriez et al., Microbiology 147:1671, 2001

; Faecal HV Spain,

Machado et al., AAC 49, 2005

Distribution of E. coli isolates from hospitalized patients and from healthy volunteers among the four phylogenetic groups

30 20 50 40 1 0 0

A B1 B2 D

Machado, Cantón, Baquero et al., AAC 49 (2005)

ESBLs

(red) predominates among strains of

group D Pathogenic

strains,

non ESBL

, predominates among

group B2 Commensal

strains, non ESBL, predominates among

group A

Antimicrobial-R in phylogenetic groups

40 30 20 10 0 80 70 60 50

A SxT-R B1 ESBLs B2 Cipro-R(1) D Cipro-R(2)

SxT-R and Cipro-R(1):

Johnson et al, AAC 49:26, 2005

; ESBL:

Machado et al., AAC 49, 2005;

Cipro-R(2):

Kuntaman et al., EID 11:1363, 2005 (Indonesia).

The phylogenetic group B2, the more pathogenic one, tends to be the less resistant?

Species-Environment Concerted Evolution Ecotypes Core genome species evolution Basic reproductive environment environmental evolution

Models for Multiple Ecotypes

(Gevers et al., Nature MR 3:733, 2005)

Clonalization

Patients with different ESBL clones

Ramón y Cajal Hospital, Madrid (Baquero, Coque & Cantón, Lancet I.D. 2:591, 2002)

30 25 20 15 10 5 0 88 89 90 91 92 93 94 95 96 97 98 99

Ye ar

0

Mutation: Intra-Clonal Diversity

E. coli

:

Faecal Urine Blood ESBLs

80 70 60 50 40 30 20 10 0 Hypo Normo Weak

Mutation frequency

Baquero et al, AAC 2004 and Nov. 2005

Strong

Clonal Ensembles: Metastability through Intermittent Fixation

Different clones peak in frequency at different times, accordingly to the best-fit clone in each epoch* of a changing environment * epochal evolution Line of best fit clones time The maintenance of clonal ensembles is favored by the assymetry of fitness abilities in different clones in different epochs Clonal ensemble

Shared Environments and Maintenance of Diversity

A regional polyclonal community structure

1 2 1 Alternative stable equilibria and the coexistence of variant organisms

On this topic:

Geographic mosaic theory of coevolution, Forde et al, Nature, 2004

Maintenance of diversity

A regional polyclonal community structure

1 2 1 Local Migration Local Gene Flow

Diversity: Collapse and Resurrection

Kin effects in open systems SELECTION

Maintenance of diversity

A regional polyclonal community structure

1

Environmental gradients are composed by a multiplicity of patches that may act as discrete selective points for bacterial variants

Maintenance of diversity

A regional polyclonal community structure Gradients and concentration dependent selection

(F. Baquero and C. Negri, Bioessays, 1997)

Maintenance of Diversity by Scissors, Rock, Paper Model

B. Kerr et al., Local dispersal promotes biodiversity in a real life game of rock-paper-scissors. Nature 418:171, 2002

Rock, Paper, Scissors Model

2. Scissors increase its power against paper...

3. And less paper means more stones...

1. If the stones reduces its attack again scissors....

Rock, Paper, Scissors Model

B. Kerr et al., Local dispersal promotes biodiversity in a real life game of rock-paper-scissors. Nature 418:171, 2002

Rock, Paper, Scissors Model

B. Kerr et al., Local dispersal promotes biodiversity in a real life game of rock-paper-scissors. Nature 418:171, 2002

In60-like integrons Kindly provided by Teresa Coque et al., 2005

Int1 aacA4 aadA2 qacE

D

1sul1 orf513 catA2 qacE

D

1 sul1

orf5

Int1 aadB qacE

D

1sul1 orf513 dfrA10 qacE

D

1 sul1

orf5

Int1 aadA2 qacE

D

1sul1 orf513 ampC ampR qacE

D

1 sul1

orf5

Int1 dfrA16 aadA2 2

CTX-M-9

qacE

D

1sul1 orf513 bla CTXM-9

orf3-like IS

3000 Int1 aacA4 bla OXA-2

CTX-M-2

orfD

qacE

D

1sul1 orf513 bla CTXM-2

orf3::

qacE

D

1 sul1 qacE

D

1 sul1 Int1 dfrA16 aadA2 2 qacE

D

1sul1 orf513 qnr ampR qacE

D

1 sul1 orf5 orf6

IS

6100 Int1 aac(6) bla oxA30 catB3 aar-3 qacE

D

1sul1 orf513 qnr ampR qacE

D

1 sul1 orf5 orf6

IS

6100 qacE

D

1sul1 orf513 dfrA18 int1 oxa1 aadA1 qacE

D

1 sul1 qacE

D

1sul1 orf513 bla DHA ampR qacE

D

1 sul1 qacE

D

1sul1 orf513

orf1

bla DHA ampR qacE

D

1 sul1

Extensive “McFarlane-Burnett” Model and Evolution of Bacterial Pathogenicity

Every evolutionary element

(clones, chromosomal sequences, plasmids, transposons, islands, recombinases, insertion sequences...) is independently submitted to apparently

random spontaneous variation

.

Combinations of the variant elements

are

constantly constructed

apparently

at random.

• Eventually

a given combination is selected

and enriched by an unexpected

advantage

(colonization-

pathogenicity

) or fixed by drift.

Pre-pathogens are probably constantly constructed; many of them eliminated by immunity and normal microbiota

The opportunity of meeting interesting people: E. coli in the environment

• It has been suggested that one-half of

E. coli

population resides

in primary habitats (warm blooded hosts) and

one-half in soil or water

.

Tropical waters

harbor natural populations of

E. coli (Carrillo et al., AEM 50:468, 1985)

• In

nutrient-rich soils

, particularly with cyclic periods of wet and dry weather,

E. coli is member of normal microflora

(Winfield and Groisman, AEM 69:3687, 2003)

E. coli in the environment

• Land disposal practices of

sewage

and

sewage sludges

that result from wastewater treatment.

• More than 3 million gallons of

sewage effluent

from more than 3,000 land treatment sites and 15 million septic tanks were applied to land every day in 1984

(Keswick, BH. 1984)

• More than

7 million dry tons

of

sewage sludge

are produced anually and 54 % of this is applied to

soil

(

Environmental Protection Agency, http:// www.epa.gov./oigearth; 2002; Santamaría&Toranzos, Int.Microbiol. 6:5-9, 2003)

E. coli in the environment

EPA Class A Biosolids

Less than 10 3 thermotolerant coliforms/g, for lawns, home gardens, as commercial fertilizer.

EPA Class B Biosolids

Less than 10 6 thermotolerant coliforms/g, for land application, forest lands, reclamation sites. During a period, access is limited to public and livestock.

(Environmental Protection Agency)

Temperature fitness profiles

Absolute fitness 5 0 -5 -10 -15 -20

E. coli K. pneumoniae

10 20 30 40 50 10 20 30 40 50

Temperature (ºC)

Modified from:

Okada and Gordon, Mol. Ecol. 10:2499, 2001

CTX-M-10 linked to Kluyvera and phage sequences

Tn

1000

-like Transposase (fragment) ORF2 ORF3 ORF4 DNA invertase CTX-M-10 ORF7 ORF8 Transposase IS432 ORF10 ORF11 Transposase IS5

Eco Bam

RI HI

Bam

HI

Eco

RI

Bam

HI

Eco

RI

Eco

RI Invertible region Tn

5708

IS4321 fragment

K. cryocrescens

homol. region (90%)

IS5

Phage related region

Oliver, Coque, Alonso, Valverde, Baquero, Cantón. AAC 2005; 1567-1571

  

Present in different clones at Ramón y Cajal Hospital Variability in the sequence among different clones Probably linked to the same plasmid structure

The Extended Genome

A genetic space

composed by the sum of: • The sequences corresponding to the maximal

core genome

of all clones (ortologs-paralogs) , plus • The sequences of

all loops

that have been inserted in such a core in the different

natural

(successful at one time) clones or lineages: ecotypes, geotypes, pathotypes.., plus • The sequences of all

extra-chromosomal elements

stably associated with any clone

Extended Genome: a Genetic Space

Core Loops Peripheral

Extended Genome: Core Gravity

Foreign sequences of different base composition tends to “ameliorate” to resemble the features of the resident genome* Core Loops Peripheral

*Ochman and Jones, EMBO J., 19:6637, 2000

Extended Genome: a Genetic Space

Filling the Carrying Capacity of the Environment for the Species

Genetic Space

Complex Genetic Space

The Extended E. coli Genome

• Research to increase our

interpretative

,

predictive

and

preventive

capability about

Escherichia coli

evolutionary biology.

• Catalog of sequences of

all evolutionary relevant pieces*

in

E. coli

.

• • Network of

all interactions

between pieces.

Modelization

of combinations that might emerge under particular environmental or clinical conditions.

*F.Baquero, From Pieces to Patterns, Nature Reviews 2004

A lot of work, a lot of fun.

Particular thanks to some of my friends in the lab...

• Rafael Cantón • Teresa Coque • Juan-Carlos Galán • José-Luis Martínez (CNB, CSIC)

Gerdes SY et al, JB 2003