Transcript Document

This presentation was originally prepared by
C. William Birky, Jr.
Department of Ecology and Evolutionary Biology
The University of Arizona
It may be used with or without modification for
educational purposes but not commercially or for profit.
The author does not guarantee accuracy and will not
update the lectures, which were written when the course
was given during the Spring 2007 semester.
MOLEC ULAR EVOLUTION
Focus is on long-term evolut ion leading to differences between species.
Differences between species are determined by the same factors that determine
differences between individuals within a species: mutat ion, select ion, and drift . But there
are some crit ical differences:
1. Time period
Populat ion genetics: Š 4Ne generat ions (Š time since most recent common ancestor of
all copies of a gene in a species)
Evolut ionary genet ics: > 4Ne generat ions
2. Select ion causes hit chhiking within species; we only discussed this in terms of the
AdhF/S difference.
3. Method of study
Rates of sequence evolution
Molecular evolution is generally studied by following procedure:
(1) Sequence one copy of a gene from each of a number of different species of
organisms
(2) Align sequences.
(3) Calculate the proportion of sites that differ, for each pair of species.
d = pairwise sequence difference in differences per bp
(4) Correct for multiple hit s, especially if d •0.10
sequence divergence = K = -(3/4) ln[1-(4/3)d] in substitution s per
bp
(5) Calculate rate of sequence divergence if desired
E = K/2T in substitution s per bp per year




A species is a population of organisms.
During speciation, one species splits into two populations that evolve
independently of each other.
To study evolution, we usually sequence one gene from each species.
Many mutations occur in the subsequent evolution of both species, but most
are eliminated. The sequences only show differences that were fixed in one
or the other species.
K = frequency of base pair substitutions that occurred along both evolutionary
paths; i.e. the number of mutations that occurred AND were fixed in one or the
other species instead of being lost.
K = num ber (per site) of mutations that occurred AND were fixed in on e or the
other species instead of being lost.
We can state this with a remarkably simple equation:
E = MF
rate of molecular Evolution = total Mutation rate X Fixation probability
M = 2Nu
where
u = mutations per site (bp) per gamete
F = mutations fixed/total mutations (substitutions per mutation)
2N = number of gametes in population or number of copies of the gene in
population
E = 2NuF
If we want to express this in units of time, then we have to incorporate
generation time by assigning units:
2N = gametes per year
2N
u
gametes X mutations
year
site X gamete
F
X substitutions
mutation
e.g.
u = 5  10-9 mutations per site per gamete
2N = 106 gametes
F = 10-7 substitutions per mutation
E = 5  10-10 substitutions per site per year
Haploids, organelles, asexuals: E = NuF
The rate of neutral substitution equals the mutation rate.
Neutral mutations:
Fn = 1/2N for a new mutation
En = 2Nu(1/2N) = u
E =u
!!
This remarkably simple result is also remarkably important. It means that
The mutation rate can be estimated from the substitution rate for neutral
mutations.


The mutation rate equals the pseudogene substitution rate because
pseudogene substitutions are neutral.
Synonymous substitution rates in functional protein-coding genes are about
equal to pseudogene rates in eukaryotes, which shows that they are also
neutral or effectively neutral (average Ne|s| << 1).
Directional selection reduces F and thus E
-2Nes/N
F(new mutation in diploids) = (1 – e
-4Nes
) / (1 – e
)
The three classes of mutations with different levels of polymorphism also have
different rates of substitution:
(1) neutral
(2) detrimental
(3) advantageous
s=0
s<0
s>0
H - 4Neu
H < 4Neu
H < 4Neu
F = 1/2N
F < 1/2N
F > 1/2N
E =u
E <u
E >u
Note that we are ignoring those subject to balancing selection, as they are rare.
The great majority of mutations are either neutral or detrimental, so on average,
F < 1/2N and E < u.
Emphasize: this is average over a large number of sites.
This is actually what we observe:
For large sample of genes in mammals:
-9
Synonymous rate
3.51 X 10 substitutions per site per year
-9
Nonsynonymous rate
0.74 X 10
e.g. comparisons of globin genes in cow and goat
K (mean ± std. error)
-globin
pseudogenes
9.1 ± 0.9
- and -globin exon synonymous
8.6 ± 2.5
- and -globin intron
8.1 ± 0.7
- and -globin 5’-flanking
5.3 ± 1.2
Note that this is evidence of natural selection which is predominantly
purifying,eliminating detrimental mutations.
MEASURING THE STRENGTH OF PURIFYING SELECTION
Calculate ratio nonsynonymous substitutions/synonymous substitutions =
Kn/Ks or Ka/Ks.
-9
0.74 X 10 = 0.21
-9
3.51 X 10
neutral mutations + detrimental mutations
neutral mutations only
neutral mutations + advantageous mutations
Kn/Ks < 1
Kn/Ks = 1
Kn/Ks > 1
DETECTING POSITIVE SELECTION (FOR ADVANTAGEOUS MUTATIONS)
Selection for advantageous mutations: Kn/Ks > 1
We can use this to detect positive selection for advantageous mutations. Use
computer to isolate specific sites and calculate Kn/Ks for each site. Then find if
find some sites have Kn/Ks > 1, these probably had one or more advantageous
mutations fixed in fairly recent time.
Making Phylogenetic Trees
If we have DNA sequences from the genes of three or more species, we can use
them to recover their evolutionary history by making a phylogenetic tree.
Can do same using AA sequences of protein encoded by gene.
Saw at beginning of course in the tree of life.
Need one gene that is present in all organisms: usually use gene for small
subunit of ribosomal RNA.
Not good for some organisms, so usually use protein-coding genes for
eukaryotes.
Done by computer.
Aligns all sequences.
Calculates pairwise d.
Corrects for multiple hits to get pairwise K..
Uses pairwise K values to make trees.
Give e.g. of three species from our bdelloid rotifers plus an outgroup
(monogonont rotifer).
Makes tree, using any one of a number of different algorithms to make tree that
is simplest or most probable given the data.
Ari1/2
WPr
FlT2/1
B.quadrid
Ari
WPr
Ari1/2
0.19394
0.17876
0.56295
WPr1/1
-
0.11301
0.52794
FlT
Flt2/1
-
0.53827
-
Bq
Old way: use morphological differences/similarities.
Sequence data have some advantages:
 Use neutral or nearly neutral sites, avoids problems due to convergent or
parallel evolution.
 Hard to know what morphological traits are informative in many organisms.
 Can make phylogenetic trees of genes even without seeing the organisms!
Uses:

Reconstruct evolutionary history
e.g. tree of life shown at beginning of course
Potential problems.
If I sequence a gene from a cow, a langur monkey, and a rhesus monkey, which
tree will I get?
cow
langur
m onk e y
rhe s us
m onk e y
cow
langur
m onk e y
rhe s us
m onk e y
Tree of lysozyme protein sequences:
cow
langur
m onk e y
rhe s us
m onk e y
cow
langur
m onk e y
rhe s us
m onk e y
Tree is clearly wrong. Reason:
Cows and langur monkeys independently evolved foregut fermentation of plant
cellulose by bacteria. Lysozyme dissolves bacteria. Lysozyme evolved to work at
lower pH, changing amino acids at five sites so they are identical in these species.
Make tree with DNA sequences, get normal tree. (right).

Detecting selection for advantageous mutations: convergent evolution
e.g. convergent evolution of lysozyme in langur monkey rumen

Study evolutionary mechanisms.
e.g. we are using tree with over 300 sequences to study the evolutionary
consequences of replacing sex with parthenogenesis
Detect horizontal transfer: e.g. origin of chloroplasts from cyanobacterial
endosymbionts
Detect horizontal transfer: e.g. origin of chloroplasts from cyanobacterial
endosymbionts
GENOME EVOLUTION: THE ORIGIN OF NEW GENES WITH NEW
FUNCTIONS
Have focused on changes in bp sequence of genes. This can actually lead to
changes in gene function if the bp sequence changes are nonsynonymous and
result in changes in amino acids.
But in course of evolution, more complex organisms have arisen with increased
number of genes. Genes with old functions have been retained (e.g. genes coding
for ribosomal RNA), while additional genes with new functions have arisen.
Happens as result of gene duplication.
Several ways of making a copy of a gene and inserting it in a new location.
Fates of duplicate genes:
1. Nonfunctionalization  pseudogene  loss by deletion
2. Subfunctionalization: activity of both copies is reduced so still make same
total amount of product.
3. Neofunct ionalizat ion: one copy acquires mutat ions that give it a new funct ion.
e.g. human globins: example of clustered multigene family.
myoglobin

alpha

beta

gamma

delta

epsilon

zeta

psi
muscle (stores oxygen)
major adult
major adult
fetal
infant, minor adult
embryonic
embryonic
pseudogene
chromosome 16
 family
2
1
a1
2
1
chromosome 11
 family
b2

G
A
b1
Entire gene family arose from a single ancestral gene by duplication.
Shows nonfunctionalization and neofunctionalization.
Phylogenetic tree:


SEXUAL REPRODUCTION
Sexual reproduction is widespread but most often alternates with asexual
reproduction.
 Mammals, birds: obligate sex
 Other eukaryotes include some obligately sexual; some obligately asexual;
most reproduce asexually most of the time with occasional sex
Asexual reproduction has advantages, e.g.:
 faster
 one individual can colonize
 in plants and animals with separate sexes, parthenogenetic mutation has 2fold advantage because all offspring are parthenogenetic mutants
What are advantages of sex?
Sex in diploids makes some mutations homozygous so recessive mutations can
be eliminated more easily.
Sex with outcrossing makes natural selection more effective:
2 loci, A and B advantageous alleles, a and b detrimental mutations.
AB
AB
AB
Ab
aB
asexual
Selection for B tends to fix a; selection for A tends to fix b
sexual
Ab X aB --> AB and ab
Selection on ab eliminates a and b
Result: compared to sexual population, asexu al population
1. accumul ates detrimental mutations (Muller’s ratchet)  reduces fitness
of individu als and population  early extin ction
2. has trouble fixing advantageous mutations  less able to adapt to
chang ing condition s  early extin ction and less speciation
We are testing this in bdelloid rotifers, which have been parthenogenetic
(asexu al) for •60 mill ion years.
Ka/Ks ratio is same as for their close sexual relatives.
Why???
Join my lab next year and h elp us find out.