The distance between sequences

Download Report

Transcript The distance between sequences

Molecular Phylogeny Analysis,
Part I.
Mehrshid Riahi, Ph.D.
Iranian Biological
Research Center (IBRC),
July 14-15, 2012
Topics
Introduction
 Four steps in Phylogenetic Inference
 Reading Phylogenetic Tree
 Tree interpretation in practice
 Reconstructing Evolutionary Trees

Molecular Phylogeny Analysis
2
Introduction

Taxonomy = the naming & grouping of creatures by characteristics

Classification = the process of grouping things based on their similarities.
* Biologists use it to organize living things into groups for easier study.
Molecular Phylogeny Analysis
3
Man’s Early Systems of
Classification:

* Aristole (Greek in 4th centrury B.C.)
Three groups (Fly, Swim, Walk)

* Linnaeus (1750’s) used a two-part naming system from
Latin. (Dog = Canis familiaris)
- Binomial Nomenclature = a two-part name
1. Genus = first part of the name (Capitalized)
(Groups similar, related organisms)
2. Species = second part of the name. (Lowercase)
(“species identifier”)
(Groups similar organisms that can mate and produce
fertile offspring)

Molecular Phylogeny Analysis
4
Seven (7) Levels of Classification:
** For plants the phylum level is called “Division”.
Kingdom –
Phylum –
Class –
Order –
Family –
Genus –
Species –
“Kingdom” is the biggest and broadest.
Each kingdom contains phyla – each phyla contains
classes, etc. The more levels that two organisms
share, the more characteristics they have in common.
Molecular Phylogeny Analysis
5
Phylogeny and classification
Hierarchy
All taxonomic classifications are hierarchical – how does phylogeny differ?
Class
Order
Order
Family
Genus
Species 1
Species 2
Species 3
Species 4
Genus
Species 1
Species 2
Species 3
Family
Family
Genus
Genus
Species 1
Species 2
Species 1
Species 2
Species 3
Species 4
Species 5
Species 6
Species 7
Species 8
Species 9
Molecular Phylogeny Analysis
Genus
Species 1
Species 2
Genus
Species 1
Genus
Species 1
Species 2
Species 3
6
How can you identify an
organism you find?

Field Guide = book with pictures and
descriptions of organisms and
characteristics

Taxonomic Key = series of paired
statements describing characteristics of
organisms
Molecular Phylogeny Analysis
7
Molecular Phylogeny Analysis
8
Modern Phylogenetic Taxonomy
Systematics is the modern method of organizing “creatures” in the
context of evolution.
 Phylogeny = the supposed evolutionary history of an organism
Phylogenetics is the science of the pattern of evolution

* Evolutionary theory now dominates the classification system, and
assumes that similar organisms in a group evolved from a common
ancestor.

Phylogenetic Trees are family trees supposedly showing
evolutionary relationships “thought to exist among groups of
organisms”, “shows possible relationships”.
Molecular Phylogeny Analysis
9
Phylogenetic tree of Kingdom
Plantae
Molecular Phylogeny Analysis
10
Linnaeus “Lawn”

His ‘lawn’ concept hypothesized that:





Each Genesis “kind” was unrelated to others
Each “kind” stayed the same without variation
Today’s species = the original “kinds”
He was right about unrelated “kinds”.
He was wrong about NO variation and species.
Molecular Phylogeny
Analysis
11
Evolutionary “Tree”
This evolutionary ‘tree’ claims that:
- all modern species are descended from a common ancestor
Molecular Phylogeny
Analysis
12
Phylogenetics
The central problem of phylogenetics:
how do we determine the relationships between taxa?
in phylogenetic studies, the most convenient way of presenting
evolutionary relationships among a group of organisms is the
phylogenetic tree
Molecular Phylogeny Analysis
13
Phylogenetic Taxonomy
Systematic taxonomists use several lines of evidence to
construct a phylogenetic tree.”




Fossil Record – “Billions of dead things buried in rock
layers, laid down by water, all over the earth” (Ken
Ham)
Morphology – similar shape or form (homologous
features) among different animals.
Embryological Patterns of Development – (Ontogeny)
Chromosomes and Macromolecules – Genetic
similarities
Molecular Phylogeny Analysis
14
What is molecular phylogeny?
phylon = Greek for “stem”
genesis = Greek for “origin”
molecular phylogeny = studying relationships
among organisms using molecular markers (e.g.
DNA or protein sequences)
dissimilarities among sequences = genetic
divergence caused by mutations during the
course of time
Molecular Phylogeny Analysis
15
Four steps in Phylogenetic
Inference
1.
Character (data) selection (not too fast, not too
slow)
2. Alignment of Data (hypotheses of primary
homology)
3. Analysis selection (choose the best model /
method(s))
4. Conduct analysis
Molecular Phylogeny Analysis
16
aim:
Work- Flow
group of organisms
or gene family
Choice of molecular marker(s)
and
Taxon sampling
Improvement of
Extraction/Amplification/Sequencing
Alignment
Choice of evolutionary model
Phylogenetic analyses
User- defined trees
And topology testing
Tree(s)
Results
Molecular Phylogeny Analysis
17
Types of Markers
Marker is a piece of DNA molecule that is associated with
a certain trait of a organism
Morphological: Characters are selected based on appearance
Disadvantage: lack of polymorphism
 Biochemical: Characters are selected based on biochemical
properties
Disadvantage: Age dependent, Influenced by environment
It covers less than 10% of genome
 Chromosomal: Characters are selected based on Structural and
Numerical Variations (Structural- Deletions, Insertions etc.
Numerical- Trisomy, Monosomy, Nullysomy)
Disadvantage: low polymorphism
 Genetic:

Molecular Phylogeny Analysis
18
Molecular Marker
Revealing variation at a DNA level
Characteristics:
 Co-dominant expression
 Nondestructive assay
 Early onset of phenotypic expression
 High polymorphism
 Random distribution throughout the genome
 Assay can be automated
Molecular Phylogeny Analysis
19
Methodological Advantages
I.
DNA isolated from any tissue eg. Blood, hair etc.
II.
DNA isolated at any stage even during foetal life
III.
DNA has longer shelf-life readily exchangeable b/w
labs
IV.
Analysis of DNA carried out at early age/ even at the
embryonic
V.
Stage irrespective of sex
Molecular Phylogeny Analysis
20
Microsatellite
Single locus marker
RFLP
STS
Molecular Markers
DNA Fingerprinting
RAPD
Multi-locus marker
Molecular Phylogeny Analysis
AFLP
21
Selection of characters
Morphologists typically choose:
1. Characters that are not constant
2. Characters that are not too variable
Molecular systematists use the same criteria to select which
gene(s) to sequence
Genes that are virtually constant don’t have enough information
Genes that are hypervariable have too much misinformation
Molecular Phylogeny Analysis
22
Selection of Molecular characters
Character / discrete data: nucleotide or amino acid sequences
(can be converted to distances)
“fast & slow” genes:
there is variation in the rate of change among regions of the
genome
e.g. rRNA (e.g. 18S) evolves slowly enough to hold information
that is over 250 million years old
- whereas mtDNA (e.g. COII) evolves much faster and most
information over 30-50 million yrs of age is probably gone
(starts to go at 15-20 my)
Molecular Phylogeny Analysis
23
Selection of Molecular characters
Higher-level phylogenetics: (families & above) use slower,
conserved genes, nuclear genes
- evolve slowly due to functional constraints:
e.g. some proteins “still work” with many
potential amino acids
others won’t, e.g. histones are strongly
conserved
- faster evolving regions, e.g. mtDNA,
-information is overwritten
- back mutations
- yield nonsense phylogenies for deep splits
Molecular Phylogeny Analysis
24
Selection of Molecular characters
Lower-level phylogenetics: (subfamilies &
below) use faster, less-conserved genes,
mtDNA
-
because slower genes would be identical
across your species
- must select genes most appropriate for your
study taxa
Molecular Phylogeny Analysis
25
Typical structure of a eukaryotic
gene
Flanking region
Exon 2
Exon 1
Exon 3
Flanking region
3'
5'
Intron I
TATA
box
Intron II
Initiation
codon
Stop
codon
Transcription
initiation
Poly (A)
addition site
AATAA
Molecular Phylogeny Analysis
26
Selection of Molecular characters
Three types of genes
tRNA - transfer RNA (short)
rRNA - ribosomal RNA (long, conserved)
mRNA - messenger RNA - protein coding (exon)
Also
introns - non coding sequence sometimes inside a
protein coding gene
Can be Nuclear
Typically slower evolving than mitochondrial better for
deeper (older) divergences
Can be Mitochondrial
Better for shallow (recent) divergences
Molecular Phylogeny Analysis
27
DNA Amplification
Molecular Phylogeny Analysis
28
PHYLOGENETIC DATA
ANALYSIS: THE FOUR STEPS
A straightforward phylogenetic analysis
consists of four steps:
 1. Alignment (both building the data model
and extracting a phylogenetic
 dataset)
 2. Determining the substitution model
 3. Tree building
 4. Tree evaluation
Molecular Phylogeny Analysis
29
READING PHYLOGENETIC
TREE
Molecular Phylogeny Analysis
30
Assumptions

Evolution produces dichotomous branching

Evolution is simple – the best explanation assumes least
mutations
If we assume:
1. Our characters are independent
2. Our character states are homologous (& genes orthologous)
3. Evolution has happened
We can infer the evolutionary relationships among organisms

Molecular Phylogeny Analysis
31
Reading phylogenetic trees: A quick review
(Adapted from evolution.berkeley.edu)

A phylogeny, or evolutionary tree, represents the
evolutionary relationships among a set of organisms or
groups of organisms, called taxa (singular: taxon) that are
believed to have a common ancestor.
Molecular Phylogeny Analysis
32
Tips, Internal Nodes, Edges




The tips of the phylogenetic tree represent groups of
descendent taxa (often species)
The internal nodes of the tree represent the common
ancestors of those descendents.
The tips are the present and the internal nodes are the
past.
The edge lengths in some trees correspond to time
estimates – evolutionary time.
Molecular Phylogeny Analysis
33
Parts of a phylogenetic tree






Internal Nodes or Divergence Points (represent hypothetical ancestors of the taxa)
Branch , Lineages : defines the relationship between the taxa in terms of descent and
ancestry
Topology: the branching patterns of the tree
Branch length (scaled trees only): represents the number of changes that have occurred in the
branch
Root: the common ancestor of all taxa
Operational Taxonomic Unit (OTU): taxonomic level of sampling selected by the user to be
used in a study, such as individuals, populations, species, genera, or bacterial strains
Branch
Node
Spec ies A
Spec ies B
Root
Clade
Spec ies C
Spec ies D
Spec ies E
Molecular Phylogeny Analysis
34
Sister Groups and a common ancestor


Two descendents that split from the same node are called
sister groups.
In the trees above, species A & B are sister groups —
they are each other's closest relatives; which means that:


i) they have a lot of evolutionary history in common and very little
evolutionary history that is unique to either one of the two sister
species and
ii) that they have a common ancestor that is unique to them.
Molecular Phylogeny Analysis
35
Equivalent trees


For any speciation event on a phylogeny, the choice of
which lineage goes to the right and which one goes to the
left is arbitrary.
These three phylogenies are therefore equivalent.
Molecular Phylogeny Analysis
36
Phylogenetic trees

There are many ways of drawing a tree
A
B
C
D
E
E
=
C
D
B
E
A
D
C
B
A
=
Molecular Phylogeny Analysis
37
Phylogenetic trees

There are many ways of drawing a tree
A
A
B
C
D
E
B
C
D
E
Molecular Phylogeny Analysis
38
Phylogenetic trees

There are many ways of drawing a tree
A
B
C
D
A
E
=
B
C
D
E
A
B
C
D
E
=
no meaning
Molecular Phylogeny Analysis
39
Outgroup



Many phylogenies also include an outgroup — a taxon
outside the group of interest.
All the members of the group of interest are more closely
related to each other than they are to the outgroup.
Hence, the outgroup stems from the base of the tree.
An outgroup can give you a sense of where on the bigger
tree of life the main group of organisms falls. It is also
useful when constructing evolutionary trees.
Molecular Phylogeny Analysis
40
Branches and clades




Evolutionary trees depict
clades.
A clade is a group of
organisms that are all
descendent from a
common ancestor; thus a
clade includes an ancestor
and all descendents of that
ancestor.
You can think of a clade as
a branch on the tree of life.
Some examples of clades
and non-clades in a
phylogenetic tree are
shown here
Molecular Phylogeny Analysis
41
More on clades. Nested clades




Clades are nested within one another — they form a
nested hierarchy.
A clade may include many thousands of species or just a
few.
Some examples of clades at different levels are marked
on the phylogenies above.
Notice how clades can be nested within larger clades.
Molecular Phylogeny Analysis
42
Types of trees: unrooted vs rooted


A rooted phylogenetic tree is a tree with a unique root
node corresponding to the (usually imputed) most recent
common ancestor of all the entities at the leaves (aka
tips) of the tree. A rooted tree is a binary tree.
Unrooted trees illustrate the relatedness of the leaf
nodes without making assumptions about common
ancestry. An unrooted tree has a node with three edges;
the rest of the nodes have up to two edges.
Molecular Phylogeny Analysis
43
Rooting the Tree




In an unrooted tree the direction of evolution
is unknown
The root is the hypothesized ancestor of the
sequences in the tree
The root can either be placed on a branch or
at a node
You should start by viewing an unrooted tree
Molecular Phylogeny Analysis
44
Positioning Roots in Unrooted
Trees

We can estimate the position of the root
by introducing an outgroup:
Proposed root
Falcon
Aardvark
Bison
Chimp
Dog
Elephant
Molecular Phylogeny
45
Analysis
Rooting Using an Outgroup
1. The outgroup should be a sequence (or set of
sequences) known to be less closely related to the rest
of the sequences than they are to each other
2. It should ideally be as closely related as possible to the
rest of the sequences while still satisfying condition 1
The root must be somewhere between the outgroup and
the rest (either on the node or in a branch)
Molecular Phylogeny Analysis
46
Dendrogram, cladogram, phylogram



Dendrogram is the ‘generic’ term applied to any type of diagrammatic representation of
phylogenetic trees. All four trees depicted here are dendrograms.
Cladogram (to some biologists) is a tree in which branch lengths DO NOT represent evolutionary
time; clades just represent a hypothesis about actual evolutionary history
TREE1 and TREE2 are cladograms and TREE1 = TREE2
Phylogram (to some biologists) is a tree in which branch lengths DO represent evolutionary time;
clades represent true evolutionary history (amount of character change) TREE3 and TREE4 are
phylograms and TREE3 ≠ TREE4
Molecular Phylogeny Analysis
47
Phylogenetic trees
Molecular Phylogeny Analysis
48
Phylogenetic Trees and classification

Phylogenetic trees classify organisms into clades. By contrast, the Linnaean
system of classification assigns every organism a kingdom, phylum, class,
order, family, genus, and species. The phylogenetic tree depicted here
identifies four clades
To build a phylogenetic tree biologists collect data about the characters of
each organism they are interested in. Characters are heritable traits that can
be compared across organisms, such as physical characteristics
(morphology), genetic sequences, and behavioral traits.
Some molecular biologists (like C. Woese) build phylogenetic trees from
genetic sequences alone.
Molecular Phylogeny Analysis
49
Phylogenetic trees
A
B
C
D
A
E
B
C
D
E
=/
Bifurcation
Trifurcation
Bifurcation versus Multifurcation (e.g. Trifurcation)
I.
Multifurcation (also called polytomy): a node in a tree that connects more than
three branches. A multifurcation may represent a lack of resolution because of too
few data available for inferring the phylogeny (in which case it is said to be a soft
multifurcation) or it may represent the hypothesized simultaneous splitting of
several lineages (in which case it is said to be a hard multifurcation)
Molecular Phylogeny Analysis
50
The goal of phylogeny inference is to resolve the
branching orders of lineages in evolutionary trees:
Completely unresolved
or "star" phylogeny
Partially resolved
phylogeny
A
A
A
B
C
E
C
E
C
D
B
B
E
D
D
Polytomy or multifurcation
Fully resolved,
bifurcating phylogeny
A bifurcation
Molecular Phylogeny Analysis
51
Phylogenetic trees

Trees can be scaled or unscaled (with or without branch lengths)
A
A
B
B
C
unit
C
D
D
E
E
C
A
C
D
A
D
unit
B
B
E
E
Molecular Phylogeny Analysis
52
TREE INTERPRETATION IN
PRACTICE
Molecular Phylogeny Analysis
53
1) By reference to the tree above, which of the following is an accurate statement of
relationships?
a) A green alga is more closely related to a red alga than to a moss
b) A green alga is more closely related to a moss than to a red alga
c) A green alga is equally related to a red alga and a moss
d) A green alga is related to a red alga, but is not related to a moss
Molecular Phylogeny Analysis
54
1) By reference to the tree above, which of the following is an accurate statement of
relationships?
a) A green alga is more closely related to a red alga than to a moss
b) A green alga is more closely related to a moss than to a red alga
c) A green alga is equally related to a red alga and a moss
d) A green alga is related to a red alga, but is not related to a moss
Molecular Phylogeny Analysis
55
2) By reference to the tree above, which of the following is an accurate statement of
relationships?
a) A crocodile is more closely related to a lizard than to a bird
b) A crocodile is more closely related to a bird than to a lizard
c) A crocodile is equally related to a lizard and a bird
d) A crocodile is related to a lizard, but is not related to a bird
Molecular Phylogeny Analysis
56
2) By reference to the tree above, which of the following is an accurate statement of
relationships?
a) A crocodile is more closely related to a lizard than to a bird
b) A crocodile is more closely related to a bird than to a lizard
c) A crocodile is equally related to a lizard and a bird
d) A crocodile is related to a lizard, but is not related to a bird
Molecular Phylogeny Analysis
57
3) By reference to the tree above, which of the following is an accurate statement of
relationships?
a) A seal is more closely related to a horse than to a whale
b) A seal is more closely related to a whale than to a horse
c) A seal is equally related to a horse and a whale
d) A seal is related to a whale, but is not related to a horse
Molecular Phylogeny Analysis
58
3) By reference to the tree above, which of the following is an accurate statement of
relationships?
a) A seal is more closely related to a horse than to a whale
b) A seal is more closely related to a whale than to a horse
c) A seal is equally related to a horse and a whale
d) A seal is related to a whale, but is not related to a horse
Molecular Phylogeny Analysis
59
4) Which of the five marks in the tree above corresponds to the most recent
common ancestor of a mushroom and a sponge?
Molecular Phylogeny Analysis
60
4) Which of the five marks in the tree above corresponds to the most recent
common ancestor of a mushroom and a sponge?
Molecular Phylogeny Analysis
61
5) If you were to add a trout to the phylogeny shown above, where would its
lineage attach to the rest of the tree?
Molecular Phylogeny Analysis
62
5) If you were to add a trout to the phylogeny shown above, where would its
lineage attach to the rest of the tree?
Molecular Phylogeny Analysis
63
6) Which of trees below is false given the larger phylogeny above?
Molecular Phylogeny Analysis
64
6) Which of trees below is false given the larger phylogeny above?
Molecular Phylogeny Analysis
65
7) Which of the four trees above depicts a different pattern of relationships than the
others?
Molecular Phylogeny Analysis
66
7) Which of the four trees above depicts a different pattern of relationships than the
others?
Molecular Phylogeny Analysis
67
8) Which of the four trees above depicts
a different pattern of relationships than the others?
Molecular Phylogeny Analysis
68
8) Which of the four trees above depicts
a different pattern of relationships than the others?
Molecular Phylogeny Analysis
69
9) In the above tree, assume that the ancestor had a long tail, ear flaps, external
testes, and fixed claws. Based on the tree and assuming that all evolutionary
changes in these traits are shown, what traits does a sea lion have?
a) long tail, ear flaps, external testes, and fixed claws
b) short tail, no ear flaps, external testes, and fixed claws
c) short tail, no ear flaps, abdominal testes, and fixed claws
d) short tail, ear flaps, abdominal testes, and fixed claws
e) long tail, ear flaps, abdominal testes, and retractable claws
Molecular Phylogeny Analysis
70
9) In the above tree, assume that the ancestor had a long tail, ear flaps, external
testes, and fixed claws. Based on the tree and assuming that all evolutionary
changes in these traits are shown, what traits does a sea lion have?
a) long tail, ear flaps, external testes, and fixed claws
b) short tail, no ear flaps, external testes, and fixed claws
c) short tail, no ear flaps, abdominal testes, and fixed claws
d) short tail, ear flaps, abdominal testes, and fixed claws
e) long tail, ear flaps, abdominal testes, and retractable claws
Molecular Phylogeny Analysis
71
10) In the above tree, assume that the ancestor was a herb (not a tree) without
leaves or seeds.
Based on the tree and assuming that all evolutionary changes in these traits are
shown, which of the tips has a tree habit and lacks true leaves?
a) Lepidodendron
b) Clubmoss
c) Oak
d) Psilotum
e) Fern
Molecular Phylogeny Analysis
72
10) In the above tree, assume that the ancestor was a herb (not a tree) without
leaves or seeds.
Based on the tree and assuming that all evolutionary changes in these traits are
shown, which of the tips has a tree habit and lacks true leaves?
a) Lepidodendron
b) Clubmoss
c) Oak
d) Psilotum
e) Fern
Molecular Phylogeny Analysis
73
Tree structure
I.
A tree can be also presented in a text format: (A(B(C,D)))
II.
The graphic structure can be difficult to interpret (2-dimentional)
Molecular Phylogeny Analysis
74
Visualising trees
I.
Treeview
II.
You can change the graphic presentation
of a tree (cladogram, rectangular
cladogram, radial tree, phylogram), but
not change the structure of a tree
Molecular Phylogeny Analysis
75
RECONSTRUCTING
EVOLUTIONARY TREES
Molecular Phylogeny Analysis
76
Recall




The phylogeny of a group of taxa (species, etc.) is its
evolutionary history
A phylogenetic tree is a graphical summary of this
history — indicating the sequence in which lineages
appeared and how the lineages are related to one
another
Because we do not have direct knowledge of
evolutionary history, every phylogenetic tree is an
hypothesis about relationships
Of course, some hypotheses are well supported by
data, others are not
Molecular Phylogeny Analysis
77
Questions

How do we make phylogenetic trees?



What kinds of data do we use?





Cladistic methodology
Similarity (phenetics)
Morphology
Physiology
Behavior
Molecules
How do we decide among competing
alternative trees?
Molecular Phylogeny Analysis
78
Similarity

The basic idea of phylogenetic reconstruction is simple:


Taxa that are closely related (descended from a relatively recent
common ancestor) should be more similar to each other than taxa
that are more distantly related — so, all we need to do is build trees
that put similar taxa on nearby branches — this is the phenetic
approach to tree building
Consider, as a trivial example, leopards, lions, wolves and coyotes:
all are mammals, all are carnivores, but no one would have any
difficulty recognizing the basic similarity between leopards and
lions, on the one hand, and between wolves and coyotes, on the
other, and producing this tree; which, it would probably be
universally agreed, reflects the true relationships of these 4 taxa
leopard
lion
wolf
coyote
Molecular Phylogeny Analysis
79
Causes of similarity
Things are seldom as simple as in the
preceding example
 We need to consider the concept of
biological similarity, and the way in which
similarity conveys phylogenetic
information, in greater depth:

Homology
 Homoplasy

Molecular Phylogeny Analysis
80
Homology

A character is similar (or present) in two taxa because their
common ancestor had that character:
cat
hawk
dove
wings

In this diagram, wings are homologous characters in hawks and
doves because both inherited wings from their common winged
ancestor
Molecular Phylogeny Analysis
81
Homoplasy

A character is similar (or present) in two taxa because of
independent evolutionary origin (i.e., the similarity does not
derive from common ancestry):
hawk
bat
cat
wings

In this diagram, wings are a homoplasy in hawks and bats
because their common ancestor was an un-winged tetrapod
reptile. Bird wings and bat wings evolved independently.
Molecular Phylogeny Analysis
82
Types of homoplasy

Convergence


Independent evolution of similar traits in distantly related
taxa — streamlined shape, dorsal fins, etc. in sharks and
dolphins
Parallelism
Independent evolution of similar traits in closely related taxa
— evolution of blindness in different cave populations of the
same fish species
Reversal
 A character in one taxon reverts to an earlier state (not
present in its immediate ancestor)


Molecular Phylogeny Analysis
83
Reversal

A character is similar (or present) in two taxa because a reversal
to an earlier state occurred in the lineage leading to one of the
hawk
bat
cat
taxa:
ACCT
ACTT
ACCT

In this diagram, hawks and cats share the ancestral nucleotide
sequence ACCT, but this is due to a reversal on the lineage
leading to cats
Molecular Phylogeny Analysis
84
Cladograms

Within a tree a clade is defined as a
group that includes an ancestral species
and all of its descendants.

Cladistics is the science of how species
may be grouped into clades.
Molecular Phylogeny Analysis
85
Cladistics



By definition, homology indicates evolutionary
relationship — when we see a shared homologous
character in two species, we know that they share a
common ancestor
Build phylogenetic trees by analyzing shared
homologous characters
Of course, we still have the problem of deciding
which shared similarities are homologies and which
are homoplasies (to which we shall return)
Molecular Phylogeny Analysis
86
Two kinds of homology – 1

Shared ancestral homology — a trait found in
all members of a group for which we are
making a phylogenetic tree (and which was
present in their common ancestor) —
symplesiomorphy


For example: a backbone is a shared ancestral
homology for dogs, humans, and lizards
Symplesiomorphies DO NOT provide phylogenetic
information about relationships within the group
being studied
Molecular Phylogeny Analysis
87
Two kinds of homology – 2

Shared derived homology — a trait found in some
members of a group for which we are making a
phylogenetic tree (and which was NOT present in the
common ancestor of the entire group) —
synapomorphy



For example: hair is (potentially) a shared derived homology
in the group [dogs, humans, lizards]
Synapomorphies DO provide phylogenetic information
about relationships within the group being studied
In this particular case, if hair is a synapomorphy in dogs and
humans, then dogs and humans share a common ancestor
that is not shared with lizards, and the common dog-human
ancestor must have lived more recently than the common
ancestor of all three taxa
Molecular Phylogeny Analysis
88
A tree for [dogs, humans, lizards] – 1
lizard
human
dog
hair
backbone
• The TWO major assumptions that we are making
when we build this tree are:
1) hair is homologous in humans and dogs
2) hair is a derived trait within tetrapods
Molecular Phylogeny Analysis
89
A tree for [dogs, humans, lizards] – 2
lizard
human
dog
hair
backbone
• In the absence of other information, the assumption of homology
of hair in humans and dogs is justified by parsimony (fewest
number of evolutionary steps is most likely = simplest
explanation)
• Also we can check to see that hair is formed in the same way by
the same kinds of cells, etc.
Molecular Phylogeny Analysis
90
A tree for [dogs, humans, lizards] – 3
human
lizard
hair
dog
hair
dog
lizard
hair
backbone
human
hair
backbone
• These trees (in which hair is considered a homoplasy
in dogs and humans) are less parsimonious than the
one on the previous slide, because they require two
independent evolutionary origins of hair
Molecular Phylogeny Analysis
91
Character Polarity

What’s the basis for our second major
assumption – that hair is a derived trait
within this group (and that absence of
hair is primitive)?
Fossil record
 Outgroup analysis

Molecular Phylogeny Analysis
92
Outgroups – 1



An outgroup is a taxon that is related to, but not part
of the set of taxa for which we are constructing the
tree (the “in group”)
Selection of an outgroup requires that we already
have a phylogenetic hypothesis
A character state that is present in both the outgroup
and the in group is taken to be primitive by the
principle of parsimony (present in the common
ancestor of both the outgroup and the in group and,
therefore, homologous)
Molecular Phylogeny Analysis
93
Outgroups – 2



In the present example, [dog, human, lizard] are all
amniote tetrapods. The anamniote tetrapods
(amphibia) make a reasonable outgroup for this
problem
No amphibia have hair, therefore absence of hair
[amphibia, lizards] is primitive (plesiomorphic) and
presence of hair [dogs, humans] is derived
(apomorphic)
So, presence of hair is a shared derived character
(synapomorphy), and dogs and humans are more
closely related to each other than either is to lizards
Molecular Phylogeny Analysis
94
A tree for [dogs, humans, lizards] – 4
Amphibia
lizard
human
dog
hair
amniotic egg
backbone
• The presence of hair is apomorphic (derived) because
no amphibians have hair
Molecular Phylogeny Analysis
95
Theories of taxonomy
There are two current major theories of
taxonomy:
Traditional Evolutionary Taxonomy
 Phylogenetic Systematics (Cladistics)


Both based on evolutionary principles, but
differ in the application of those principles
to formulate taxonomic groups.
Molecular Phylogeny Analysis
96
Theories of taxonomy
There are three different ways a taxon
may be related to a phylogentic tree.
 The taxon may be a monophyletic,
paraphyletic or polyphyletic grouping

Molecular Phylogeny Analysis
97
Monophyletic Group

A monophyletic taxon includes the most
recent common ancestor of a group and
all of its descendents.
Molecular Phylogeny Analysis
98
Molecular Phylogeny Analysis
99
Paraphyletic group

A taxon is paraphyletic if it includes the
most recent common ancestor of a group
and some but not all of its descendents.
Molecular Phylogeny Analysis
100
Molecular Phylogeny Analysis
101
Polyphyletic grouping

A taxon is polyphyletic if it does not contain the
most recent common ancestor of all members
of the group.

This situation requires the group to have had
independent evolutionary origin of some
diagnostic feature. E.g. If you grouped birds
and bats into a group you called
“WingedThings” it would be a polyphyletic
group because birds and bats evolved wings
separately.
Molecular Phylogeny Analysis
102
Molecular Phylogeny Analysis
103
Theories of taxonomy

Both traditional evolutionary taxonomy
and cladistics reject polyphyletic groups.

They both accept monophyletic groups,
but differ in their treatment of paraphyletic
groupings.
Molecular Phylogeny Analysis
104
Phylogeny and classification
Monophyly
Each of the colored lineages
in this echinoderm phylogeny
is a good monophyletic group
Asteroidea
Ophiuroidea
Echinoidea
Holothuroidea
Crinoidea
Each group shares a common
ancestor that is not shared by any
members of another group
Molecular Phylogeny Analysis
105
Paraphyletic groups
Foxes
Paraphyly
“Foxes” are paraphyletic with respect
to dogs, wolves, jackals, coyotes, etc.
This is a trivial example because
“fox” and “dog” are not formal
taxonomic units, but it does show
that a dog or a wolf is just a derived
fox in the phylogenetic sense
Lindblad-Toh et al. (2005) Nature 438: 803-819
Molecular Phylogeny Analysis
106
Paraphyletic groups
Lizards
Paraphyly
“Lizards” (Sauria) are
paraphyletic with respect
to snakes (Serpentes)
Serpentes is a monophyletic
clade within lizards
Squamata (lizards + snakes)
is a monophyletic clade
sister to sphenodontida
Snakes are just derived,
limbless lizards
Fry et al. (2006) Nature 439: 584-588 Molecular Phylogeny Analysis
107
Traditional Evolutionary Taxonomy

TET uses two principles for designating taxa.



Common descent
Amount of adaptive evolutionary change
The second criterion leads to the idea that
groups may be designated as higher level taxa
because they represent a distinct “adaptive
zone” (Simpson) because they have undergone
adaptive change that fits them to a unique role
(e.g. penguins, humans).
Molecular Phylogeny Analysis
108