Section 7 Population Fragmentation Historic Current

Download Report

Transcript Section 7 Population Fragmentation Historic Current

Section 7
Population Fragmentation
Historic
Distribution
Current
Distribution
When a population is fragmented, different
fragments will have different initial allele
frequencies by chance.
The extent of diversification in allele frequencies
can be measured as variance in allele freq. as:
p2 = pq/2Ne
Thus, there will be greater variance & larger
differentiation of allele frequencies among small
fragments.
Additionally, on average, fragmented popualtions
have reduced heterozygosity and increased
variance in heterozygosity across loci within
populations.
The average reduction in heterozygosity due to
sampling from the base population is 1/2Ne.
The initial reduction is minor unless the population
fragments are very small (< 10).
As heterozygosity is lost within fragments &
allele frequencies drift among fragments, there
is a deficiency of heterozygotes when compared
to HWE for the entire population and this is
known as the “Wahlund Effect”.
Sometimes population substructuring is not obvious, and as
a result, a sample may consist of a group of heterogeneous
subsamples from a population.
For example, subpopulations may be separated by subtle
physical or ecological barriers that limit movement
between groups.
When these subpopulations are lumped together and if
there are differences in allele frequencies among these
subpopulations, there will be a deficiency of heterozygotes
and an excess of homozygotes which is a Wahlund Effect.
All AA
Wahlund Effect
All aa
With increasing fragmentation, population size
within each fragment becomes smaller and
differentiation among fragments increases.
Inbreeding and inbreeding depression will be more
rapid in smaller than larger fragments as will
genetic drift and loss of genetic diversity.
Measuring Population Fragmentation: F-statistics
Differentiation among fragments or sub-populations
is directly related to the inbreeding coefficients
within and among sub-populations or fragments.
Using inbreeding coefficients, Sewall Wright
described the distribution of genetic diversity
within and among population fragments.
He partitioned inbreeding of individuals (I) in the
total (T) populations (FIT) into that due to:
Inbreeding of individuals relative to their
sub-population (FIS)
Inbreeding due to differentiation among
sub-populations, relative to the total population
(FST).
More specifically, FIS is the probability that two
alleles in an individual are identical by descent.
This is the inbreeding coefficient (F) averaged
across all individuals from all population fragments.
FST, the fixation index, is the effect of the
population sub-division on inbreeding.
FST = probability that two alleles drawn at
random from a single population fragment (either
from different individuals or the same individual)
are identical by descent.
With high rates of gene flow among fragments,
this probability is low.
With low rates of gene flow among fragments,
populations diverge and become inbred and
FST increases.
F-statistics are used to understand factors
involved in causing a population to deviate from
Hardy-Weinberg expectations.
Deviations from HWE has 2 components:
Deviations due to factors within subpopulations
Deviations due to factors among subpopulations
If the deviations result from excess of
HOMOZYGOTES, there may be selection, inbreeding,
or further population subdivision.
If deviations result from excess of
HETEROZYGOTES, there may be overdominant
selection or outbreeding.
Terms & Equations for F-statistics
F = fixation index and is a measure of how much the
observed heterozygosity deviates from HWE
F = (He - Ho)/He
HI = observed heterozygosity over ALL
subpopulations.
HI = (Hi)/k where Hi is the observed H of
the ith supopulation and k = number of
subpopulations sampled.
HS = Average expected heterozygosity within
each subpopulation.
HS = (HIs)/k
Where HIs is the expected H within the ith
subpopulation and is equal to 1 - xi2 where
xi2 is the frequency of each allele.
HT = Expected heterozygosity within the total
population.
HT = 1 - xi2
where xi2 is the frequency of each allele averaged
over ALL subpopulations.
FIT measures the overall deviations from HWE
taking into account factors acting within
subpopulations and population subdivision.
FIT = (HT - HI)/HT and ranges from - 1 to +1
because factors acting within subpopulations
can either increase or decrease Ho relative
to HWE.
Large negative values indicate overdominance
selection or outbreeding (Ho > He).
Large positive values indicate inbreeding or
genetic differentiation among subpopulations
(Ho < He).
FIS measures deviations from HWE within
subpopulations taking into account only those
factors acting within subpopulations
FIS = (HS - HI)/HS and ranges from -1 to +1
Positive FIS values indicate inbreeding or
mating occurring among closely related
individuals more often than expected under
random mating.
Individuals will possess a large proportion of
the same alleles due to common ancestry.
This leads to a reduction in Ho relative to HWE
(Ho < He).
Negative FIS suggests outbreeding or mating
mating occurring among individuals having different
genotypes more often than expected under
random mating (Ho > He).
However, while large negative and positive values
of FIS indicate non-random mating, the
interpretation of FIS is the same as FIT.
Thus, there is the possibility of overdominant
selection or further subdivision (Wahlund Effect).
If you knowingly lump different subpopulation
into one, FIS will be positive not because of
inbreeding but because of population
differentiation.
An understanding of the populations natural
history can address the likelihood of this
possibility.
FST measures the degree of differentiation
among subpopulations -- possibly due to population
subdivision.
FST = (HT - HS)/HT and ranges from 0 to 1.
FST estimates this differentiation by comparing
He within subpopulations to He in the total
population.
FST will always be positive because He in
subpopulations can never be greater than He in
the total population.
As a general rule of thumb, values of FST that are
statistically different from 0:
0.05 --> 0.15 = moderate genetic differentiation.
0.15 --> 0.25 = high genetic differentiation.
> 0.25 = very high genetic differentiation.
FST = (FIT - FIS)/(1 - FIS)
Where,
FIS = 1 - (HI/HS)
FST = 1 - (HS/HT)
FIT = 1 - (HI/HT)
Scenario #1 -- 5 subpopulations, 1 locus, 2 alleles
Given: Observed heterozygosity (HI) = 0.18
Pop
1
2
3
4
5
Ave
Ave
Xi
0.9
0.9
0.9
0.9
0.9
Xj
0.1
0.1
0.1
0.1
0.1
0.9
0.1
HIs=1-Xi2
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
HS = HIs/k = 0.18
HT = 1 - xi2 = 0.18
Scenario #1
FIS = 1 - (HI/HS) = 1 - (0.18/0.18) = 0.00
FIT = 1 - (HI/HT) = 1 - (0.18/0.18) = 0.00
FST = 1 - (HS/HT) = 1 - (0.18/0.18) = 0.00
Conclusions:
No genetic differentiation among
subpopulations
No Inbreeding
Scenario #2 -- 5 subpopulations, 1 locus, 2 alleles
Given: Observed heterozygosity (HI) = 0.089
Pop
1
2
3
4
5
Ave
Ave
Xi
0.9
0.9
0.9
0.9
0.9
Xj
0.1
0.1
0.1
0.1
0.1
0.9
0.1
HIs=1-Xi2
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
HS = HIs/k = 0.18
HT = 1 - xi2 = 0.18
Scenario #2
FIS = 1 - (HI/HS) = 1 - (0.089/0.18) = 0.5056
FIT = 1 - (HI/HT) = 1 - (0.089/0.18) = 0.5056
FST = 1 - (HS/HT) = 1 - (0.18/0.18) = 0.00
Conclusions:
No genetic differentiation among
subpopulations
Inbreeding!
Scenario #3 -- 5 subpopulations, 1 locus, 2 alleles
Given: Observed heterozygosity (HI) = 0.34
Pop
1
2
3
4
5
Ave
Ave
Xi
0.9
0.7
0.5
0.3
0.1
Xj
0.1
0.3
0.5
0.7
0.9
0.5
0.5
HIs=1-Xi2
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
1-(0.72 + 0.32) = 1 - 0.58 = 0.42
1-(0.52 + 0.52) = 1 - 0.50 = 0.50
1-(0.32 + 0.72) = 1 - 0.58 = 0.42
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
HS = HIs/k = 0.34
HT = 1 - xi2 = 0.50
Scenario #3
FIS = 1 - (HI/HS) = 1 - (0.34/0.34) = 0.00
FIT = 1 - (HI/HT) = 1 - (0.34/0.50) = 0.32
FST = 1 - (HS/HT) = 1 - (0.34/0.50) = 0.32
Conclusions:
Genetic differentiation among
subpopulations
No Inbreeding
Scenario #4 -- 5 subpopulations, 1 locus, 2 alleles
Given: Observed heterozygosity (HI) = 0.169
Pop
1
2
3
4
5
Ave
Ave
Xi
0.9
0.7
0.5
0.3
0.1
Xj
0.1
0.3
0.5
0.7
0.9
0.5
0.5
HIs=1-Xi2
1-(0.92 + 0.12) = 1 - 0.82 = 0.18
1-(0.72 + 0.32) = 1 - 0.58 = 0.42
1-(0.52 + 0.52) = 1 - 0.50 = 0.50
1-(0.32 + 0.72) = 1 - 0.58 = 0.42
1-(0.12 + 0.92) = 1 - 0.82 = 0.18
HS = HIs/k = 0.34
HT = 1 - xi2 = 0.50
Scenario #4
FIS = 1 - (HI/HS) = 1 - (0.169/0.34) = 0.5029
FIT = 1 - (HI/HT) = 1 - (0.169/0.50) = 0.662
FST = 1 - (HS/HT) = 1 - (0.34/0.50) = 0.32
Conclusions:
Genetic differentiation among
subpopulations
Inbreeding!
Examples thus far have dealt with a single
hierarchical level.
However, there may be genetic differentiation
within subpopulations which would create additional
hierarchical levels: localities within subpopulations.
Data can be partitioned into various hierarchical
levels to determine what geographical
factors most likely explain population differentiation.
HLi = Observed heterozygosity within localities
HL = HLi/k
HIs = 1 - xi2 = expected heterozygosity within
subpopulations
HS = HIs/k
FLS measures the proportion of the total genetic
variation attributable to divergence among
localities within subpopulations.
FLS = (HS - HL)/HT
FLT measures the proportion of the total genetic
variation attributable to localities
FLT = (HT - HL)/HT
FST measures the proportion of the total genetic
variation attributable to subpopulations
FST = (HT - HS)/ HT
Scenario 1
subpop
loc.
1
1
2
2
3
4
5
Ave
xi
0.8
0.8
0.2
0.3
0.3
xj
0.2
0.2
0.8
0.7
0.7
HLi xi
xj
HSi
0.32 0.8 0.2 0.32
0.32
0.32
0.42 0.27 0.73 0.3911
0.42
0.36 = HL
0.43 0.52 0.4992 = HT
HS = weighted ave of HSi = [(2X0.32)+(3X0.3911)/5
HS = 0.3627
FLS
= (HS - HL)/HT
= (0.3627 - 0.36)/0.4992 = 0.005
FLT = (HT - HL)/HT
(0.4992 - 0.36)/0.4992 = 0.279
FST = (HT - HS)/HT
(0.4992 - 0.3627)/0.4992 = 0.273
Scenario 2
subpop
loc.
1
1
2
2
3
4
5
Ave
xi
0.8
0.4
0.7
0.3
0.4
xj
0.2
0.6
0.3
0.7
0.6
HLi xi
xj
HSi
0.32 0.6 0.4 0.48
0.48
0.42
0.42 0.47 0.53 0.498
0.48
0.424 = HL
0.52 0.48 0.4992 = HT
HS = weighted ave of HSi = [(2X0.48)+(3X0.498)/5
HS = 0.491
FLS
= (HS - HL)/HT
= (0.491 - 0.424)/0.4992 = 0.134
FLT = (HT - HL)/HT
(0.4992 - 0.424)/0.4992 = 0.151
FST = (HT - HS)/HT
(0.4992 - 0.491)/0.4992 = 0.016
Conclusions:
In both scenarios, total genetic diversity (HT) is
0.4992.
However, in scenario #1, 27.3% of the total genetic
diversity is attributable to differences between
subpopulations and only 0.5% is due to differences
among localities within subpopulations.
Conclusions:
In scenario #2, 13.4% of the total genetic diversity
is attributable to differences between the two
subpopulation and 1.7% is attributable to
differences among localities.
Mortamer Snide graduated with a Ph.D. in wildlife management from Texas A&M University in the early 1900s. His
dissertation dealt with the distribution and deer of an unknown and unnamed mountain range in Idaho. Because of the rugged
terrain and remoteness of the area, Mortamer spent 15 years collecting observational data on deer in this area to develop a
management plan. Upon graduating and publishing his dissertation, Mortamer Snide was hired by the Idaho Department of
Wildlife Resources to develop a management plan for deer in this area. Because he spent so much time in these mountains,
the mountain range was officially named the Mortamer Snide Mountains and the newly developed management area was
named the Mortamer Snide Management Area. Based on his work, Mortamer concluded that there were two herds of
deer on the management area, separated from each other by the Snide Mountain Range. Additionally, his demographic
work suggested that although there appeared to be subdivision within these two regions, there were high levels of
movement of individuals within each regions such that each regions should be treated as a management unit. Therefore,
his management plan was to treat all deer east of the Snide Mountains as one management unit and all deer west of the
Snide Mountains as a second management unit.
In recent years, the deer on the Mortamer Snide Management Area have not been doing well. You are enrolled in the
Ph.D. program at Oklahoma State University and have become very interested in the problem with the deer at the
Mortamer Snide Management Area. For your dissertation, you spent several years performing demographic studies of
the deer on each side of the Snide Mountain Range. Your observational data indicate that there may actually be two
heards within each region. However, to test this, you collect blood samples from several deer from five groups of
deer, two on the west side of the Snide Mountains and three from the east side of the Snide Mountains and perform a
microsatellite analysis. The following is a figure illustrating your interpretation of the demographics of the deer
populations on the Mortamer Snide Management Area.
Completely Analyze these data
Snide Mountains
B-2
A-1
Snide River
A-2
Localities
Subpopulations
B-3
A-1
A-2
B-1
B-2
B-3
Locus 1
AA Aa
26 27
10
18
20 23
12
8
19
25
aa
16
28
26
36
25
Locus 2
BB Bb bb
14 19 36
38 6
12
29 14 26
12 6
38
30 20 19
Calculate Observed Heterozygosity (Ho) & Test
each locus in each population for deviations from
H.W.E.
Sample A1, Locus 1
Ho = 27/69 = 0.3913
A = p = (26+26+27)/138 = 0.5725;
q = 1 - 0.5725 = 0.4275
OBS
26
27
16
EXP
(O-E)2/E
(0.5725)2X69 = 22.62
0.505
2X0.5725X0.4275X69 = 33.77 1.357
(0.4275)2X69 = 12.61
0.911
2 = 2.773
Degrees of Freedom = 3 - 1 - 1 = 1
Tabled 2 at = 0.05 and 1 d.f. = 3.84
Sample A1, Locus 2
Ho = 19/69 = 0.275
B = p = (14+14+19)/138 = 0.3406;
q = 1 - 0.3406 = 0.6594
OBS
EXP
14
(0.3406)2X69 = 8.00
19
2X0.3406X0.6594X69 = 30.99
36 (0.6594)2X69 = 30.00
2 =
Ho = (0.3913 + 0.275)/2 = 0.3332
(O-E)2/E
4.5
4.64
1.2
10.339
Sample A2, Locus 1
Ho = 18/56 = 0.321
A = p = (10+10+18)/112 = 0.3393;
q = 1 - 0.3393 = 0.6607
OBS
EXP
10
(0.3393)2X56 = 6.47
18
2X0.3393X0.6607X56 = 25.11
28 (0.6607)2X56 = 24.45
2 =
(O-E)2/E
1.93
2.01
0.52
4.46
Sample A2, Locus 2
Ho = 6/56 = 0.107
B = p = (38+38+6)/112 = 0.7321;
q = 1 - 0.7321 = 0.2679
OBS
EXP
38 (0.7321)2X56 = 30.01
6
2X0.7321X0.2679X56 = 21.97
12
(0.2679)2X56 = 4.02
2 =
Ho = (0.321 + 0.107)/2 = 0.214
(O-E)2/E
2.13
11.61
15.85
29.59
Sample B1, Locus 1
Ho = 23/69 = 0.333
A = p = (20+20+23)/138 = 0.4565;
q = 1 - 0.4565 = 0.5435
OBS
EXP
20 (0.4565)2X69 = 14.38
23 2X0.4565X0.5435X69 = 34.24
12
(0.5435)2X69 = 20.38
2 =
(O-E)2/E
2.20
3.69
3.45
9.34
Sample B1, Locus 2
Ho = 14/69 = 0.203
B = p = (29+29+14)/138 = 0.5217;
q = 1 - 0.5217 = 0.4783
OBS
EXP
29 (0.5217)2X69 = 18.78
14
2X0.5217X0.4783X69 = 34.44
26 (0.4783)2X69 = 15.79
2 =
Ho = (0.333 + 0.203)/2 = 0.268
(O-E)2/E
5.56
12.13
6.61
24.3
Sample B2, Locus 1
Ho = 8/56 = 0.143
A = p = (12+12+8)/112 = 0.2857;
q = 1 - 0.2857 = 0.7143
OBS
EXP
12
(0.2857)2X56 = 4.57
8
2X0.2857X0.7143X56 = 22.86
36 (0.7143)2X56 = 28.57
2 =
(O-E)2/E
12.08
9.66
1.93
23.67
Sample B2, Locus 2
Ho = 6/56 = 0.107
B = p = (12+12+6)/112 = 0.2678;
q = 1 - 0.2678 = 0.7322
OBS
EXP
12
(0.2678)2X56 = 4.02
6
2X0.2678X0.7322X56 = 21.96
38 (0.7322)2X56 = 30.02
2 =
Ho = (0.143 + 0.107)/2 = 0.125
(O-E)2/E
15.87
11.60
2.12
29.59
Sample B3, Locus 1
Ho = 25/69 = 0.362
A = p = (19+19+25)/138 = 0.4565;
q = 1 - 0.4565 = 0.5435
OBS
EXP
19
(0.4565)2X69 = 14.38
25 2X0.4565X0.5435X69 = 34.24
25 (0.5435)2X69 = 20.38
2 =
(O-E)2/E
1.49
2.49
1.05
5.03
Sample B3, Locus 2
Ho = 20/69 = 0.290
B = p = (30+30+20)/138 = 0.5755;
q = 1 - 0.5755 = 0.4245
OBS
EXP
30 (0.5755)2X69 = 22.85
20 2X0.5755X0.4245X69 = 33.71
19
(0.4245)2X69 = 12.43
2 =
Ho = (0.362 + 0.290)/2 = 0.326
(O-E)2/E
2.24
5.58
3.47
11.29
Calculate He = 2N(1 - pi2)/(2N - 1)
A-1 locus 1:
138[1-(0.3278+0.1828)]/137=0.4930
A-1 locus 2
138[1-(0.1160+0.4348)]/137=0.4525
He = (0.4930 + 0.4525)/2 = 0.473
A-2 locus 1:
112[1 - 0.5516]/111 = 0.4524
A-2 locus 2:
112[1 - 0.6077]/111 = 0.3958
He = (0.4524 + 0.3958)/2 = 0.4241
B-1 locus 1:
138[1 - 0.5038]/137 = 0.4998
B-1 locus 2:
138[1 - 0.5009]/137 = 0.5027
He = (0.4998 + 0.5027)/2 = 0.5013
B-2 locus 1:
112[1 - 0.5918]/111 = 0.4119
B-2 locus 2:
112[1 - 0.6078]/111 = 0.3856
He = (0.4119 + 0.3856)/2 = 0.3988
B-3 locus 1:
138[1 - 0.5038]/137 = 0.4998
B-3 locus 2:
138[1 - 0.5114]/137 = 0.4922
He = (0.4998 + 0.4922)/2 = 0.4960
Table of Descriptive Statistics
A1
A2
B1
B2
B3
Ho
0.391
0.321
0.333
0.143
0.362
Locus 1
He
0.493
0.452
0.500
0.412
0.500
HWE
Y
N
N
N
N
Ho
0.275
0.107
0.203
0.107
0.290
Locus2
He
0.453
0.396
0.502
0.386
0.492
HWE
N
N
N
N
N
Ho
0.333
0.214
0.268
0.125
0.326
He
0.473
0.424
0.501
0.399
0.496
Locus 1
subpop
A
loc.
1
2
xi
xj
HLi
xi
xj
HSi
0.573 0.427 0.489 0.456 0.544 0.496
0.339 0.661 0.448
B
1
2
3
0.457 0.543 0.496
0.268 0.732 0.392 0.394 0.606 0.478
0.457 0.543 0.496
0.457 = HL
0.419 0.581 0.487 = HT
Ave
HS = weighted ave of HSi = [(2X0.496)+(3X0.478)]/5
HS = 0.485
FLS = (HS - HL)/HT
= (0.485 - 0.457)/0.487
= 0.057
FLT = (HT - HL)/HT
= (0.487 - 0.457)/0.487
= 0.062
FST = (HT - HS)/(HT)
= (0.487 - 0.485)/0.487
= 0.004
Locus 2
subpop
A
loc.
1
2
xi
xj
HLi
xi
xj
HSi
0.341 0.659 0.449 0.537 0.463 0.497
0.732 0.268 0.392
B
1
2
3
0.522 0.478 0.499
0.268 0.732 0.392 0.455 0.545 0.496
0.576 0.424 0.488
0.444 = HL
0.422 0.578 0.500 = HT
Ave
HS = weighted ave of HSi = [(2X0.497)+(3X0.496)]/5
HS = 0.496
FLS = (HS - HL)/HT
= (0.496 - 0.444)/0.500
= 0.104
FLT = (HT - HL)/HT
= (0.500 - 0.444)/0.500
= 0.112
FLS = (HT - HS)/(HT)
= (0.500 - 0.496)/0.500
= 0.008
Locus 1
FLS
0.057
FLT
0.062
FST
0.004
Locus 2
0.104
0.112
0.008
Ave.
0.081
0.087
0.006
F-statistics Locus 1
HI = (0.391+0.321+0.333+0.143+0.362)/5 = 0.399
Pop
A1
A2
B1
B2
B3
Ave
Ave
Xi
0.5725
0.3393
0.4565
0.2857
0.4565
Xj
0.4275
0.6607
0.5435
0.7143
0.5435
0.4221
0.5779
HIs=1-Xi2
0.4894
0.4484
0.4962
0.4082
0.4962
HS = HIs/k = 0.4677
HT = 1 - xi2 = 0.4879
HI = 0.399
HS = 0.468
HT = 0.488
FIS = 1 - (HI/HS) = 1 - (0.399/0.468) = 0.147
FIT = 1 - (HI/HT) = 1 - (0.399/0.488) = 0.182
FST = 1 - (HS/HT) = 1 - (0.468/0.488) = 0.041
F-statistics Locus 2
HI = (0.275+0.107+0.203+0.107+0.290)/5 = 0.196
Pop
A1
A2
B1
B2
B3
Ave
Ave
Xi
0.3406
0.7321
0.5217
0.2678
0.5755
Xj
0.6594
0.2679
0.4783
0.7322
0.4245
0.4875
0.5125
HIs=1-Xi2
0.4492
0.3923
0.4991
0.3922
0.4886
HS = HIs/k = 0.4443
HT = 1 - xi2 = 0.4997
HI = 0.196
HS = 0.4443
HT = 0.4997
FIS = 1 - (HI/HS) = 1 - (0.196/0.444) = 0.559
FIT = 1 - (HI/HT) = 1 - (0.196/0.500) = 0.608
FST = 1 - (HS/HT) = 1 - (0.444/0.500) = 0.112
Locus 1
Locus 2
Ave
FIS
0.147
0.559
0.353
FIT
0.182
0.608
0.395
FST
0.041
0.112
0.077
Calculation of Pairwise FST -Locus 1; Population A1 vs. A2
Subpop
A1
A2
xi
0.5725
0.3393
xj
0.4275
0.6607
Ave.
0.4559
0.5441
HSi
0.4894
0.4484
HS = 0.4689
HT = 0.4961
FST = 1 - (HS/HT) = 1 - (0.4689/0.4961) = 0.055
Locus 2; Population A1 vs. A2
Subpop
A1
A2
xi
0.3406
0.7321
xj
0.6594
0.2679
Ave.
0.5364
0.4637
HSi
0.4492
0.3923
HS = 0.4208
HT = 0.4975
FST = 1 - (HS/HT) = 1 - (0.4208/0.4975) = 0.154
Ave FST = (0.055 + 0.154)/2 = 0.105
Continue this for ALL pairwise comparisons
A1
A2
B1
B2
B3
A1
----0.105
0.024
0.046
0.035
A2
B1
B2
B3
----0.031
0.110
0.021
----0.048
0.001
----0.061
---
Unweighted Pair-Group Method Using Arithmetric
Averages (UPGMA) Clustering Method.
UPGMA computes the average similarity or
dissimilarity of a candidate OTU to an extant
cluster, weighting each OTU in that cluster
equally, regardless of its structural subdivision.
Pairwise FST
A1
A1 ----A2 0.105
B1 0.024
B2 0.046
B3 0.035
A2
B1
B2
B3
----0.031
0.110
0.021
----0.048
0.001
----0.061
---
Step 1 -- Find the pair(s) of OTUs with the lowest
value.
Step 2 -- Recalculate distances with [B1,B3] as an
OUT. Differences between OTUs that did not
join any cluster are transcribed unchanged from
the original matrix.
Therefore:
[B1,B3] vs. A1
[B1,B3] vs. B2
[B1,B3] vs. B2
= 1/2(B1A1 + B3A1)
= 1/2(0.024 + 0.035) = 0.030
= 1/2(0.031 + 0.021) = 0.026
= 1/2(0.048 + 0.061) = 0.055
Now:
[B1,B3]
A1
A2
B2
[B1,B3]
----0.30
0.026
0.055
A1
A2
B2
----0.105
0.046
----0.110
-----
Find the smallest difference and repeat the above
steps until all OTUs cluster!
[B1,B3,A2] vs. A1 = 1/3(0.024 + 0.035 + 0.105)
= 0.055
[B1,B3,A2] vs. B2 = 1/3(0.048 + 0.061 + 0.110)
= 0.073
[B1,B2,A2]
[B1,B2,A2]
----A1
0.055
B2
0.073
A1
B2
----0.046
-----
[B1,B3,A2] vs. [A1,B2] =
1/6(0.105 + 0.024 + 0.035 + 0.048 + 0.110 + 0.061)
=
0.064
Final Step is to draw an unrooted dendrogram
Summary:
1. B1 clusters with B3 @ 0.001
2. A2 clusters with [B1,B3] @ 0.026
3. A1 clusters with B2 @ 0.046
4. [B1,B3,A2] clusters with [A1,B2] at 0.064
0
0.01
0.02
0.03
0.04
0.05
0.06
B1
B3
A2
A1
B2
Snide Mountains
B-2
A-1
Snide River
A-2
Localities
Subpopulations
B-3
Migration -- movement of individuals (or their
gametes) between populations and subsequent
gene flow.
The degree of genetic differentiation among
populations (FST) is expected to be greater for:
species with lower vs. higher dispersal rates
subdivided vs. continuous habitat
distant vs. closer fragments
smaller vs. larger population fragments
species with longer vs shorter divergence times
with adaptive differences vs. those without.
Mainland-Island Model of Migration:
qm
One-Way
Migration
qt
m = migration coefficient; proportion of individuals
on the island that just came from the mainland.
Change in allele frequency due to migration:
q = m(qm - qt)
Frequency of q on the island in the next
generation:
qt+1 = qt + q
The Mainland-Island model of gene flow is
realistic for:
the Galapagos Islands & west coast of S. A.
“habitat islands” such as fragmented rainforests
of mountain tops separated by desert.
Island Model:
m is the proportion of
individuals on any one
island that came from
elsewhere.
The islands differ in allele frequency, and the
average allele frequency among all islands is the
mean p.
From the point of view of a single island, all other
islands are equivalent of a “mainland”.
Therefore, if q is the allele frequency on a
given island, the recursion equation for the allele
frequency on the island is:
qt+1 = (m X q) + (1 - m)(qt)
This is the same equation as the mainland-island
model.
The allele frequency on each island will approach
the mean allele frequency across all islands.
Stepping-Stone Model: Allows subpopulations to
exchange individuals only with adjacent
subpopulations and can be formulated in one, two,
or more dimensions.
One dimensional stepping-stone model:
Two-Dimensional Stepping-Stone Model:
Migration tends to homogenize populations and
the rate of homogenization depends upon the
migration rate, population structure, and
difference in allele frequencies.
All subpopulations eventually approach the mean
allele frequency of the total population!
Populations diverge from each other as a function
of genetic drift, migration, and local selection.
Drift can have a particularly strong effect on
small populations and is an inverse function of Ne.
1
FST = 4Nem + 1
If Ne is small, populations will tend to diverge as a
result of random genetic drift and high rates of
migration (m) are needed to prevent divergence.
Allendorf (1983) has shown that if Nem is > 1,
local populations will not tend to diverge
significantly in terms of alleles present.
For example, a pair of populations with a mean Ne
of 1,000 and m = 0.01 would not significantly
diverge by chance alone since Nem = 10.
However, a pair of smaller population, with a mean
Ne of 100 and the same rate of gene flow
(m = 0.01), random drift would be greater and a
higher rate of gene flow would be needed to
prevent divergence.
Some populations and species in nature have existed
for long periods of time in complete isolation from
other gene pools and have diverged through genetic
drift and selection.
In such cases, natural movement among
subpopulations was historically rare or nonexistent
and strong divergence occurred.
In such cases, within population heterozygosity is
expected to be low and FST high, virtually all of the
total genetic diversity (HT) in such species could be
due to the divergence component.
The management implications of this scenario is
that the separation of these naturally isolated
populations should be maintained.
Contrasting this, is the more typical case in which
genetic exchange among populations occurs in a
hierarchical fashion.
In this case, local populations may be only
partially isolated from other gene pools, with
some probability of gene flow among them.
Geographically proximate populations would, on
average, experience gene flow more frequently
than would geographically distant populations.
Genetic “connectedness” is therefore a function
of geographic structure and spatial scale.
Most endangered species do not experience the
equilibrium conditions implicit in a hierarchical
model.
By their very nature of being endangered or of
special concern, their genetic structure has
probably been altered, populations have been lost,
and remaining populations are dangerously small
and fragmented.
Habitat destruction, blockage of migration routes,
drying or diversion of waterways, clear-cutting,
urbanization, and other anthropogenic factors
isolate populations that normally would experience
gene flow with other populations.
Such induced fragmentation and isolation will lead
to loss of heterozygosity and divergence from other
populations where gene flow previously occurred.
Leberg (1991) found
that eastern wild
turkeys in AK, KY,
TN, CN were
fragmented and
had gone through
bottlenecks
because of human
activity.
Genetic divergence among these populations
(FST = 10.2%) was among the highest ever
recorded for birds, much higher than for turkey
populations that had not experienced known
bottlenecks.
Leberg attributed this divergence to human
activity, including management manipulations.
Scenarios such as this may call for managers to
simulate natural gene flow by artificial means.
The management challenge in the hierarchical
model is to determine former rates and direction
of gene flow among populations in an attempt to
mimic those rates in the face of human
disturbance.
The age and sex ratio of translocated individuals
should match the natural history of the species
and care should be taken not to introduce
parasites and pathogens in the process.
This management recommendation is in direct
contrast to the first example -- the isolated
island mode-- in which case the manager should
NOT induce gene flow but rather, should protect
the normal isolation of populations.
But, where natural gene flow has historically
occurred and has been interrupted by humans,
management should emphasize continuance of gene
flow near historical levels.
Such is now being done for
the Florida panther and
genetic models have been
used to design the program
and evaluate its possible
consequences.
The natural genetic structure of a species and its
normal rates of gene flow may be inferred from:
geography
historical records
knowledge of the biology of the species
genetic information derived from hierarchical
analyses
Assuming drift-migration balance, the effective
migration rate historically experienced can be
estimated as:
1
FST = 4Nem + 1
Thus, in a species with FST = 3%:
0.03 = [1/(4Nem +1)] = 4Nem + 1 = 33.3 and Nem=8.1
Whereas, in a species with FST = 35%:
0.35 = [1/(4Nem +1)] = 4Nem + 1 = 2.86 and Nem=0.46
Echelle et al. (1987) studied 4 species of pupfish in the Chihuhan
desert region of NM and TX and their data are amenable to
calculating historical migration rates.
Estimated historical rates of gene flow among populations
Species
Cyprinodon bovines
Distribution
FST
A single, ~8km stretch of spring-fed 1.4%
stream
Nem
17.6
C. pecosensis
600 - 700 km of mainstream
Pecos River
7.7%
3.0
C. elegans
Spring-fed complex of canals and
creek with partial isolation
10.8% 2.1
C. tularosa
2 isolated springs & associated
creek in extremely arid region
19.0% 1.1
Nuclear vs. Mitochondrial DNA
Nuclear markers are inherited biparentally and
therefore can provide information concerning
both sexes as well as recombination.
Mitochondrial data is only maternally inherited and
therefore only provides information about
matrilines and provides only a small amount of
information.
Cronin (1993) proposed that mtDNA data are of
limited utility because they represent a small
part of the genome and may not reflect overall
phylogenetic relationships of taxa.
Avise provides three arguments for the utility of
mtDNA in conservation and management.
1. In many species, dispersal and gene flow are
highly asymmetric by gender with females
often relatively sedentary.
2. In many species, females and their young are
spatially associated at the time that the
offspring begin independent life.
B
E
A
D
B
A
B
B
B
3. A strong spatial structure of matrilines implies
a considerable degree of demographic autonomy
among population over ecological time.
Gender biased gene flow is exemplified in many
mammalian species by stronger degree of
faithfulness of females to natal sites or social
groups compared with males.
In principle, such behavior should translate into
distinctive population genetic signatures for
cytoplasm vs. nuclear loci.
In Macaca monkeys, only 9%
of the total intraspecific
diversity in nuclear allozyme
loci (FST) was distributed
among geographic
populations, the comparable
value was more than 91%
for mtDNA.
This mirror image pattern
presumably results from
natal-group fidelity by females that contrast with
extensive intergroup movement by males.
Except for those species that release eggs into
the environment, a strong spatial association
normally exists between females and their
newly produced young.
If female progeny remain reasonably philopatric
to the natal site or social group, either by active
choice or passively because of limited dispersal
capabilities, a species inevitably will become
spatially structured along matrilines.
Because recruitment is contingent upon female
reproductive success, any population that is
compromised or extirpated by human or natural
causes, will unlikely recover or re-establish in the
short-term via recruitment of non-indigenous
females when female dispersal is low.
Green sea turtles nest on
isolated sandy beaches that
may be distant from their
foraging locations.
The remainder of their
life-cycle is spent at sea,
where mating takes place.
mtDNA data have documented a dramatic
matrilineal structuring among rookeries, a finding
indicative of a strong propensity for natal homing
by females.
From an ecological perspective, each rookery
should be considered (and possibly managed) as
an autonomous demographic unit.
Severe decline or loss of a rookery will not likely
be compensated by natural recruitment of
foreign females.
Female Dispersal and Gene Flow
Male Dispersal and Gene Flow
Low
High
Geographic Structure
mtDNA
autosomal genes
Y-linked genes
Demographic
autonomy
in:
yes
yes
yes
Geographic Structure
mtDNA
autosomal genes
Y-linked genes
Demographic
autonomy
in:
yes
no
no
yes
yes
High
Geographic Structure in:
mtDNA
no
autosomal genes
no
Y-linked genes
yes
Demographic
autonomy
yes
Geographic Structure in:
mtDNA
no
autosomal genes
no
Y-linked genes
no
Demographic
autonomy
no