Transcript Lab9 :Test of Neutrality and Evidence for Selection
Lab 11 :Test of Neutrality and Evidence for Selection
Goals:
1.Calculate exp. # of different allele in a population for different marker.
2.Detect departure from neutrality using 1. Ewens- Watterson test.
2. Tajima’s D test.
3. HKA test and 4. Synonymous and Nonsynonymous nucleotide substitution test
Infinite Alleles Model (IAM)
• • • Each mutation produces a new allele At equilibrium, number of alleles and shape of allele frequency distribution remain constant Lost alleles replaced by new mutations
Ewens -Watterson test
E
(
k
)
i
2
N
1 0
i
1 1 2 ...
2
N
1
Where
, 4
Neu
Expected homozygosity under mutation-drift equilibrium and assuming IAM:
f e
1 4
N e
1
Expected homozygosity under HWE:
P-value < 0.025: Too even -> Balancing selection or recent bottleneck
f HW
p i
2 P-value > 0.975: Too uneven -> Directional selection or population growth
Problem 1.
Estimates of the long-term
effective population size
of human populations vary widely, ranging from as low as ~3,000 to as high as ~100,000. To estimate allele frequencies for a forensic identification study, you are genotyping individuals selected at random from a population with an
estimated N e 0.8
10 -6
and
= 7,500.
You are using one
allozyme a
nd one
microsatellite marker
, with estimated mutation rates
=
9.2
10 -2
,
respectively
=
. How many different alleles do you expect to find for each marker in a sample of:
•
7 people?
•
12 people?
•
What assumptions were made for these calculations to be valid?
Tajima’s D
• Under neutrality, we expect the following: •
S
S
1
n
i
1
i
m
Test of the coalescent model – Assumes neutral alleles and constant population size
Under neutrality
d =
−
S
= 0
D
=
d SE
(
d
)
D
0 Purifying/
D
0 Balancing positive selection or population selection or recent bottleneck growth
(Hamilton 270)
plantsciences.ucdavis.edu
Problem 2.
File
aspen_phy.arp
Arlequin
(which is already in format) contains sequence data from exon 1 of the
phytochrome B2
(
phyB2
) gene of 24 aspen (
Populus tremula
) trees sampled along a wide latitudinal gradient in Europe. Use
Arlequin
to: a.Determine the number of polymorphic sites (
S
) and calculate the nucleotide diversity ( ) based on these sequences.
b.Perform the tests of neutrality developed by Ewens Watterson and Tajima and interpret the results.
c. Provide a statistical and a biological interpretation of the results from the two neutrality tests.
Hudson-Kreitman- Aguade(HKA) test
(Hamilton 266)
Hudson-Kreitman- Aguade(HKA) test
Adh
Polymorphism within 0.101
species (S/m) Divergence between Species(D/m) Ratio (within/between)
χ
2 p-value 0.056
1.80
6.09
0.016
Control locus 0.022
0.052
0.42
Problem 3.
Files
utr_mays.arp
,
utr_par.arp, exon_mays.arp,
and
exon_par.arp
contain sequence data from the 5’ untranslated region and from an exon of the
teosinte branched1
(
tb1
) gene of maize (
Zea mays
ssp.
mays
) and its most likely wild progenitor
Zea mays
ssp.
parviglumis
. For each of these regions of
tb1
•Use
Arlequin
and for each subspecies: to determine the number of segregating sites (
S
) and calculate the nucleotide diversity ( ). What can you infer by comparing nucleotide diversity between the two species for each region? •
Use Arlequin
to perform the tests of neutrality developed by Ewens-Watterson and Tajima. Interpret and discuss the results.
•
Interpret and discuss the results from the following 2 HKA tests:
•
GRADUATE STUDENTS ONLY: Download and read the paper describing this study (Wang et al. 1999), which is uploaded on the lab page of the class website, and provide an extended biological interpretation of the results of a) – c).
File
utr_mays.arp
utr_par.arp
exon_mays.arp
exon_par.arp
Test A Polymorphism within subspecies Divergence between subspecies
χ
2
p
-value Test B Polymorphism within subspecies Divergence between subspecies
χ
2
p
-value Region of
tb1
5’ untranslated region 5’ untranslated region exon exon
tb1
5’ untranslated region 0.00093
Subspecies
mays parviglumis mays parviglumis
Average of control loci 0.01996
0.05255
13.58
0.001
tb1
translated region 0.00243
0.01273
2.70
0.26
0.02242
Average of control loci 0.01996
0.02242
Synonymous and Nonsynonymous Nucleotide Substitution test
d N d S
dN = Observed # nonsynonymous substitutions/nonsynonymous site dS= Observed # synonymous subsitutions/synonymous site 5’-ATT GTT CAT CG T 5’-ATT GTT CAT CG C ACC CA
T
ACC CA
A
CGA-3’ CGA-3’ Synonymous site Synonymous mutation Nonsynonymous site Nonsynonymous mutation
Problem 4.
Calculate the
ω
=
d N
/
d S
ratio based on the following 2 DNA sequences: 5’-ATG GTT CAT TTT ACC GGA CGA AGT CGA TTA-3’ 5’-ATG GTT CAC TTG ACC GCA CGA AGT AGA TTA-3’ Seq 1 Codon ATG GTT No. potential synonymous sites (s j ) 0 No. potential nonsynonymous sites (n j ) 3 Seq 2 Codon ATG GTT No. potential synonymous sites (s j ) 0 No. potential nonsynonymo us sites (n j ) 3
Total