Transcript Slide 1

Interval Mapping
r
A
(marker)
r1
r2
B
Q
(putative (marker)
QTL)
• r = r1 + r2 - 2r1r2
Relative position of the QTL in relation to the interval:
• ρ = r1/r
• 1- ρ = r2/r
Interval mapping of QTL using backcross
r
progeny
Q B
A
• Parents:
• Backcross:
a
AAQQBB x aaqqbb
r1 q r2 b
AaQqBb x AAQQBB
Expected Frequency
• BC Progeny
AAQQBB, AaQqBb
0.5 (1 - r)
AAQqBb, AaQQBB
0.5r1
AAQQBb, AaQqBB
0.5r2
AAQqBB, AaQQBb
0
r: recombination fraction between A and B
r1: recombination fraction between A and Q
r2: recombination fraction between Q and B
Interval mapping of QTL using
backcross progeny
Marker
genotype
Observed
count (ni)
Joint frequencies (pij)
Frequency (pi)
AABB
n1
AABb
0.5(1-r)
QQ
0.5(1-r)
Qq
0
n2
0.5r
0.5r2
0.5r1
AaBB
n3
0.5r
0.5r1
0.5r2
AaBb
n4
0.5(1-r)
0
0.5(1-r)
Interval mapping of QTL using
backcross progeny
Conditional
frequency p(Qj/Mi)
QQ
Qq
Marker
genotype
Frequency
(pi)
AABB
0.5(1-r)
1
0
µ1
AABb
0.5r
r2/r= 1-ρ
r1/r= ρ
(1-ρ) µ1 + ρµ2
AaBB
0.5r
r1/r= ρ
r2/r= 1-ρ
ρµ1 + (1-ρ)µ2
AaBb
0.5(1-r)
0
1
µ2
µ1
µ2
0.5(µ1 + µ2)
Mean
p(Q j / M i ) 
pij
pi
Expected trait value (gi)
Interval Mapping analysis
r
A
(marker)
r1
r2
B
Q
(putative (marker)
QTL)
- Likelihood approach (Lander and Botstein, 1989)
- Linear regression approach (Knapp et al., 1990)
- Non-linear regression approach (Knapp et al., 1992)
A likelihood approach using backcross progeny
L
 ( yi   j ) 2 
p(Q j / M i ) exp


2
2

i 1 j 1


N
1
 2 
N
2
A likelihood approach using backcross progeny
(cont.)
Logarithm of the likelihood of the theoretical distribution:

 ( yi   j ) 2  
2
 N
2
Ln( L)   Ln p(Q j / M i ) exp
   Ln(2 )
2
2

i 1

 
 j 1
 2
N
Under the H0: µ1 - µ2 = 0, the logarithm of the likelihood is:
LnL(1  2     
1
2
ni
4
( yij   ) 2 
2 
i 1 j 1
N
Ln(2 2 )
2
G-statistics
Likelihood ratio test statistics (LR)
Probability of occurrence of the data under the null the hypothesis



G  2 ln L( ˆ1 , ˆ 2 , ˆ 2 , rˆ1 )  lnL(  Aa   aa   )
Likelihood maps
Plot of likelihood-ratio statistics as function of map position of the putative
QTL
Likelihood ratio test statistic
(LR)
Likelihood of odds
(LOD score)
LR= 4.605LOD
LOD= 0.217LR
Interval Mapping (by Maximum Likelihood)
Lander and Botstein (1989)
•Tests positions at and between markers / one-dimensional search,
interactions between multiple QTL are not considered
•Accounts for mixtures of QTL genotypes in each marker class.
•Find position that maximizes LR
Interval Mapping
Limitations
• Assumes a single segregating QTL influencing the trait
• The number of QTLs cannot be resolved (linked QTLs?)
• The locations of the QTLs are sometimes not well resolved
and the exact position cannot be determined (QTL
interactions?)
• The statistical power is still relatively low (limited
information in the model)
Composite Interval Mapping (CIM)
Combination of simple interval mapping and multiple linear regression
Genetic markers are incorporated as co-factors in the framework of simple
interval mapping
Co-factor markers partially absorb genetic variance associated with unlinked
and linked markers and greatly increase the power for detecting QTL and
estimating QTL positions
Zeng (1994), Jansen and Stam (1994)
Composite Interval Mapping (CIM)
Problems with CIM:
• Highly dependent on background markers
• Permutations to calculate threshold are slow (QTL
Cart)
Mapping QTL in multiple environments
•QTL positions are assumed to be constant
•QTL Effects may vary by environment
•Need to test multiple environments to know
this
•This adds complexity to the QTL analysis
•One solution: separate analyses
Mapping QTL in multiple environments
Better: analysis across environments
Software: NQTL
Multiple interval mapping (MIM)
- MIM uses multiple marker intervals simultaneously to fit multiple
putative QTL directly in the model for mapping QTL
- MIM is based on Cockerham’s model for interpreting genetic
parameters
- MIM is based on maximum likelihood for estimating genetic parameters
- Identifies: number of QTL, genomic positions, effects and interactions
of significant QTL and their contribution to genetic variance
Issues in QTL detection
•
•
•
•
•
•
•
•
False positives / false negatives
Significance thresholds
Confidence interval
Population size and marker density
QTL resolution
What are QTLs?
QTL software
Applications of QTL mapping
False positives / false negatives
• Ho: no association between markers and QTL
Type 1 error: false positives
Type 2 error: false negatives
Define probability of false positives
–significance level- is controlled by choosing the appropriate
significance threshold
Significance Thresholds by Permutation
Churchill and Doerge, 1994
1.Permute the data
(create the null hypothesis)
H0: there is no QTL in the tested interval
H1: there is QTL in the tested interval
2.Perform interval mapping
3. Repeat (1) and (2) many times
4.Choose Threshold
Significant thresholds by empirical calculations
QTL threshold calculations
Piepho, 2001; Van Ooijen, 1999
LOD scores of 2.5 to 3.5 are commonly used
LOD score of 2 indicates that the model containing the estimated
QTL effect is 102= 100 times more likely than is the model with no
QTL effect.
Confidence interval for QTL positions
Typically, confidence intervals for QTL positions have been based on
a “1-LOD support interval” or a “2-LOD support interval”
Maximum value
of LOD: 4.2
3.2
Example of a “1-LOD
support interval”
3.2
Size of the interval:
(cM at *) – (cM at $)
$ *
As Kearsey and Farquhar (1998) note, most QTL confidence intervals are
quite large, typically on the order of 20 cM.
Population size and markers density
- For an initial QTL study, generate a ”framework” DNA marker map with
loci sampled every 20-30 cM.
-Approaches for reducing the size of a mapping population:
--Random sampling
--Selective genotyping
--Selective phenotyping
- The number of individuals needed for mapping might depend on the
heritability of the trait
- Replication and environments vs. Population size
- Relative costs for genotypic and phenotypic analyses
Approaches for reducing population size
 Random sampling
Data sets
Number of lines
 Selective genotyping
The Toluca Valley, Mexico (TVM)
B47 (14.6%)
35
30
25
20
15
10
5
0
0
10
20
Bar (78.1%)
30 40 50 60 70 80
Disease severity (%)
genotyping
and
analysis
90 100
 Selective phenotyping
phenotyping
and analysis
Effect of population size on the estimation of
QTL number
Number of QTL
The Toluca Valley, Mexico (TVM)
10
9
8
7
6
5
4
3
2
1
0
Whole population
Selective genotyping
Selective phenotyping
Random sampling DS
409
300
200
150
100
50
Number of DH lines
Trait: Barley stripe rust (Puccinia striiformis f. sp. Hordei) severity
Vales et al., 2005
Effect of population size on the estimation of
percentage of phenotypic variance explained
% phenotypic variance
The Toluca Valley, Mexico (TVM)
100
90
80
70
60
50
40
30
20
10
0
Whole population
Selective genotyping
Selective phenotyping
Random sampling DS
409
300
200
150
100
50
Number of DH lines
Trait: Barley stripe rust (Puccinia striiformis f. sp. Hordei) severity
Vales et al., 2005
Fine mapping and QTL cloning
- Near isogenic lines
- Positional cloning
- Development of large mapping populations
- Transposon tagging
- Fine-map of the regions of interest (more markers, larger
population, more recombination)
- Candidate gene
Positional cloning approach. Paterson (1998)
Dissecting QTL using microarrays
Genetical Genomics – Jansen and Nap, 2001
ELP: expression level polymorphism (Doerge 2002)
QTL ‘eQTL” analysis
Microarray ‘eTraits’
Limitations:
- Changes in gene expression may not
be attributable to allelic variation
http://www.med.upenn.edu/microarr/array.html
-Only detect changes in expression
level
- Availability of arrays
QTL analysis software
Method (reference)
Program
Significance
Interval mapping (Lander and
Botstein 1989)
MAPMAKER/QTL
Localizes QTL into marker
intervals
Composite interval mapping (Zeng
1994; Basten et al. 2001)
CIM in QTL CARTOGRAPHER
Employs adjacent markers and/or
other QTL as cofactors
Multiple interval mapping (Zeng et
al. 1999)
MIM in QTL CARTOGRAPHER
Searches for multiple QTL
simultaneously, using a single
test of a chromosome.
Bayesian interval mapping
(Satagopan et al. 1996)
BMAPQTL, also BIM in QTL
CARTOGRAPHER
Can estimate QTL effect and
position separately, use of a
prior may improve power.
Outbred QTL (Seaton et al. 2002)
QTL EXPRESS
Least squares regression in some
outbred crossing designs
including sibs and pedigrees.
Interval mapping in autotetraploids
(Hackett and Luo, 2003)
TETRAPLOIDMAP
Localizes QTL into marker
intervals. Takes into account
tetrasomic inheritance and
double reduction
For more information, check: http://www.stat.wisc.edu/~yandell/qtl/software/
QTL databases
Gramene QTL Database
http://www.gramene.org/qtl/index.html
QTL for agronomic traits in rice, maize, barley, oat, sorghum, pearl millet, foxtail and wild rice.
QTL databases
Graingenes:
http://www.graingenes.org/cgi-bin/ace/browse/graingenes?class=QTL
http://rye.pw.usda.gov/cmap/
MaizeGBD:
http://www.maizegdb.org/qtl.php
Summary of published barley QTL reports:
http://barleyworld.org/northamericanbarley/qtlsummary.php
Use of QTL information in Improvement
programs
• Genotype building
- Pyramiding
- major genes
- minor genes
- major and minor genes
• Introgression
• Recurrent selection
• Crossbreeding or hybrid production
- Choice of breeds or lines to cross
Marker assisted selection
• Marker-assisted selection
• Marker-aid selection
– Positive selection
– Negative selection
• Selection by design
Marker assisted selection
Pyramiding: BCD populations
Cali-sib x Bowman
Shyri x Galena
BSR-45
CI10587 x Galena
D1-72
Harrington
D3-6
Baronesse
Orca
BCD
BCD47
AJO
DB
BCD12
D3-6/B23
BU
OPS
D3-6/B61
Marker assisted selection
Introgression, Backcrossing and Pyramiding
Baronesse
• BISON: Baronesse near isogenic lines
1H
– BISON 1H, BISON 4H, BISON 5H, and
combinations
• For research purposes
– Evaluation of disease components
• Lession size, sporulation rate, pustule
density
• For breeding purposes
BISON
1H
Richardson et al. 2006
Richardson et al. 2006