Transcript Slide 1
Interval Mapping r A (marker) r1 r2 B Q (putative (marker) QTL) • r = r1 + r2 - 2r1r2 Relative position of the QTL in relation to the interval: • ρ = r1/r • 1- ρ = r2/r Interval mapping of QTL using backcross r progeny Q B A • Parents: • Backcross: a AAQQBB x aaqqbb r1 q r2 b AaQqBb x AAQQBB Expected Frequency • BC Progeny AAQQBB, AaQqBb 0.5 (1 - r) AAQqBb, AaQQBB 0.5r1 AAQQBb, AaQqBB 0.5r2 AAQqBB, AaQQBb 0 r: recombination fraction between A and B r1: recombination fraction between A and Q r2: recombination fraction between Q and B Interval mapping of QTL using backcross progeny Marker genotype Observed count (ni) Joint frequencies (pij) Frequency (pi) AABB n1 AABb 0.5(1-r) QQ 0.5(1-r) Qq 0 n2 0.5r 0.5r2 0.5r1 AaBB n3 0.5r 0.5r1 0.5r2 AaBb n4 0.5(1-r) 0 0.5(1-r) Interval mapping of QTL using backcross progeny Conditional frequency p(Qj/Mi) QQ Qq Marker genotype Frequency (pi) AABB 0.5(1-r) 1 0 µ1 AABb 0.5r r2/r= 1-ρ r1/r= ρ (1-ρ) µ1 + ρµ2 AaBB 0.5r r1/r= ρ r2/r= 1-ρ ρµ1 + (1-ρ)µ2 AaBb 0.5(1-r) 0 1 µ2 µ1 µ2 0.5(µ1 + µ2) Mean p(Q j / M i ) pij pi Expected trait value (gi) Interval Mapping analysis r A (marker) r1 r2 B Q (putative (marker) QTL) - Likelihood approach (Lander and Botstein, 1989) - Linear regression approach (Knapp et al., 1990) - Non-linear regression approach (Knapp et al., 1992) A likelihood approach using backcross progeny L ( yi j ) 2 p(Q j / M i ) exp 2 2 i 1 j 1 N 1 2 N 2 A likelihood approach using backcross progeny (cont.) Logarithm of the likelihood of the theoretical distribution: ( yi j ) 2 2 N 2 Ln( L) Ln p(Q j / M i ) exp Ln(2 ) 2 2 i 1 j 1 2 N Under the H0: µ1 - µ2 = 0, the logarithm of the likelihood is: LnL(1 2 1 2 ni 4 ( yij ) 2 2 i 1 j 1 N Ln(2 2 ) 2 G-statistics Likelihood ratio test statistics (LR) Probability of occurrence of the data under the null the hypothesis G 2 ln L( ˆ1 , ˆ 2 , ˆ 2 , rˆ1 ) lnL( Aa aa ) Likelihood maps Plot of likelihood-ratio statistics as function of map position of the putative QTL Likelihood ratio test statistic (LR) Likelihood of odds (LOD score) LR= 4.605LOD LOD= 0.217LR Interval Mapping (by Maximum Likelihood) Lander and Botstein (1989) •Tests positions at and between markers / one-dimensional search, interactions between multiple QTL are not considered •Accounts for mixtures of QTL genotypes in each marker class. •Find position that maximizes LR Interval Mapping Limitations • Assumes a single segregating QTL influencing the trait • The number of QTLs cannot be resolved (linked QTLs?) • The locations of the QTLs are sometimes not well resolved and the exact position cannot be determined (QTL interactions?) • The statistical power is still relatively low (limited information in the model) Composite Interval Mapping (CIM) Combination of simple interval mapping and multiple linear regression Genetic markers are incorporated as co-factors in the framework of simple interval mapping Co-factor markers partially absorb genetic variance associated with unlinked and linked markers and greatly increase the power for detecting QTL and estimating QTL positions Zeng (1994), Jansen and Stam (1994) Composite Interval Mapping (CIM) Problems with CIM: • Highly dependent on background markers • Permutations to calculate threshold are slow (QTL Cart) Mapping QTL in multiple environments •QTL positions are assumed to be constant •QTL Effects may vary by environment •Need to test multiple environments to know this •This adds complexity to the QTL analysis •One solution: separate analyses Mapping QTL in multiple environments Better: analysis across environments Software: NQTL Multiple interval mapping (MIM) - MIM uses multiple marker intervals simultaneously to fit multiple putative QTL directly in the model for mapping QTL - MIM is based on Cockerham’s model for interpreting genetic parameters - MIM is based on maximum likelihood for estimating genetic parameters - Identifies: number of QTL, genomic positions, effects and interactions of significant QTL and their contribution to genetic variance Issues in QTL detection • • • • • • • • False positives / false negatives Significance thresholds Confidence interval Population size and marker density QTL resolution What are QTLs? QTL software Applications of QTL mapping False positives / false negatives • Ho: no association between markers and QTL Type 1 error: false positives Type 2 error: false negatives Define probability of false positives –significance level- is controlled by choosing the appropriate significance threshold Significance Thresholds by Permutation Churchill and Doerge, 1994 1.Permute the data (create the null hypothesis) H0: there is no QTL in the tested interval H1: there is QTL in the tested interval 2.Perform interval mapping 3. Repeat (1) and (2) many times 4.Choose Threshold Significant thresholds by empirical calculations QTL threshold calculations Piepho, 2001; Van Ooijen, 1999 LOD scores of 2.5 to 3.5 are commonly used LOD score of 2 indicates that the model containing the estimated QTL effect is 102= 100 times more likely than is the model with no QTL effect. Confidence interval for QTL positions Typically, confidence intervals for QTL positions have been based on a “1-LOD support interval” or a “2-LOD support interval” Maximum value of LOD: 4.2 3.2 Example of a “1-LOD support interval” 3.2 Size of the interval: (cM at *) – (cM at $) $ * As Kearsey and Farquhar (1998) note, most QTL confidence intervals are quite large, typically on the order of 20 cM. Population size and markers density - For an initial QTL study, generate a ”framework” DNA marker map with loci sampled every 20-30 cM. -Approaches for reducing the size of a mapping population: --Random sampling --Selective genotyping --Selective phenotyping - The number of individuals needed for mapping might depend on the heritability of the trait - Replication and environments vs. Population size - Relative costs for genotypic and phenotypic analyses Approaches for reducing population size Random sampling Data sets Number of lines Selective genotyping The Toluca Valley, Mexico (TVM) B47 (14.6%) 35 30 25 20 15 10 5 0 0 10 20 Bar (78.1%) 30 40 50 60 70 80 Disease severity (%) genotyping and analysis 90 100 Selective phenotyping phenotyping and analysis Effect of population size on the estimation of QTL number Number of QTL The Toluca Valley, Mexico (TVM) 10 9 8 7 6 5 4 3 2 1 0 Whole population Selective genotyping Selective phenotyping Random sampling DS 409 300 200 150 100 50 Number of DH lines Trait: Barley stripe rust (Puccinia striiformis f. sp. Hordei) severity Vales et al., 2005 Effect of population size on the estimation of percentage of phenotypic variance explained % phenotypic variance The Toluca Valley, Mexico (TVM) 100 90 80 70 60 50 40 30 20 10 0 Whole population Selective genotyping Selective phenotyping Random sampling DS 409 300 200 150 100 50 Number of DH lines Trait: Barley stripe rust (Puccinia striiformis f. sp. Hordei) severity Vales et al., 2005 Fine mapping and QTL cloning - Near isogenic lines - Positional cloning - Development of large mapping populations - Transposon tagging - Fine-map of the regions of interest (more markers, larger population, more recombination) - Candidate gene Positional cloning approach. Paterson (1998) Dissecting QTL using microarrays Genetical Genomics – Jansen and Nap, 2001 ELP: expression level polymorphism (Doerge 2002) QTL ‘eQTL” analysis Microarray ‘eTraits’ Limitations: - Changes in gene expression may not be attributable to allelic variation http://www.med.upenn.edu/microarr/array.html -Only detect changes in expression level - Availability of arrays QTL analysis software Method (reference) Program Significance Interval mapping (Lander and Botstein 1989) MAPMAKER/QTL Localizes QTL into marker intervals Composite interval mapping (Zeng 1994; Basten et al. 2001) CIM in QTL CARTOGRAPHER Employs adjacent markers and/or other QTL as cofactors Multiple interval mapping (Zeng et al. 1999) MIM in QTL CARTOGRAPHER Searches for multiple QTL simultaneously, using a single test of a chromosome. Bayesian interval mapping (Satagopan et al. 1996) BMAPQTL, also BIM in QTL CARTOGRAPHER Can estimate QTL effect and position separately, use of a prior may improve power. Outbred QTL (Seaton et al. 2002) QTL EXPRESS Least squares regression in some outbred crossing designs including sibs and pedigrees. Interval mapping in autotetraploids (Hackett and Luo, 2003) TETRAPLOIDMAP Localizes QTL into marker intervals. Takes into account tetrasomic inheritance and double reduction For more information, check: http://www.stat.wisc.edu/~yandell/qtl/software/ QTL databases Gramene QTL Database http://www.gramene.org/qtl/index.html QTL for agronomic traits in rice, maize, barley, oat, sorghum, pearl millet, foxtail and wild rice. QTL databases Graingenes: http://www.graingenes.org/cgi-bin/ace/browse/graingenes?class=QTL http://rye.pw.usda.gov/cmap/ MaizeGBD: http://www.maizegdb.org/qtl.php Summary of published barley QTL reports: http://barleyworld.org/northamericanbarley/qtlsummary.php Use of QTL information in Improvement programs • Genotype building - Pyramiding - major genes - minor genes - major and minor genes • Introgression • Recurrent selection • Crossbreeding or hybrid production - Choice of breeds or lines to cross Marker assisted selection • Marker-assisted selection • Marker-aid selection – Positive selection – Negative selection • Selection by design Marker assisted selection Pyramiding: BCD populations Cali-sib x Bowman Shyri x Galena BSR-45 CI10587 x Galena D1-72 Harrington D3-6 Baronesse Orca BCD BCD47 AJO DB BCD12 D3-6/B23 BU OPS D3-6/B61 Marker assisted selection Introgression, Backcrossing and Pyramiding Baronesse • BISON: Baronesse near isogenic lines 1H – BISON 1H, BISON 4H, BISON 5H, and combinations • For research purposes – Evaluation of disease components • Lession size, sporulation rate, pustule density • For breeding purposes BISON 1H Richardson et al. 2006 Richardson et al. 2006