Implementation Challenges in Marker Assisted Selection

Download Report

Transcript Implementation Challenges in Marker Assisted Selection

Software for Incorporating
Marker Data in Genetic
Evaluations
Kathy Hanford
U.S. Meat Animal Research Center
Agricultural Research Service
U.S. Department of Agriculture
Outline
 Introduction
 Mixed Models Incorporating Random
QTL Effects
 Current/Future Modification to
MTDFREMLQ
 Practical Limitations of MTDFREMLQ
 Applications
2
Introduction
 Genetic evaluation
 genetic
improvement of quantitative traits
through selection
 currently use polygenic model
genes at many loci each with a small effect
 measure the cumulative effect
 analysis with mixed models –software available

 add
genomic information
3
Introduction
 Two phases in application of genomic
data to livestock improvement
1) Statistical analysis of genomic information to
determine the potential importance of that information
(i.e. use of genetic markers to quantify the effects of
QTL on traits of economic importance)
2) Include marker information in the genetic
evaluation of potential parents to determine which will
have the best progeny (Marker Assisted Selection)
4
Introduction
 QTL Identification
 methods

needed for outbred populations
daughter and granddaughter designs
– many half-sib families with QTL effects being
estimated for each half-sib family

Fernando and Grossman
– works with the outbred population as a whole, using
both pedigree and marker information. Need
complete marker data

other methods
– such as MCMC, primarily been used only in
simulations
5
Mixed Model Incorporating
Random QTL Effects
y  X  Zuu  Z v v  e
Var(u)  A  G,Var(v)  M  V,Var(e)  R e2
v 2nx1 vector of QTL alleleic effects
(a i=vp i+vm i+u i, v i=[vp i,vm i]’ )
6
BLUP equations for Fernando and
Grossman model
 X' R X
X' R Z u

1
1
1
1
Z
'
R
X
Z
'
R
Z

A

G
u
u
 u
1
 Z v ' R 1X
Z
'
R
Zu
v

1
1

     X ' R 1 y 
X' R Z v
   
1 
Z u ' R 1Z v
u

Z
'
R
   u y
1
1
1     
Z v ' R Z v  M  V   v   Z v ' R 1y 
1
7
Numerator Relationship Matrix
(A)
 The probability that alleles are IBD
 Probability between two half sibs is .25
 Need the inverse of A
 depends on pedigree information
 Computed directly (Henderson, 1976)
 Relatively few nonzero elements (sparse)
8
Gametic (QTL) Correlation Matrix
(M)
 The probability that alleles are IBD
 Need the inverse of M
 depends
on pedigree information
 depends on probabilities QTL alleles are
IBD
 Computed directly if complete marker
information (Abdel-Azim and Freeman,
2001)
9
Practical Issues in Calculating
the QTL Correlation Matrix
 Outbred population
 Sparse marker information
 Individuals with missing or incomplete marker
data
 Some of which will be incorrect
 Large complex pedigrees (inbreeding and
loops)
10
Complex Pedigree
A
B
A1A2
D
C
E
A1A2
??
A1A2
Software
•MCMC
•LOKI
•DET
•Pong-Wong,et al.
•Allelic Peeling
•GenoProb
11
Size Considerations
 Each additional QTL increases the
number of equations by 2 times the
number of animals in the pedigree
 Sparse matrix storage
 Only store nonzero elements
 Polygenic (A-1) grows by 4 times the
number of animals
 Gametic (M-1) grows by 15 times the
number of animals
12
MTDFREML
 Multiple Trait Derivative-free Restricted
(or residual) Maximum Likelihood
 A set of programs to obtain estimates of
variances and covariances
 USDA/ARS – Dale Van Vleck
 Keith
Boldman, Lisa Kriese, Curt Van
Tassell, Steve Kachman, Joerg Dodenhoff
13
MTDFREML
 MTDFNRM – Calculate and output the
inverse of the numerator relationship
matrix
 MTDFPREP – Set up the model for the
analysis
 MTDFRUN – Run the analysis using the
files produced by MTDFNRM and
MTDFPREP to obtain (co)variance
estimates and breeding values
14
Current Modifications to MTDFREML
to Incorporate QTL effects
(MTDFREMLQ)
 MTDFNRMQ – modified to calculate inverse
of QTL correlation matrix (M-1) from IBD
probability file (produced by Genoprob, Loki,
etc)



Non-inbred pedigree when marker data are
incomplete
Inbred pedigree when marker data are complete
Genetic groups arising from different populations
with different prior selection
15
Current Modifications to
MTDFREMLQ (cont.)
 MTDFPRPQ – modified to include
multiple QTL in the model (validated for
single QTL)
 Multiple
trait
 Gametic imprinting (coded, not validated)
16
Current Modifications to
MTDFREMLQ (cont.)
 MTDFRUNQ – modified to include M-1
and associated between trait
(co)variances for each QTL (V-1)
 Assumes
independence between two
QTLs
17
Further Modifications to
MTDFREMLQ
 Include inbred pedigree when marker
data are incomplete (approximate M-1)
 Calculate standard errors for the
parameters using the delta method
 Currently
in MTDFREML
 In the testing/debugging stage
appears to work for single-trait, single-QTL and
two-trait, single-QTL cases.
 still need to test for multiple-QTL

18
Practical Limitations of MTDFREMLQ
 Memory Limitations/animal/traits/qtl
50,000
1Trait
2 Traits 3 Traits 4 Traits
1 qtl
<268M
324M
778M
2 qtls
<268M
552M
1.5G
3 qtls
<268M
919M
4 qtls
314M
1.4G
5 qtls
430M
1.6G
20,000 3 Traits 4 Traits 5 Traits
100,000
1 Trait
2 Traits
1 qtl
362M
698M
1 qtl
<268M
562M
2 qtls
642M
1.3G
2 qtls
<268M
1.0G
3 qtls
1.1G
3 qtls
376M
4 qtls
542M
5 qtls
775M
Time Limitations
1.2G
19
Applications
 QTL detection

Find and utilize QTL in a breed and include that
information in national genetic evaluation.
 Marker Assisted Selection
 Experimental

herds
The twinning herd at MARC
– Currently producing about 50% twin calving
compared to a normal range of 1-3%
– ~6000 in genetic evaluation, marker data from 1994
on over 3000 animals in regions of 3 QTLs
20
21