Genotype x Environment Interactions

Download Report

Transcript Genotype x Environment Interactions

PBG 650 Advanced Plant Breeding
Module 13: Breeding for Diverse Environments
– Genotype x Environment Interactions
Genotype by environment interaction
•
Genotypes respond differently across a range of
environments i.e., the relative performance of
varieties depends on the environment
•
GXE, GEI, G by E, GE
P = G + E + GE
   
2
P
2
G
2
E
2
GE
•
Genotype by environment interactions are common
for most quantitative traits of economic importance
•
Advanced breeding materials must be evaluated in
multiple locations for more than one year
– MET = multi-environment trials
Types of GEI
A
B
Rank changes and
interaction
noncrossover
crossover
A
B
Response
No interaction
No rank changes,
but interaction
B
A
Environments
•
Interaction may be due to:
– heterogeneity of genotypic variance across environments
– imperfect correlation of genotypic performance across environments
The GEI challenge
•
Environmental effect is the greatest, but is irrelevant
to selection (remember 70-20-10 rule, E: GE: G)
•
Many statistical approaches consider all of the
phenotypic variation (i.e., means across
environments), which may be misleading
•
Need analyses that will help you to characterize GEI
•
"GE Interaction is not merely a problem, it is also an
opportunity" (Simmonds, 1991). Specific adaptations
can make the difference between a good variety and
a superb variety
The GEI challenge
Some environmental variation is predictable
– can be attributed to specific, characteristic features of the
environment
– e.g., soil type, soil fertility, plant density
Some variation is unpredictable
– e.g., rainfall, temperature, humidity
Questions a breeder may ask
•
•
How do I choose suitable environments for testing?
How do I allocate resources (number of testing sites vs
replicates within sites)?
LSD  t  /2
2MSGE
re
2
 2 GE

 2 2


e 
 re
• In general, increasing environments does more to reduce the
standard error of genotype means than increasing
replications
Conventional ANOVA
F-test
Source
Environment (E)
Location (L)
Year (Y)
LxY
Reps/E
Genotype (G)
GxE
GxL
GxY
GxLxY
Error
MS
M1
M2
M3
M4
E random E random E fixed
G random G fixed G fixed
Approx.
M1/M2
M1/M2
M3/M4
M4/M5
M3/M4
M4/M5
M3/M5
M4/M5
M5
• Ratios of Sums of Squares (SS) can be used to estimate the
•
contribution of interactions to the total SS
Can also estimate variance components due to GXE
2
GE

MS GE  MSE
r
Mixed Models for analysis of MET
•
For evaluation of GEI among an elite group of
cultivars, genotypes are often considered to be fixed
effects and environments are random. The GEI
component is random. One can obtain BLUP
estimates of GEI effects.
•
For the purpose of estimating breeding values using
BLUP, genotypes are considered to be random and
environments are fixed effects.
•
Some statisticians believe that genotypes should
always be random effects regardless of the stage of
selection, provided that the objective is to select the
best ones (Smith et al., 2005).
Strategies for coping with GXE
•
Broad adaptation - develop a variety that performs consistently
well across a range of environments (high mean across
environments)
– this is equivalent to selection for multiple traits, which may reduce the
rate of progress from selection
– will not necessarily identify the best genotype for a specific
environment
•
Specific adaptation - subdivide environments into groups so that
there is little GEI within each group. Breed varieties that perform
consistently well in each environment
– you have to carry out multiple breeding programs, which means you
have fewer resources for each, and hence reduced progress from
selection
•
Evaluate a common set of breeding material across environments,
but make specific recommendations for each environment
Stability
•
Static – performance of a genotype does not change
under different environmental conditions (relevant
for disease resistance)
•
Dynamic – genotype performance is affected by the
environment, but its relative performance is
consistent across environments
Ways that a variety can achieve stability
•
Genetic homeostasis – variety is heterogeneous and
plants are adapted to slightly different environmental
conditions
– variety mixtures
– open-pollinated varieties vs hybrids
•
Developmental homeostasis – individual genotype is
able to adapt to a range of conditions
– phenotypic plasticity
– “reaction norms” – phenotypic variability among
individuals with the same genotype
Regression of genotype on environmental index
•
Finlay and Wilkinson, 1963
– b=0 is a stable genotype (static or Type I)
•
Eberhart and Russel, 1966
– b=1 is stable, because genotypes with b~0 tend to have
lower yields (dynamic or Type II)
– stable genotypes have small deviations from regression
(Type III)
effect of ith genotype
deviation from regression
Yij    gi  bit j  ij  eij
regression of ith genotype
environmental index
Regression of genotype on environmental index
Yield
250
240
Var A, b = 1.44
230
Var B, b = 1.02
220
Var C, b = 0.75
210
200
190
180
170
160
160
170
180
190
200
210
220
Environmental Index
230
240
250
Stability analysis
Source
df
MS
F ratio
Genotypes (G)
g-1
M1
~M1/M2
g-1
M2
~M2/M3
g(e-2)
M3
~M3/M4
e(r-1)(g-1)
M4
Environments + G x E
Environments (linear)
G x E (linear)
Pooled deviations
Error
g(e-1)
1
• Test for homogeneous slopes: F ~M2/M3
• Test that deviations from regression = 0: F ~M3/M4
Drawbacks of regression approach
•
•
Genotype means are not independent of site means
•
Does not consider underlying causes for performance at
each site
Genotype that has a consistent response to environmental
factors, but is atypical of the genotypes in the trial, will have a
high deviation from regression
– Sites with deficiencies and excesses of water, nutrients, etc. will
be juxtaposed on the regression line, resulting in large
deviations from regression
•
Heritabilities of bi and sd2 are low
– Stability is a statistical, not a genetic concept
Other stability statistics
•
Ecovalence (Wricke, 1962)
Wi   (X ij  Xi.  X.j  X.. )2
j
• Wi is the ecovalence for the ith genotype across the j environments
• Measures a genotype’s contribution to GEI
• Superiority measure of cultivars (Lin and Binns, 1988)
Pi  Xij  Mj  / 2n
2
• Mj is the maximum response in the jth location
• n is the number of locations
• Indicates how often a variety is close to being the best in the trial
Rank sum index
•
Utilizes Shukla’s ‘stability variance’ (Shukla, 1972) to
represent the stability concept and the unadjusted
mean across sites to represent mean performance
•
Genotypes are ranked for both parameters
– The genotype with the highest mean yield receives a rank
of one
– the genotype with the smallest value for stability variance
is ranked number one
•
Ranks are added together, and the genotype with
the lowest score is considered to be the best
Cluster analyses
•
Stratify environments into homogeneous groups so that there
is no GEI within groups; maximize between group variation
• Possible benefits
– May suggest genetic relationships
– May lead to better understanding of environmental factors influencing
adaptation
– Permits systematic approach for choosing testing sites
•
Disadvantage – location groupings are seldom consistent
from one year to the next
• Many choices of criteria exist for calculating distance beween
sites (Euclidian distance, dissimilarity index) and assigning
groups (e.g., average linkage or UPGMA)
Crossover interactions
•
GEI is only problematic for plant breeders when there are
rank changes in performance of varieties in different
environments (“crossover interactions”)
•
Tests for crossover interactions were presented by Baker
(1988)
•
Shifted multiplicative model (Cornelius et al.,1992)
– search for "separability" of crop cultivars based on GEI
– identify clusters of genotypes in which crossover interactions are
negligible
– cluster environments so that the genotypes within each group show no
crossover interactions
– techniques have been further developed in recent years by Cornelius
and Crossa
Principal component analysis (PCA)
•
With conventional multiple regression, coefficients can be
misleading when the variables in the model are correlated.
Values vary depending on the order that the variables appear
in the model.
•
PCA is a multivariate approach that provides an alternative to
conventional multiple regression analysis. PCA transforms
the data into linear combinations of the original variables that
are uncorrelated.
•
Useful for determining which agroclimatic or biotic factors can
be used to discriminate among environments.
•
Drawback for PCA – interpretation is not as intuitive as with
multiple regression or correlation analyses. Results do not
bear any obvious relationship to environmental variables.
Additive Main Effects and Multiplicative Interaction Model (AMMI)
• Method for analyzing GEI to identify patterns of interaction and reduce
background noise
• Combines conventional ANOVA with principal component analysis
• May provide more reliable estimates of genotype performance than the
mean across sites
• Biplots help to visualize relationships among genotypes and
environments; show both main and interaction effects
• Enables you to identify target breeding environments and to choose
representative testing sites in those environments
• Enables you to select varieties with good adaptation to target breeding
environments
• Can be used to identify key agroclimatic factors, disease and insect pests,
and physiological traits that determine adaptation to environments
• A type of fixed effect, Linear-Bilinear Model
Zobel et al., 1988; Gauch and Zobel, 1996
AMMI Model
Yijl =  + Gi + Ej + (kikjk) + dij + eijl
k = kth eigenvalue
ik = principal component score for the ith
genotype for the kth principal component axis
jk = principal component score for the jth
environment for the kth principal component
axis
dij = residual GXE not explained by model
AMMI estimates for performance of genotype i in environment j
E(Yijl) =  + Gi + Ej + (kikjk)
Benefits of AMMI
•
•
Based on a two-way matrix of genotypes x environments
Partitions treatment SS into model and residual
– traditional approaches for controlling error partition the
error SS into pure error and blocks.
– both can be done
• Gains in precision due to modeling using AMMI are often
several times as large as those due to blocking
•
Can first run AMMI to get rid of noise and get clean estimates
of variety means, and then apply classification procedures or
other analyses that are not able to discriminate between
patterns and noise
Biplots
Shafi et al., 1998
Biplots
http://www.uidaho.edu/ag/statprog/ammi/index.html
AMMI
•
General interpretation
– genotypes that occur close to particular environments on the IPCA2 vs
IPCA1 biplot show specific adaptation to those environments
– a genotype that falls near the center of the biplot (small IPCA1 and
IPCA2 values) may have broader adaptation
•
How many IPCAs (interaction principal component axes) are
needed to adequately explain patterns in the data?
– Rule of thumb - discard higher order IPCAs until total SS due to
discarded IPCA's ~ SSE.
– Usually need only the first 2 PC axes to adequately explain the data
(IPCA1 and IPCA2). This model is referred to as AMMI2.
•
Approach is most useful when G x location effects are more
important than G x year effects
GGE or SREG (Sites Regression) Model
• Another fixed effect, linear-bilinear model that is similar
to AMMI
• Only the environmental effects are removed before PCA
Yijl =  + Ej + (kikjk) + dij + eijl
• The bilinear term includes both the main effects of
genotype and GXE effects
• Several recent papers compare AMMI and GGE (e.g.
Gauch et al., 2008)
• May be used to evaluate test environments (Yan and
Holland, 2010)
Pattern Analysis
Steps involved:
•
recommended pretreatment (transformation) – scale the data
by removing environment main effects and adjust scale by
dividing by the phenotypic standard deviation at each site.
•
use a classification procedure to identify environments which
show similar discrimination among the genotypes.
•
use an ordination procedure (singular value decomposition) –
similar to AMMI except that it uses transformed data
•
use biplots to show relationships between genotypes and
environments
Cooper and DeLacy, 1994
Partial Least Squares Regression (PLS)
• AMMI model only considers one response variable
• PLS is a type of bilinear model that can utilize
information about environmental factors (covariables)
– rainfall, temperature, and soil type
• PLS can accommodate additional genotypic data
– disease reaction
– molecular marker scores
• Analysis indicates which environmental factors or
genotypic traits can be used to predict GEI for grain yield
Factorial Regression (FR)
•
•
A fixed effect, linear model
•
Similar to stepwise multiple regression, where
additional variables are added to the model in
sequence until sufficient variability due to GEI can
be explained
•
FR is easier to interpret than PLS, but may give
misleading results when there are correlations
among the explanatory variables in the model
Can incorporate additional genotypic and
environmental covariables into the model
Linear-Bilinear Mixed Models
•
•
Have become widely accepted for analysis of GEI
•
•
Has desirable statistical properties
Lead to Factor Analytic form of the genetic variancecovariance for environments
When genotypes are random, coancestries can be
accommodated in the model
Burgueño, J.; Crossa, J.; Cornelius, P.L.; Yang, R.-C. 2008. Using factor
analytic models for joining environments and genotypes without crossover
genotype × environment interaction. Crop Sci. 48:1291-1305.
Nonlinear models for GEI analysis
•
Assumptions for linear models
– homoscedasticity (errors homogeneous = common variance)
– normal distribution of residuals
– errors are independent (e.g. no relationship between mean and
variance)
•
Generalized linear models can be used when assumptions
are not met
– SAS PROC GENMOD, PROC NLMIXED, PROC GLIMMIX
•
Nonparametric approaches
– Smoothing spline genotype analysis
Agroecological zones
•
The Geographic Information System (GIS) can be
used to define regions with similar ecologies for crop
production
•
Gauch and Zobel (1997) described methodologies
for defining mega-environments and regions of
cultivar adaptation using actual yield data from multilocational trials
•
Further work is needed to integrate information from
GIS with actual performance data to help define
target breeding environments
Agroecological Zones of West and Central Africa
Zones
Humid Forest
Derived Savanna
Southern Guinea Savanna
Northern Guinea Savanna
Mid-altitude Savanna
Sudan Savanna
Crop models
• Crop simulation models calculate or predict crop yield as a function of:
–
–
–
–
Weather conditions
Soil conditions
Crop management scenarios
Genetic coefficients
• Potential production determined by
– Solar radiation and temperature as input
– Simulate growth and development
– Plant carbon balance (photosynthesis, respiration, partitioning)
• More sophisticated models may also consider yield reductions due to:
– limited water
– limited nitrogen and other nutrients
– insects, diseases, and weeds
From a presentation by Gerrit Hoogenboom, 2002
Peanut Varieties in Thailand
•
Development Coefficients (Photothermal days):
– Time between emergence and flowering (16.4 - 25.0 d)
– Seed filling Duration (22.0 - 44.0 d)
– Time required for final pod load (13.0 - 30.0 d)
•
Growth Coefficients:
– Maximum leaf photosynthetic rate (1.04 - 1.40 mg CO2 m-2 s-1 )
– Maximum partitioning to seed + shell (0.58-0.95)
– Maximum weight per seed (0.39-1.18 g)
From a presentation by Gerrit Hoogenboom, 2002
Peanut Varieties in Thailand
Simulated and observed above ground biomass for entry no. 9
GEI - Conclusions
• Many techniques available
• An active area of research
• Need to synthesize information
– performance data and stability analyses
– understanding of crop physiology, crop models
– disease and pest incidence
– molecular genetics
– agroclimatology, GIS
References
Crossa, J. 2012. From genotype × environment interaction to gene × environment interaction.
Current Genomics 13: 225-244.
Gauch, H.G., H.-P. Piepho, and P. Annicchiarico. 2008. Statistical analysis of yield trials by
AMMI and GGE: further considerations. Crop Sci. 48: 866-889
Hussein, M.A, Bjornstad, A., and A.H. Aastveit. 2000. SASG X ESTAB: A SAS program for
computing genotype x environment stability statistics. Agron. J. 92: 454–459. Agron. J.
92:454–459.
Kang, M.S., M.G. Balzarini, and J.L.L. Guerra. 2004. Genotype-by-environment interaction. In
(A.M. Saxton ed.) Genetic Analysis of Complex Traits Using SAS. Cary, NC: SAS Institute
Inc.
Peipho, H.P., and J. Mohring. 2005. Best Linear unbiased prediction of cultivar effects for
subdivided target regions. Crop Sci. 45: 1151-1159.
Smith, A.B., B.R. Cullis, and R. Thompson. 2005. The analysis of crop cultivar breeding and
evaluation trials: an overview of current mixed model approaches. Journal of Agricultural
Science 143: 449-462.
Yan, W., and J.B. Holland. 2010. A heritability-adjusted GGE biplot for test environment
evaluation. Euphytica 171: 355-369.
Yang, R.-C., J. Crossa, P.L. Cornelius, and J. Burgueño. 2009. Biplot analysis of genotype x
environment interaction: proceed with caution. Crop Sci. 49: 1564-1576.