Recent Advances in Statistical Ecology using

Download Report

Transcript Recent Advances in Statistical Ecology using

Recent Advances in Statistical
Ecology using Computationally
Intensive Methods
Ruth King
Overview



Introduction to wildlife populations and
identification of questions of interest.
Motivating example.
Issues to be addressed:




Missing data;
Model discrimination.
Summary.
Future research.
Wildlife Populations




In recent years there has been increasing
interest in wildlife populations.
Often we may be interested in population
changes over time, e.g. if there is a declining
population. (Steve Buckland)
Alternatively, we may be interested in the
underlying dynamics of the system, in order to
obtain a better understanding of the population.
We shall concentrate on this latter problem, with
particular focus on identifying factors that affect
demographic rates.
Data Collection




Data are often collected via some form of
capture-recapture study.
Observers go out into the field and record all
animals that are seen at a series of capture
events.
Animals may be recorded via simply resightings
or recaptures (of live animals) and recoveries (of
dead animals).
At each capture event all unmarked animals are
uniquely marked; all observed animals are
recorded and subsequently released back into the
population.
Data




Each animal is uniquely identifiable so our data
consist of the capture histories for each individual
observed in the study.
A typical capture history may look like:
0110012
0/1 corresponds to the individual being
unobserved/observed at that capture time; and
2 denotes an individual is recovered dead.
We can then explicitly write down the
corresponding likelihood as a function of survival
(), recapture (p) and recovery () rates.
Likelihood



The likelihood is the product over all individuals
of the probability of their corresponding capture
history, conditional on their initial capture.
For example, for an individual with history:
0110012
the contribution to the likelihood is:
2 p3 3 (1-p4) 4 (1-p5) 5 p6 (1-6) 7.
Then, we can use this likelihood to estimate the
parameter values (either MLE’s or posterior
distributions).
Covariates





Covariates are often used to explain temporal
heterogeneity within the parameters.
Typically these are “environmental” factors, such
as resource availability, weather conditions or
human intervention.
Alternatively, heterogeneity in a population can
often be explained via different (individual)
covariates.
For example, sex, condition or breeding status.
We shall consider the survival rates to be possibly
dependent on the different covariates.
Case Study: Soay sheep




We consider mark-recapture-recovery (MRR) data
relating to Soay sheep on the island of Hirta.
The sheep are free from human activity and
external competition/predation.
Thus, this population is ideal for investigating the
impact of different environmental and/or
individual factors on the sheep.
We consider annual data from 1986-2000,
collected on female sheep (1079 individuals).
This is joint work with Steve Brooks and Tim Coulson
Covariate Information
(i)
Individual covariates:







(ii)
Coat type (1=dark, 2=light)
Horn type (1=polled, 2=scurred, 3=classical)
Birth weight (real – normalised)
Age (in years);
Weight (real - normalised);
Number of lambs born to the sheep in the spring prior to
summer census (0, 1, 2);
And in the spring following the census (0, 1, 2).
Environmental covariates:

NAO index; population size; March rainfall; Autumn rainfall
and March temperature.
Survival Rates - 


Let the set of environmental covariate values at
time t be denoted by xt.
The survival rate for animal i, of age a, at time t,
is given by,
logit i,a,t = a + aT xt + aT yi + aT zi,t + a,t

Here, yi denotes the set of time-independent
covariates and zi,t the time varying covariates;
a,t ~ N(0,a2) denotes a random effect.

There arise two issues here: missing covariate
values and model choice.
Issue 1: Missing Data

The capture histories and time-invariant covariate
values data are presented in the form:
Covariate values
Capture
history
Missing values

Note that there are also many missing values for
the time-dependent weight covariate.
Problem


Given the set of covariate values, the
corresponding survival rate can be obtained, and
hence the likelihood calculated.
However, a complexity arises when there are
unknown (i.e. missing) covariate values,
removing the simple and explicit expression for
the likelihood.
Missing Data: Classical Approaches

Typical classical approaches to missing data
problems include:





Ignoring individuals with missing covariate values;
EM algorithm (can be difficult to implement and
computationally expensive);
Imputation of missing values using some underlying
model (e.g. Gompertz curve).
Conditional approach for time-varying covariates
(Catchpole, Morgan and Tavecchia, 2006 – in submission)
We consider a Bayesian approach, where we
assume an underlying model for the missing data
which allows us to account for the corresponding
uncertainty of the missing values.
Bayesian Approach




Suppose that we wish to make inference on the
parameters , and the data observed corresponds
to capture histories, c, and covariate values, vobs.
Then, Bayes’ theorem states:
(|c, vobs) Ç L(c, vobs| ) p()
The posterior distribution is very complex and so
we use Markov chain Monte Carlo (MCMC) to
obtain estimates of the posterior statistics of
interest.
However, in our case the likelihood is analytically
intractable, due to the missing covariate values.
Auxiliary Variables
We treat the missing covariate values (vmis) as
parameters or auxiliary variables (AVs).
 We then form the joint posterior distribution over
the parameters, , and AVs, given the capture
histories c and observed covariate values yobs:
(, vmis | c, vobs) Ç L(c | , vmis , vobs)
We can calculate the likelihood of £ f(v
mis, vobs | ) p()
the capture histories given all the
 We can now sample from the joint posterior
covariate values (observed and
distribution: (, vmis | c, vobs).
imputed missing values).
 We can integrate out over the missing
Prior covariate
on parameters
values, The
vmis,underlying
within the
MCMC
algorithm to obtain a
model
for the
sample covariate
from (values.
| c, vobs).

f(vmis, vobs | ) – Categorical Data





For categorical data (coat type and horn type), we
assume the following model.
Let y1,i denote the horn type of individual i. Then,
y1,i 2 {1,2,3}, and we assume that,
y1,i ~ Multinomial(1,q),
where q = {q1, q2, q3}.
Thus, we have additional parameters, q, which
can be regarded as the underlying probability of
each horn type.
The q’s are updated within the MCMC algorithm,
as well as the y1,i’s which are unknown.
We assume the analogous model for coat type.
f(vmis, vobs | ) – Continuous Data




Let y3,i denote the birth weight of individual i.
Then, a possible model is to assume that,
y3,i ~ N(, 2),
where  and 2 are to be estimated.
For the weight of individual i, aged a at time t,
denoted by z1,i,t we set,
z1,i,t ~ N(z1,i,t-1 + a + t, w2),
where the parameters a, t and w2 are to be
estimated.
In general, the modelling assumptions will depend
on the system under study.
Practical Implications




Within each step of the MCMC algorithm, we not
only need to update the parameter values , but
also impute the missing covariate values and
random effects (if present).
This can be computationally expensive for large
amounts of missing data.
The posterior results may depend on the
underlying model for the covariates– a sensitivity
analysis can be performed using different
underlying models.
(State-space modelling can also be implemented
using similar ideas – see Steve’s talk).
Issue 2: Model Selection




For the sheep data set we can now deal with the
issue of missing covariate values.
But….. how do we decide which covariates to use
and/or the age structure? – often there may be a
large number of possible covariates and/or age
structures.
Discriminating between different models can often
be of particular interest, since they represent
competing biological hypotheses.
Model choice can also be important for future
predictions of the system.
Possible Models






We only want to consider biologically plausible
models.
We have uncertainty on the age structure of the
survival rates, and consider models of the form:
1:a; a+1;b; …; j+
k groups
k is unknown a priori and also the covariate
dependence for each age group.
We consider similar age-type models for p and 
with possible arbitrary time dependence.
E.g. 1(N,BW), 2:7(W,L+),8+(N,P)/p(t)/1,2+(t)
The number of possible models is immense!!
Bayesian Approach



We treat the model itself to be an unknown
parameter to be estimated.
Then, applying Bayes’ Theorem we obtain the
posterior distribution over both parameter and
model space:
(m, m | data) Ç L(data | m, m) p(m) p(m).
Here m denotes the parameters in model m.
Likelihood
Prior on parameters
in model m
Prior on model m
Reversible Jump MCMC




The reversible jump (RJ)MCMC algorithm allows
us to construct a Markov chain with stationary
distribution equal to the posterior distribution.
It is simply an extension of the MetropolisHastings algorithm that allows moves between
different dimensions.
This algorithm is needed because the number of
parameters, m, in model m, may differ between
models.
Note that this algorithm needs only one Markov
chain, regardless of the number of models.
Posterior Model Probabilities




Posterior model probabilities are able to
quantitatively compare different models.
The posterior probability of model m is defined
as,
(m| data) = s (m,m| data) dm.
These are simply estimated within the RJMCMC
algorithm as the proportion of the time the chain
is in the given model.
We are also able to obtain model-averaged
estimates of parameters, which takes into
account both parameter and model uncertainty.
General Comments





The RJMCMC algorithm is the most widely used
algorithm to explore and summarise a posterior
distribution defined jointly over parameter and
model space.
The posterior model probabilities can be sensitive
to the priors specified on the parameters (p()).
The acceptance probabilities for reversible jump
moves are typically lower than MH updates.
Longer simulations are generally needed to
explore the posterior distribution.
Only a single Markov chain is necessary,
irrespective of the number of possible models!!
Problem 1



Constructing efficient RJ moves can be difficult.
This is particularly the case when updating the
number of age groups for the survival rates.
This step involves:




Proposing a new age structure (typically add/remove a
change-point);
Proposing a covariate dependence for the new age
structure;
Proposing new parameter values for this new model.
It can be very difficult to construct the Markov
chain with (reasonably) high acceptance rates.
Example: Reversibility Problem





One obvious (and tried!) method for adding a
change-point would be as follows.
Suppose that we propose to split age group a:c.
We propose new age groups – a:b and b+1:c.
Consider a small perturbation for each (non-zero)
regression parameter, e.g.,
’a:b = a:c + ; and ’b+1:c = a:c - .
where  ~ N(0,2).
However, to satisfy the reversibility constraints, a
change-point can only be removed when the
covariate dependence structure is the same for
two consecutive age groups…
Example: High Posterior Mass


An alternative proposal would be to set
’a:b = a:c (i.e. keep all parameters same for a:b).
Propose a model (in terms of covariates) for ’b+1:c:




Choose each possible model with equal probability (reverse
move always possible)
Choose model that differs by at most one individual and one
environmental covariate.
The problem now lies in proposing “sensible”
parameter values for the new model.
One approach is to use posterior estimates of the
parameters from a “saturated” model (full covariate
dependence for some age structure) as the mean of
the proposal distribtion.
Problem 2





Consider the missing covariates vmis.
Then, if the covariate is present, we have,
(vmis | vobs, , data) / L(data | , vobs, vmis)
£ f(vmis, vobs | ).
However, if the covariate is not present in the
model, we have,
(vmis | vobs, , data) / f(vmis, vobs | ).
Thus, adding (or removing) the covariate values
may be difficult, due to (potentially) fairly
different posterior conditional distributions.
One way to avoid this is to simultaneously update
the missing covariate values in the model move.
Soay Sheep Analysis


We now use these techniques for analysing the
(complex) Soay sheep data.
Discussion with experts identified several points
of particular interest:




the age-dependence of the parameters;
identification of the covariates influencing the survival
rates;
whether random effects are present.
We focus on these issues on the results
presented.
Prior Specification




We place vague priors on the parameters present
in each model.
Priors also need to be specified on the models.
Placing an equal prior on each model places a
high prior mass on models with a large number of
age groups, since the number of models increases
with the number of age groups.
Thus, we specify an equal prior probability on
each marginal age structure and a flat prior over
the covariate dependence given the age
structure; or on time-dependence.
Results: Survival Rates



The marginal models for the age groups with
largest posterior support are:
Age-structure
Posterior probability
1; 2:7; 8+
0.943
1; 2:7; 8:9; 10+
0.052
Note that with probability 1, lambs have a distinct
survival rate.
Often the models with most posterior support are
close neighbours of each other.
Covariate Dependence
BF = 3
0.6
NAO index
Population size
March rainfall
March temperature
Autumn rainfall
Random effects
0.0
0.2
0.2
0.4
Posterior probability
0.6
0.4
Coat type
Horn type
Birth weight
Weight
Lamb before census
Lamb following census
0.0
Posterior probability
0.8
0.8
1.0
1.0
BF = 20
2
4
6
Age
8
10
2
BF = Bayes factor
4
6
Age
8
10
Influence of Covariates
0.8
0.6
0.0
0.2
0.4
Survival rate
0.6
0.4
0.2
0.0
Survival rate
0.8
1.0
Population size
1.0
Weight
5
10
15
20
Weight
Age 1
25
30
35
0
200
400
600
Population size
Age 10+
Marginalisations





Presenting only marginal results can hide some of
the more intricate details.
This is most clearly seen from another MRR data
analysis relating to the UK lapwing population.
There are two covariates – time and “fdays” – a
measure of the harshness of the winter.
Without going into too many details, we had MRR
data and survey data (estimates of total
population size).
An integrated analysis was performed, jointly
analysing both data sets.
Marginal Models
The marginal models with most posterior support
for the demographic parameters are:

(a) 1 – 1st year survival
Model
1(fdays)
1
Post. prob.
0.636
0.331
(c)  – productivity
Model

(t)
Post. prob.
0.497
0.393
(b) a – adult survival
Model
a(fdays,t)
a(fdays)
Post. prob.
0.522
0.407
(d)  – recovery rate
Model
(t)
(fdays,t)
Post. prob.
0.730
0.270
This is joint work with Steve Brooks, Chiara Mazzetta and Steve Freeman
Warning!



The previous (marginal) posterior estimates hides
some of the intricate details.
The marginal models of the adult survival rate and
productivity rate are highly correlated.
In particular, the joint posterior probabilities for
these parameters are:
Model
a(fdays,t), 
a(fdays), (t)

Post. prob.
0.45
0.36
Thus, there is strong evidence that either adult
survival rate OR productivity rate is time dependent.
Summary





Bayesian techniques are widely used, and allow
complex data analyses.
Covariates can be used to explain both temporal
and individual heterogeneity.
However, missing values can add another layer of
complexity and the need to make additional
assumptions.
The RJMCMC algorithm can explore the possible
models and discriminate between competing
biological hypotheses.
The analysis of the Soay sheep has stimulated
new discussion with biologists, in terms of the
factors that affect their survival rates.
Further Research




The development of efficient and generic RJMCMC
algorithms.
Assessing the posterior sensitivity on the model
specification for the covariates.
Investigation of the missing at random
assumption for the covariates in both classical
and Bayesian frameworks.
Prediction in the presence of time-varying
covariate information.