Transcript Document

Quantitative Environmental
Reconstructions in Palaeolimnology:
Progress, current status, & future needs
John Birks
University of Bergen,
University College London, and
University of Oxford
12th International Paleolimnology
Symposium (IPS 2012)
Glasgow 21-24 August 2012
INTRODUCTION
Early attempts at quantitative environmental
reconstructions used presence of one or more ‘indicator
species’ (e.g. Andersson, Samuelsson, Iversen, Grichuk,
Coope) or species groups (e.g. Hustedt, Nygaard). Major
development in Quaternary science occurred in 1971
with publication of the classic paper by Imbrie & Kipp.
Paper laid the foundation
of calibration functions
(transfer functions) as a
tool for the quantitative
reconstruction of past
environments using the
whole fossil assemblage,
not just a few indicator
species. Paradigm shift.
Quickly followed by Webb & Bryson (1972) using
pollen data in the Midwest, USA, to reconstruct
climate. Used liner-based canonical correlation
analysis.
Basic Approach to Quantitative
Environmental Reconstruction –
Calibration-in-Space
Fossil data (e.g. diatoms)
‘Proxy data’
1,
, m taxa
Environmental variable
(e.g. pH)
1 variable
Yf
t
samples
Xf
t
samples
Unknown
To be estimated or
reconstructed
To solve for Xf, need modern data about species
and pH from n samples
Modern biology
(e.g. diatoms)
1,
, m taxa
Modern environment
(e.g. pH)
1 variable
Ym
n
samples
Xm
n
samples
Model Ym in relation to Xm to derive modern
calibration function Ûm
Apply Ûm to Yf to estimate past environment Xf
Imbrie & Kipp provided the basic theory and
assumptions, a robust method, and modern and
fossil data
Xm
Calibration-inspace
Ym
transfer
Ûm function
Yf
Xf
Based on an
unpublished
diagram by
Steve
Juggins
Alternative Approach –
Calibration-in-Time
Fossil data
(e.g. diatoms)
1,
, m taxa
Y0
p samples
Yf
t samples
Environmental variable
(e.g. pH)
1
X0
Known from
historical data
p observations
Xf
t samples
Unknown, to be
reconstructed
To solve for Xf, model Y0 in relation to X0, derive and
^
apply calibration function F0 to Yf to estimate Xf
All done at one site
Potential problems
1. Temporal autocorrelation in Y0 and X0. How
many independent samples are there? What is
n?
2. Chronological – sample correlation between
Y0 and X0
3. Applicability – can the model be applied to
other sites other than the site where the
calibration is made? Similar problem of
applicability with intra-lake approach (Ym and
Xm to derive Ûm from one lake applied to
other non-training set lakes).
Only consider Calibration-in-Space
In palaeolimnology, after Nygaard’s (1956) , ,
and  indices and Merilainen’s (1967) calibration,
first major step towards robust environmental
reconstructions was made in 1982 by Renberg &
Hellberg with their Index B
ind = indifferent species (either side of pH7)
acp = acidophilous (pH<7)
acb = acidobiontic (pH<7, optimum 5.5 or less)
alk = alkaliphilous (pH7 or more)
alb = alkalibiontic (pH>7)
Renberg & Hellberg (1982)
Represented a great breakthrough, only 30 years ago
Other approaches were developed by Don Charles,
Ron Davis, Roger Flower, and others, involving
inverse multiple linear regression using Hustedt
pH groups or individual taxa as ‘predictors’ of pH,
e.g.
pH = 10.04acb – 0.35acp – 0.01ind – 0.01alk + 7.99
(r2 = 0.91)
(Charles 1982)
State of the subject in palaeolimnology prior to 1989
1986
Major breakthrough occurred in 1989 as result of
work of Cajo ter Braak with his 1987 doctoral thesis
Advances in Ecological Research
1988
Several important papers that have been very
influential on quantitative palaeolimnology
Through his work at the Research Institute for
Nature Management at Leersum, ter Braak advised
ecologists about data analysis and developed many
new techniques to help answer particular ecological
questions.
One such ecologist was the diatomist Herman van
Dam who was working on the impact of acidification
on diatoms and water chemistry of Dutch moorland
ponds (this work led ter Braak to publish his first
paper on multivariate data analysis (principal
component biplots) in 1982).
This collaboration led to ter Braak & van Dam (1989)
Changed the approaches to quantitative
environmental reconstruction in palaeolimnology
(and in much of palaeoecology)
Fortunately coincided with Surface Water Acidification
Project’s (SWAP) Palaeolimnology Programme led by
Rick Battarbee and Ingemar Renberg 1987-1990.
ter Braak & van Dam (1989)
99 training-set diatom-pH samples; 61 independent test-set
diatom samples
Test-set
RMSEP
Maximum-likelihood
Gaussian logit
Test-set
RMSEP
Weighted averaging
0.63
No tolerance
downweighting
0.71
Multinomial logit (equal
tolerances)
0.70
Tolerance
downweighting
0.74
Multinomial logit
0.67
pH groups
0.75
Multiple regression
26 taxa
2.24
7 step-wise selected taxa
2.74
pH groups
0.71
Index B
0.83
Correspondence
analysis regression
0.71
Set the scene for weighted-averaging based methods –
computationally simple, heuristic equivalents to the theoretically
more rigorous maximum-likelihood methods.
Biological Proxy-Data Properties
• Contain many taxa (200-300)
• Contain many zero values (absences)
• Commonly expressed as proportions or
percentages - "closed" compositional data
• Multicollinearity between variables
• Quantitative data are highly variable, invariably
show a skewed distribution. Few common taxa,
many rare taxa
• Can show spatial autocorrelation e.g. forams,
dinocysts, pollen
• Taxa generally have non-linear relationship with
their environment, and the relationship is often a
unimodal function of the environmental variables
Species Response Models
LINEAR
A unimodal relation
between the abundance
value (y) of a species and
an environmental variable
(x). (u=optimum or
mode; t=tolerance;
c=maximum). Modelled
by Gaussian logit
regression (GLR)
A straight line displays the
linear relation between the
abundance value (y) of a
species and an environmental
variable (x). Modelled by
linear regression.
UNIMODAL
Environmental Data Properties
• Generally few variables, often show a skewed
distribution
• Strong multicollinearity (e.g. July mean temperature,
growing season duration, annual mean temperature)
• Often difficult to obtain (few modern climate stations,
corrections for altitude of sampling sites, etc.)
• Strong spatial autocorrelation (tendency of values at
sites close to each other to resemble one another more
than randomly selected sites). Values at one site can be
partially predicted from its values at neighbouring sites.
• Problem of nearly all data in real world. Recognised by
Francis Galton in 1889. First methods to eliminate
spurious correlation due to spatial position developed by
‘Student’ in 1914.
PROGRESS
Since 1971, calibration functions widely used in
palaeoceanography, terrestrial palaeoecology, and
palaeolimnology
Used with wide range of biological proxies
• foraminifera, radiolaria, marine diatoms,
coccolithophores
• pollen, testate amoebae, mollusca, bryophytes,
plant macrofossils
• diatoms, chrysophytes, chironomids, ostracods,
cladocerans
Now many different numerical reconstruction
methods – at least 26 methods published, many
minor variants of established methods
Reconstruction methods can be divided into three
main types (Birks et al. 2010)
1.Indicator-species approach – one or many
taxa considered as presence/absence
2.Similarity-based assemblage methods
involving a quantitative comparison between
past assemblages Yf and modern assemblages
Ym (e.g. MAT, smooth response surfaces)
3.Multivariate calibration methods involving a
quantitative calibration function Ûm estimated
from Xm and Ym, modern calibration or training
data-set (e.g. weighted averaging regression and
calibration)
Concentrate on calibration-function approach
Approaches to Estimating Calibration
Functions
1. Basic Numerical Models
• Classical Approach
Y = f(X) + error
Biology
Environment
Estimate f by some mathematical procedure
and 'invert' estimated (f) to find unknown past
environment Xf from fossil data Yf
Xf  f-1(Yf)
Can be difficult computationally
• Inverse Approach
In practice, for various mathematical reasons, do an
inverse regression or calibration
X = g(Y) + error
Xf = g(Yf)
Obtain 'plug-in' estimate of past environment Xf from
fossil data Yf
f or g are calibration functions
Easier to compute g and nearly always performs
as well as classical approach
2. Assumed Species Response Model
• Linear or unimodal
• No response model assumed (linear or non-linear)
3. Dimensionality of Model
• Full (all species considered)
• Reduced (selected components of species used)
4. Estimation Procedure for Model
• Global (estimate parametric functions,
extrapolation possible)
• Local (estimate non-parametric functions,
extrapolation not possible)
Birks et al. (2010)
Commonly Used Methods
Principal components regression (PCR) I
L (U) R
G
CF
Index B
I
L
R
G
CF
Inverse multiple regression
I
L
R
G
CF
Partial least squares (PLS)
I
L
R
G
CF
Gaussian logit regression (GLR)
C U
F
G
CF
Two-way weighted averaging (WA)
I
U
F
G
CF
WA-PLS
I
U
R
G
CF
Artificial neural networks (ANN)
I
NA
F
Ln
CF
Modern analogue technique (MAT)
I
NA
F
Ln
S
Smooth response surfaces
C NA
F
Ln
S
I = inverse; C = classical
L = linear; U = unimodal; NA = not assumed;
R = reduced dimensionality; F = full dimensionality;
G = global parametric estimation; Ln = local non-parametric estimation
CF = calibration-function based; S = similarity-based
Good reasons for preferring methods with assumed biological
response model, full dimensionality, and global parametric
estimation (ter Braak (1995), ter Braak et al. (1993), etc.)
1. Can test statistically if taxon A has a statistically
significant relation to particular environmental variables
2. Can develop ‘artificial’ simulated data with realistic
assumptions for numerical ‘experiments’
3. Such methods have clear and testable assumptions
– less of a ‘black box’ than e.g. artificial neural networks
4. Can develop model evaluation or diagnostic
procedures analogous to regression diagnostics in
statistical modelling
5. Having a statistical basis, can adopt well-established
principles of statistical model selection and testing.
Minimises ‘ad hoc’ aspects of MAT
“To make sense of an observation, everyone needs a model
… whether he or she knows it or not” Marc Kéry (2010)
Basic Requirements in Quantitative
Palaeoenvironmental Reconstructions
1. Need biological system with abundant fossils that
is responsive and sensitive to environmental
variables of interest.
2. Need a large, high-quality training set of modern
samples. Should be representative of the likely
range of variables, be of consistent taxonomy and
nomenclature, be of highest possible taxonomic
detail, be of comparable quality (methodology,
count size, etc.), and be from the same sedimentary
environment.
3. Need fossil set of comparable taxonomy,
nomenclature, quality, and sedimentary
environment.
4. Need robust statistical methods for regression
and calibration that can adequately model taxa and
their environment with the lowest possible error of
prediction and the lowest bias possible and sound
methods for model selection.
5. Need means of establishing if reconstruction is
statistically significant.
6. Need statistical estimation of standard errors of
prediction for each reconstructed value.
7. Need statistical and ecological evaluation and
validation of the reconstruction and of each
reconstructed value.
Birks et al. (1990)
Early Methods Used
Principal components regression (PCR)
= Imbrie & Kipp (1971) approach
PC1
Ym
PC2
Xm
PC3
Multiple linear regression or
quadratic regression of Xm on
PC1, PC2, PC3, etc, to derive
Ûm. Express Yf as principal
components and apply Ûm to
estimate Xf
Principal components maximise
variance within Ym only
Selection of PCA components done visually until recently. Now
cross-validation is used to select model with fewest components,
lowest root mean squared error of prediction (RMSEP), & lowest
maximum bias. ‘Minimal adequate model’ in statistical modelling
Inverse, linear, reduced dimensionality, global estimation. Linear
response model is assumed, although non-linear responses are possible.
Index B approach
Ind
Acp
Ym + Xm
Acb
Alk
Alb
Index B (Um)  Xf
+
Yf
pH reconstruction
(fossil
data)
Inverse, linear, reduced dimensionality, global parametric estimation.
Needs a priori taxon groupings
Related inverse multiple linear regression approach
(Davis & Berge 1980, Charles 1982, Davis et al. 1983, Davis &
Anderson 1984, Flower 1986)
Ind
Acp
Ym
Acb
+ Xm  Um  Xf
+
Alk
Yf
Alb
(fossil
data)
pH reconstruction
Inverse, linear, reduced dimensionality, global parametric estimation.
Linear model is assumed, although non-linear responses are possible.
Can be done with a priori species groups or individual taxa (forward
selection).
Major Methods Used
Gaussian logit regression (GLR)
and maximum likelihood (ML) calibration
ter Braak & van
Dam (1989)
Ym + Xm
modern
data
b0, b1, b2
b0, b1, b2
b0, b1, b2
Yf
fossil
data
ML
calibration
Xf
environmental
reconstruction
taxon GLR regression
coefficients for all taxa Ûm
Classical, unimodal, full dimensionality, global estimation. Robust to
spatial autocorrelation. Can be computationally difficult. ML finds the
most likely value of Xf that maximises the likelihood function given Yf
and Ûm
Two-way weighted averaging regression and
calibration (WA)
ter Braak & van Dam
(1989); Birks et al. (1990)
WA regression
Ym + Xm
modern
data
U1
U2
Ut
Yf
fossil
data
WA
calibration
Xf
environmental
reconstruction
taxa WA optima
‘calibration function’
Ûm
Inverse, unimodal, full dimensionality, global parametric estimation.
Robust to spatial autocorrelation. First used in Quaternary science by
Lynts and Judd (1971) Science 171: 1143-1144
1. Ecologically plausible – based on unimodal species
response model.
2. Mathematically simple but has a rigorous mathematical
theory. Properties fairly well known now.
3. Empirically powerful:
a.does not assume linear responses
b.not hindered by too many taxa, in fact helped by many
taxa! Full dimensionality
c. relatively insensitive to outliers
4. Tests with simulated and real data – at its best with noisy,
taxon-rich compositional percentage data with many zero
values over long environmental gradients.
5. Because of its computational simplicity, can derive error
estimates for predicted inferred values by bootstrapping.
6. Does well in ‘non-analogue’ situations as it is not based on
the assemblage as a whole but on INDIVIDUAL taxa optima
and/or tolerances. Robust to spatial autocorrelation. Global
parametric estimation.
7. Ignores absences of taxa.
Weaknesses
1. Sensitive to distribution of environmental variable in
training set, leading to ‘edge effects’ where responses
are truncated.
WA
WA GLR
pH
GLR
J. Oksanen (2002)
2. Disregards residual correlations in biological data.
Can extend WA to WA-partial least squares to include
residual correlations in biological data in an attempt to
improve estimates of taxon optima
Weighted averaging partial least squares regression
and calibration (WA-PLS)
ter Braak & Juggins (1993) and ter Braak et al. (1993)
PLS1
Ym
PLS2
PLS3
WA-PLS
regression
Xm
βm
WA-PLS
calibration
Yf
Xf
coefficients (Ûm)
Components selected to maximise
covariance between taxon weighted
averages and environmental variable X
Selection of number of PLS components to include based on crossvalidation. Model selected should have fewest components possible
and low RMSEP and maximum bias – minimal adequate model.
Inverse, unimodal, reduced dimensionality, global parametric
estimation. Can be sensitive to spatial autocorrelation.
Comparison of different methods
Imbrie & Kipp (1971) data
Model performance statistic is root mean squared error
of prediction (RMSEP) based on leave-one-out crossvalidation
Unimodal
Linear
RMSEP
Summer SST Winter SST
PC regression
2.55C
2.57C
PC regression with
quadratic terms
2.15C
1.54C
CA regression
1.72C
1.37C
GLR (ML)
1.63C
1.20C
WA
2.02C
1.07C
WA-PLS
1.53C
1.17C
Shows importance of using a unimodal-based method
(ter Braak et al. (1993))
Other Areas of Progress
Besides the development of new methods for
deriving calibration functions and of modern
calibration data-sets, there have been major
developments in model evaluation and
selection and in reconstruction assessment,
namely statistics of calibration functions and in
understanding the strengths and weaknesses of
different methods and in their underlying theory
See Juggins (2012 QSR submitted)
1. Model evaluation and selection
Tendency to use several different methods and to
select so-called ‘best’ method. Resulted in a shift
from an obsession with the model with lowest
RMSEP or, even worse, the highest r2.
More concern with model performance statistics
including estimates of bias and number of
components fitted (e.g. in WA-PLS).
Model performance usually based on some form of
internal cross-validation (leave-one-out, n-fold
cross-validation, or bootstrapping) or external
cross-validation with independent test-set.
Juggins & Birks (2012)
Birks & Simpson (in press) revisited the classical
SWAP 167-sample diatom-pH calibration-set using
modern methods (WA, WAPLS, GLR, MAT, etc.)
1. Internal cross-validation, done 50 times
167 samples 
110 training-set samples
+ 20 optimisation-samples (no.
WAPLS components etc.
+ 37 test-samples
2. External cross-validation, done 50 times
167 samples 
167 training-set samples
+ 23 external optimisation-samples
+ 50 external test-samples
Internal
crossvalidation
37 test-samples
50
randomisations
Birks & Simpson
(in press)
External
crossvalidation
50 test-samples
50
randomisations
Birks & Simpson
(in press)
Internal cross-validation RMSEP values
(I = inverse; C = classical; M = monotonic; T = Tolerance downweighting)
WAI = WAC = WAM = WTM
< WATI = WATC = MAT < WAPLS < GLR
External cross-validation
GLR < WAM = WTM < WAI = WAPLS
< WAC < WATI < MAT < WATC
Which to use as a guide to model selection?
External cross-validation involving independent
test-set samples is ‘the appropriate benchmark to
compare methods’ because all sources of error are
considered (ter Braak & van Dam 1989)
van der Voet (1994) randomisation test of models
helps find ‘minimal adequate model’ (MAM).
Model with good performance statistics and fewest
number of fitted parameters. May be more than
one MAM.
More work needed on model selection using
criteria like Akaike Information Criterion (AIC)
where unnecessary parameters are penalised.
Active research area in ecology and evolutionary
biology today.
Of course, performance of modern model is being
assessed with other modern data, not with fossil
data! Major problem. External cross-validation
provides as rigorous a test as possible of
performance.
2. Effects of spatial autocorrelation
Estimating model performance in terms of RMSEP,
r2, maximum bias, etc, assumes that the test-set
is statistically independent of the training-set.
Cross-validation in presence of spatial
autocorrelation violates this assumption as test
samples are not spatially and statistically
independent.
Spatial autocorrelation property of almost all
environmental data and much ecological and
biological data.
Telford
Telford
Telford
Telford
& Birks 2005 Quat. Sci. Rev. 24: 2173-2179
2006 Quat. Sci. Rev. 25: 1375-1382
& Birks 2009 Quat. Sci. Rev. 28: 1309-1316
& Birks 2011 Quat. Sci. Rev. (30: 3210-3213)
Results show the apparent performance of some models is
enhanced as a result of spatial autocorrelation in oceans and
on land
Effect of spatial autocorrelation
MAT, ANN
High
Local, non-parametric estimation
WA-PLS
Some
Global, parametric + potentially some
local estimation
GLR, WA
Low
Global parametric estimation
Problems in finding spatially independent test-sets to test
inference models
Telford & Birks (2009) have developed methods for crossvalidating a calibration function in presence of spatial
autocorrelation, h-block cross-validation
Spatial autocorrelation does not appear to be a problem in
many palaeolimnological calibration-sets. May be a problem
in within-lake calibration-sets developed for water-level
reconstructions (Velle et al. 2012 JoPL in press)
3. Partitioning Root Mean Squared Error of Prediction
Model uncertainty commonly expressed as RMSEP
s1 Error due to variability in estimates of taxon
parameters in training-set (model error or lack
of fit)
20-25%
s2 Error due to variation in taxon abundances at a
given environmental value
75-80%
1. Within-lake variability
(Heiri et al. 2002)
2. Variability in modern environmental data
(Nilsson et al. 1996)
Models cannot, at present, take account of
variation in environmental data
3. Variability in assemblages at a given
environmental value due to unknown historical,
ecological, stochastic, taphonomic, etc,
processes. Unexplained variation
Can only hope to reduce RMSEP by 20-25%
c. 15-20%
25-40% (up
to 60%)
10-35%
4. Testing the statistical significance of a
quantitative palaeoenvironmental reconstruction
All calibration-function programs will produce
output or ‘reconstruction’
Does the resulting reconstruction explain more of
the variance in the fossil data than most (say 95%)
reconstructions derived from calibration functions
trained on random environmental data?
If it does, then it is statistically significant.
Global test of significance
Telford & Birks 2011 Quat. Sci. Rev. 30: 1272-1278
H.H. Birks et al. 2012 Quat. Sci. rev. 33: 100-120
5. Evaluation of individual reconstructed estimates
Assuming overall reconstruction is statistically
significant, some individual estimates may be less
reliable than others (poor preservation, unusual
composition or peak, etc). Need to evaluate
individual reconstructed values. Local evaluation
• Goodness-of-fit measures for each individual fossil
sample, as in regression modelling (Birks et al. 1990)
• Analogue statistics (Birks et al. 1990; Simpson 2007)
• Proportions of taxa in fossil assemblage absent or
rare in modern training data with no or poorly
estimated taxon parameters (Birks 1998)
• Sample-specific errors for reconstructed values
estimated by bootstrapping, bagging (aggregated
bootstrapping) or Monte Carlo simulation (Birks et al. 1990)
What to do with sample-specific errors?
Has a statistically
significant
(p=0.009)
reconstruction but
there is also a
continuous overlap
in RMSEP.
Problems of
temporal
autocorrelation in
assessing RMSEP
for samples.
Birks & Peglar (unpub.)
Unresolved
6. Highlighting ‘signal’ from ‘noise’ in reconstructions
Use of LOESS smoother a great help
Sample-specific
errors
or
LOESS smoother
Seppä & Birks (2002)
Brooks & Birks (2001)
7. Ecological validation
Compare reconstructed values with historical data.
Rarely possible as few historical data exist.
Renberg & Hultberg (1992)
But when done,
sometimes the
model that gives
the closest
correspondence
is not the model
with lowest
RMSEP or
maximum bias!
Conflict between model performance and selection
based on cross-validation of modern data and validation
results using independent historical test-sets
8. Palaeoecological validation by multi-proxy data
Birks & Ammann (2000)
Similar trends, different absolute values. Not surprising,
given different biology of different groups of organisms
CURRENT STATUS AND PROBLEMS
1. The biggest set of problems is that the calibrationfunction approach, like any other quantitative
procedure, makes assumptions, as originally
stated by Imbrie & Kipp (1971), Imbrie & Webb
(1981), and Birks et al. (1990).
These assumptions are being increasingly
violated, especially in the last 5-10 years.
What are these assumptions?
1. Assumptions in quantitative palaeoenvironmental
reconstructions
1. Taxa in training set (Ym) are systematically related to the physical
environment (Xm) in which they live
2. Environmental variable (Xf , e.g. summer temperature) to be
reconstructed is, or is linearily related to, an ecologically important
variable in the system
3. Taxa in the training set (Ym) are the same as in the fossil data (Yf ) and
their ecological responses (Ûm) have not changed significantly over the
timespan represented by the fossil assemblage
4. Mathematical methods used in regression and calibration adequately
model the biological responses (Um) to the environmental variable (Xm)
5. Other environmental variables than, say, summer temperature have
negligible influence, or their joint distribution with summer temperature
in the fossil set is the same as in the training set
6. In model evaluation by cross-validation, the test-data are
independent of the training data
Imbrie & Kipp (1971), Imbrie & Webb (1981), Birks et al. (1990),
Telford & Birks (2005), Juggins & Birks (2012)
2. Multiple-variable reconstructions – what variables
can be reconstructed?
Increasing tendency to reconstruct 2 or 3, even 7-8,
environmental variables that on the basis of current ecological
knowledge of, e.g., vegetation, chironomids, or diatoms,
cannot all be ‘ecologically important’ (assumption 2)
e.g. mean January, mean July, mean annual temperature,
growing degree days above 0C and above 5C, annual
precipitation, and evaporation : potential evaporation.
Ecological data are not usually influenced by 8 independent
‘ecologically important’ variables. Usually only 1-3 significant
ordination axes.
All variables may be statistically significant in a RDA or CCA
when considered individually (‘marginal’ effects) but almost
certainly not significant when considered together
(‘conditional’ effects, high multicollinearity, variance inflation
factors). Many reconstructions of, for example, ‘distance to
littoral vegetation’ suspect.
Basic statistical error
(Juggins 2012)
Other potentially powerful approach is hierarchical
partitioning (HP)
HP is designed to overcome multicollinearity problems by
using a mathematical theorem by which the explanatory
capacities of a set of predictor environmental variables can
be estimated. Uses goodness-of-fit measures for each of the
2k possible models for k independent variable. In HP, the
variances are partitioned so that the independent
contribution (I) of a given environmental variable is
estimated. Furthermore, the variation shared with another
environmental variable (conjoint contribution J) can be
computed.
HP allows differentiation between those environmental
variables whose independent, as distinct from partial,
correlation with the response variable may be important
from those variables that have little or no independent
effect on the responses (hier.part in R).
Used by Steve Juggins with diatom data and encouraging
results (2012)
3. Confounding effects of correlated environmental
variables (assumptions 2 and 5)
Present in all studies, starting with Imbrie & Kipp
(1971) with reconstructions of summer and winter
sea-surface temperature and salinity.
Covarying environmental
variables e.g. temperature
and lake trophic status
(e.g. total N or P) or
temperature and lake
depth and chironomids. Is
the fossil chironomid signal
temperature or trophic
status?
Broderson & Anderson (2002)
In almost all ecological systems, assemblages are a
complex function of multiple climatic, edaphic,
land-use, biotic, and historical factors.
First part of assumption 5 (environmental variables
other than the variable being reconstructed have
negligible influence) is therefore almost never met.
Need very careful design of modern training-set and
rigorous statistical analysis to establish what can
reliably and significantly be reconstructed.
Second part of assumption 5 (the joint distribution
of additional variables with the variable of interest
does not change with time) is also violated in
many cases.
Climate model and glaciological results suggest that
the joint distribution between summer temperature
and winter accumulation has not been the same in
the past 11,000 years.
Good evidence to suggest that lake-water pH has
decreased naturally (soil deterioration) whilst summer
temperature rose and then fell in the last 11,000
years. pH-climate relationship changed with time.
In Norway today, lake-water pH is negatively
correlated with summer temperature because lakes of
pH 6-7.5 are on basic rock and this happens in
Norway to occur mainly at high altitudes and hence at
low temperatures. In the past after deglaciation,
almost all lakes had a higher pH than today, so the
pH-temperature relationship in the past was different
than today.
4. Assumption 3 “Taxa in the training-set are the
same as in the fossil data and their ecological
responses have not changed significantly over the
timespan represented by the fossil assemblage”
Assumption not unique to calibration functions. Basic
assumption of all Quaternary palaeoecology, namely
uniformitarianism.
Considerable interest in niche-conservatism amongst
biogeographers and conservation and evolutionary
biologists. Increasing evidence for conservatism of
ecological niche characteristics in the timespan of
last 20,000 years.
Problems of ‘cryptic’ species and of taxa like
Saxifraga oppositifolia-type in environmental
reconstructions currently unresolved.
5. Use of different proxies can give different
reconstructions
Mean July temp, Bjørnfjell
p = 0.001
p = 0.183
ns
Validate using another proxy – e.g. macrofossils of tree birch
Validate using second proxy – e.g. chironomids
Importance of independent validation and establishing what
is statistically significant
FUTURE NEEDS
Quantitative palaeoenvironmental reconstructions
in the context of Quaternary palaeoecology are not
really an end in themselves (in contrast to
Quaternary palaeoclimatology) but they are a
means to an end.
Use the reconstructions based on one proxy (e.g.
chironomids) to provide an environmental
history against which observed biological changes
in another, independent proxy (e.g. pollen) can be
viewed and interpreted as biological responses
to environmental change.
Minden Bog,
Michigan
Booth & Jackson (2003)
Env. predictor
Biotic responses
Black portions = wet
periods,
grey = dry periods
Major change 1000 years ago towards drier
conditions, decline in Fagus and rise in Pinus in
charcoal
Climate  vegetation  fire frequency
These approaches involving environmental
reconstructions independent of the main fossil record
can be used as a long-term ecological observatory or
laboratory to study long-term ecological dynamics
under a range of environmental conditions, not all of
which exist on Earth today (e.g. lowered CO2
concentrations, low human impact).
Can begin to study the Ecology of
the Past.
Exciting prospect, many potentialities
in future research, as outlined by
Flessa and Jackson (2005) and
discussed by Birks et al.
(2010 Open Ecol J 3: 68-110)
Other important future needs
1. Increased rigour in model evaluation and selection
with greater use of external cross-validation,
development of ‘minimal adequate model’
2. Testing significance of reconstructions
3. Greater rigour in deciding what environmental
variables can be reconstructed (critical use of
RDA/CCA, hierarchical partitioning, and ecological
knowledge!)
4. Consider the likelihood of confounding and
‘surrogate’ environmental variables
5. There are increasing numbers of calibration datasets (e.g. Norwegian, Swiss, Norwegian + Swiss,
N Sweden, Finland 1 & 2 chironomid data-sets).
How to select the ‘appropriate’ one?
RMSEP (C)
0.85
0.75
0.91
0.85
Max bias (C)
0.98
1.09
1.56
1.14
Salonen et al. (in press)
Same July T
range, different
continentality
(3), one with
lower July T
range. Similar
but not identical
RMSEP and
maximum bias,
all two-way WA
Salonen et al. (in press)
All reconstructions statistically significant (p<0.05). Likely
explanation is that WA optima are different in areas of different
continentality. Higher in areas of high continentality (e.g. Ulmus,
Tilia, Quercus)
Basic problem in palaeoecology – really interested in the
fundamental niche but can only study the realised niche as a
result of confounding environmental variables. Realised niche may
be different in different areas. Conflicts with assumptions 3 and 5.
6. Do not ignore inconsistent results – Velle et al. (2012)
7. Try to understand why results are seemingly
inconsistent
8. Remember what the six basic assumptions of
calibration functions are and try not to violate them
or, even better, try to test them (e.g. niche
conservatism)
CONCLUSIONS
Effective use of calibration functions needs
• good understanding of underlying ecology,
mathematics, and principles of statistical
modelling and cross-validation
• good quality modern and fossil data
Bayesian framework is an important future
research direction but it presents very difficult and
time-consuming computational problems. No
available software (cf. DECORANA, CANOCO, WACALIB,
CALIB, etc. philosophy)
Importance of continued research collaboration
between palaeoecologists and applied statisticians
To paraphrase the statistician G.P.E. Box
“All reconstructions are wrong, but some
reconstructions may be useful”
The challenge is to identify the useful and reliable
ones
It is a difficult task and one that has received
surprisingly little attention until recently. Major
challenge for the future.
Simple two-way weighted-averaging appears hard to beat
Takes account of % data, ignores zero values, assumes
unimodal responses, can handle several hundred species, and
gives calibration functions of high precision (0.8ºC), low bias,
and high robustness.
Xm = g(Y1, Y2, Y3, ... ... ..., Yp)
Modern data WA regression
Xf = g(Yf1, Yf2, Yf3, ... ... ..., Yfp)
Fossil data WA calibration
g is our calibration function for Xm and Ym
Simple, ecologically realistic, and robust
WA is robust to spatial autocorrelation, as are Gaussian
logit regression and ML calibration. WA (with monotonic
deshrinking) and GLR are, to me, the preferred methods
Lynts and Judd 1971 Science 171: 1143-1144
Late Pleistocene Paleotemperatures at Tongue of the Ocean, Bahamas
It too is 41 years old! Has 20 citations (cf. 652)
Major problem in all reconstructions are the effects
of secondary variables, confounding variables,
and non-causal environmental variables on
resulting reconstructions.
Only recently beginning to receive attention –
Juggins & Birks (2012) and Juggins (2012 in
press).
We must all give greater attention to what can and
cannot be reconstructed and explicitly address the
dangers of reconstructing surrogate variables (e.g.
water depth) and confounding variables (e.g.
climate and nutrients)
Juggins (2012)
Key Figures in Calibration-Function
Research
John Imbrie
Cajo ter Braak
Tom Webb
Svante Wold
Steve Juggins
Richard Telford
One cannot do calibration-function research without high
quality data and these need skilled palaeoecologists. Many
colleagues have contributed to the development of calibration
functions by creating superb modern-environmental data sets
Nilva Kipp
Heikki Seppä
Andy Lotter
Sylvia Peglar
Steve Brooks
Viv Jones
Oliver Heiri
Ulrike Herzschuh