Spatial Data Analysis: Surfaces

Download Report

Transcript Spatial Data Analysis: Surfaces

Spatial Data Analysis:
Surfaces
Model-Driven Approaches

Model of discrete spatial variation




Each subregion is described by is a statistical distribution Zi
e.g., homicides numbers are Poisson (, ).
The main objective of the analysis is to estimate the joint
distribution of random variables Z = {Z1,…,Zn}
Model of continuous spatial variation


All of the area is a continuous surface
The main objective is to estimate the distribution Z(x), x  A
Models of Discrete Spatial Variation
Zi  Random
variable in
area i
Yi
• n° of ill people
T a xa s d e L e ish ma n io se V isc e ra l (1 9 9 7 / 1 9 9 8 ) .
c a s o s p o r 1 0 0 m il h a b it a n t e s .
2 0 0a 2 5 0 (1 )
1 5 0a 2 0 0 (2 )
1 0 0a 1 5 0 (1 )
5 0 a 1 0 0 (4 )
1 0 a 5 0 (2 9 )
5 a 1 0 (1 6 )
1 a 5 (4 3 )
< 1
(1 9 )
• n° of newborn babies
• per capita income
Models of Continuous Spatial Variation
Temperature, Water ph, soil acidity...
Sampling stations in locations marked by
Location to predict value: shown as
From Areas to Surfaces
Polygon data
Sample generation
X,Y,Z
X,Y,Z
X,Y,Z
X,Y,Z
Samples
geoestatistics
superfície contínua / grade
X,Y,Z
From Areas to Surfaces
Space as a planar
subdivision
From Areas to Surfaces
Space as a planar
subdivision
Space as a continuos
surface
From Areas to Surfaces
Geostatistics


Applicable to spatial distributions (fields)
Typical situation

interpolation from field samples
Water Availabilty
Index
Estimated
Surface
Estimated
Uncertainty
What is Geostatistics?

Analysis and inference of continuously-distributed variables

Pollution, Zync concentration, infant mortality rate
Study area

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• • •
• • • •
• • • •
• • •
• • •
• •
• • •
• • •
• •
Field Samples
Inferences
Analysis


•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Describing the spatial variability of the phenomenon under study
estudar ou descrever
Inference

Estimating the unknown values
Why Geostatistics ?

Techniques appropriate to statistical estimation of spatial phenomena
Deterministic Procedures
Study area
Geoestatistics
Field samples
Thinking spatially
Z1~ N(1, 1)
1 = 2
1 = 2
corr(Z1, Z2) = f(h)
Z2 ~ N (2, 2)
How are they distributed?
How are they related to each other?
How can I infer a distribution from one
sample?
Steps of the Geostatistical Process
DATA
Exploratory
Analysis
Structural
Analysis
Inference and
Interpolation
RESULTS
Concept of a Regionalized Variable
+
Área Poluída
Zona A
Escala de poluição
Zona B

Regionalized Variable = structure + randomness

Structure



Global distribution of natural phenomena
Average value of a phenomena in a given area is constant
Random


Local variation within a given area
Values fluctuate around a mean
Regionalized Variable
Z(x) = m(x) + (x) + 
m(x): structural component (constant mean value)
(x): random component, spatially variant around m(x)
: uncorrelated random noise
Zona B
m(x)
”(x)
’
Zona A
Geostatistics

Each position on the field is a random variable



Each measurement is a realization of a random variable



E : extent of the field
 u  E, Z(u) is a random variable
Let z(u1), ...z(un) be the set of measures
Then, z(u) is a realization of Z(u),  = 1,..,n
Problem

How can we estimate the joint distribution?
Uncertainty: the Statistical Approach

Basic hypothesis:
Var{ Z(u+h) – Z(u)} = 2(h)
γ h  C 0  C h

Difference in values are similar for similar distances

We call this a “stationary” spatial process

We can find the “structure” of a stationary spatial
process using a very simple technique

The variogram
EXPERIMENTAL SEMIVARIOGRAM
 z (u )  z (u  h ) 




1
N (h ) is the number of pairs
of samples separated by h
^(h)
(C)
1
2 N (h )
2
Sill
ˆ (h ) 
N (h)
Nugget effect (Co)
Range (a)
h
Building the Experimental
Semivariogram

Step 1 (optional): Transforming area maps in samples
Building the Experimental
Semivariogram


Step 2 : Measuring spatial
variation
For each pair Z(x) and
Z(x+h), sepated by a
distance h, we measure
the square of the
difference between them
•
•
h
•
•
•
•
•
h
•
•
•
h
Vetor distância h

•
•
h
VARIOGRAMAS DO I.D.H.
Spatial Model Fitting for Variograms

After building an experimental variogram, we need to fit
a theoretical function in order to model the spatial
variation

The adjustment procedure is interactive, where the user
selects the theoretical model that best fits his data.

Some useful models:

Gaussian, Exponential, Spherical models
Fitting the Semivariogram
 (h)
Experimental
Theoretical
Sill
Nugget
Effect
Range
h
Plotting the variogram
Analysing the variogram

Later we will look at fitting a model to the variogram; but
even without a model we can notice some features,
which we define here only qualitatively:




Sill: maximum semi-variance; represents variability in the
absence of spatial dependence
Range: separation between point-pairs at which the sill is
reached; distance at which there is no evidence of spatial
dependence
Nugget: semi-variance as the separation approaches zero;
represents variability at a point that can’t be explained by spatial
structure.
In the previous slide, we can estimate the sill  1.9, the
range  1200 m, and the nugget  0.5 i.e. 25% of the
sill.
Using the experimental variogram to model the
random process

Notice that the semivariance of the separation vector (h)
is now given as the estimate of covariance in the spatial
field.

So it models the spatially-correlated component of the
regionalized variable

We must go from the experimental variogram to a
variogram model in order to be able to model the random
process at any separation.
Modelling the variogram

From the empirical variogram we now derive a
variogram model which expresses semivariance as a
function of separation vector. It allows us to:

Infer the characteristics of the underlying process from
the functional form and its parameters;

Compute the semi-variance between any point-pair,
separated by any vector;

Interpolate between sample points using an optimal
interpolator (“kriging”)
“Authorized” Models


Any variogram function must be able to model the
following:
1. Monotonically increasing






possibly with a fluctuation (hole)
2. Constant or asymptotic maximum (sill)
3. Non-negative intercept (nugget)
4. Anisotropy
Variograms must obey mathematical constraints so that
the resulting kriging equations are solvable (e.g., positive
definite between-sample covariance matrices).
The permitted functions are called authorized models.
Spherical Model


, |h|= 0
0


3

 3 |h| 1 |h| 
γ(h)  Co  C1         Co  C1[ Sph (|h|)] , 0 |h| a
a  2 a  

2 




, |h| a
C o  C1
Sill
C = Co+ C1
C1
Co
a
h
Exponential Model

0


 |h|
(h)  

 

+ [Exp
Co + C1 1  exp 
 Co C1
a





,|h| 0
(|h|)] ,|h| 0
C1
Co
a
h
Gaussian Model

0


(h)  

Co + C1 1  exp




,|h| 0
2
 |h| 

 
+ [Gau (|h|)]

  Co C1
a

 

,|h| 0
C1
Co
a
h
What sample size to fit a variogram
model?

Can’t use non-spatial formulas for sample size, because spatial
samples are correlated, and each sample is used multiple times in
the variogram estimate





No way to estimate the true error, since we have only one realisation
Stochastic simulation from an assumed true variogram suggests:
< 50 points: not at all reliable
100 to 150 points: more or less acceptable
> 250 points: almost certaintly reliable

More points are needed to estimate an anisotropic variogram.

This is very worrying for many environmental datasets (soil cores,
vegetation plots, . . . ) especially from short-term fieldwork, where
sample sizes of 40 – 60 are typical. Should variograms even be
attempted on such small samples?
Cross Validation

Re-estimate the samples to find errors in the model
Variogram Model
1
2
?
3
?
5
4
?
?
OK?
Yes
NO
?
–
–
–
–
Error Statistics
Error Histogram
Erro Spatial diagram
observed x estimated value
Cross Validation
Approaches to spatial prediction


This is the prediction of the value of some variable at an
unsampled point, based on the values at the sampled
points.
This is often called interpolation, but strictly speaking
that is only for points that are geographically inside the
sample set (otherwise it is extrapolation.
Approaches to prediction: Local
predictors

Value of the variable is predicted from “nearby”
samples


Example: concentrations of soil constituents (e.g. salts,
pollutants)
Example: vegetation density
Local Predictors

Each interpolator has its own assumptions, i.e. theory of
spatial variability

Nearest neighbour
Average within a radius
Average of n nearest neighbours
Distance-weighted average within a radius
Distance-weighted average of n nearest neighbours

Optimal” weighting -> Kriging




Ordinary Kriging

The theory of regionalised variables leads to an “optimal”
interpolation method, in the sense that the prediction
variance is minimized.

This is based on the theory of random functions, and
requires certain assumptions.
Optimal local interpolation: motivation

Problems with average-in-circle methods:


Problems with inverse-distance methods:



1. No objective way to select radius of circle or number of points
1. How to choose power (inverse, inverse squared . . . )?
2. How to choose limiting radius?
In both cases:


1. Uneven distribution of samples could over– or under–
emphasize some parts of the field
2. prediction error must be estimated from a separate validation
dataset
An “optimal” local predictor would have these
features:







Prediction is made as a linear combination of known
data values (a weighted average).
Prediction is unbiased and exact at known points
Points closer to the point to be predicted have larger
weights
Clusters of points “reduce to” single equivalent points,
i.e., over-sampling in a small area can’t bias result
Closer sample points “mask” further ones in the same
direction
Error estimate is based only on the sample configuration,
not the data values
Prediction error should be as small as possible.
Kriging




A “Best Linear Unbiased Predictor” (BLUP) that
satisfies certain criteria for optimality.
It is only “optimal” with respect to the chosen model!
Based on the theory of random processes, with
covariances depending only on separation (i.e. a
variogram model)
Theory developed several times (Kolmogorov 1930’s,
Wiener 1949) but current practise dates back to
Matheron (1963), formalizing the practical work of the
mining engineer D G Krige (RSA).
How do we use Kriging?




1. Sample, preferably at different resolutions
2. Calculate the experimental variogram
3. Model the variogram with one or more authorized
functions
4. Apply the kriging system, with the variogram model of
spatial dependence, at each point to be predicted




Predictions are often at each point on a regular grid (e.g. a
raster map)
These ‘points’ are actually blocks the size of the sampling
support
Can also predict in blocks larger than the original support
5. Calculate the error of each prediction; this is based
only on the sample point locations, not their data
values.
Prediction with Ordinary Kriging (OK)

In OK, we model the value of variable z at location si as
the sum of a regional mean m and a spatiallycorrelated random component e(si):

Z(si) = m+e(si)

The regional mean m is estimated from the sample, but
not as the simple average, because there is spatial
dependence. It is implicit in the OK system.
Prediction with Ordinary Kriging (OK)


Predict at points, with unknown mean (which must also
be estimated) and no trend
Each point x is predicted as the weighted average of
the values at all samples
n
Z*x   λi Z xi 
0
i1
λ ?
i



The weights assigned to each sample point sum to 1
Therefore, the prediction is unbiased
“Ordinary”: no trend or strata; regional mean must be
estimated from sample
Simple and Ordinary Kriging
Linear combination of nearest neighbours
•
x1
x2
x0
x3
x4
Local Means
n
*
Zx   λi Z xi 
0
i1
λ 1
i n
Inverse Distance Weights
Kriging
n
n
Z*x   λi Z xi 
i1
λ ?
Z*x   λi Z xi 
0
λ  12
i d
0
i1
i
Ordinary Kriging
1
x1
Variogram analysis
x2
x0
2
x3
Variogram adjustment
x4
3
Modelo de ajuste do semivariograma
 
3 
 3  h 1  h  
γ h  C0  C1        C0  C1 Sph
2 a 2  a  
  
 

 
  h 

4
Kriging estimator
Ordinary Kriging
l1
l2
:
ln

=
C11
C21
:
C n1
1
C12 .........C1n 1
C22 .........C2n 1
:
: :
C n2 .........C nn 1
1 ......... 1 0
1
C 10
C 20
:
C n0
1
• Covariance matrix elements
Cij  C(0)  γ (h)  C0  C1  γ (h)
•Substituting the values we find the weights
n
•Kriging estimator: Z*x   λi Z xi 
0
i1
• Variance
2
 C0  C1  λ T k
σko
Kriging example
Estimator:
50
50
x2
x1
x0
x4
x3
 λ1 
 
λ 2 
λ 
 
λ 
 
 
=
C11 C12 C13 C14 1 


C 21 C 22 C 23 C 24 1


C
C
C
C
1
 31 32 33 34 
C C C C 1 
 41 42 43 44 
 1 1 1 1 0


1
•Matrix elements: Cij = C0 + C1 -  (h) Modelo Teórico
C12 = C21 = C04 = C0 + C1 -  (50 2)

 50 2
(50 2 )3  

 0,5
= (2+20) - 2  20 1,5
3   = 9,84
200
(200)  


C 01 
 
C 02 
 
C 03 
C 
 04 
 1
 
Kriging example
C13 = C31 = (C0 + C1) -   V (150) + (50) ] = 1,23
2
C14 = C41 = C02 = (C0 + C1) -   V (100) + (50) ] = 4,98
2
50
50
x2
2
C23 = C32 = (C0 + C1) -   V (100) + (100) ] = 2,33
2
x1
x0
x4
2
x3
2
C24 = C42 = (C0 + C1) -   V (100) + (150) ] = 0,29
2
2
C34 = C43 = (C0 + C1 ) -   V (200) + (50) ] = 0
2
2
C01 = (C0 + C1 ) -  (50) = 12,66
C03 = (C0 + C1 ) -  (150) = 1,72
C11 = C22 = C33 = C44 = (C0 + C1 ) -  (0) = 22
Kriging example
Substituting the values Cij, we find the following weights:
l1 = 0,518
l2 = 0,022
l3 = 0,089
l4 = 0,371
The estimator is
Z**x o  0,518 z(x1) + 0,022 z(x2) + 0,089 z(x3) + 0,371 z(x4)
50
50
x2
x1
x0
x4
x3
Sampling configurations





There is no agreement on a “universally” optimal sampling
configuration for geostatistical research (i.e., variogram modelling,
followed by spatial prediction), but: for spatial prediction, regular
(lattice, or triangular) sampling is optimal (in case of isotropy;
otherwise stretched lattices);
for variogram modelling, all distances should be present, including
sufficient information about short distances (which are not present
when sampling regularly)
cross validation on a regular sampling grid will not reveal
deficiencies in modelled short distance behaviour of the variogram;
interpolated maps will be dominated by this short distance variogram
behaviour.
compromise: most effort put to regular spread, sufficient effort to
short distance replicates.
related questions: adding sampling points to an existing design, or
reducing (“optimizing”) an existing monitoring network.
Questions about kriging






what do sill, nugget, range, and anisotropy tell about
spatial variability of an observed variable?
what happens if we predict a value at an observation
location?
what does the prediction variance measure?
why is the interpolator discontinuous at observation
locations when the nugget is positive?
why is the prediction variance pattern independent on
data, but only dependent on data configuration?
what are the causes for positive nugget effect?
Spatial Indices
H.D.I. – human development index (UN)
H.D.I.= longevity + education + income
3
(0 < HDI < 1)
HDI – From Areas to Surfaces
HDI Variograms
Human Development Index in São Paulo
HDI= 1
IDH = 0
Trend Surfaces for Homicide Rates in São
Paulo
1996
Estimate of homicide rates using ordinary kriging
1999
Trend Surfaces for Homicide Rates : Binomial
Kriging
1996
1999
Binomial x Ordinary Kriging - 1996
Krigeagem
Ordinária
Krigeagem
Binomial
Binomial x Ordinary Kriging - 1999
Krigeagem
Ordinária
Krigeagem
Binomial
Practical Example

Analise of Apgar values in newborn by buroughs, Rio de Janeiro,
1994.

Apgar index


Vitality of newborn baby in first and fifth minute after birth
Respiration, heartbeat, response to stimula

Sample of 152 georeferenced samples.

Thematic classification





High:
Medium High:
Average:
Medium Low:
Low:
77,4
74,4
69,5
63,4
44,1
a
a
a
a
a
83,3
77,4
74,4
69,5
63,4
Practical Example
Bairros do Municipio do Rio de Janeiro
Bairros Excluídos
Exploratory Data Analysis
Spatial Correlation Analysis
Semivariogramas
Modelo
de Ajuste
Tipo: Omnidirecional
Gaussiano
o
Efeito45Pepita
= 16
90o
Contribuição
= 128
o
135=
Alcance
32000
Kriging results
+
Kriging variance
-
+
Spatial Variability of the APGAR index
-
Comparison
77,4 a 83,3
74,4 a 77,4
69,5 a 74,4
66,4 a 69,5
44,1 a 63,4
Excluded
Areal data grouped
By quintiles
+
Kriging results