Transcript Chapter 2
Chapter 5
Part B: Spatial Autocorrelation and
regression modelling
www.spatialanalysisonline.com
Autocorrelation
Time series correlation model
{xt,1} t=1,2,3…n-1 and {xt,2} t=2,3,4…n
3rd edition
www.spatialanalysisonline.com
2
Spatial Autocorrelation
n
Correlation coefficient
r
{xi} i=1,2,3…n, {yi} i=1,2,3…n
x
x y i y
i
i 1
n
x
i
x
2
i 1
Time series correlation model
n
y
i
y 2
i 1
{xt,1} t=1,2,3…n-1 and {xt,2} t=2,3,4…n
Mean values:
Lag 1 autocorrelation:
n 1
1
x.1
xt
1
large n
n 1 t 1
x
n
n
1
x.2
xt
n 1 t 2
3rd edition
n 1
n
x
t
t 1
www.spatialanalysisonline.com
r1
x
t
x xt 1 x
t 1
n
2
x
x
t
t 1
3
Spatial Autocorrelation
Classical statistical model assumptions
Independence vs dependence in time and
space
Tobler’s first law:
“All things are related, but nearby things are more
related than distant things”
Spatial dependence and autocorrelation
Correlation and Correlograms
3rd edition
www.spatialanalysisonline.com
4
Spatial Autocorrelation
Covariance and autocovariance
Lags – fixed or variable interval
Correlograms and range
Stationary and non-stationary patterns
Outliers
Extending concept to spatial domain
Transects
Neighbourhoods and distance-based models
3rd edition
www.spatialanalysisonline.com
5
Spatial Autocorrelation
Global spatial autocorrelation
Dataset issues: regular grids; irregular lattice
(zonal) datasets; point samples
Simple binary coded regular grids – use of Joins
counts
Irregular grids and lattices – extension to x,y,z data
representation
Use of x,y,z model for point datasets
Local spatial autocorrelation
Disaggregating global models
3rd edition
www.spatialanalysisonline.com
6
Spatial Autocorrelation
Joins counts (50% 1’s)
A. Completely separated pattern (+ve)
3rd edition
B. Evenly spaced pattern (-ve)
www.spatialanalysisonline.com
C. Random pattern
7
Spatial Autocorrelation
Joins count
Binary coding
Edge effects
Double counting
Free vs non-free sampling
Expected values (free sampling)
1-1 = 15/60, 0-0 = 15/60, 0-1 or 1-0 = 30/60
3rd edition
www.spatialanalysisonline.com
8
Spatial Autocorrelation
Joins counts
A. Completely separated (+ve)
3rd edition
B. Evenly spaced (-ve)
www.spatialanalysisonline.com
C. Random
9
Spatial Autocorrelation
Joins count – some issues
Multiple z-scores
Binary or k-class data
Rook’s move vs other moves
First order lag vs higher orders
Equal vs unequal weights
Regular grids vs other datasets
Global vs local statistics
Sensitivity to model components
3rd edition
www.spatialanalysisonline.com
10
Spatial Autocorrelation
Irregular lattice – (x,y,z) and adjacency tables
Cell data
Cell coordinates (row/col)
x,y,z view
+4.55
+5.54
1,1
1,2
1,3
x
y
z
+2.24
-5.15
+9.02
2,1
2,2
2,3
1
2
4.55
+3.10
-4.39
-2.09
3,1
3,2
3,3
1
3
5.54
+0.46
-3.06
4,1
4,2
4,3
2
1
2.24
2
2
-5.15
2
3
9.02
3
1
3.1
3
2
-4.39
3
3
-2.09
4
2
0.46
4
3
-3.06
3
7
1
4
8
2
5
9
6
10
Cell numbering
Adjacency matrix, total
1’s=26
3rd edition
www.spatialanalysisonline.com
11
Spatial Autocorrelation
“Spatial” (auto)correlation coefficient
Coordinate (x,y,z) data representation for cells
Spatial weights matrix (binary or other), W={wij}
From last slide: Σ wij=26
Coefficient formulation – desirable properties
Reflects co-variation patterns
Reflects adjacency patterns via weights matrix
Normalised for absolute cell values
Normalised for data variation
Adjusts for number of included cells in totals
3rd edition
www.spatialanalysisonline.com
12
Spatial Autocorrelation
Moran’s I
w (z z)( z
1
I
p
(z z)
p w / n,
ij
i
i
j
j
2
i
z)
, w her e
i
ij
i
j
hence p 26/10 for our 10 cell ex ample
TSA model
x x x
x x
t 1
t
r.1
x
t
2
t
t
3rd edition
www.spatialanalysisonline.com
13
Spatial Autocorrelation
Moran I =10*16.19/(26*196.68)=0.0317 0
A. Computation of variance/covariance-like quantities, matrix C
B. C*W: Adjustment by multiplication of the weighting matrix, W
3rd edition
www.spatialanalysisonline.com
14
Spatial Autocorrelation
w (z z)( z
Moran’s I I 1
p
(z z)
ij
i
i
j
z)
j
2
i
, w her e p
w
i
ij / n
j
i
Modification for point data
Replace weights matrix with distance bands, width h
Pre-normalise z values by subtracting means
Count number of other points in each band, N(h)
z z
I(h) N(h)
z
i
i
j
j
2
i
i
3rd edition
www.spatialanalysisonline.com
15
Spatial Autocorrelation
Moran I Correlogram
Source data points
3rd edition
Lag distance bands, h
www.spatialanalysisonline.com
Correlogram
16
Spatial Autocorrelation
Geary C
Co-variation model uses squared differences
rather than products
(z z )
w
p2
C
1
p
wij (zi z j )2
i
2
ij
n 1
Similar approach is used in geostatistics
3rd edition
www.spatialanalysisonline.com
17
Spatial Autocorrelation
Extending SA concepts
Distance formula weights vs bands
Lattice models with more complex
neighbourhoods and lag models (see GeoDa)
Disaggregation of SA index computations (rowwise) with/without row standardisation (LISA)
Significance testing
Normal model
Randomisation models
Bonferroni/other corrections
3rd edition
www.spatialanalysisonline.com
18
Regression modelling
Simple regression – a statistical
perspective
One (or more) dependent (response) variables
One or more independent (predictor) variables
Linear regression is linear in coefficients:
y 0 1x1 2 x2 3 x3 ..., or
y xβ
Vector/matrix form often used
Over-determined equations & least squares
3rd edition
www.spatialanalysisonline.com
19
Regression modelling
Ordinary Least Squares (OLS) model
yi 0 1x1i 2 x2i 3 x3i ... i , or
y Xβ ε
Minimise sum of squared errors (or residuals)
Solved for coefficients by matrix expression:
ˆ XX T
β
3rd edition
1
ˆ) σ 2 XX T
X T y var (β
www.spatialanalysisonline.com
1
20
Regression modelling
OLS – models and assumptions
Model – simplicity and parsimony
Model – over-determination, multi-collinearity
and variance inflation
Typical assumptions
Data are independent random samples from an
underlying population
Model is valid and meaningful (in form and statistical)
Errors are iid
• Independent; No heteroskedasticity; common distribution
Errors are distributed N(0,2)
3rd edition
www.spatialanalysisonline.com
21
Regression modelling
Spatial modelling and OLS
Positive spatial autocorrelation is the norm,
hence dependence between samples exists
Datasets often non-Normal >> transformations
may be required (Log, Box-Cox, Logistic)
Samples are often clustered >> spatial
declustering may be required
Heteroskedasticity is common
Spatial coordinates (x,y) may form part of the
modelling process
3rd edition
www.spatialanalysisonline.com
22
Regression modelling
OLS vs GLS
OLS assumes no co-variation
Solution:
ˆ XX T
β
1
XT y
GLS models co-variation:
y~ N(,C) where C is a positive definite covariance matrix
y=X+u where u is a vector of random variables (errors)
with mean 0 and variance-covariance matrix C
Solution:
3rd edition
ˆ XC1X T
β
1
T
1
X C y
www.spatialanalysisonline.com
ˆ X T C 1X T
var(β)
1
23
Regression modelling
GLS and spatial modelling
y~ N(,C) where C is a positive definite covariance
matrix (C must be invertible)
C may be modelled by inverse distance weighting,
contiguity (zone) based weighting, explicit covariance
modelling…
Other models
Binary data – Logistic models
Count data – Poisson models
3rd edition
www.spatialanalysisonline.com
24
Regression modelling
Choosing between models
Information content perspective and AIC
AIC 2 ln(L) 2k
n
AICc 2 ln(L) 2k
n k 1
where n is the sample size, k is the number of
parameters used in the model, and L is the likelihood
function
3rd edition
www.spatialanalysisonline.com
25
Regression modelling
Some ‘regression’ terminology
Simple linear
Multiple
Multivariate
SAR
CAR
Logistic
Poisson
Ecological
Hedonic
Analysis of variance
Analysis of covariance
3rd edition
www.spatialanalysisonline.com
26
Regression modelling
Spatial regression – trend surfaces and
residuals (a form of ESDA)
General model:
y f (x 1, x 2 , w)
y - observations, f( , , ) - some function, (x1,x2) - plane
coordinates, w - attribute vector
Linear trend surface plot
Residuals plot
2nd and 3rd order polynomial regression
Goodness of fit measures – coefficient of
determination
3rd edition
www.spatialanalysisonline.com
27
Regression modelling
Regression & spatial autocorrelation (SA)
Analyse the data for SA
If SA ‘significant’ then
Proceed and ignore SA, or
Permit the coefficient, , to vary spatially (GWR), or
Modify the regression model to incorporate the SA
3rd edition
www.spatialanalysisonline.com
28
Regression modelling
Regression & spatial autocorrelation (SA)
Analyse the data for SA
If SA ‘significant’ then
Proceed and ignore SA, or
Permit the coefficient, , to vary spatially (GWR)
or
Modify the regression model to incorporate the SA
3rd edition
www.spatialanalysisonline.com
29
Regression modelling
Geographically Weighted Regression (GWR)
Coefficients, , allowed to vary spatially, (t)
Model: y Xβ(t) ε
Coefficients determined by examining neighbourhoods
of points, t, using distance decay functions (fixed or
adaptive bandwidths)
Weighting matrix, W(t), defined for each point
1 T
Solution: β(
ˆ t) XW(t)X T
X W(t)y
GLS:
3rd edition
ˆ XC1X T
β
1
X T C 1y
www.spatialanalysisonline.com
30
Regression modelling
Geographically Weighted Regression
Sensitivity – model, decay function, bandwidth,
point/centroid selection
ESDA – mapping of surface, residuals,
parameters and SEs
Significance testing
Increased apparent explanation of variance
Effective number of parameters
AICc computations
3rd edition
www.spatialanalysisonline.com
31
Regression modelling
Geographically Weighted Regression
Count data – GWPR
use of offsets
Fitting by ILSR methods
Presence/Absence data – GWLR
True binary data
Computed binary data - use of re-coding, e.g.
thresholding
Fitting by ILSR methods
3rd edition
www.spatialanalysisonline.com
32
Regression modelling
Regression & spatial autocorrelation (SA)
Analyse the data for SA
If SA ‘significant’ then
Proceed and ignore SA, or
Permit the coefficient, , to vary spatially (GWR)
or
Modify the regression model to incorporate the
SA
3rd edition
www.spatialanalysisonline.com
33
Regression modelling
Regression & spatial autocorrelation (SA)
Modify the regression model to incorporate the
SA, i.e. produce a Spatial Autoregressive model
(SAR)
Many approaches – including:
SAR – e.g. pure spatial lag model, mixed model,
spatial error model etc.
CAR – a range of models that assume the expected
value of the dependent variable is conditional on the
(distance weighted) values of neighbouring points
Spatial filtering – e.g. OLS on spatially filtered data
3rd edition
www.spatialanalysisonline.com
34
Regression modelling
SAR models
Spatial weights matrix
Pure spatial lag:
y Wy ε
Autoregression parameter
Re-arranging:
y (I W)1 ε
MRSA model:
y Xβ ρW y ε
Linear regression added
3rd edition
www.spatialanalysisonline.com
35
Regression modelling
SAR models
Linear regression + spatial error
Spatial error model:
y Xβ ε, where
ε λWε u
iid error vector
Spatial weighted error vector
Substituting and re-arranging:
y Xβ W(y Xβ) u, or
y Xβ Wy WXβ u
Linear regression (global)
iid error vector
SAR lag
3rd edition
www.spatialanalysisonline.com
Local trend
36
Regression modelling
CAR models
Standard CAR model:
Autoregression parameter
E y i | all y j i i
w y
ij
j
j
j i
Expected value at i
weighted mean for neighbourhood of i
Local weights matrix – distance or contiguity
Variance : var (y) (I W)1M
Different models for W and M provide a range of
CAR models
3rd edition
www.spatialanalysisonline.com
37
Regression modelling
Spatial filtering
Apply a spatial filter to the data to remove SA
effects
Model the filtered data
y Wy = Xβ WXβ + ε, or
Example: y = Xβ + ε
y I W = I W Xβ + ε, hence
1
y = Xβ + I W ε
Spatial filter
3rd edition
www.spatialanalysisonline.com
38