Regression / Calibration

Download Report

Transcript Regression / Calibration

Regression / Calibration
MLR, RR, PCR, PLS
Paul Geladi
Head of Research NIRCE
Unit of Biomass Technology and Chemistry
Swedish University of Agricultural Sciences
Umeå
Technobothnia
Vasa
[email protected] [email protected]
Univariate regression
y
a
Slope
Offset
x
y
e
a
Slope b
y = a + bx + e
Offset a
x
y
x
y
Linear fit
Underfit
x
y
Overfit
x
y
Quadratic fit
x
Multivariate linear regression
y = f(x)
Works sometimes
y = f(x)
Works only for a few variables
Measurement noise!
∞ possible functions
K
X
I
y
y = f(x)
y = f(x)
Simplified by:
y = b0 + b1x1 + b2x2 + ... + bKxK + f
Linear approximation
Nomenclature
y = b0 + b1x1 + b2x2 + ... + bKxK + f
y : response
xk : predictors
bk : regression coefficients
b0 : offset, constant
f : residual
K
X
y
I
X, y mean-centered
b0 out
y = b1x1 + b2x2 + ... + bKxK + f
y = b1x1 + b2x2 + ... + bKxK + f
y = b1x1 + b2x2 + ... + bKxK + f
y = b1x1 + b2x2 + ... + bKxK + f
y = b1x1 + b2x2 + ... + bKxK + f
} I samples
y = b1x1 + b2x2 + ... + bKxK +f
y = b1x1 + b2x2 + ... + bKxK +f
y = b1x1 + b2x2 + ... + bKxK +f
y = b1x1 + b2x2 + ... + bKxK +f
y = b1x1 + b2x2 + ... + bKxK +f
K
y
X
=
+
b
I
y = Xb + f
f
X, y known, measurable
b, f unknown
No solution
f must be constrained
The MLR solution
Multiple Linear Regression
Ordinary Least Squares (OLS)
b=
-1
(X’X) X’y
Least squares
Problems?
3b1 + 4b2 = 1
4b1 + 5b2 = 0
One solution
3b1 + 4b2 = 1
4b1 + 5b2 = 0
b 1 + b2 = 4
No solution
3b1 + 4b2 + b3 = 1
4b1 + 5b2 + b3 = 0
∞ solutions
b=
-1
(X’X) X’y
-K > I ∞ solutions
-I > K no solution
-error in X
-error in y
-inverse may not exist
-inverse may be unstable
3b1 + 4b2 + e = 1
4b1 + 5b2 + e = 0
b 1 + b2 + e = 4
Solution
Wanted solution
-I≥K
- No inverse
- No noise in X
Diagnostics
y = Xb + f
SS tot = SSmod + SSres
R2 = SSmod / SStot = 1- SSres / SStot
Coefficient of determination
Diagnostics
y = Xb + f
SSres = f’f
RMSEC = [ SSres / (I-A) ] 1/2
Root Mean Squared Error of Calibration
Alternatives to MLR/OLS
Ridge Regression (RR)
b=
-1
(X’X) X’y
I easiest to invert
b = (X’X + kI)-1 X’y
k (ridge constant) as small as possible
Problems
- Choice of ridge constant
- No diagnostics
Principal Component Regression
(PCR)
-I≥K
-Easy inversion
Principal Component Regression
(PCR)
A
K
X
PCA
T
-A≤ I
- T orthogonal
- Noise in X removed
Principal Component Regression
(PCR)
y = Td + f
d = (T’T)-1 T’y
Problem
How many components used?
Advantage
- PCA done on data
- Outliers
- Classes
- Noise in X removed
Partial Least Squares
Regression
X
t
u
Y
X
t
u
w’
Y
q’
Outer relationship
X
t
u
w’
Y
q’
Inner relationship
A
A
X
t
u
Y
q’
w’
A
A
p’
Advantages
- X decomposed
- Y decomposed
- Noise in X left out
- Noise in Y left out
PCR, PLS are one component at a
time methods
After each component, a residual
is calculated
The next component is calculated
on the residual
Another view
y = Xb + f
y = XbRR + fRR
y = XbPCR + fPCR
y = XbPLS + fPLS
b
3
Subspace of
usef ul
regression vectors
OLS
Shrunk and
rotated
A regression vector
w ith too much
shrinkage
b2
b1
Prediction
K
Xcal
ycal
Xtest
yhat
I
J
ytest
Prediction diagnostics
yhat = Xtestb
ftest = ytest -yhat
PRESS = ftest’ftest
RMSEP = [ PRESS / J ] 1/2
Root Mean Squared Error of Prediction
Prediction diagnostics
yhat = Xtestb
ftest = ytest -yhat
R2test = Q2 = 1 - ftest’ftest/ytest’ytest
Some rules of thumb
R2 > 0.65 5 PLS comp.
R2test > 0.5
R2 - R2test < 0.2
Bias
f = y - Xb
always 0 bias
ftest = y - yhat
bias = 1/J S ftest
Leverage - influence
b= (X’X)-1 X’y
yhat = Xb = X(X’X)-1 X’y = Hy
the Hat matrix
diagonal elements of H: Leverage
Leverage - influence
b= (X’X)-1 X’y
yhat = Xb = X(X’X)-1 X’y = Hy
the Hat matrix
diagonal elements of H: Leverage
Leverage - influence
Leverage - influence
Leverage - influence
Residual plot
f test
Outlier
Small variance
Biased
Unbiased
Large
variance
0
Heteroscedastic
y
pred
Residual
-Check histogram f
-Check variablewise E
-Check objectwise E
Predicted response
A Line ar , low nois e
hom os ce das tic
non-bias e d
Predicted response
B
Nonlinear
Measured response
Measured response
Predicted response
C
Predicted response
D
Bias e d
Measured response
Nois y
Measured response
Predicted response
E
He te r os ce das tic
Measured response
Predicted response
G
Bad outlie r
Measured response
Predicted response
F
Outlie r by
e xtr apolation
Measured response
A
A
X
t
u
Y
q’
w’
A
A
p’
Plotting: line plots
Scree plot RMSEC, RMSECV, RMSEP
Loading plot against wavel.
Score plot against time
Residual against sample
Residual against yhat
T2 against sample
H against sample
Plotting: scatter plots 2D, 3D
Score plot
Loading plot
Biplot
H against residual
Inner relation t - u
Weight wq
Nonlinearities
Remedies for nonlinearites. Making nonlinear data fit a
linear model or making the model nonlinear.
-Fundamental theory (e.g. going from transmittance to
absorbance)
-Use extra latent variables in PCR or PLSR
-Use transformations of latent variables
-Remove disturbing variables
-Find subsets that behave linearly
Remedies for nonlinearites. Making nonlinear data fit a
linear model or making the model nonlinear.
-Use intrinsically nonlinear methods
-Locally transform variables X, y, or both nonlinearly (powers,
logarithms, adding powers)
-Transformation in a neighbourhood (window methods)
-Use global transformations (Fourier, Wavelet)
-GIFI type discretization