Document 7126219

Download Report

Transcript Document 7126219

The rank of a product of two matrices X and Y
is equal to the smallest of the rank of X and Y:
Rank (X Y) =min (rank (X) , rank (Y))
A
S
= C
2.5
0.4
0.3
Concentration
Absorbance
0.35
0.25
0.2
0.15
0.1
2
1.5
1
0.5
0.05
0
400
0
450
500
Wavelength (nm)
550
600
0
2
4
6
Time (s)
8
10
12
Eigenvectors and Eigenvalues
For a symmetric, real matrix, R, an eigenvector
v is obtained from:
Rv = vl
l is an unknown scalar-the eigenvalue
Rv – vl = 0 (R – lI) v = 0
The vector v is orthogonal to all of the row
vector of matrix (R-lI)
R
R
v =l v
-l
I
v = 0
A=
0.1 0.2 0.3
R=ATA =
0.2 0.4 0.6
0.14
0.28
0.28
0.56
0
Rv = vl
(R – lI) v = 0
0.14
0.28
0.14
0.28
0.28
0.56
l
0.28
0.56
-
0
-l
1
0
v1
0
1
v2
=
0
0.14 - l 0.28
0
v1
l
v2 = 0.28
v1
0
=0
0.56 - l v2
0.14 - l 0.28
0.28
0.56 - l
=0
(0.14 – l) (0.56 – l) – (0.28) (0.28) = 0
l2 - 0.7 l = 0
l ( l - 0.7) = 0
l1 = 0.7 &
l2=0
For l1 = 0.7
0.14 – 0.7 0.28
0.28
v11
0.56 – 0.7 v21
=
v11
0.28
v21 = 0
-0.14
-0.56 v11 + 0.28v21 = 0
0.28 v11 - 0.14 v21 = 0
v21 = 2 v11
If v11 = 1
0
-0.56 0.28
v21 = 2
0.4472
Normalized vector v1 =
0.8944
For l1 = 0
0.14 0.28 v12
0
= 0
v
0.28 0.56 22
0.14 v12 + 0.28 v22 = 0
0.28 v12 +0.56 v22 = 0
v12 = -2 v22
If v22 = 1
v12 = -2
-0.8944
Normalized vector v1 =
0.4472
A=
0.1 0.2 0.3
R=ATA =
0.2 0.4 0.6
0.14
0.28
0.28
0.56
Rv = vl V = 0.4472
0.8944
RV = VL
L=
-0.8944
0.4472
0.7 0
0
0
v1v2 =0
tr(R) = ∑ li= 0.7 + 0.0 =0.7
More generally, if R (p x p) is symmetric of rank r≤p
then R posses r positive eigenvalues and (p-r) zero
eigenvalues
Example
Consider 15 sample each contain 3
absorbing components
Show that in the presence of random noise the
number of non-zero eigenvalues is larger than
numbers of components
?
Variance-Covariance Matrix
…
…
…
…
covar(x2x1) var(x2)
…
XT X =
covar(x1x2)
x1p – mxp
x1p – mxp
covar(xpx1) covar(xpx2) …
…
…
xnp – mxp
covar(x1xp)
covar(x2xp)
…
(xn1 – mx1) xn2 – mx2
var(x1)
…
…
…
…
X=
x11 – mx1 x12 – mx2
x21 – mx1 x22 – mx2
…
Column mean
centered matrix
var(xp)
mmcn.m file for mean
centering a matrix
Use anal.m file and mmcn.m file and verify that
each eigenvalue of an absorbance data matrix is
correlated with variance of data
?
Singular Value Decomposition
SVD of a rectangular matrix X is a method which
yield at the same time a diagnal matrix of singular
values S and the two matrices of singular vectors U
and V such that :
X = U S VT
UTU = VTV =Ir
The singular vectors in U and V are identical to
eigenvectors of XXT AND XTX, respectively and the
singular values are equal to the positive square roots
of the corresponding eigenvalues
X = U S VT
XT = V S UT
X XT= U S VT VSUT= US2UT
(X XT) U = US2
n
X
n
=
n
VT
S
U
n
m
n
n
m
If the rank of matrix X=r then;
X = U S VT = s1u1v1T + … + srurvrT
r
n
n
r
VT
S
X
m
U
=
m
r
r
Singular value decomposition
with MATLAB
Consider 15 sample containing 2 component with
strong spectral overlapping and construct their
absorbance data matrix accompany with random noise
Noised data
nd
Reconstructed
data
rd
-
-
Ideal data
A
Ideal data
A
=
=
residual
R1
residual
R2
It can be shown that the reconstructed data
matrix is closer to ideal data matrix
Anal.m file for constructing the
data matrix
Spectral overlapping of two
absorbing species
Ideal data matrix A
Noised data matrix, nd, with 0.005
normal distributed random noise
nf.m file for investigating the noise filtering
property of svd reconstructed data
Plot the %relative standard error as a function of
number of eigenvectors
?
Principal Component Analysis (PCA)
x11 x12 … x114
x2
x21 x21 … x214
x1
PCA
u11
…
u12
u114
x11 x12 … x114
x2
x21 x21 … x214
x1
u12
u22
u114
…
u21
…
u11
u214
Principal Components in two Dimensions
l1
l2
s1 0.1
0.2
u1 = ax1 + bx2
s2 0.2
s3 0.3
0.4
u2 = cx1 + dx2
0.6
In principal components model new variables
are found which give a clear picture of the
variability of the data. This is best achieved by
giving the first new variable maximum variance,
the second new variable is then selected so as to
be uncorrelated with the first one, and so on
The new variables can be uncorrelated if:
ac + bd =0 Orthogonality constraint
a=1 b=2 c=-1 d=0.5
0.1
0.2
0.5
x1 = 0.2 x2 = 0.4
0.3
0.6
u1 = 1.0
var(u1)=0.25
1.5
a=2 b=4 c=-2
d=1
1.0
u1 = 2.0
3.0
var(u1)=1.0
Normalizing constraint
a=1
b=2
c=-1
d=0.5
a=2
b=4
c=-2 d=1
Normalizing
Normalizing
a2 + b2 = 1
c2 + d2 = 1
a=0.4472
b=0.8944
c=-0.8944
d=0.4472
a=0.4472
b=0.8944
c=-0.8944
d=0.4472
Maximum variance constraint
u1 = ax1 + bx2
s2u1 = a2 s2x1 + b2 s2x2 + 2ab sx1-x2
2
s
x1
2
s u1 = [ a b ]
sx1-x2
s2x1
sx1-x2
sx1-x2
s2x2
a
b
sx1-x2
s2x2
= s2u1
a
b
a
b
…
…
…
Principal Components in m Dimensions
x11 x12 … x1m
x21 x22 … x2m
X=
xn1 xn2 … xnm
u1 = v11x1 + v12x2 + … + v1mxm
…
…
…
…
u2 = v21x1 + v22x2 + … + v2mxm
um = vm1x1 + vm2x2 + … + vmmxm
V
var(xm)
L
=
X
V
m
n
U
vm1
V
m
=
v21
= var(u1)
…
v21
…
covar(x2xm)
m
m
n
v11
…
…
…
covar(xmx1) covar(xmx2) …
C
covar(x1xm) v11
…
covar(x1x2) …
…
covar(x2x1) var(x2)
var(x1)
vm1
m
m
n
X
V
X V =U
U
=
m
n
m
Loading vectors
Score vectors
X VTV = UVT
T = S LT
X
=
UV
T
V V=I
X = USVT
m
m
n
X
m
=
n
s
LT
m
S = US
More generally, when one analyzes a data
matrix consisting of n objects for which m
variables have been determined, m principal
components can then be extracted (as long as
m<n.
PC1 represents the direction in the data
containing the largest variation. PC2 is
orthogonal to PC1 and represents the direction
of the largest residual variation around PC1.
PC3 is orthogonal to the first two and represents
the direction of the highest residual variation
around the plane formed by PC1 and PC2.
PCA.m file
anal.m file
10 mixtures of two
components
Perform PCA on data matrix obtained from an
evolutionary process, such as kinetic data (kin.m
file) and interpret the score vectors.
?
Classification with PCA
The most informative view of a data set, in
terms of variance at least, will be given by
consideration of the first two PCs. Since the
scores matrix contains a value for each sample
corresponding to each PC, it is possible to plot
these values against one another to produce a
low dimensional picture of a high-dimensional
data set.
Suppose there are 20 sample from two different
class
0.16
0.3
Class II
Class I
Absorbance
Absorbance
0.25
0.2
0.15
0.1
0.12
0.08
0.04
0.05
0
400
0
450
500
Wavelength (nm)
550
600
400
450
500
Wavelength (nm)
550
600
0.3
0.125
Absorbance
0.2
Class II
Class I
0.15
0.1
0
400
0.1
0.075
0.05
0.025
0.05
0
450
500
Wavelength (nm)
Abs. (l 2)
Absorbance
0.25
0.15
550
400
600
450
500
550
Wavelength (nm)
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0
0.05
0.1
Abs. (l 1)
0.15
600
PC2
-0.8
-0.6
-0.4
PC1
-0.2
0.1
0.08
0.06
0.04
0.02
0
-0.02 0
-0.04
-0.06
-0.08
-0.1
-0.12
Multiple Linear Regression (MLR)
yn
xn1
… x1p
… x2p
…
y=Xb
bp
xn2 … xnp
p
1
y =
n
b1
b2
…
x12
x22
…
=
x11
x21
…
…
y1
y2
…
y = b1 x1 + b2 x2 + … + bp xp
n
p
1
X
b
If p>n
a1 = e1 c11 + e2 c12 + e3 c13
a2 = e1 c21 + e2 c22 + e3 c23
There is an infinite number of solution for e,
which all fit the equation
If p=n
a1 = e1 c11 + e2 c12 + e3 c13
a2 = e1 c21 + e2 c22 + e3 c23
a3 = e1 c31 + e2 c32 + e3 c33
It gives a unique solution for e provided that
the X matrix has ful rank
If p<n
y1 = e1 c11 + e2 c12 + e3 c13
a2 = e1 c21 + e2 c22 + e3 c23
a3 = e1 c31 + e2 c32 + e3 c33
a4= e1 c41 + e2 c42 + e3 c43
This does not allow an exact solution for e, but
one can get a solution by minimizing the
length of the residual vector e
The least squares solution is
e = (CTC)-1 CT a
Least Squares in Matrix Equations
p
1
y =
n
1
b
X
p
n
1
1
y=Xb
1
1
1
1
n
n
n
1
y = 1 b1 x1 + 1 b2 x + … + 1 bp x
2
p
n
For solving this system the Xb-y must be
perpendicular to the column space of X
Suppose vector Xc is a linear combination of
the columns of X :
(Xc)T [Xb – y]=0
c [XTXb –XT y]=0
XTXb = XT y
b = (XTX)-1 XT y
The projection of y onto the column space of X
is therefore
p=Xb = (X (XTX)-1 XT )y
Least Squares Solution
Projection the y vector
in column space of X
The error vector
The error vector is
perpendicular to all
columns of X matrix
MLR with more than one dependent variable
1
n
1
1
y 1 y2
y3
n
n
p
1
X
=
p
n
b1 b2 b3
p
p
n
p
m
Y
1
1
=
X
m
B
p
n
Y= X B
B= (XTX) -1 Y
Classical Least Squares (CLS)
m A= C K p
A
n
=
C
m
p
K
n
Calibration step
K = (CTC)-1 CT A
The number of calibration standards should at
least be as large as the number of analytes
The rank of C must be equal to p
T = cT K c = (KKT)-1 K a
a
Prediction step
un
un
un
un
Number of wavelengths mustbe equal or larger
than number of components
Advantages of CLS
Full spectral domain is used for estimating each
constituent. Using redundant information has an
effect equivalent to replicated measurement and
signal averaging, hence it improves the precision
of the concentration estimates.
Disadvantages of CLS
The concentration of all the constituents in the
calibration set have to be known
Simultaneous
determination of two
components with CLS
Random design of concentration
matrix
Pure component spectra
Absorbance data matrix
Data matrices for mlr.m file
mlr.m file for multiple linear
regression
Predicted concentrations
Real concentrations
Use CLS method for determination
component in binary mixture samples
?
of
one
Inverse Least Squares (ILS)
c= A b
p
1
c1 =
n
A
1
b
p
n
Calibration step
b = (ATA)-1 AT c1
The number of calibration samples should at
least be as large as the number of wavelengths
The rank of A must be equal to p
Prediction step
cTun= aTun b
Advantages of ILS
It is not necessary to know all the information on
possible constituents, analyte of interest and
interferents
The method can work in principal when
unknown chemical interferents are present. It is
important that such interferents are present in
calibration samples
Disadvantages of ILS
The number of calibration samples should at
least be as large as the number of wavelengths
Determination of x in the
presence of y by ILS
method
15 x 9 absorbance data matrix
ILS.m file
ILS calibration
Real concentrations
Predicted concentrations
Does in ILS method the accuracy of final results is
dependent to number of wavelength?
?
Principal Component Regression (PCR)
PCR is simply PCA followed by a regression step
A= C E = S L
A
= C
E
= S
A= C E = (S R) (R-1 L)
C=SR
C = S
R
c1 = S r
L
A data matrix can be represented by its score
matrix
A regression of score matrix against one or
several dependent variables is possible, provided
that scores corresponding to small eigenvalues
are omitted
This regression gives no matrix inversion
problem
PCR has the full-spectrum advantages of the
CLS method
PCR has the ILS advantage of being able to
perform the analysis one chemical components
at a time while avoiding the ILS wavelength
selection problem
Validation
How many meaningful principal components
should be retained?
*Percentage of explained variance
If all possible PCs are used in the model 100% of
the variance is explained
d
∑ li
sd2
=
i=1
p
∑ li
i=1
x 100
Percentage of explained variance for
determination of number of PCs
Spectra of 20 samples of various
amount of 2 components
Pev.m file for
percentage of explained variance method
Performing pev.m file on nd absorbance
data matrix
Show the validity of results of Percentage Explained
Variance method is dependent to spectral
overlapping of individual components
?
p
A
n
n or r?
PCA
p
n
score
loading
n
n
Cross-Validation
1
a
p
p
A
PCA
n
p
A’
n-1
c
Creating absorbance data for
performing cross-validation
method
Spectra of 15 samples of various
amount of 3 components
cross.m file for
PCR cross-validation
PCR cross-validation
PRESS
cross-validation plot
7
6
5
4
3
2
1
0
0
5
10
no of factors
15
Calibration and Prediction Steps in PCR
Calibration Step
c=Sb
r
1
c1 = S
n
1
r
b
n
b = ( STS)-1 ST c
Prediction Step
r
m
Sx
p
=
m
r
Sx = Ax L
cx = Sx b
Ax
L
p
Pcr.m file for calibration and
prediction by PCR method
Spectra of 20 samples of various
amount of 3 components
Input data for pcr.m file
pcr.m function
Predicted and real values for
first component
Predicted and real values for
first component
Predicted and real values for
first component
Compare the CLS, ILS and PCR methods for
prediction in a two components system with strong
spectral overlapping
?