Recitation: SVD and dimensionality reduction Zhenzhen Kou Thursday, April 21, 2005 SVD • Intuition: find the axis that shows the greatest variation, and project all points.

Download Report

Transcript Recitation: SVD and dimensionality reduction Zhenzhen Kou Thursday, April 21, 2005 SVD • Intuition: find the axis that shows the greatest variation, and project all points.

Recitation:
SVD and dimensionality
reduction
Zhenzhen Kou
Thursday, April 21, 2005
SVD
• Intuition: find the axis that shows the greatest
variation, and project all points into this axis
f2
e1
e2
f1
SVD: Mathematical
Background
=
XX
k
mXn
UUk
m X kr
SxSk
rk X rk
V’
Vk’
kr X n
The reconstructed matrix Xk = Uk.Sk.Vk’ is the closest
rank-k matrix to the original matrix R.
SVD: The mathematical formulation
• Let X be the M x N matrix of M N-dimensional points
• SVD decomposition
– X= U x S x VT
– U(M x M)
• U is orthogonal: UTU = I
• columns of U are the orthogonal eigenvectors of XXT
• called the left singular vectors of X
– V(N x N)
• V is orthogonal: VTV = I
• columns of V are the orthogonal eigenvectors of XTX
• called the right singular vectors of X
– S(M x N)
• diagonal matrix consisting of r non-zero values in descending order
• square root of the eigenvalues of XXT (or XTX)
– r is the rank of the symmetric matrices
• called the singular values
SVD - Interpretation
SVD - Interpretation
• X = U S VT - example:
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
=
0.18
0.36
0.18
0.90
0
0
0
0
0
0
0
0.53
0.80
0.27
x
9.64 0
0
5.29
x
v1
0.58 0.58 0.58 0
0
0
0
0
0.71 0.71
SVD - Interpretation
• X = U S VT - example:
variance (‘spread’) on the v1 axis
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
=
0.18
0.36
0.18
0.90
0
0
0
0
0
0
0
0.53
0.80
0.27
x
9.64 0
0
5.29
x
0.58 0.58 0.58 0
0
0
0
0
0.71 0.71
SVD - Interpretation
• X = U S VT - example:
– U L gives the coordinates of the points
in the projection axis
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
=
0.18
0.36
0.18
0.90
0
0
0
0
0
0
0
0.53
0.80
0.27
x
9.64 0
0
5.29
x
0.58 0.58 0.58 0
0
0
0
0
0.71 0.71
Dimensionality reduction
• set the smallest eigenvalues to zero:
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
=
0.18
0.36
0.18
0.90
0
0
0
0
0
0
0
0.53
0.80
0.27
x
9.64 0
0
5.29
x
0.58 0.58 0.58 0
0
0
0
0
0.71 0.71
Dimensionality reduction
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
~
0.18
0.36
0.18
0.90
0
0
0
0
0
0
0
0.53
0.80
0.27
x
9.64 0
0
0
x
0.58 0.58 0.58 0
0
0
0
0
0.71 0.71
Dimensionality reduction
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
~
0.18
0.36
0.18
0.90
0
0
0
0
0
0
0
0.53
0.80
0.27
x
9.64 0
0
0
x
0.58 0.58 0.58 0
0
0
0
0
0.71 0.71
Dimensionality reduction
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
~
0.18
0.36
0.18
0.90
0
0
0
x
9.64
x
0.58 0.58 0.58 0
0
Dimensionality reduction
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
~
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Dimensionality reduction
Equivalent:
‘spectral decomposition’ of the matrix:
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
=
0.18
0.36
0.18
0.90
0
0
0
0
0
0
0
0.53
0.80
0.27
x
9.64 0
0
5.29
x
0.58 0.58 0.58 0
0
0
0
0
0.71 0.71
Dimensionality reduction
Equivalent:
‘spectral decomposition’ of the matrix:
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
= u1
u2
x
l1
l2
x
v1
v2
Dimensionality reduction
‘spectral decomposition’ of the matrix:
m
n
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
r terms
=
l1 u1 vT1 +
nx1
1xm
l2 u2 vT2 +...
Dimensionality reduction
approximation / dim. reduction:
by keeping the first few terms (Q: how many?)
m
n
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
=
l1 u1 vT1 +
l2 u2 vT2 +...
assume: l1 >= l2 >= ...
Dimensionality reduction
A heuristic: keep 80-90% of ‘energy’ (=
sum of squares of li ’s)
m
n
1
2
1
5
0
0
0
1
2
1
5
0
0
0
1
2
1
5
0
0
0
0
0
0
0
2
3
1
0
0
0
0
2
3
1
=
l1 u1 vT1 +
l2 u2 vT2 +...
assume: l1 >= l2 >= ...
Another example-Eigenface
• The PCA problem in HW5
• Face data X
• Eigenvectors associated with the first few
large eigenvalues of XXT have face-like
images
Dimensionality reduction
• Matrix V in the SVD decomposition
(X = USVT ) is used to transform the data.
• XV (= US) defines the transformed dataset.
• For a new data element x, xV defines the
transformed data.
• Keeping the first k (k < n) dimensions,
amounts to keeping only the first k columns
of V.
Principal Components Analysis
(PCA)
• Transfer the dataset to the center by subtracting
the means: let matrix X be the result.
• Compute the matrix XTX.
• The covariance matrix except for constants.
• Project the dataset along a subset of the
eigenvectors of XTX.
• Matrix V in the SVD decomposition
(X= U S VT ) contains the eigenvectors of XTX.
• Also known as K-L transform.