Recitation: SVD and dimensionality reduction Zhenzhen Kou Thursday, April 21, 2005 SVD • Intuition: find the axis that shows the greatest variation, and project all points.
Download ReportTranscript Recitation: SVD and dimensionality reduction Zhenzhen Kou Thursday, April 21, 2005 SVD • Intuition: find the axis that shows the greatest variation, and project all points.
Recitation: SVD and dimensionality reduction Zhenzhen Kou Thursday, April 21, 2005 SVD • Intuition: find the axis that shows the greatest variation, and project all points into this axis f2 e1 e2 f1 SVD: Mathematical Background = XX k mXn UUk m X kr SxSk rk X rk V’ Vk’ kr X n The reconstructed matrix Xk = Uk.Sk.Vk’ is the closest rank-k matrix to the original matrix R. SVD: The mathematical formulation • Let X be the M x N matrix of M N-dimensional points • SVD decomposition – X= U x S x VT – U(M x M) • U is orthogonal: UTU = I • columns of U are the orthogonal eigenvectors of XXT • called the left singular vectors of X – V(N x N) • V is orthogonal: VTV = I • columns of V are the orthogonal eigenvectors of XTX • called the right singular vectors of X – S(M x N) • diagonal matrix consisting of r non-zero values in descending order • square root of the eigenvalues of XXT (or XTX) – r is the rank of the symmetric matrices • called the singular values SVD - Interpretation SVD - Interpretation • X = U S VT - example: 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 = 0.18 0.36 0.18 0.90 0 0 0 0 0 0 0 0.53 0.80 0.27 x 9.64 0 0 5.29 x v1 0.58 0.58 0.58 0 0 0 0 0 0.71 0.71 SVD - Interpretation • X = U S VT - example: variance (‘spread’) on the v1 axis 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 = 0.18 0.36 0.18 0.90 0 0 0 0 0 0 0 0.53 0.80 0.27 x 9.64 0 0 5.29 x 0.58 0.58 0.58 0 0 0 0 0 0.71 0.71 SVD - Interpretation • X = U S VT - example: – U L gives the coordinates of the points in the projection axis 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 = 0.18 0.36 0.18 0.90 0 0 0 0 0 0 0 0.53 0.80 0.27 x 9.64 0 0 5.29 x 0.58 0.58 0.58 0 0 0 0 0 0.71 0.71 Dimensionality reduction • set the smallest eigenvalues to zero: 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 = 0.18 0.36 0.18 0.90 0 0 0 0 0 0 0 0.53 0.80 0.27 x 9.64 0 0 5.29 x 0.58 0.58 0.58 0 0 0 0 0 0.71 0.71 Dimensionality reduction 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 ~ 0.18 0.36 0.18 0.90 0 0 0 0 0 0 0 0.53 0.80 0.27 x 9.64 0 0 0 x 0.58 0.58 0.58 0 0 0 0 0 0.71 0.71 Dimensionality reduction 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 ~ 0.18 0.36 0.18 0.90 0 0 0 0 0 0 0 0.53 0.80 0.27 x 9.64 0 0 0 x 0.58 0.58 0.58 0 0 0 0 0 0.71 0.71 Dimensionality reduction 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 ~ 0.18 0.36 0.18 0.90 0 0 0 x 9.64 x 0.58 0.58 0.58 0 0 Dimensionality reduction 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 ~ 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Dimensionality reduction Equivalent: ‘spectral decomposition’ of the matrix: 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 = 0.18 0.36 0.18 0.90 0 0 0 0 0 0 0 0.53 0.80 0.27 x 9.64 0 0 5.29 x 0.58 0.58 0.58 0 0 0 0 0 0.71 0.71 Dimensionality reduction Equivalent: ‘spectral decomposition’ of the matrix: 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 = u1 u2 x l1 l2 x v1 v2 Dimensionality reduction ‘spectral decomposition’ of the matrix: m n 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 r terms = l1 u1 vT1 + nx1 1xm l2 u2 vT2 +... Dimensionality reduction approximation / dim. reduction: by keeping the first few terms (Q: how many?) m n 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 = l1 u1 vT1 + l2 u2 vT2 +... assume: l1 >= l2 >= ... Dimensionality reduction A heuristic: keep 80-90% of ‘energy’ (= sum of squares of li ’s) m n 1 2 1 5 0 0 0 1 2 1 5 0 0 0 1 2 1 5 0 0 0 0 0 0 0 2 3 1 0 0 0 0 2 3 1 = l1 u1 vT1 + l2 u2 vT2 +... assume: l1 >= l2 >= ... Another example-Eigenface • The PCA problem in HW5 • Face data X • Eigenvectors associated with the first few large eigenvalues of XXT have face-like images Dimensionality reduction • Matrix V in the SVD decomposition (X = USVT ) is used to transform the data. • XV (= US) defines the transformed dataset. • For a new data element x, xV defines the transformed data. • Keeping the first k (k < n) dimensions, amounts to keeping only the first k columns of V. Principal Components Analysis (PCA) • Transfer the dataset to the center by subtracting the means: let matrix X be the result. • Compute the matrix XTX. • The covariance matrix except for constants. • Project the dataset along a subset of the eigenvectors of XTX. • Matrix V in the SVD decomposition (X= U S VT ) contains the eigenvectors of XTX. • Also known as K-L transform.