Transcript Principal Component Analysis
Principal Component Analysis
Consider a collection of points
Suppose you want to fit a line
Project onto the Line
Consider variance of distribution on the line
Different line . . .
different variance
Maximum Variance
Minimum Variance
Given by eigenvectors of covariance matrix of coordinates of original points
PCA notes… • Input data set • Subtract the mean to get data set with 0 mean • Compute the covariance matrix • Compute the eigenvalues and eigenvectors of the covariance matrix • Choose components and form a feature vector. Order by eigenvalues – highest to lowest
PCA • To compress, ignore components of lesser significance • The
feature vector
F is a matrix is the matrix of ordered eigenvectors • Derive the data set in the new coordinates: • new_data = F T old_data
Covariance • C, of 2 random variables X and Y
C
cov( cov( cov(
x x
,
x
, ,
x
)
y z
) ) where cov(
X
,
Y
) cov(
x
,
y
) cov(
y
,
y
) cov(
y
,
z
) cov( cov( cov(
x y
,
z
, ,
z z z
) ) )
i n
1 (
x i
x
)(
y i
y
)
n
1
Example
OOBB
Choose bounding box oriented this way
OOBB: Fitting
Covariance matrix of point coordinates describes statistical spread of cloud.
OBB is aligned with directions of greatest and least spread (which are guaranteed to be orthogonal).
Good Box
OOBB
Add points: worse Box
OOBB
More points: terrible box
OOBB
OOBB