Principal Component Analysis

Download Report

Transcript Principal Component Analysis

Principal Component Analysis

Consider a collection of points

Suppose you want to fit a line

Project onto the Line

Consider variance of distribution on the line

Different line . . .

different variance

Maximum Variance

Minimum Variance

Given by eigenvectors of covariance matrix of coordinates of original points

PCA notes… • Input data set • Subtract the mean to get data set with 0 mean • Compute the covariance matrix • Compute the eigenvalues and eigenvectors of the covariance matrix • Choose components and form a feature vector. Order by eigenvalues – highest to lowest

PCA • To compress, ignore components of lesser significance • The

feature vector

F is a matrix is the matrix of ordered eigenvectors • Derive the data set in the new coordinates: • new_data = F T old_data

Covariance • C, of 2 random variables X and Y

C

     cov( cov( cov(

x x

,

x

, ,

x

)

y z

) ) where cov(

X

,

Y

)  cov(

x

,

y

) cov(

y

,

y

) cov(

y

,

z

) cov( cov( cov(

x y

,

z

, ,

z z z

) ) )    

i n

  1 (

x i

x

)(

y i

y

)

n

 1

Example

OOBB

Choose bounding box oriented this way

OOBB: Fitting

Covariance matrix of point coordinates describes statistical spread of cloud.

OBB is aligned with directions of greatest and least spread (which are guaranteed to be orthogonal).

Good Box

OOBB

Add points: worse Box

OOBB

More points: terrible box

OOBB

OOBB