pca - Docjava.com

Download Report

Transcript pca - Docjava.com

Multivariate Analysis
And
PCA
1
Principal Components Analysis (
PCA)
• Is a Factor Analytic method
• Can be used to:
– Reduce number of dimensions in data
– Find patterns in high-dimensional data
– Visualize data of high dimensionality
• Example applications:
– Face recognition
– Image compression
– Gene expression analysis
2
Curse of Dimensionality.
• A major problem is the curse of dimensionality.
• If the data x lies in high dimensional space, then
an enormous amount of data is required to learn
distributions or decision rules.
• Example: 50 dimensions. Each dimension has
20 levels. This gives a total of
cells. But
the no. of data samples will be far less. There
will not be enough data samples to learn.
3
What is PCA(Principal
Component Analysis)
• Basic idea : Given the data in M-Space,PCA reduce the
dimensionality of a data set in which there are a large
number of interrelated variables ,while retaining as much
as possible of the variation present in the data set. This
can be achieved by transforming to a new set of
variables,the principle components,which are
uncorrelated and are order so that the first few retain
most of the variation present in all of the original
variables
4
More about PCA
5
Benefits of PCA
• Reduction of computation and storage
overhead
• Reduction of noise
• Useful for visualizing the data
6
Multivariate Analysis:
Multiple Regression
Ordination
7
Ordination
Goal: to discover and summarize the
main patterns of variation in a set of
variables measured over a number
of sample locations.
8
Ordination
Ordination techniques may generate
useful simplifications of patterns in
complex multivariate data sets.

Ordination combine common variation
into new variables (multivariate axes)
along which samples are ordered.

9
Dimension Reduction
• One way to avoid the curse of
dimensionality is by projecting the data
onto a lower-dimensional space.
• Techniques for dimension reduction:
• Principal Component Analysis (PCA)
• Fisher’s Linear Discriminant
• Multi-dimensional Scaling.
• Independent Component Analysis.
10
A Numerical Example
• Original data values & mean centered:
20
6
18
4
16
14
2
12
10
0
8
-6
6
-4
-2
0
2
4
6
-2
4
-4
2
0
0
5
10
15
20
-6
11
A Numerical Example
• Transformed data space:
6
4
2
0
-6
-4
-2
0
2
4
6
-2
-4
-6
12
Ordination
“A procedure for adapting a
multidimensional swarm of data
points in such a way that when it
is projected onto a reduced
number of dimensions any
intrinsic pattern will become
apparent”
13
Ordination
Data reduction technique:
To
select low-dimensional
projections of the data for graphing.
To
search for “structure” in the data.
14
A Numerical Example
• Compare original
vs. transformed
data space:
6
4
20
18
2
16
14
12
0
-6
10
8
-4
-2
0
2
4
6
-2
6
4
-4
2
0
0
5
10
15
20
-6
15
Ordination methods:
Principal Component Analysis (PCA)
 Correspondence Analysis (CA)
 Principal Coordinate Analysis (PCoA)
 Discriminant Function Analysis (DFA)

16
PCA:
Principal components analysis (PCA) is
perhaps the most common technique
used to summarize patterns among
variables in multivariate datasets.
17
Principal Component Analysis (PCA):
A geometric interpretation

PCA constructs a new coordinate system new variables - which are linear combinations of
the original axes and which are defined to align
the samples along their major dimensions or
axes of variation.
PCA finds
the coordinate system that best
represents the internal variability in the data,
essentially representing the data.
18
Principal Component Analysis (PCA):
A geometric interpretation
The technique also compresses the internal variability in the
data into a smaller number of important axes, by capturing
associations among variables (species, environmental
variables).
19
Basics & Background
• Objective: Conceptualize underlying
pattern or structure of observed variables
yi1, …,yip on p attributes at each of n sites
s i.
• PCA can be viewed as a rotation between
data spaces of yi1, …,yip and ui1, …,uip.
• Where u1 is measured along the direction
of maximum separation (i.e., variance) and
u2 along the second in line and so forth …20
Principal components
• 1. principal component (PC1)
– the direction along which there is greatest
variation
• 2. principal component (PC2)
– the direction with maximum variation left in
data, orthogonal to the 1. PC
General about principal components
– linear combinations of the original variables
– uncorrelated with each other
21
Principal components
22
Basics & Background
• Eigenvalue and Eigenvector:
– Eigen originates in the German language and
can be loosely translated as “of itself”
– Thus an Eigenvalue of a matrix could be
conceptualized as a “value of itself”
– Eigenvalues and Eigenvectors are utilized in a
wide range of applications (PCA, calculating a
power of a matrix, finding solutions for a
system of differential equations, and growth
23
models)
Background - variance
• Standard deviation:
– Average distance from mean to a point
• Variance:
– Standard deviation squared
– One-dimensional measure
n
s2 
(X
i 1
i
 X)
2
(n  1)
24
Principal Component Analysis
• PCA is the most commonly used
dimension reduction technique.
• (Also called the Karhunen-Loeve
transform).
• PCA – data samples
• Compute the mean
25
Principal components - Variance
25
Variance (%)
20
15
10
5
0
PC1
PC2
PC3
PC4
PC5
PC6
PC7
PC8
PC9
PC10
26
Background - covariance
• How two dimensions vary from the mean
with respect to each other
n
cov(X , Y ) 
(X
i 1
i
 X )(Yi  Y )
(n  1)
• cov(X,Y)>0: Dimensions increase together
• cov(X,Y)<0: One increases, one decreases
• cov(X,Y)=0: Dimensions are independent
27
Background – covariance
matrix
• Contains covariance values between all
possible dimensions:
C
nxn
 (cij | cij  cov(Dimi , Dimj ))
• Example for three dimensions (x,y,z):
 cov(x, x) cov(x, y ) cov(x, z ) 


C   cov(y, x) cov(y, y ) cov(y, z ) 
 cov(z , x) cov(z , y ) cov(z , z ) 


28
Associations among variables in PCA
is measured by:
Correlation Matrix (variables have
different scales, e.g., environmental
variables.
Covariance Matrix (variables have the
same scales, e.g., morphological
variables = it preserves allometric
relationships = parts of the same
organism grow at different rates).
29
Project on the axis
30
Why only choose two axes?
• Eigenvalue for the 3 axes are
1.8907,0.9951,0.1142
typically express the eigenvalue as
percentage of the total:
PCA Axis 1: 63%
PCA Axis 2: 33%
PCA Axis 3: 4%
31
Describing Video via PCA
• Strategy:condense local spatial
information using the PCA, and to
preserve the temporal information by
keeping all such reduced spatial
information for all frames.
32
The mathematic of Principal Component Analysis
(PCA):
Eigenanalysis is a mathematical operation on a
square, symmetric matrix (e.g., pairwise
correlation matrix).
A square matrix has # rows =#cols.
A symmetric matrix is transpose invariant.
33
The mathematic of Principal Component Analysis
(PCA):
The answer to an eigenanalysis consists of a
series of eigenvalues and eigenvectors. Each
eigenvalue has an eigenvector, and there are as
many eigenvectors and eigenvalues as there are
rows in the initial correlation or covariance matrix.
Eigenvalues are usually ranked from the greatest
to the least.
34
Principal component analysis presents three
important structures:
1 - Eigenvalues: represent the amount of
variation summarized by each principal
component. The first principal component
(PC-1) presents the largest amount, PC-2
presents the second largest, and so on.
35
Step1 Extracting features
• Features used in video analysis:
color,texture,shape,motion vector…
• Criteria of choosing features : they should have
similar statistical behavior across time
• Color histogram: simple and robust
• Motion vectors:invariance to color and light
36
Principal component analysis presents three
important structures:
2 - Eigenvectors: Each principal component is
a linear function with coefficients for each
variable.
Eigenvectors contain these
coefficients. High values, positive or negative,
represents high association with the
component.
37
Principal component analysis presents three
important structures:
3 - Scores: Since each component is a linear
function of the variables, when multiplying the
standardized variables (in the case of
correlation matrices) by the eigenvector
structure, a matrix containing the position of
each observation in each principal component
is produced. The plot of these scores in the
first few dimensions, represents the main
patterns of variation among the original
observations.
38
Original data
Correlation or
covariance matrix
eigenvalues
eigenvectors
scores
39
Principal component analysis: an example
River Macacu
53 sites
0
Atlantic Ocean
4
Km
River Macae
28 sites
0 4
Km
Atlantic Ocean
40
Are the two streams different in their environments?
 Local environment:
 Depth








Depth variation
Current velocity
Current variation
Substrate composition:
Boulder, rubble, gravel and sand
Substrate variation
Width variation (irregularity)
Area
Altitude
41
Original data
Correlation matrix
87 sites by
12 variables
12 eigenvalues
12 variables
eigenvectors
12 x 12
scores
87 sites by
12 PC axes
42
Correlation matrix
1.0
0
0.0
7
0.0
2
0.1
2
0.0
2
0.0
7
0.3
5
0.4
2
0.0
2
0.1
9
0.1
3
0.1
7
0.0
6
0.4
4
0.3
3
0.0
8
0.3
3
0.8
1
0.3
3
1.0
0
0.1
1
0.5
2
0.0
9
0.0
4
0.2
8
0.0
4
1.0
0
0.1
2
0.8
5
0.3
1
0.0
7
0.2
8
0.1
2
1.0
0
0.0
7
0.1
7
0.0
8
0.8
5
0.0
7
0.3
1
0.1
7
0.0
7
0.0
8
0.0
7
0.0
6
0.3
3
1.0
0
0.3
6
0.1
7
0.0
6
0.3
3
0.3
6
1.0
0
0.0
8
0.0
1
0.1
1
0.8
4
0.8
6
0.6
6
0.1
3
0.1
7
0.0
3
0.7
1
0.5
7
0.3
6
0.2
0
0.1
7
0.0
0
0.1
1
0.2
43
3
0.0
5
0.7
1
eigenvalues
1
4.348
2
2.288
3
1.429
4
1.018
5
0.957
6
0.620
7
0.478
8
0.300
9
0.265
10
0.231
11
0.067
12
0.000
Screeplot
5
4
eigenvalues
PC
3
2
1
0
0
2
4
6
8
10
12
14
PC
44
Eigenvector structure:
variable
PC
1
2
3
4
depth
0.00
-0.20
0.60
0.27
depth variation
0.46
-0.08
-0.02
-0.02
current
velocity
-0.08
-0.23
0.26
0.69
current
variation
0.43
0.01
-0.05
0.08
boulder
0.22
0.45
0.40
-0.02
rubble
0.06
0.37
-0.42
0.36
gravel
0.03
-0.22
-0.35
0.40
sand
-0.24
-0.50
-0.01
-0.32
altitude
-0.18
0.43
0.25
-0.08
area
0.42
-0.06
0.02
0.02
width variation
0.41
-0.09
-0.08
-0.10
substrate
45
Eigenvector plot
0.6
boulder
0.4
current
depth
PC-2
0.2
area
altitude
width
variation
0.0
current
variability
rubble
-0.2
depth
variability
gravel sediment variability
-0.4
sand
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
PC-1
46
bi-plot (scores+eigenvectors)
0.6
Macae
boulder
0.4
current
depth
Macacu
PC-2
0.2
area
altitude
width
variation
0.0
current
variability
rubble
-0.2
depth
variability
gravel sediment variability
-0.4
sand
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
PC-1
47
Ordination bi-plots
This summary is often a useful end in itself:
the analysis discovers the latent structure of
the data and how the variables contribute to
this structure.
48
Background – eigenvalues &
eigenvectors
• Vectors x having same direction as Ax are
called eigenvectors of A (A is an n by n
matrix).
• Ax=x,  is called an eigenvalue of A.
• Ax=x  (A-I)x=0
49
How to calculate x and :
– Calculate det(A-I), yields a polynomial
(degree n)
– Determine roots to det(A-I)=0, roots are
eigenvalues 
– Solve (A- I) x=0 for each  to obtain
eigenvectors x
50
PCA – step 1
• Get some data!
– The data is organized as a matrix Data
– Rows are observations
– Columns are dimensions (variables)
– Observations can also act as dimensions
and vice versa by transposing matrix
51
Color Example
•
•
•
•
•
•
Let pixel p1 be r1,g1,b1
A ={P1,P2,P3}
D1={r1,r2,r3}
D2={g1,g2,g3}
D3={b1,b2,b3}
Why use RGB? If you want skin tones,
why not use IYQ?
52
PCA – step 2
• Subtract the mean from each dimension
DataAdjustij  Dataij  {Data xj | x  1,...,m}
where i is observation, j is dimension and m is
total number of observations
53
PCA – steps 3&4
• Calculate covariance matrix for DataAdjust
• Calculate eigenvalues  and eigenvectors x
for covariance matrix:
– Eigenvalues j are used for calculation of [% of
total variance] (Vj) for each component j:
V j  100
j
n
 x
n

x 1
x
n
x 1
54
PCA – steps 5&6
• Choose components – form feature vector
– Eigenvalues  and eigenvectors x are sorted in
descending order
– Component with highest  is principal
component
– Featurevector=(x1, ... , xn) where xi is a column
oriented eigenvector. Contains chosen
components.
55
Derive new dataset
– Transpose Featurevector and DataAdjust
– Finaldata=RowFeatureVector x
RowDataAdjust
– Original data in terms of chosen components
– Finaldata has eigenvectors as coordinate
axes
56
PCA – step 7
• Retrieving old data (e.g. in data
compression)
– RetrievedRowData=(RowFeatureVectorT x
FinalData)+OriginalMean
– Yields original data using the chosen
components
57
Applications - Computer
vision
• Representation
– N x N pixel image  X=(x1 ... xN2)
– xi is intensity value
58
PCA for Pattern identification
– Perform PCA on matrix of M images
– If new image  Which original image is most
similar?
– Traditionally: difference original image and
new image
– PCA: difference PCA data and new image
– Advantage: PCA data reflects similarities and
differences in image data
– Omitted dimensions  still good performance
59
Applications – Computer
vision
• PCA for image compression
– M images, each containing N2 pixels
– Dataset of M dimensions and N2 observations
– Corresponding pixels form vectors of intensities
– PCA produces M eigenvectors and eigenvalues
– Compress: choose limited number of
components
– Information loss when recreating original data
60
Motion Detection
using PCA
61
Agenda

Motion Detection
 Input
Video
 Algorithm Steps (2-D and 3-D blocks)
 Results

Sample Videos and Results
 Video
with 8x8 Detection Blocks
 Video with 32x32 Detection Blocks
62
Input Video
MPEG video converted to 2688 JPEG
image frames
 Full RGB color

63
Algorithm Steps
1.
2.
3.
4.
5.
6.
7.
8.
Reshape image to 8x8 blocks
Collect blocks from every frame, normalize and reshape array from
3-D 8x8 blocks
Compute PCA projection matrix per block
Compute PCA score by projecting blocks from each frame onto
that block’s 3-PCA projection
Compute EV values with W=3 for each block
Generate global threshold based on all blocks and frames
Generate local dynamic threshold for each block/frame with W=3
Generate motion matrix based on local and global dynamic
threshold for all blocks-frames
64
Step 1 - Details
Read the color image
 Resize the image by scale factor of 0.5
 Convert the image to gray scale
 Reshape the image into 8x8 distinct blocks
 Transpose and save the data


Note: save per frame block data
65
Step 1 - Reshape Image
Image
im2col(Image)
64 Pix per Block
1728 Blocks
66
Step 1 - Code
fileName = ...;
imN = imread(fileName);
imN = imresize(imN,0.5);
imN = rgb2gray(imN);
67
Step 1 - 8x8 Block Size
Block size
relative to
image size
Block 26x25
Image Size:
36x48 blocks
68
Step 2 - Collect Blocks





Collect same block from all the frames
Create a single matrix for each block location
Reshape vector from 3-D 8x8 blocks
There are 1728 matrices holding pixel values
Each matrix is 2688 x 64 (frames x pixels/block)
69
Step 2-Normalize Blocks
Normalize each block by its mean value
 Each block has its mean subtracted from
each of the 64 pixel values
 Store the normalized block data to be
used in Step 3 and Step 4

70
Step 2 - Block Matrix
Each block X of 1728
total blocks has a matrix
representation of size
2688x64
Each block is normalized
by its mean value
N = 2688
71
Step 2 – 3-D 8x8 Blocks
Take 3 rows of Block matrix from previous
slide 3x64
 Reshape into 1x192 vector
 3-D blocks are overlapping
 New 3-D Block Matrix is used in
computing PCA scores and projection
matrices

72
Step 3 - Compute PCA

Load normalized block matrix from Step 2
and compute the PCA projection matrix for
this block sequence
73
Step 3 - PCA Projection


The principal components projection matrix contains 64
rows representing each pixel location in the block and 64
columns representing 64 principal components
Only the first three components are used in projection
(first 3 columns)
64 PCAs
64 Pix
74
Step 4 - Compute Score


Load normalized block matrix from Step 2 and
project it onto the PCA projection matrix
computed in Step 3
Only the first 3 PCA projections are used
Block X
3 Scores
Block X
3 PCAs
2688
Frames
2688
Frames
64 Pix
64 Pix/ Block
75
Step 5 - Compute EV





For each block sequence, load the PCA score
matrix computed in Step 4
Compute a covariance matrix using a moving
window of size 3
Compute eigenvalues (EVs)
Sort to get the larges EV value
Store the data in one EV matrix, representing all
blocks and all frames
76
Step 5 - EV Matrix

EV matrix will contain a single EV value for
a block-frame spatiotemporal location
EV Matrix
1728
Blocks
2688 Frames
77
Step 6- Global Threshold
Load EV matrix from Step 5
 Compute mean and standard deviation
 Find all entries in the EV matrix that are
below mean+2*std
 Update the EV matrix

78
Step 7 - Local Threshold
Use the updated EV matrix from Step 6
 Compute a local dynamic threshold using
window
 Generate a Motion matrix of same size as
the EV matrix with a simple 0/1 values
(1=motion)

79
Step 7 - Assumptions




Assume that first 100 frames have no detectable
motion
Compute mean and std of first 100 frames for
each block
Compute local threshold for each block using a
moving window (W=3)
Adjust local threshold, when no moving object is
detected
80
Step 8 - Motion Matrix
Motion matrix is of size 1728x2688, same
size as the EV matrix
 It contains values 0 or 1, where 1 = motion
detected
 Use the Motion matrix to create sample
videos showing blocks where motion was
detected

81
Detected Motion
No motion
Detected Motion
(red blocks)
82
Conclusion
The method of motion detection using
principal component analysis combined
with dynamic thresholding yields very
good results in detecting motion
 Future projects will include processing
images with variation in size of the blocks

83