My Previous Research on Object Segmentation & Recognition

Transcript My Previous Research on Object Segmentation & Recognition

Element Rearrangement for Tensor
based Subspace Learning
Dong XU
School of Computer Engineering
Nanyang Technological University
1
What is Tensor?
Tensors are arrays of numbers which transform in
certain ways under coordinate transformations.
m1
m1
m1
m3
xR
m1
Vector
m2
X  R m1 m2
Matrix
m2
X  R m1 m2 m3
3rd-order Tensor
Definition of Mode-k Product
Original
Tensor
Projection:
m1 (100)
100
100
100
=
m1
10
100
m3 (40)
high-dimensional space
-> low-dimensional space
10
m2 (100)
100
100
=
m2 (100)
Projection
Matrix
m2
Y  XU Yij   X ik U kj
100
=
m1 (100)
100
10
m1
m2
(100)
m2 (100)
m2 '(10)
Notation:
Y  X k U
=
(100)
10
m3 (40)
New
Tensor
Product for two Matrices
k 1
100
100
.10
.
.
10
.
.
.
100
low-dimensional space
-> high-dimensional space
100
m2 '(10)
Reconstruction:
m2 '(10)
m1
(100)
m2 '(10)
Original Projection New
Matrix
Matrix Matrix
Definition of Mode-k Flattening
m1 (100)
m1
...
m3 (40)
m2 (100)
Tensor
m1 (100)
m2 m3 (100* 40)
Matrix
Potential Assumption in Previous Tensor-based Subspace Learning:
Intra-tensor correlations: Correlations along column vectors of mode-k
flattened matrices.
Data Representation in Dimensionality Reduction
Vector
Matrix
3rd-order Tensor
Gray-level Image Filtered Image Video Sequence
High Dimension
...
...
...
Low Dimension
Rank-1
Decomposition, 2001
A. Shashua
Examples Tensorface, 2002 and
A. Levin,
PCA, LDA
M. Vasilescu and
D. Terzopoulos,
Our Work
Xu et al., 2005
Yan et al., 2005
Why Represent Objects as Tensors
instead of Vectors?
 Natural Representation
...
...
Gray-level Images (2D structure)
Videos (3D structure)
Gabor-filtered Images (3D structure)
 Enhance Learnability in Real Application
Curse of Dimensionality (Gabor-filtered image: 100*100*40 -> Vector: 400,000)
Small sample size problem (less than 5,000 images in common face databases)
 Reduce Computation Cost
Concurrent Subspace Analysis as an Example
(Criterion: Optimal Reconstruction)
Dimensionality
Reduction
Reconstruction
10
100
100
m1
m1
m1
10
40
Input sample
40
10
100
100
Sample in Lowdimensional space
The reconstructed
sample
U1
Objective Function:
U2
U3
Projection Matrices?
(U k* |3k 1 ) 
arg min  i ||Xi 1 U1U1... 3 U 3U 3  Xi ||2
U k |3k 1
D. Xu, S. Yan, H. Zhang and et al., CVPR, 2005
Tensorization - New Research Direction
 Our Extensions:
1) Supervised Learning with Rank-(R1, R2,… Rn) Decomposition (DATER):
CVPR 2005 and T-IP 2007
2) Supervised Learning with Rank-1 Decomposition and Adaptive Margin (RPAM):
CVPR 2006 and T-SMC-B (To appear)
3) Application in Human Gait Recognition (CSA-2+DATER-2): T-CSVT 2006
 D. Tao, S. Maybank, et al.’s Extensions :
1)Incremental Learning with Tensor Representation: ACM SIGKDD 2006
2)Tensorized SVM and Minimax Probability Machines: ICDM 2005
 G. Dai and D. Yeung’s Extensions:
Tensorized NPE (Neighborhood Preserving Embedding), LPP (Locality
Preserving Projections) and LDE (Local Discriminant Embedding): AAAI 2006
Graph Embedding Framework
Direct Graph Embedding
min
yT Ly
T
y B y 1
Original PCA & LDA,
ISOMAP, LLE,
Laplacian Eigenmap
Type
Formulation
Example
Linearization
Kernelization
y  XTw
w  i i ( xi )
PCA, LDA, LPP
KPCA, KDA
Tensorization
yi  Xi 1 w1 2 w2 n wn
CSA, DATER
S. Yan, D. Xu, H. Zhang and et al., CVPR, 2005, T-PAMI,2007
Graph Embedding FrameworkContinued
Intrinsic Graph:
G  [ xi , Sij ]
Similarity in high
dimensional space
S, SP: Similarity matrix (graph edge)
L, B: Laplacian matrix from S, SP;
L  D  S , Dii   j i Sij i
Data in high-dimensional space and lowdimensional space (assumed as 1D space here):
Penalty Graph
G P  [ xi , SijP ]
X  [ x1 , x2 ,..., xN ] y  [ y1 , y2 ,..., yN ]T
Tensorization
Low dimensional representation is
obtained as:
Intrinsic Graph
yi  Xi 1 w1 2 w2 ... n wn
Penalty Graph
Objective function in Tensorization
( w1 ,..., w n )*  arg min
|| X 
i
f ( w1 ,..., w n ) 1 i  j
w1 2 w 2 ... n w n  X j 1 w1 2 w 2 ... n w n ||2 Sij
f ( w1 ,..., w n )   i 1|| Xi 1 w1 2 w 2 ... n w n ||2 Bii
N
where
1
or
f ( w1 ,..., w n )   || Xi 1 w1 2 w 2 ... n w n  X j 1 w1 2 w 2 ... n w n ||2 SijP
i j
A General Framework for Dimensionality Reduction
Algorithm
S & B Definition
Embedding Type
PCA/KPCA/CSA
Sij  1 N , i  j; B  I
L/K/T
LDA/KDA/DATER
Sij   li ,l j nli , B  I  N1 ee
L/K/T
ISOMAP
Sij   ( DG )ij , i  j; B  I
D
LLE
S  M  M   M M; B  I
D
Sij  exp{ || xi  x j ||2 / t}
LE/LPP
if
|| xi  x j ||  ; B=D
D: Direct Graph Embedding
L: Linearization
K: Kernelization
T: Tensorization
D/L
New Dimensionality Reduction Algorithm:
Marginal Fisher Analysis
Important Information
for face recognition:
1) Label information
2) Local manifold structure
(neighborhood or margin)
Sij 
1: if xi is among the k1-nearest neighbors of xj in the same class;
0 : otherwise
SijP  1: if the pair (i,j) is among the k shortest pairs among the data set;
2
0: otherwise
Motivations
Contradiction
• The success of tensor-based subspace learning relies
on the redundancy among the unfolded vector
• The truth is that this kind of correlation/redundancy is
often not strong in real data
S. Yan, D. Xu, S. Lin and et al., CVPR, 2007
14
Motivations-Continued
Pixel Rearrangement
Sets of highly
correlated pixels
Low correlation
Columns of highly
correlated pixels
High correlation
Problem Definition
• The task of enhancing correlation/redundancy
among 2nd–order tensor is to search for a pixel
rearrangement operator R, such that
N
R  arg min{ min  || X iR  UU T X iRVV T ||2 }
*
R
U ,V
i 1
1. X iR is the rearranged matrix from sample X i
2. The column numbers of U and V are predefined
After the pixel rearrangement, we can use the rearranged
tensors as input for Tensorization of graph embedding!
16
Solution to Pixel Rearrangement Problem
Initialize U0, V0
n=n+1
Compute reconstructed matrices
Rn1
T
X iRec
Vn 1Vn 1T
, n  U n 1U n 1 X i
Optimize U and V
N
(U n ,Vn )  arg min  || X iRn  UU T X iRn VV T ||2
U ,V
i 1
Optimize operator R
N
2
Rn  arg min  || X iR  X iRec
, n ||
R
i 1
N
Note :  || X
i 1
Rn
i
X
N
Rec
i,n
2
||
 || X iRn  U n 1U n 1T X iRn Vn 1Vn 1T ||2
i 1
17
Step for Optimizing R
• It is Earth Mover Distance problem
N
2
R  arg min  || X iR  X iRec
, n ||
Sender
*
R
i 1
min  c pq R pq st.
R
p,q
1: 0  R pq  1; 2 :  R pq  1; 3 :  R pq  1
p
p
c pq
Original matrix
Receiver
q
N
2
where c pq   | X i ( p)  X iRec
, n (q ) |
q
i 1
Reconstructed matrix
1. Linear programming problem has integer solution.
2. We constrain the rearrangement within local neighborhood for speedup.
18
Convergence Speed
19
Rearrangement Results
20
Reconstruction Visualization
21
Reconstruction Visualization
22
Classification Accuracy
Thank You very much!
www.ntu.edu.sg/home/dongxu
24