Entropic graphs: Applications

Download Report

Transcript Entropic graphs: Applications

Entropic graphs: Applications
Alfred O. Hero
Dept. EECS, Dept BME, Dept. Statistics
University of Michigan - Ann Arbor
[email protected]
http://www.eecs.umich.edu/~hero
1. Dimension reduction and pattern matching
2. Entropic graphs for manifold learning
3. Simulation studies
4. Applications to face and digit databases
1.Dimension Reduction and Pattern
Matching
• 128x128 images of faces
• Different poses, illuminations, facial expressions
• The set of all face images evolve on a lower dimensional
imbedded manifold in R^(16384)
Face Manifold
Classification on Face Manifold
Manifold Learning:
What is it good for?
• Interpreting high dimensional data
• Discovery and exploitation of lower dimensional
structure
• Deducing non-linear dependencies between
populations
• Improving detection and classification
performance
• Improving image compression performance
Background on Manifold Learning
1.
Manifold intrinsic dimension estimation
1.
2.
3.
4.
2.
Manifold Reconstruction
1.
2.
3.
4.
3.
Local KLE, Fukunaga, Olsen (1971)
Nearest neighbor algorithm, Pettis, Bailey, Jain, Dubes (1971)
Fractal measures, Camastra and Vinciarelli (2002)
Packing numbers, Kegl (2002)
Isomap-MDS, Tenenbaum, de Silva, Langford (2000)
Locally Linear Embeddings (LLE), Roweiss, Saul (2000)
Laplacian eigenmaps (LE), Belkin, Niyogi (2002)
Hessian eigenmaps (HE), Grimes, Donoho (2003)
Characterization of sampling distributions on manifolds
1.
2.
3.
Statistics of directional data, Watson (1956), Mardia (1972)
Data compression on 3D surfaces, Kolarov, Lynch (1997)
Statistics of shape, Kendall (1984), Kent, Mardia (2001)
Sampling on a Domain Manifold
2dim manifold
Embedding
Sampling distribution
Domain Sampling
A statistical sample
Observed sample
Learning 3D Manifolds
Ref: Tenenbaum&etal (2000)
Swiss Roll N=400
Ref: Roweiss&etal (2000)
S-Curve N=800
• Sampling density fy = Uniform on manifold
Sampled S-curve
A
Geodesic from A to B
is shortest path
B
Euclidean Path is
poor approximation
What is shortest path between points A and B along manifold?
Geodesic Graph Path
Approximation
A
B
Dykstra’s shortest
path approximates
geodesic
k-NNG skeleton
k=4
ISOMAP (PCA) Reconstruction
• Compute k-NN skeleton on observed sample
• Run Dykstra’s shortest path algorithm between
all pairs of vertices of k-NN
• Generate Geodesic pairwise distance matrix
approximation
• Perform MDS on
• Reconstruct sample in manifold domain
ISOMAP Convergence
• When domain mapping is an isometry, domain is
open and convex, and true domain dimension
d is known (de Silva&etal:2001):
• How to estimate d?
• How to estimate attributes of sampling density?
How to Estimate d?
Residual variance vs dimentionalityData Set
1
Landmark-ISOMAP
residual
curve
For Abilene Netflow data set
0.015
Residual variance
0.01
0.005
0
0
2
4
6
8
10
12
Isomap dimensionality
14
16
18
20
2. Entropic Graphs
•
in D-dimensional Euclidean space
•
Euclidean MST with edge power weighting gamma:
•
•
pairwise distance matrix over
edge length matrix of spanning trees over
• Euclidean k-NNG with edge power weighting gamma:
• When
obtain Geodesic MST
Example: Uniform Planar Sample
Example: MST on Planar Sample
Example: k-NNG on Planar Sample
Convergence of Euclidean MST
Beardwood, Halton, Hammersley Theorem:
GMST Convergence Theorem
Ref: Costa&Hero:TSP2003
k-NNG Convergence Theorem
Shrinkwrap Interpretation
n=400
n=800
Dimension = “Shrinkage rate” as vary number of resampled points on M
Joint Estimation Algorithm
• Convergence theorem suggests log-linear model
• Use bootstrap resampling to estimate mean graph
length and apply LS to jointly estimate slope and
intercept from sequence
• Extract d and H from slope and intercept
3. Simulation Studies: Swiss Roll
K=4
GMST
• n=400, f=Uniform on manifold
kNN
Estimates of GMST Length
Segment n=786:799 of MST sequence (=1,m=10) for unif sampled Swiss Roll
815
814
813
812
E[Ln]
811
810
809
808
807
806
805
785
Bootstrap SE bar (83% CI)
790
795
n
800
loglogLinear Fit to GMST Length
Segment of logMST sequence ( =1,m=10) for unif sampled Swiss Roll
6.704
6.702
y = 0.53*x + 3.2
n
log(E[L ])
6.7
6.698
6.696
log(E[Ln])
LS fit
6.694
6.692
6.665
6.67
6.675
log(n)
6.68
6.685
GMST Dimension and Entropy
Estimates
• From LS fit
find:
• Intrinsic dimension estimate
• Alpha-entropy estimate (
– Ground truth:
)
MST/kNN Comparisons
MST
MST
n=800
kNN
n=400
kNN
n=800
n=400
Entropic Graphs on S2 Sphere in 3D
GMST
• n=500, f=Uniform on manifold
kNN
k-NNG on Sphere S4 in 5D
• k=7 for all algorithms
• kNN resampled 5 times
• Length regressed on 10 or 20
samples at end of mean length
sequence
• 30 experiments performed
• ISOMAP always estimates d=5
Histogram of resampled d-estimates of k-NNG
N=1000 points uniformly distributed on S4 (sphere) in 5D
n
Table of relative frequencies of correct d estimate
kNN/GMST Comparisons
Table of relative frequencies of correct d estimate
True Entropy
Estimated entropy (n = 600)
kNN/GMST Comparisons for
Uniform Hyperplane
GMST
4-NN
Improve Performance by Bootstrap Resampling
• Main idea: Averaging of weak learners
– Using fewer (N) samples per MST estimate, generate large
number (M) of weak estimates of d and H
– Reduce bias by averaging these estimates (M>>1,N=1)
– Better than optimizing estimate of MST length (M=1,N>>1)
Illustration of bootstrap resampling method: A,B: N=1 vs C: M=1
kNN/GMST Comparisons for
Uniform Hyperplane
Table of relative frequencies of correct d estimate using the GMST,
with (N = 1) and without (M = 1) bias correction.
4. Application: ISOMAP Face Database
• http://isomap.stanford.edu/datasets.html
• Synthesized 3D face surface
• Computer generated images
representing 700 different angles
and illuminations
d=3
• Subsampled
to 64 x 64 resolution
H=21.1 bits
d=4
(D=4096)
H=21.8 bits
• Disagreement over intrinsic
Mean GMST Length Function
dimensionality
Mean
kNNG
(k=7) length
– d=3
(Tenenbaum)
Histogram of d hat
vs Resampling
d=4 (Kegl)
Application: Yale Face Database
• Description of Yale face database 2
– Photographic folios of many people’s faces
– Each face folio contains images at 585
different illumination/pose conditions
– Subsampled to 64 by 64 pixels (4096 extrinsic
dimensions)
• Objective: determine intrinsic dimension
and entropy of a typical face folio
Samples from Face database B
GMST for 3 Face Folios
Dimension Estimator Histograms
for Face database B
Real valued intrinsic dimension estimates
using 3-NN graph for face 1.
Real valued intrinsic dimension estimates
using 3-NN graph for face 2.
Remarks on Yale Facebase B
• GMST LS estimation parameters
– Local Geodesic approximation used to generate
pairwise distance matrix
– Estimates based on 25 resamplings over 18 largest
folio sizes
• To represent any folio we might hope to attain
– factor > 600 reduction in degrees of freedom (dim)
– only 1/10 bit per pixel for compression
– a practical parameterization/encoder?
Application: MNIST Digit Database
Sample: MNIST Handwritten Digits
MNIST Digit Database
Histogram of intrinsic dimension estimates: GMST (left) and 5-NN (right)
(M = 1, N = 10, Q = 15).
Estimated intrinsic dimension
MNIST Digit Database
ISOMAP (k = 6) residual variance plot.
The digits database contains nonlinear transformations, such as width
distortions of each digit, that are not adequately modeled by ISOMAP!
Conclusions
• Entropic graphs give accurate global and consistent
estimators of dimension and entropy
• Manifold learning and model reduction
– LLE, LE, HE estimate d by finding local linear representation of
manifold
– Entropic graph estimates d from global resampling
– Initialization of ISOMAP… with entropic graph estimator
• Computational considerations
– GMST, kNN with pairwise distance matrix: O(E log E)
– GMST with greedy neighborhood search: O(d n log n)
– kNN with kdb tree partitioning: O(d n log n)
References
• A. O. Hero, B. Ma, O. Michel and J. D. Gorman,
“Application of entropic graphs,” IEEE Signal Processing
Magazine, Sept 2002.
• H. Neemuchwala, A.O. Hero and P. Carson, “Entropic
graphs for image registration,” to appear in European
Journal of Signal Processing, 2003.
• J. Costa and A. O. Hero, “Manifold learning with
geodesic minimal spanning trees,” to appear in IEEE TSP (Special Issue on Machine Learning), 2004.
• A. O. Hero, J. Costa and B. Ma, "Convergence rates of
minimal graphs with random vertices," submitted to IEEE
T-IT, March 2001.
• J. Costa, A. O. Hero and C. Vignat, "On solutions to
multivariate maximum alpha-entropy Problems", in
Energy Minimization Methods in Computer Vision and
Pattern Recognition (EMM-CVPR), Eds. M. Figueiredo,
R. Rangagaran, J. Zerubia, Springer-Verlag, 2003