Dror Baron Marco Duarte Shriram Sarvotham Michael Wakin

Transcript Dror Baron Marco Duarte Shriram Sarvotham Michael Wakin

Model-based
Compressive
Sensing
Richard Baraniuk
Rice University
Chinmay Hegde
Volkan Cevher
Marco Duarte
Compressive Sensing
• Sensing via randomized dimensionality reduction
random
measurements
sparse
signal
nonzero
entries
Compressive Sensing
• Sensing via randomized dimensionality reduction
random
measurements
sparse
signal
nonzero
entries
• Recovery by solving a sparse approx problem
– ex: basis pursuit
(linear programming)
– ex: greedy algorithm
Concise Signal Structure
• Sparse signal:
only K out of N
coordinates nonzero
– model: union of K-dimensional subspaces
aligned w/ coordinate axes
sorted index
Concise Signal Structure
• Sparse signal:
only K out of N
coordinates nonzero
– model: union of K-dimensional subspaces
• Compressible signal:
sorted coordinates decay
rapidly to zero
well-approximated
by a K-sparse signal
(simply by thresholding)
sorted index
Concise Signal Structure
• Sparse signal:
only K out of N
coordinates nonzero
– model: union of K-dimensional subspaces
• Compressible signal:
– model:
sorted coordinates decay
rapidly to zero
ball:
power-law
decay
sorted index
Restricted Isometry Property (RIP)
• Preserve the structure of sparse/compressible signals
random
measurements
sparse
signal
nonzero
entries
Restricted Isometry Property (RIP)
• Preserve the structure of sparse/compressible signals
K-planes
Restricted Isometry Property (RIP)
• Preserve the structure of sparse/compressible signals
• RIP of order 2K implies: for all K-sparse x1 and x2
K-planes
Restricted Isometry Property (RIP)
• Preserve the structure of sparse/compressible signals
• Random (iid Gaussian, Bernoulli) matrix has the RIP
whp if
K-planes
Sparse Recovery Algorithms
• Goal:
given
recover
•
and convex optimizations
– basis pursuit, Dantzig selector, Lasso, …
• Greedy algorithms
– matching pursuit (MP), orthogonal matching pursuit (OMP)
– iterated hard thresholding (IHT) [Blumensath, Davies]
– CoSaMP [Tropp, Needell]
– at their core:
iterated sparse approximation
RIP and Recovery
• Using
methods, CoSaMP, IHT
• Sparse signals
– noise-free measurements:
– noisy measurements:
exact recovery
stable recovery
• Compressible signals
– recovery as good as K-sparse approximation
CS recovery
error
signal K-term
approx error
noise
Example with CS Camera
target
N=65536 pixels
M=11000
measurements (16%)
M=1300
measurements (2%)
From Sparsity
to
Structured Sparsity
Beyond Sparse Models
• Sparse/compressible signal model captures
simplistic primary structure
wavelets:
natural images
Gabor atoms:
chirps/tones
pixels:
background subtracted
images
Beyond Sparse Models
• Sparse/compressible signal model captures
simplistic primary structure
• Modern compression/processing algorithms capture
richer secondary coefficient structure
wavelets:
natural images
Gabor atoms:
chirps/tones
pixels:
background subtracted
images
Sparse Signals
• Defn: A K-sparse signal lives on the collection of
K-dim subspaces aligned with coord. axes
Model-Sparse Signals
• Defn: A K-model sparse signal lives on a particular
(reduced) collection of K-dim canonical subspaces
[Blumensath and Davies]
• Fewer subspaces <> relaxed RIP
<> stable recovery using
fewer measurements M
Ex: Tree-Sparse
• Model: K-sparse coefficients
+ significant coefficients
lie on a rooted subtree
• Typical of wavelet transforms
of natural signals and images (piecewise smooth)
• Tree-sparse approx:
– CSSA
[B],
find best rooted subtree
of coefficients
dynamic programming
[Donoho]
• Recovery: inject tree-sparse approx in CoSaMP/IHT
• If
has Tree-RIP, number of measurements for
stable recovery [B&D]
Tree-Sparse Signal Recovery
N=1024
M=80
target signal
L1-minimization
(RMSE=0.751)
CoSaMP,
(RMSE=1.12)
Tree-sparse CoSaMP
(RMSE=0.037)
Compressible Signals
• Real-world signals are compressible, not sparse
• Recall: compressible <> approximable by sparse
– compressible signals lie close to a union of subspaces
– ie: approximation error decays rapidly as
– nested in that
• If
has RIP, then
both sparse and
compressible signals
are stably recoverable
sorted index
Model-Compressible Signals
• Model-compressible <> approximable
by model-sparse
– model-compressible signals lie close to a
reduced union of subspaces
– ie: model-approx error decays rapidly as
• Nested approximation property (NAP):
model-approximations nested in that
Model-Compressible Signals
• Model-compressible <> approximable
by model-sparse
– model-compressible signals lie close to a
reduced union of subspaces
– ie: model-approx error decays rapidly as
• Nested approximation property (NAP):
model-approximations nested in that
Model-Compressible Signals
• Model-compressible <> approximable
by model-sparse
– model-compressible signals lie close to a
reduced union of subspaces
– ie: model-approx error decays rapidly as
• Nested approximation property (NAP):
model-approximations nested in that
Model-Compressible Signals
• Model-compressible <> approximable
by model-sparse
– model-compressible signals lie close to a
reduced union of subspaces
– ie: model-approx error decays rapidly as
• Nested approximation property (NAP):
model-approximations nested in that
Model-Compressible Signals
• Model-compressible <> approximable
by model-sparse
– model-compressible signals lie close to a
reduced union of subspaces
– ie: model-approx error decays rapidly as
• Nested approximation property (NAP):
model-approximations nested in that
• New: while model-RIP enables stable
model-sparse recovery,
model-RIP is not sufficient for stable
model-compressible recovery!
Stable Recovery
• RIP:
• RAmP:
controls amt of nonisometry of
approximation subspace
Restricted Amplification Property
controls amt of nonisometry of
residual subspaces
optimal K-term
model recovery
(error controlled
by
RIP)
optimal 2K-term
model recovery
(error controlled
by
RIP)
in the
in the
residual subspace:
not in model
(error not controlled
by
RIP)
Stable Recovery
• RIP:
• RAmP:
controls amt of nonisometry of
approximation subspace
controls amt of nonisometry of
residual subspaces
optimal K-term
model recovery
(error controlled
by
RIP)
optimal 2K-term
model recovery
(error controlled
by
RIP)
in the
in the
residual subspace:
not in model
(error not controlled
by
RIP)
Stable Recovery
• RIP:
• RAmP:
controls amt of nonisometry of
approximation subspace
controls amt of nonisometry of
residual subspaces
optimal K-term
model recovery
(error controlled
by
RIP)
optimal 2K-term
model recovery
(error controlled
by
RIP)
in the
in the
residual subspace:
not in model
(error not controlled
by
RIP)
Stable Recovery
• RIP:
• RAmP:
controls amt of nonisometry of
approximation subspace
controls amt of nonisometry of
residual subspaces
optimal K-term
model recovery
(error controlled
by
RIP)
optimal 2K-term
model recovery
(error controlled
by
RIP)
in the
in the
residual subspace:
not in model
(error not controlled
by
RIP)
Ex: Tree-RIP, Tree-RAmP
Theorem: An MxN iid subgaussian random matrix
has the Tree(K)-RIP if
Theorem: An MxN iid subgaussian random matrix has
the Tree(K)-RAmP if
Simulation
• Recovery performance (MSE) vs. number of
measurements
• Piecewise cubic
signals +
wavelets
• Models/algorithms:
– sparse (CoSaMP)
– tree-sparse
Other Interesting Models (1)
• Deterministic models
– theory developed above is general
– seek models with
 NAP
 efficient approximation algorithm
– plus sensing matrix
with
 model-RIP, model-RAmP
– Ex: block sparsity / signal ensembles
[Tropp, Gilbert, Strauss], [Stojnic, Parvaresh, Hassibi]
[Eldar, Mishali], [Baron, Duarte et al]
Other Interesting Models (2)
• Bayesian probabilistic models
– sparse model
– compressible model
<> iid Gaussian mixture
<> generalized Gaussian (Laplacian)
• Encode correlations between coefficients
 one approach: graphical model
 wavelets: Hidden Markov Tree (HMT) [Crouse et al],
Gaussian scale mixture [Portilla et al], …
 space domain: Markov Random Field
• Recovery via MAP estimation
[see also Carin et al]
Ex: Background Subtraction
• Model clustering of significant pixels
in space domain using Ising MRF
• MAP iterative estimation
– Ising model approximation performed efficiently using
graph cuts
target
Ising-model
recovery
CoSaMP
recovery
LP (FPC)
recovery
References
• R. G. Baraniuk, V. Cevher, M. F. Duarte, and C. Hegde,
"Model-based Compressive Sensing," submitted to
IEEE Transactions on Information Theory, 2008.
• V. Cevher, M. F. Duarte, C. Hegde, and R. G. Baraniuk,
“Sparse Signal Recovery using Markov Random Fields,”
Neural Information Processing Systems (NIPS), Vancouver,
Canada, December 2008.
From Sparsity
to
Manifold Models
Stable Embedding
• Random projection not full rank, but stably embeds
signals with concise geometrical structure
– sparse and model-sparse signals
– compressible and model-compressible signals
– point clouds
with high probability provided M large enough
• Q: What about other concise signal models?
• Result: smooth K-dimensional manifolds in
Stable Manifold Embedding
Theorem:
Let
M
in RN be a compact K-dimensional manifold with
– condition number 1/t (curvature, self-avoiding)
– volume V
Stable Manifold Embedding
Theorem:
Let
M in RN be a compact K-dimensional manifold with
– condition number 1/t (curvature, self-avoiding)
– volume V
Let F be a random MxN orthoprojector with
Stable Manifold Embedding
Theorem:
Let
M
in RN be a compact K-dimensional manifold with
– condition number 1/t (curvature, self-avoiding)
– volume V
Let F be a random MxN orthoprojector with
Then with probability at least 1-r, the following
statement holds: For every pair x1,x2 in
M,
[B and Wakin,
FOCM, 2007]
Stable Manifold Embedding
Theorem:
Let
M in RN be a compact K-dimensional manifold with
– condition number 1/t (curvature, self-avoiding)
– volume V
Let F be a random MxN orthoprojector with
Then with probability at least 1-r, the following
statement holds: For every pair x1,x2 in
M,
[B and Wakin,
FOCM, 2007]
Application:
Compressive
Detection/Classification
via
Smashed Filtering
Information Scalability
• Many applications involve signal inference
and not reconstruction
detection < classification < estimation < reconstruction
• Good news:
CS supports efficient learning,
inference, processing directly
on compressive measurements
• Random projections ~ sufficient statistics
for signals with concise geometrical structure
• Leverages stable embedding of smooth manifolds
Matched Filter
• Detection/classification with K unknown
articulation parameters
– Ex: position and pose of a vehicle in an image
– Ex: time delay of a radar signal return
• Matched filter: joint parameter estimation and
detection/classification
– compute sufficient statistic for each potential target and
articulation
– compare “best” statistics to detect/classify
Matched Filter Geometry
• Detection/classification with K unknown
articulation parameters
• Images are points in
• Classify by finding closest
target template to data
for each class (AWG noise)
– distance or inner product
target templates
from
generative model
or
training data (points)
data
Matched Filter Geometry
data
• Detection/classification with K unknown
articulation parameters
• Images are points in
• Classify by finding closest
target template to data
• As template articulation
parameter changes,
points map out a K-dim
nonlinear manifold
• MLE matched filter class.
= closest manifold search
articulation parameter space
Recall: CS for Manifolds
• Recall the Theorem:
random measurements
preserve manifold
structure
• Enables parameter
estimation and MF
detection/classification
directly on compressive
measurements
– K very small in many
applications
– no data sparsity required!
Example: Matched Filter
• Detection/classification with K=3 unknown
articulation parameters
1. horizontal translation
2. vertical translation
3. rotation
Smashed Filter
• Detection/classification with K=3 unknown
articulation parameters (manifold structure)
• Dimensionally reduced matched filter directly on
compressive measurements
Smashed Filter
• Random shift and rotation (K=3 dim. manifold)
• Noise added to measurements
• Goal:
identify most likely position for each image class
more noise
number of measurements M
classification rate (%)
avg. shift estimate error
identify most likely class using nearest-neighbor test
more noise
number of measurements M
Application:
Scalable
Data Fusion
Multisensor Inference
• Example:
Network of J cameras observing
an articulating object
• Each camera’s images lie on K-dim manifold in
• How to efficiently fuse imagery from J cameras
to solve an inference problem while
minimizing network communication?
Multisensor Fusion
• Fusion:
stack corresponding image vectors
taken at the same time
• Fused images still lie on K-dim manifold in
“joint manifold”
Multisensor Fusion via JM+CS
• Can take random CS measurements of
stacked images and process or make inferences
w/ unfused CS
w/ unfused and no CS
Multisensor Fusion via JM+CS
• Can compute CS measurements in-place
– ex: as we transmit to collection/processing point
Simulation Results
• J=3 CS cameras, each N=320x240 resolution
• M=200 random measurements per camera
class 1
• Two classes
1. truck w/ cargo
2. truck w/ no cargo
• Goal: classify
a test image
class 2
Simulation Results
• J=3 CS cameras, each N=320x240 resolution
• M=200 random measurements per camera
• Two classes
1. truck w/ cargo
2. truck w/ no cargo
• Smashed filtering
– independent
– majority vote
– joint manifold
Joint Manifold
References
• M. Davenport, C. Hegde, M. Duarte, R. G. Baraniuk,
“Joint Manifolds for Data Ensembles,” preprint, 2008.
• R. G. Baraniuk and M. B. Wakin, “Random Projections of
Smooth Manifolds,” Foundations of Computational
Mathematics, December 2007.
• C. Hegde, M. Wakin, and R. G. Baraniuk, “Random Projections
for Manifold Learning,” Neural Information Processing Systems
(NIPS), Vancouver, December 2007.
• M. A. Davenport, M. F. Duarte, M. B. Wakin, J. A. Laska, D.
Takhar, K. F. Kelly, and R. G. Baraniuk, “The Smashed Filter for
Compressive Classification and Target Recognition,” IS&T/SPIE
Computational Imaging IV, San Jose, January 2007.
Conclusions
• Why CS works:
stable embedding for signals
with concise geometric structure
• Sparse signals
>> model-sparse signals
• Compressible signals >> model-compressible signals
upshot:
fewer measurements
more stable recovery
new concept:
RAmP
• Smooth manifolds
– smashed filter
 many fewer measurements may be required
to detect/classify/estimate than to reconstruct
 leverages inference problem’s articulation structure,
not sparsity
dsp.rice.edu/cs
Connexions
(cnx.org)
• non-profit open publishing project
• goal: make high-quality educational content available
to anyone, anywhere, anytime for free
on the web and at very low cost in print
• open-licensed repository of Lego-block
modules for authors, instructors, and
learners to create, rip, mix, burn
• global reach:
>1M users monthly
from 200 countries
• collaborators:
IEEE (IEEEcnx.org),
Govt. Vietnam, TI, NI, …
Application:
Compressive
Manifold Learning
Manifold Learning
• Given training points in
, learn the mapping
to the underlying K-dimensional articulation manifold
• ISOMAP, LLE, HLLE, …
• Ex: images of
rotating teapot
articulation space
= circle
Compressive Manifold Learning
• ISOMAP algorithm based on geodesic distances
between points
• Random measurements preserve these distances
• Theorem:
[Hegde et al
NIPS ’08]
translating
disk manifold
(K=2)
If
, then the
ISOMAP residual variance in the
projected domain is bounded by
the additive error factor
full data (N=4096)
M = 100
M = 50
M = 25
Compressive Manifold Learning

Number of measurements required for
stable manifold learning




N = ambient dimension of data points
M = dimension of randomly projected data points
Q = number of data points
K = dimension of manifold

Our approach:

Johnson-Lindenstraus approach:
translating
disk manifold
(K=2)
full data (N=4096)
M = 100
M = 50
M = 25
Joint Manifolds
• Given submanifolds
– -dimensional
– homeomorphic (we can continuously map between any pair)
• Define joint manifold as concatenation of
• Example: joint articulation
Joint Manifolds: Properties
• Joint manifold inherits properties from component
manifolds
– compactness
– smoothness
– volume:
– condition number (
):
• Translate into algorithm performance gains
• Bounds are often loose in practice
Manifold Learning via Joint Manifolds
• Goal: Learn embedding
of 2D translating ellipse
(with noise)
N=45x45=2025 pixels
J=20 views at different angles
Manifold Learning via Joint Manifolds
• Goal: Learn embedding
of 2D translating ellipse
(with noise)
N=45x45=2025 pixels
J=20 views
• Embeddings
learned
separately
Manifold Learning via Joint Manifolds
• Goal: Learn embedding
of 2D translating ellipse
(with noise)
N=45x45=2025 pixels
J=20 views
• Embeddings
learned
separately
• Embedding learned jointly
Manifold Learning via JM+CS
• Goal: Learn embedding
via random compressive
measurements
N=45x45=2025 pixels
J=20 views
• Embeddings
learned
separately
• Embedding learned jointly
M=100 measurements per view
Stable Manifold Embedding
Sketch of proof:
– construct a sampling of points
 on manifold at fine resolution
 from local tangent spaces
– apply JLL to these points
(concentration of measure)
– extend to entire manifold
Implication:
Nonadaptive (even random) linear projections can
efficiently capture & preserve structure of manifold
See also: Indyk and Naor, Agarwal et al., Dasgupta and Freund
Recovery of Compressible Signals
• If
has RIP, then both sparse and compressible
signals are stably recoverable
sorted index