Dictionary-Learning for the Analysis Sparse Model * Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel This work was.

Download Report

Transcript Dictionary-Learning for the Analysis Sparse Model * Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel This work was.

Dictionary-Learning for the
Analysis Sparse Model *
Michael Elad
The Computer Science Department
The Technion – Israel Institute of technology
Haifa 32000, Israel
This work was supported by the European
Commission FP7-FET program SMALL
Joint work with
Boaz Ophir,
Mark Plumbley, and
Nancy Bertin
Special Session on
Processing and
recovery using analysis
and synthesis sparse
models – September
1st 2011
Part I - Background
The Synthesis and
Analysis Models
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
2
The Synthesis Model – Basics
 The synthesis representation is expected
to be sparse:
 0  k  d
m
=
d
 Adopting a Bayesian point of view:
Dictionary
 Draw the support at random
D
 Choose the non-zero coefficients
randomly (e.g. iid Gaussians)
α
x
 Multiply by D to get the synthesis signal
 Such synthesis signals belong to a Union-of-Subspaces (UoS):
x
T k
spanDT 
m

k
 This union contains 
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
where
DT T  x
subspaces, each of dimension k.
3
The Analysis Model – Basics
 The analysis representation z is expected to be sparse
d
Ωx 0  z 0  p 
 Co-sparsity: - the number of zeros in z.
=
p
 Co-Support:  - the rows that are orthogonal to x
x
Ω x  0
 If  is in general position*, then 0   d and thus
we cannot expect to get a truly sparse analysis
representation – Is this a problem? No!
1.
2.
S. Nam, M.E. Davies, M. Elad, and R. Gribonval, "Co-sparse Analysis
Modeling - Uniqueness and Algorithms" , ICASSP, May, 2011.
S. Nam, M.E. Davies, M. Elad, and R. Gribonval, "The Co-sparse Analysis
Model and Algorithms" , Submitted to ACHA, June 2011.
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
Analysis Dictionary
Ω
z
* spark Ω T   d  1
4
A Bayesian View of These Models
d
 Analysis signals, just like synthesis ones,
can be generated in a systematic way:
Synthesis Signals
Analysis Signals
Choose the
support T (|T|=k)
at random
Choose the cosupport  (||= )
at random
Coef. :
Choose T at
random
Choose a random
vector v
Generate:
Synthesize by:
DTT=x
Orhto v w.r.t. :
x  I  Ω† Ω   v
Support:
=
p
x
Analysis Dictionary
Ω
z
 Bottom line: an analysis signal x satisfies:    s.t. Ω x  0
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
5
Union-of-Subspaces
d
 Analysis signals, just like synthesis ones,
belong to a union of subspaces:
Synthesis
Signals
Analysis
Signals
What is the Subspace
Dimension:
k
How Many Subspaces:
m
k
 
p
 
 
spanDT 
span Ω 
Who are those Subspaces:
=
p
x
d-
Analysis Dictionary
Ω
z
 Example: p=m=2d:
 Synthesis: k=1 (one atom) – there are 2d subspaces of dimensionality 1
 Analysis: =d-1 leads to  d2d 1  >>O(2d) subspaces of dimensionality 1
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
6
The Analysis Model – Summary
m
 The analysis and the synthesis models are
similar, and yet very different
 The two align for p=m=d : non-redundant
D
d
=
α x
 Just as the synthesis, we should work on:
 Pursuit algorithms (of all kinds) – Design
 Pursuit algorithms (of all kinds) – Theoretical study
d
 Dictionary learning from example-signals
 Applications …
 Our experience on the analysis model:

Theoretical study is harder

Different applications should be considered
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
p
Ω
=
x
z
7
Part II – Dictionaries
Analysis Dictionary-Learning by
Sequential Minimal Eigenvalues
1.
2.
B. Ophir, M. Elad, N. Bertin and M.D. Plumbley, "Sequential Minimal Eigenvalues
- An Approach to Analysis Dictionary Learning", EUSIPCO, August 2011.
R. Rubinstein and M. Elad, “A k-SVD Dictionary Learning Algorithm for the Cosparse Analysis Model" , will be submitted (very) soon to IEEE-TSP ....
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
8
Analysis Dictionary Learning – The Signals
X
Ω
=
A
We are given a set of N contaminated (noisy)
analysis signals, and our goal is to recover their
analysis dictionary, 
y  x  v ,
j
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
j
j

  j  s.t. Ωj xj  0, v ~ N0,  I
2
N
j1
9
Lets Find a Single Row w from 
Observations:
1.
2.
Any given row is supposed to be
orthogonal to ~ N/p signals.
If we knew S, the true set of
examples that are orthogonal to
a row w, then we could
approximate w as the solver of
min
w 2 1
 w y 
2
T
jS
j
We shall seek w by iteratively
solving the above for
w and S alternately.
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
Set w to a random
normalized vector
Compute Inner-Products
|wTY|
Set S to be the set of c N / p
examples with the smallest
inner-product
Update w to be the eigenvector
corresponding to the smallest
eigenvalue of YSYST
10
Lets Find a Single Row w from 
Note:
1.
2.
Having found a candidate row w, it
is not necessarily a row from .
For a vector to represent a feasible
solution, we should require
 
2
1
T
2
w
y

d

 j
S jS
3.
Thus, if, after the convergence of
this algorithm, this condition is not
met, we discard of this estimate.
Set w to a random
normalized vector
Compute Inner-Products
|wTY|
Set S to be the set of c N / p
examples with the smallest
inner-product
Update w to be the eigenvector
corresponding to the smallest
eigenvalue of YSYST
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
11
Finding All Rows in 
Observations:
1.
2.
3.
4.
The previous algorithm can
produce feasible rows from .
Repeating the same process may
result with different rows, due to
the different random initialization.
We can increase chances of
getting different rows by running
the above procedure on a
different (randomly selected)
subset of the examples.
When a row is found several
times, this is easily detected, and
those repetitions can be pruned.
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
12
Results – Synthetic Experiment
Experiment Details:
1.
2.
3.
We generate a random analysis
dictionary  of size p=20, d=10.
We generate 10,000 analysis
signal examples of dimension
d=10 with co-sparsity=8.
We consider learning  from
noiseless and noisy (=0.1)
signals.
Experiment Results:
1.
2.
Dictionary-Learning for
the Analysis Sparse Model
By: Michael Elad
In the noiseless case, all rows
(p=20) were detected to within
1e-8 error (per coefficient). This
required ~100 row-estimates.
In the noisy case, all rows were
detected with an error of 1e-4,
this time requiring ~300 rowestimates.
13
Synthetic Image Data
Experiment Details:
1.
2.
3.
4.
We generate a synthetic piecewise constant image.
We consider 88 patches from
this image as examples to train
on. Thus, d=64.
We set p=256, and seek an
analysis dictionary for these
examples.
In our training, we stop when
the error is =10 gray-values.
K-SVD Dictionary-Learning for
Analysis Sparse Models
By: Michael Elad
14
Synthetic Image Data
Experiment Results:
1.
2.
We have to choose the cosparsity to work with, and it can
go beyond d.
We see that the results are
“random” patterns that are
expected to be orthogonal to
piece-wise constant patches.
They become more localized
when the co-sparsity is
increased.
K-SVD Dictionary-Learning for
Analysis Sparse Models
By: Michael Elad
15
True Image Data
Experiment Details:
1.
2.
3.
4.
We take the image “Peppers”
to work on.
We consider 88 patches from
this image as examples to train
on. Thus, d=64.
We set p=256, and seek an
analysis dictionary for these
examples.
In our training, we stop when
the error is =10 gray-values.
K-SVD Dictionary-Learning for
Analysis Sparse Models
By: Michael Elad
16
True Image Data
Experiment Results:
1.
2.
Few of the found rows are
shown here, and they are similar
to the previous ones (cosparsity=3d).
We also show the sorted
projected values |Y| for
the obtained dictionary
(top) and a random one
(bottom). White stands
for zero – as can be seen,
the trained dictionary
leads to better sparsity.
K-SVD Dictionary-Learning for
Analysis Sparse Models
By: Michael Elad
17
Recall the Synthesis K-SVD Aharon, Elad, & Bruckstein (`04)
Initialize D
D
e.g. choose a subset
of the examples
SparseNCoding
Use OMP
Min
 orDBPj  y j
D ,A
j1
2
2
s.t. j  1,2, ,N j 0  k
Y
Dictionary
Update
Column-by-Column by
SVD computation
K-SVD Dictionary-Learning for
Analysis Sparse Models
By: Michael Elad
18
Analysis K-SVD
Rubinstein and Elad (`11)
Synthesis
N
Min
D ,A
 D  y
j1
j
2
j 2
s.t. j  1,2, ,N j 0  k
Analysis
N
Min
Ω ,X

j1
xj  yj
2
2
s.t. j  1,2, ,N Ωx j 0  p 
We adopt a similar approach to the K-SVD for approximating
the minimization of the analysis goal, by iterating between
the search for xj and an update of the rows of 
K-SVD Dictionary-Learning for
Analysis Sparse Models
By: Michael Elad
19
Analysis Dictionary Learning – Results
Experiment: Piece-Wise Constant Image
Initial 
 We take 10,000 patches (+noise σ=5) to train on
 Here is what we got:
Trained
(100 iterations)

Original Image
Patches used for training
K-SVD Dictionary-Learning for
Analysis Sparse Models
By: Michael Elad
20
Analysis Dictionary Learning – Results
Experiment: The Image “House”
Initial 
 We take 10,000 patches (+noise σ=10) to train on
 Here is what we got:
Trained
(100 iterations)

Original Image
Patches used for training
K-SVD Dictionary-Learning for
Analysis Sparse Models
By: Michael Elad
21
Part III – We Are Done
Summary and
Conclusions
K-SVD Dictionary-Learning for
Analysis Sparse Models
By: Michael Elad
22
Today We Have Seen that …
Sparsity and
Redundancy are
practiced mostly in
the context of the
synthesis model
Is there any
other way?
Yes, the analysis model is
a very appealing (and
different) alternative,
worth looking at
In the past few years
So, what
We propose new
there is a growing
to do?
algorithms for this
What about
interest in better
task. The next step is
Dictionary
defining this model,
learning?
applications that will
suggesting pursuit
benefit from this
methods, analyzing
them, etc.
More on these (including the slides and the relevant papers) can be found in
http://www.cs.technion.ac.il/~elad
K-SVD Dictionary-Learning for
Analysis Sparse Models
By: Michael Elad
23