ΗΥ590.71 Θέματα διακριτής βελτιστοποίησης

Download Report

Transcript ΗΥ590.71 Θέματα διακριτής βελτιστοποίησης

Discrete Optimization in
Computer Vision
Nikos Komodakis
Ecole des Ponts ParisTech, LIGM
Traitement de l’information et vision artificielle
Message passing algorithms for
energy minimization
Message-passing algorithms



Central concept: messages
These methods work by propagating
messages across the MRF graph
Widely used algorithms in many areas
Message-passing algorithms


But how do messages relate to optimizing
the energy?
Let’s look at a simple example first:
we will examine the case where the MRF
graph is a chain
Message-passing on chains
MRF graph
Message-passing on chains
Corresponding lattice or trellis
Message-passing on chains


Global minimum in linear time
Optimization proceeds in two passes:
 Forward pass (dynamic programming)
 Backward pass
Message-passing on chains
(example on board)
(algebraic derivation of messages)
Message-passing on chains
p
q
r
s
Forward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
s
Forward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
s
Forward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
s
Forward pass (dynamic programming)
 2 .5 
  1
M
 p , xq )
pqpq( x
 0.1
 
1.5 
 p (xp ) 
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
s
Forward pass (dynamic programming)
 0.5 
 2 
M qr   
1.2 
 
 2.0 
p
q
r
s
k
M qr (k )  min   q ( j )  M pq ( j )    qr ( j , k ) 
j
Forward pass (dynamic programming)
s
1.0 
 4.0 
M rs   
 2.0 
 
1.0 
p
q
r
 0.5 
 2 
 
1.2 
 
 2.0 
s
Min-marginal
for node s and label j:
min  E (x)
x
xs  j

Backward pass
p
q
r
s
xp
xq
xr
xs
 qr( (j)j)MMpqqr((jj)) qrrs ( j,, xxrs ) 


jj
MM
)min
min
qrxrs
s) arg
qx(rx(rxarg
jj
Message-passing on chains
How
can I compute min-marginals for any node in the chain?
How
to compute min-marginals for all nodes efficiently?
What
is the running time of message-passing on chains?
Message-passing on trees



We can apply the same idea to treestructured graphs
Slight generalization from chains
Resulting algorithm called:
belief propagation
(also called under many other names: e.g.,
max-product, min-sum etc.)
(for chains, it is also often called the
Viterbi algorithm)
Belief propagation
(BP)
BP on a tree [Pearl’88]
leaf
p
q
leaf


r
root
Dynamic programming: global minimum in linear time
BP:
 Inward pass (dynamic programming)
 Outward pass

Gives min-marginals
Inward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
Inward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
Inward pass (dynamic programming)
 p ( x p )   pq ( x p , xq )
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
Inward pass (dynamic programming)
 2 .5 
  1
M
 p , xq )
pqpq( x
 0.1
 
1.5 
 p (xp ) 
p
q
r
j
M pq ( j )  min  p (i )   pq (i, j ) 
i
Inward pass (dynamic programming)
M pq
p
q
 0 .5 
 2
 
1.2 
 
 2 .0 
r
k
M qr (k )  min   q ( j )  M pq ( j )    qr ( j , k ) 
j
Inward pass (dynamic programming)
p
q
r
Inward pass (dynamic programming)
p
q
r
Outward pass
p
q
r
BP on a tree: min-marginals
p
q
r
j
Min-marginal
for node q and label j:
min E (x)
x
xq  j 
 q ( j)  M pq ( j)  M rq ( j)
Belief propagation: message-passing on
trees
Belief propagation: message-passing on
trees
min-marginals = ???
sum of all messages +
unary potential
What is the running
time of messagepassing for trees?
Message-passing on chains


Essentially, message passing on chains is
dynamic programming
Dynamic programming means
reuse of computations
Generalizing belief propagation



Key property: min(a+b,a+c) = a+min(b,c)
BP can be generalized to any operators
satisfying the above property
E.g., instead of (min,+), we could have:


(max,*)
Resulting algorithm called max-product.
What does it compute?
(+,*)
Resulting algorithm called sum-product.
What does it compute?
Belief propagation as a distributive
algorithm

BP works distributively
(as a result, it can be parallelized)

Essentially BP is a decentralized algorithm

Global results through local exchange of
information

Simple example to illustrate this: counting soldiers
Counting soldiers in a line
(From David MacKay’s book “Information Theory, Inference, and Learning”)

Can you think of a distributive algorithm for the
commander to count its soldiers?
Counting soldiers in a line
Counting soldiers in a tree

Can we do the same for this case?
Counting soldiers
in a tree
Counting soldiers


Simple example to illustrate BP
Same idea can be used in cases which are
seemingly more complex:



counting paths through a point in a grid
probability of passing through a node in the grid
In general, we have used the same idea for
minimizing MRFs (a much more general
problem)
Graphs with loops

How about counting these soldiers?

Hmmm…overcounting?