Slides (PPT)

Download Report

Transcript Slides (PPT)

Efficient Inference for FullyConnected CRFs with Stationarity
Yimeng Zhang, Tsuhan Chen
CVPR 2012
Summary
• Explore object-class segmentation with fullyconnected CRF models
• Only restriction on pairwise terms is `spatial
stationarity’ (i.e. depend on relative locations)
• Show how efficient inference can be achieved
by
– Using a QP formulation
– Using FFT to calculate gradients in complexity
(linear in) O(NlogN)
Fully-connected CRF model
• General pairwise CRF model:
•
•
•
•
Image I
Class labeling, X:
Label set, L:
V = set of pixels, N_i = neighbourhood of pixel i,
Z(I) = partition function, psi = potential functions
Fully-connected CRF model
• General pairwise CRF model:
• In fully-connected CRF, for all i, N_i = V
Unary Potential
• Unary potential generates a score for each
object class per pixel (TextonBoost)
Pairwise Potential
• Pairwise potential measures compatibility of
the labels at each pair of pixels
• Combines spatial and colour contrast factors
Pairwise Potential
• Colour contrast:
• Spatial term:
Pairwise Potential
• Learning the spatial term
MAP inference using QP relaxation
• Introduce a binary indicator variable for each
pixel and label
• MAP inference expressed as a quadratic integer
program, and relaxed to give the QP
MAP inference using QP relaxation
• QP relaxation has been proved to be tight in all
cases (Ravikumar ICML 2006 [24])
• Moreover, it is convex whenever matrix of edgeweights is negative-definite
• Additive bound for non-convex case
• QP requires O(KN) variables, LP requires (K^2E)
MAP inference using QP relaxation
• Gradient
• Derive fixed-point update by forming
Lagrangian and setting its derivative to 0
Illustration of QP updates
Efficiently evaluating the gradient
• Required summation
• Would be a convolution without the color term
• With color term is requires 5D-filtering
• Can be approximated by clustering into C color
clusters, => C convolutions across
Efficiently evaluating the gradient
• Hence, for the case x_i = x_j, we need to
evaluate
• Instead, evaluate for C clusters (C = 10 to 15)
• where
• Finally, interpolate
Update complexity
• FFTs of each spatial filters can be calculated in
advance (K^2 filters)
• At each update, we require C FFTs calculating,
O(CNlogN)
• K^2 convolutions are needed, each requiring a
multiplication, O(K^2CN)
• Terms can be added in Fourier domain, => only
KC inverse FFTs needed, O(KCNlogN)
• Run-time per iteration < 0.1s for 213x320 pixels
(+ downsampling by factor of 5)
MSRC synthetic experiment
• Unary terms randomized
• Spatial distributions set to ground-truth
MSRC synthetic experiment
• Running times
Sowerby synthetic experiment
MSRC full experiment
• Use TextonBoost unary potentials
• Compare with several other CRFs with same
unaries
– Grid only
– Grid + P^N (Kohli, CVPR 2008)
– Grid + P^N + Cooccurrence (Ladickỳ, ECCV 2010)
– Fully-connected + Gaussian spatial (Krähenbühl,
NIPS 2011)
MSRC full experiment
• Qualitative comparison
MSRC full experiment
• Quantitative comparison
– Overall
– Per-class
– Timing: 2-8s per image