PowerPoint 簡報 - National Tsing Hua University

Download Report

Transcript PowerPoint 簡報 - National Tsing Hua University

Feature-Based Stereo Matching
Using Graph Cuts
Gorkem Saygili, Laurens van der Maaten,
Emile A. Hendriks
ASCI Conference 2011
Overview
• Introduction
• implementation
– Color Segmentation and Key Point Extraction
– Local Pixel Matching
– Plane Parameters Decision
– Plane Assignment
• Experimental
Introduction (1/2)
• Stereo matching is one of the key topics in 3D
computer vision. The main goal is to find an
estimate of the depth information.
• stereo algorithms can be divided into two classes:
local and global approaches.
– local : based on aggregation windows that take
intensity differences into account
– global : make implicit smoothness assumptions on
the image and try to minimize the total energy of
disparity map based on data and smoothness costs.
Introduction (2/2)
• Popular global methods use graph cuts [9,12] or
belief propagation [8,10-11] to minimize the
energy function.
• Most of the reliable and robust stereo matching
algorithms rely on over-segmentation of the
image [7-10,15].
Disparity Estimation
Color Segmentation and
SURF Key Point Extraction (1/2)
• Disparities at key point is hard to estimate.
• Recent state-of-the-art disparity estimation
algorithms do not incorporate the key point
disparities into the disparity estimation.
• Such non-trivial disparities are easy to
estimate with the use of matching key
points between stereo pairs.
Color Segmentation and
SURF Key Point Extraction (2/2)
• We used a restricted search space to avoid
disparities overestimate, which decreases the
noise.
– S : the set of salient points for images.
– 𝑦𝑠 𝐿, 𝑦𝑠 𝑅 :the vertical positions of those points.
• Mean-shift color segmentation is applied to oversegment the image into homogenous image
regions in which we do not expect to have
disparity discontinuity.
Local Pixel Matching (1/4)
• Then, we can obtain a rough estimate of the
disparity of the segment efficiently with a bounded
disparity search:
– T : the set of segments
– 𝑑 𝑥, 𝑦 : the disparity of the pixel at 𝑥, 𝑦
– 𝑑𝑚𝑎𝑥 , 𝑑𝑚𝑖𝑛 : maximum and minimum passible
disparities for the image pairs
– 𝑑𝑡,𝑙𝑜𝑤 , 𝑑𝑡,ℎ𝑖𝑔ℎ : upper and lower boundaries of the
disparity range to search for segment 𝑡
– 𝛼 : scaling coefficient ranges in between 0 and 1
Local Pixel Matching (2/4)
• The most common choices for the cost function is
sum of squared differences (SSD) and sum of
absolute differences (SAD).
• Here, SAD is choosed.
– 𝑑 : the disparity values subject to Eq.2.
– 𝑅𝑥𝑦 : the aggregation window constructed around the
pixel at (x, y).
Local Pixel Matching (3/4)
• The contents of the aggregation window
should not contain disparity discontinuities,
we use the following adaptive box matching
approach:
– Take a box around the pixel of interest.
– Aggregate the matching cost for only the
pixels that lies on the same region as the pixel of
interest inside the box as represented in Eq.7.
Local Pixel Matching (4/4)
• The matching is done for both left image and
right image and the disparity is assigned in
conjunction with a winner-takes-all optimization.
• In order to find non-occluded matches, a crosscheck is performed.
• To alleviate the foreground fattening effect
described in [6], the minimum value of the left
and right disparities is used as the final initial
estimate:
[6] M. Gerrits and P. Bekaert. Local Stereo Matching with Segmentation-Based Outlier
Rejection. 3rd Canadian Conference on Computer and Robot Vision, June 2006.
Determining Plane Parameters
• There are three main approaches to estimate
the disparity plane parameters:
– (1) a RANSAC solution
– (2) a histogram-based solution
– (3) a Least-squares solution
• Here , we choose the RANSAC algorithm that
only considers non-occluded disparities as
input.
RANdom SAmple Consensus(RANSAC)(1/2)
• A model is fitted to the hypothetical inliers.
• All other data are then tested against the fitted
model and, if a point fits well to the estimated
model, also considered as a hypothetical inlier.
• The estimated model is reasonably good if
sufficiently many points have been classified as
hypothetical inliers.
• The model is reestimated from all hypothetical
inliers, because it has only been estimated from
the initial set of hypothetical inliers.
• Finally, the model is evaluated by estimating the
error of the inliers relative to the model.
RANdom SAmple Consensus(RANSAC)(2/2)
• RANSAC is very robust to outliers such that the
algorithm can even work effectively when there are
only 50 percent of inliers.[10]
• Wang and Zheng have shown that RANSAC provides
even better solutions than histogram-based approach,
however the result mostly depends on the initial set of
the algorithm [7].
• The third approach is sensitive to outliers.
• Hong and Chen used least-squares solution with only
non-occluded pixel disparities inside the segment [9].
[7] Z. Wang and Z. Zheng. A Region Based Stereo Matching Algorithm Using Cooperative
Optimization. CVPR,2008.
[9] L. Hong and G. Chen. Segment-Based Stereo Matching Using Graph-Cuts. Proc. CVPR, vol. 1,
pp. 74-81, 2004.
[10] Q. Yang, L. Wang, R. Yang, H. Stewenius and D. Nister. Stereo Matching with ColorWeighted Correlation, Hierarchical Belief Propagation, and Occlusion Handling. IEEE Trans. on
Pattern Analysis and Machine Intelligence, vol. 3, pp. 492-504, 2009.
Disparity Plane Assignment Using
Graph Cuts (1/3)
• Finally, we assigns a disparity plane to each
image segment by minimizing an energy
function.
• The energy minimization problem is solved
using a graph cut approach in which each node
corresponds to a segment.
• Our aim is to find a labelling f that assigns
each segment t ∈ T to its plane label p ∈ P by
minimizing the following energy function:
Disparity Plane Assignment Using
Graph Cuts (2/3)
the cost of assigning plane labels to the segments
• we proposed to use the following modified data cost
(MDC) which is given by:
– 𝑂𝑡 : the set of occluded pixels in 𝑡
– 𝑛 : the number of non-occluded pixels that has the same
initial disparity as the disparity after plane fitting.
– 𝑚 : the number of non-occluded pixels inside the segment
– 𝜆 : the scaling coefficient
– 𝑑 𝑓 𝑡 : the disparity of the pixel (x, y) after fitting a plane
with label f(t)
Disparity Plane Assignment Using
Graph Cuts (3/3)
smoothness term that penalizes the discontinuities in plane labels
of neighboring segments.
– N(t) : the set of neighbors of t.
𝑊 = 25, 𝜎
– w , σ : scaling parameters
– β , τ : the boundary length and mean colour
difference between t and q.
= 150
Experimental (1/3)
• We performed experiments on the image datasets
provided by [2].
• A disparity value is defined to be erroneous if the
absolute difference from ground truth is larger
than 1.
• As in common practice in the evaluation of stereo
algorithm, we look at results:
– (1) non-occluded pixels only (nonocc)
– (2) all pixels (all)
– (3) pixels in image regions that are close to a disparity
discontinuity (disc).
[2] D. Scharstein and R. Szelinski. Middleburry Stereo Vision Page.
http://vision.edu/stereo/eval.
Experimental (2/3)
Experimental (3/3)