Summary: A Taxonomy and Evaluation of Dense Two

Download Report

Transcript Summary: A Taxonomy and Evaluation of Dense Two

Summary:
A Taxonomy and Evaluation of
Dense Two-Frame Stereo
Correspondence Algorithms
Matthew Wilhelm
CS5331 Mobile Robotics
Goal / Motivation
Provide means of quantitatively gauge
progress in the field of Stereo
Correspondence as well as judge the
value of new approaches
 Novel publications will have to improve in
some way on the performance of existing
algorithms
 Provide an update on the state of the art
of the field

Background / Theory
All vision algorithms make assumptions
about physical world and camera
 Stereo Algorithms commonly make the
following assumptions

◦ Lambertian surfaces – appearance does not
vary with viewpoint
◦ Piecewise-smooth surfaces
◦ Camera Calibration and geometry
Disparity
~ difference between location of matching
pixels
 ≈ inverse depth
 Various computation methods
 Displayed as a disparity space image, close
items will be brighter and far away items
will be darker

Taxonomy


A classification system for items based on
their relationship to one another
Allows dissection and comparison of
individual algorithm components and design
decisions
◦
◦
◦
◦

Matching Cost Computation
Cost Aggregation
Disparity Computation / Optimization
Disparity Refinement
Existing algorithms are built of various
implementations of above classifications
Matching Cost Computation
Form initial Disparity Space Image
 Many Methods including:

◦
◦
◦
◦
◦
◦
Squared Intensity Differences
Absolute Intensity Differences
Truncated Quadratics
Contaminated Gaussians
Normalized Cross-Correlation
Binary Features
Cost Aggregation
Group similar costs in disparity space
image in order to form objects
 Again Many Different Methods Including:

◦
◦
◦
◦
Square Windows
Gaussian Convolution
Shiftable Windows
Adaptable Size Windows
Disparity Computation / Optimization

Local Methods – majority of work done in previous
two steps
 For optimization simply choose at each pixel the disparity with
the minimum cost value.
 Uniqueness is only enforced on one image.

Global Methods – majority of work done in this stage
 Energy Minimization – continuation, simulated annealing, highest
confidence first, and mean field annealing
 Max-Flow and Graph-Cut for special cases
Dynamic Programming – compute disparity for pair
wise matching costs, using adjusting parameters
 Cooperative Algorithms – models human stereo
vision

Disparity Refinement
Sub pixel disparity estimates used when
rendering images for more appealing view
results
 Clean up mismatches via various methods


Not usually done for fast implementation
such as robot navigation or tracking
Implementation
Closely tied to Taxonomy given above
 Author developed modular and portable
C++ implementation of several stereo
algorithms
 Post processing steps to improve results
not implemented, in order to compare
methods directly.
 Easily extendable to include other
algorithms

Implementation Details

Matching Cost Computation
 Squared or absolute difference in color
 Sub-pixel interpolation

Aggregation
 Box Filter: separable moving average filter
 Binomial Filter: separable finite impulse response filters

Optimization
 Winner-take-all, dynamic programming, scanline
optimization, simulated annealing, and graph cut

Refinement
 Three aggregated matching cost values around the
winning disparity are examined to compute the subpixel disparity estimate
Evaluation
Allows for quantitative evaluation of
stereo algorithms
 Provides test bed for new and existing
algorithms along with test data and
results on the Web at
http://vision.middlebury.edu/stereo/
 Allows for testing of individual
components as divided in taxonomy

Quality Metrics
RMS error – root-mean-squared value of
difference between the computed disparity map
and the ground truth map
 Percentage of bad matching pixels – disparity
error tolerance
 Computed over whole image as well as three
areas which usually cause problems:

◦ Textureless regions – average intensity gradient to
low
◦ Occluded regions – mapped disparity lands at
location covered by closer object
◦ Depth discontinuity regions – neighboring disparities
differ by to much
Experiments
Authors perform several experiments to
compare various algorithm components,
again as divided in the taxonomy
 Focus on common problem areas for
stereo algorithms

Experiments / Results

Matching Costs
◦ Experiment 1: ran many tests with different
matching cost truncation values found good
results are 5-20
◦ Experiment 2: ran same test as above, but
used a 9x9 min filter before truncation and
found that no truncation performed best
◦ Experiment 3: tested effects of matching cost
and truncation on global algorithms, found
that some truncation helped, and suggested
use of SNR based parameter setting
Experiments / Results

Aggregation
◦ Experiment 4: Analyze affects of various
aggregation techniques on local methods
 Large amounts of aggregation are necessary in
textureless regions
 Shiftable windows perform best
Experiments / Results

Disparity Computation /Optimization
◦ Experiment 5: analyze smoothness parameter
 Found that the optimal smoothness parameter varies
greatly for each image pair
 Future work includes parameter calculation techniques
◦ Experiment 6: Focus on graph-cut optimization
 While Birchfield-Tomasi’s method and gradient based
smoothness cost improve performance of graph-cut
algorithms,
 Choosing the right parameters for threshold and penalty
is difficult and image specific
Experiments / Results

Sub-Pixel Estimations
◦ Experiment 7: refine disparity maps via subpixel interpolation
 As expected an unrefined DSI contains staircase
error, where refined DSI is considerably better
 Again, this step is often skipped in fast
implementations
Conclusion



The author provides a comparison of 20
stereo algorithms all of which are available in
detail on website
Found that most algorithms perform about
the same in so-called easy area’s and the
differences arise in known problematic areas
One evaluation of algorithms that I thought
would have been helpful was runtime
comparisons, however the author was not
concerned with this
Questions?

Can you clarify what is being reference in Figure 1 (f)
regarding the disparity levels as a slice?
◦ A slice simply means that the DSI is 3D and the are keeping on
of the 3 variables constant to produce a 2D image

Can you find references to using illumination along side the
stereo depth analysis to further define the depth of objects?
◦ I have searched some and did not see any papers however this
does not mean that it is impossible.
◦ Probably would be very helpful to have a illumination estimate
prior to stereo evaluation

And of course, can you simplify the differences between each
algorithm?
◦ I think I have done a brief simplification, to go into more detail I
would have to read each of the 132 referenced papers
Questions??

How were the stereo algorithms chosen
in the paper?
◦ The paper focuses on Dense Two-Frame
Stereo Correspondence Algorithms
◦ Common algorithms which needed to be
compared were chosen to be implemented
however the framework allows novel
algorithms to be implemented as well

What is a stereo algorithm?
◦ A stereo algorithm utilizes images from two
cameras, similar to human vision (two eyes)
Questions???

In page 2, section 2.2, it was indicated that an
unvalued disparity map is produced as output.
What is “unvalued disparity map”?
◦ It sais univalued
◦ I think this means that there is a single value for the
disparity at each pixel

On page 11 under the evaluation section they
discuss that they use three different regions to
check the algorithm over (texture less, occluded,
and depth) how did they come up with these?
◦ These are common problem areas for several
different stereo algorithms
Questions????

Why did they down sample the images for
testing (page 13)?
◦ To normalize the motion of background objects
to a few pixels per frame, to allow better results
when matching and truly compare various
algorithms quality

Why do they only evaluate,
bad_pixels_nonocc, bad_pixels_textureless,
and bad_pixels_discont ?
◦ The also evaluate the whole image, but the do
these areas separate as well to get an idea of how
different algorithms perform in these known
problem areas
Additional Resources




References throughout the paper provide
resources for various algorithms
Hartley and Zisserman: Multiple View
Geometry in Computer Vision.
Middlebury website, an excellent source of
papers and code related to stereo vision
algorithms.
Sebastian Thrun, Wolfram Burgard and
Dieter Fox: Probabilistic Robotics, MIT
Press, 2005.