Summary: A Taxonomy and Evaluation of Dense Two
Download
Report
Transcript Summary: A Taxonomy and Evaluation of Dense Two
Summary:
A Taxonomy and Evaluation of
Dense Two-Frame Stereo
Correspondence Algorithms
Matthew Wilhelm
CS5331 Mobile Robotics
Goal / Motivation
Provide means of quantitatively gauge
progress in the field of Stereo
Correspondence as well as judge the
value of new approaches
Novel publications will have to improve in
some way on the performance of existing
algorithms
Provide an update on the state of the art
of the field
Background / Theory
All vision algorithms make assumptions
about physical world and camera
Stereo Algorithms commonly make the
following assumptions
◦ Lambertian surfaces – appearance does not
vary with viewpoint
◦ Piecewise-smooth surfaces
◦ Camera Calibration and geometry
Disparity
~ difference between location of matching
pixels
≈ inverse depth
Various computation methods
Displayed as a disparity space image, close
items will be brighter and far away items
will be darker
Taxonomy
A classification system for items based on
their relationship to one another
Allows dissection and comparison of
individual algorithm components and design
decisions
◦
◦
◦
◦
Matching Cost Computation
Cost Aggregation
Disparity Computation / Optimization
Disparity Refinement
Existing algorithms are built of various
implementations of above classifications
Matching Cost Computation
Form initial Disparity Space Image
Many Methods including:
◦
◦
◦
◦
◦
◦
Squared Intensity Differences
Absolute Intensity Differences
Truncated Quadratics
Contaminated Gaussians
Normalized Cross-Correlation
Binary Features
Cost Aggregation
Group similar costs in disparity space
image in order to form objects
Again Many Different Methods Including:
◦
◦
◦
◦
Square Windows
Gaussian Convolution
Shiftable Windows
Adaptable Size Windows
Disparity Computation / Optimization
Local Methods – majority of work done in previous
two steps
For optimization simply choose at each pixel the disparity with
the minimum cost value.
Uniqueness is only enforced on one image.
Global Methods – majority of work done in this stage
Energy Minimization – continuation, simulated annealing, highest
confidence first, and mean field annealing
Max-Flow and Graph-Cut for special cases
Dynamic Programming – compute disparity for pair
wise matching costs, using adjusting parameters
Cooperative Algorithms – models human stereo
vision
Disparity Refinement
Sub pixel disparity estimates used when
rendering images for more appealing view
results
Clean up mismatches via various methods
Not usually done for fast implementation
such as robot navigation or tracking
Implementation
Closely tied to Taxonomy given above
Author developed modular and portable
C++ implementation of several stereo
algorithms
Post processing steps to improve results
not implemented, in order to compare
methods directly.
Easily extendable to include other
algorithms
Implementation Details
Matching Cost Computation
Squared or absolute difference in color
Sub-pixel interpolation
Aggregation
Box Filter: separable moving average filter
Binomial Filter: separable finite impulse response filters
Optimization
Winner-take-all, dynamic programming, scanline
optimization, simulated annealing, and graph cut
Refinement
Three aggregated matching cost values around the
winning disparity are examined to compute the subpixel disparity estimate
Evaluation
Allows for quantitative evaluation of
stereo algorithms
Provides test bed for new and existing
algorithms along with test data and
results on the Web at
http://vision.middlebury.edu/stereo/
Allows for testing of individual
components as divided in taxonomy
Quality Metrics
RMS error – root-mean-squared value of
difference between the computed disparity map
and the ground truth map
Percentage of bad matching pixels – disparity
error tolerance
Computed over whole image as well as three
areas which usually cause problems:
◦ Textureless regions – average intensity gradient to
low
◦ Occluded regions – mapped disparity lands at
location covered by closer object
◦ Depth discontinuity regions – neighboring disparities
differ by to much
Experiments
Authors perform several experiments to
compare various algorithm components,
again as divided in the taxonomy
Focus on common problem areas for
stereo algorithms
Experiments / Results
Matching Costs
◦ Experiment 1: ran many tests with different
matching cost truncation values found good
results are 5-20
◦ Experiment 2: ran same test as above, but
used a 9x9 min filter before truncation and
found that no truncation performed best
◦ Experiment 3: tested effects of matching cost
and truncation on global algorithms, found
that some truncation helped, and suggested
use of SNR based parameter setting
Experiments / Results
Aggregation
◦ Experiment 4: Analyze affects of various
aggregation techniques on local methods
Large amounts of aggregation are necessary in
textureless regions
Shiftable windows perform best
Experiments / Results
Disparity Computation /Optimization
◦ Experiment 5: analyze smoothness parameter
Found that the optimal smoothness parameter varies
greatly for each image pair
Future work includes parameter calculation techniques
◦ Experiment 6: Focus on graph-cut optimization
While Birchfield-Tomasi’s method and gradient based
smoothness cost improve performance of graph-cut
algorithms,
Choosing the right parameters for threshold and penalty
is difficult and image specific
Experiments / Results
Sub-Pixel Estimations
◦ Experiment 7: refine disparity maps via subpixel interpolation
As expected an unrefined DSI contains staircase
error, where refined DSI is considerably better
Again, this step is often skipped in fast
implementations
Conclusion
The author provides a comparison of 20
stereo algorithms all of which are available in
detail on website
Found that most algorithms perform about
the same in so-called easy area’s and the
differences arise in known problematic areas
One evaluation of algorithms that I thought
would have been helpful was runtime
comparisons, however the author was not
concerned with this
Questions?
Can you clarify what is being reference in Figure 1 (f)
regarding the disparity levels as a slice?
◦ A slice simply means that the DSI is 3D and the are keeping on
of the 3 variables constant to produce a 2D image
Can you find references to using illumination along side the
stereo depth analysis to further define the depth of objects?
◦ I have searched some and did not see any papers however this
does not mean that it is impossible.
◦ Probably would be very helpful to have a illumination estimate
prior to stereo evaluation
And of course, can you simplify the differences between each
algorithm?
◦ I think I have done a brief simplification, to go into more detail I
would have to read each of the 132 referenced papers
Questions??
How were the stereo algorithms chosen
in the paper?
◦ The paper focuses on Dense Two-Frame
Stereo Correspondence Algorithms
◦ Common algorithms which needed to be
compared were chosen to be implemented
however the framework allows novel
algorithms to be implemented as well
What is a stereo algorithm?
◦ A stereo algorithm utilizes images from two
cameras, similar to human vision (two eyes)
Questions???
In page 2, section 2.2, it was indicated that an
unvalued disparity map is produced as output.
What is “unvalued disparity map”?
◦ It sais univalued
◦ I think this means that there is a single value for the
disparity at each pixel
On page 11 under the evaluation section they
discuss that they use three different regions to
check the algorithm over (texture less, occluded,
and depth) how did they come up with these?
◦ These are common problem areas for several
different stereo algorithms
Questions????
Why did they down sample the images for
testing (page 13)?
◦ To normalize the motion of background objects
to a few pixels per frame, to allow better results
when matching and truly compare various
algorithms quality
Why do they only evaluate,
bad_pixels_nonocc, bad_pixels_textureless,
and bad_pixels_discont ?
◦ The also evaluate the whole image, but the do
these areas separate as well to get an idea of how
different algorithms perform in these known
problem areas
Additional Resources
References throughout the paper provide
resources for various algorithms
Hartley and Zisserman: Multiple View
Geometry in Computer Vision.
Middlebury website, an excellent source of
papers and code related to stereo vision
algorithms.
Sebastian Thrun, Wolfram Burgard and
Dieter Fox: Probabilistic Robotics, MIT
Press, 2005.