PowerPoint 簡報 - National Tsing Hua University

Transcript PowerPoint 簡報 - National Tsing Hua University

1
Stereo Video
Temporally Consistent Disparity Maps from Uncalibrated
Stereo Videos
2. Real-time Spatiotemporal Stereo Matching Using the
Dual-Cross-Bilateral Grid
3. Temporally Consistent Disparity and Optical Flow via
Efficient Spatio-temporal Filtering
4. Efficient Spatio-temporal Local Stereo Matching Using
Information Permeability Filtering
1.
2
A. Temporally Consistent
Disparity Maps from
Uncalibrated Stereo Videos
Michael Bleyer and Margrit Gelautz
International Symposium on Image and Signal
Processing and Analysis (ISPA) 2009
3
B. Real-time Spatiotemporal
Stereo Matching Using
The Dual-cross-bilateral Grid
Christian Richardt, Douglas Orr, Ian Davies,
Antonio Criminisi, and Neil A. Dodgson1
The European Conference on Computer Vision
(ECCV) 2010
4
C. Temporally Consistent Disparity
And Optical Flow Via
Efficient Spatio-temporal Filtering
Asmaa Hosni, Christoph Rhemann,
Michael Bleyer, and Margrit Gelautz
The Pacific-Rim Symposium on Image and
Video Technology (PSIVT) 2011
5
D. Efficient Spatio-temporal
Local Stereo Matching Using
Information Permeability Filtering
Cuong Cao Pham, Vinh Dinh Nguyen,
and Jae Wook Jeon
International Conference on Image Processing
(ICIP)2012
6
Outline
• Introduction
• Related Works
• Methods and Results
• A. Median Filter
• B. Temporal DCB Grid
• C. Spatial-temporal Weighted Smoothing
• D. Three-pass Aggregation
• Comparison
• Conclusion
7
INTRODUCTION
8
Introduction
• Stereo matching issues only focus on static image pairs.
• The conventional methods estimate the disparities by using
spatial and color information.
• The important problem of extending to video is flickering.
• Solution :
• Base on local methods (for real-time)
• Enforce temporally consistent (for flickering)
9
RELATED WORKS
10
Related Works
• About Local Methods
• The key of local method lies in the cost aggregation step.
• Aggregate the cost data from the neighboring pixels within a
finite size window.
• The most well-known method is edge-preserving algorithm.
• Adaptive support wight
• Geodesic Diffusion
• Bilateral filter
• Guided filter
11
Related Works
• Single-frame stereo matching
12
Related Works
• Spatio-temporal stereo matching
• The inter disparity difference between two successive frames is minimized
to enforce the temporal consistency.
13
METHODS AND RESULTS
14
A. Median filter
15
A. Median filter
16
A. Median filter
• Computing 1 disparity map takes 1 second.
• But a video content about 30~60 frames per second.
• => Can NOT achieve real-time.
• No data and comparison.
17
B. Temporal DCB Grid
• Bilateral Grid
• It runs faster and uses less memory as σ increases.
•
• Dual-Cross-Bilateral Grid
•
18
B. Temporal DCB Grid
• Dichromatic DCB Grid
• Comparison (fps)
200x
19
B. Temporal DCB Grid
• Temporal DCB Grid
•
• Last n = 5 frames, each weighted by wi
• i=0 : current frame
• i=1 : previous frame
Weighted Sum
20
B. Temporal DCB Grid
16 fps
14 fps
21
B. Temporal
DCB Grid
Source data
22
B. Temporal DCB Grid
• Only use intensity information
• Just near-real-time
23
C. Spatial-temporal Weighted Smoothing
• Cost initialization
• Construct a spatio-temporal cost volume for each disparity d.
• Cost aggregation
• Smooth cost volume with a spatio-temporal filter.(Guided filter [1])
• Disparity computation
• Select the lowest costs as disparity(WTA)
• Refinement
• Wighted median filter
[1]Rhemann, C., Hosni, A., Bleyer, M., Rother, C., Gelautz, M.
Fast Cost-Volume Filtering for Visual Correspondence and Beyond.
CVPR(2011) and PAMI (2013)
24
C. Spatial-temporal Weighted Smoothing
25
C. Spatial-temporal Weighted Smoothing
• Cost initialization
• Cost aggregation
wk: wx * wy* wt
: smoothness parameter
26
C. Spatial-temporal Weighted Smoothing
• The guided filter weights can be implemented by a sequence of
linear operations.
• All summations are 3D box filters and can be computed in O(N)
time.
27
C. Spatial-temporal Weighted Smoothing
• Disparity computation : Winner take all
• Refinement : Wighted Meadian filter
=> Just adjust to reduce single frame error.
28
C. Spatial-temporal Weighted Smoothing
• Temporal vs. frame-by-frame processing.
• 2nd row: Disparity maps computed by a frame-by-frame implementation
show flickering artifacts.
• 3rd row: Our proposed method exploits temporal information, thus can
remove most artifacts
29
C. Spatial-temporal Weighted Smoothing
30
C. Spatial-temporal Weighted Smoothing
31
C. Spatial-temporal Weighted Smoothing
32
D. Three-pass cost aggregation
• Three-pass cost aggregation technique based on information
permeability(Adaptive Support-Weight).[2]
[2] Yoon, K.J., Kweon, I.S.: Locally Adaptive Support-Weight Approach for Visual
Correspondence Search. In: CVPR (2005)
33
D. Three-pass cost aggregation
Frame i+1
Frame i
Frame i-1
34
D. Three-pass cost aggregation
• Matching cost initialization
Show the effectiveness of using temporal
information in addition to spatial
information .
• v = (x, y, t) represents the spatial and temporal positions of a voxel.
• Similarity(weighted) function
35
D. Three-pass cost aggregation
• Spatial Aggregation : Horizontal and then Vertical
36
D. Three-pass cost aggregation
• Temporal Aggregation : Forward and backward
• Disparity computation : WTA
• Refinement
• consistency check
• 3 × 3 median filter.
37
D. Three-pass cost aggregation
• Computational Complexity
• Only six multiplications and nine additions per voxel
• It is still more efficient than the adaptive support-weight
approach.
• Without motion estimation
38
D. Three-pass cost aggregation
39
D. Three-pass cost aggregation
40
COMPARISON
41
Comparison
A.
Method Optical flow
+
Median filter
Drawback
Too slow
Reference
frame
number
3 frames
-1~1
B.
Weighted
last 5
frames
Over
smoothness
5 frames
-4~0
C.
Guided
filter
temporally
D.
Three pass
5 frames
-2~2
3frames
-1~1
42
Comparison
No post-processing
Include post-processing : consistency check and
3 × 3 median filter
43
CONCLUSION
44
Conclusion
• Based on edge-preserving methods.
• Extend these concepts to time dimension.
• These methods only solved slow motion scenes.
• They do not perform well with dynamic scenes that contain
large object motions.

PowerPoint 簡報 - National Tsing Hua University

Transcript PowerPoint 簡報 - National Tsing Hua University

Directory