Transcript Slides
Yunhai Wang1 Minglun Gong1,2 Tianhua Wang1,3 Hao (Richard) Zhang 4
Daniel Cohen-Or 5 Baoquan Chen1,6
1Shenzhen
2 Memorial
4 Simon
5
Institutes of
Advanced Technology
Fraser University
University
of Newfoundland
3Jilin
Tel-Aviv University
6 Shandong
University
University
One of the most fundamental tasks in shape
analysis
Low-level cues (minimal rule; convexity) alone
insufficient
2/40
Learning segmentation
Unsupervised co-analysis
[Kalograkis et al. 10]
[Sidi et al. 2011]
Keys to success: amount & quality of
labelled or unlabelled 3D data
3
Joint segmentation
Active co-analysis
[Huang et al. 2011]
[Wang et al. 2012]
3/40
How many 3D models of strollers, golf carts, gazebos, …?
Not enough 3D models = insufficient knowledge
Labeling 3D shapes is also a non-trivial task
380 labeled meshes over 19 object
categories
4/40
About 14 million images across almost 22,000 object
categories
Labeling images is quite a bit easier than labeling 3D
shapes
5/40
Self-intersecting; non-manifold
Incomplete
Real-world 3D models (e.g., those from Tremble
Warehouse) are often imperfect
6
6/40
Treat a 3D shape as a set of projected binary
images
Label these images by
learning from vast amount of
image data
Then propagate the image
labels to the 3D shape
Alleviate various data
artifacts in 3D, e.g., selfintersections
7/40
Joint image-shape analysis via projective
analysis for semantic 3D segmentation
Utilize vast amount of available image data
Allowing us to analyze imperfect 3D shapes
8/40
Bi-class Symmetric Hausdorff distance = BiSH
Designed for matching 1D binary images
More sensitive to topology changes (holes)
Caters to our needs: part-aware label transfer
9/40
Many works on 2D-3D fusion, e.g., for reconstruction
[Li et al.11]
Image-guided 3D modeling
[Xu et al.11]
10
10/40
Image-space simplification error
[Lindstrom and Turk 10]
Light field descriptor for
3D shape retrieval
[Chen et al.03]
We deal with the higher-level and more
delicate task of semantic 3D segmentation
11/40
11
PSA for 3D shape segmentation
Region-based binary shape matching
Results and conclusion
12/40
Labeling involves GrabCut and some user assistance
13/40
Assume all objects are upright oriented; they mostly are!
Project an input 3D shape from multiple pre-set
viewpoints
14/40
For each projection of the input 3D shape, retrieve top
matches from the set of labelled images
15/40
Select top (non-adjacent) projections with the smallest
average matching costs for label transfer
16/40
Label transfer is done per
corresponding horizontal slabs
Later …
Pixel correspondence
straightforward
17/40
Label transfer is weighted by a confidence value per pixel
Three terms based on image-level, slab-level,
and pixel-level similarity: more similar = higher
confidence
18/40
Probabilistic map over input 3D shape: computed by
integrating per-pixel confidence values over each shape
primitive
One primitive projects to multiple pixels in multiple
images
Per-pixel confidence gathered over multiple retrieved
images
19/40
Final labeling of 3D shape: multi-label alpha expansion
graph cuts based on the probabilistic map
20/40
PSA for 3D shape segmentation
Region-based binary shape matching
Results and conclusion
21/40
…
Projections of input 3D shape
…
Database of (labeled) images
Goal: find shapes most suitable for label transfer and
FAST!
Not a global visual similarity based retrieval
Want part-aware label transfer but cannot reliably
segment
Characteristics of the data to be matched
Classical descriptors, e.g., shape context, interior
distance
shape
context
(IDSC),
GIST,
Zenikenot
moments,
Possibly
complex
topology
(lots
of holes),
just a
Fourier
descriptors, etc., do not quite fulfill our needs
contour
22/40
All upright orientated: to be exploited
Takes advantage of upright orientation
23/40
Cluster scan-lines into smaller number of slabs --efficiency!
Hierarchical clustering by a distance between adjacent
slabs
Classical choice for distance:
symmetric Hausdorff (SH)
But not sensitive to topology
changes; not part-aware
24/40
C
SH(C,B)=2, SH(Cc, Bc)=2
B
A
SH(A,B)=2, SH(Ac, Bc)=10
B
SH for only one class may not be topologysensitive
A bi-class SH distance is!
25/40
C
B
SH(C,B)=2, SH(Cc, Bc)=2
BiSH(C,B) = 2
A
B
SH(A,B)=2, SH(Ac, Bc)=10
BiSH(A,B) = 10
26/40
BiSH is more part-aware: new slabs near part boundaries
BiSH
SH
27/40
Slabs are scaled/warped vertically for better alignment
Another measure to encourage part-aware label transfer
Warp
Slabs of labeled image warped to
better align with slabs in projected
image
Recolor
Slabs recolored: many-to-one slab
matching possible
28/40
Dissimilarity between slabs: BiSH scaled by slab
height
Slab matching allows linear warp: optimized by a
dynamic time warping (DTW) algorithm
Dissimilarity between images: sum over slab
dissimilarity after warped slab matching
29/40
PSA for 3D shape segmentation
Region-based binary shape matching
Results and conclusion
30/40
Same inputs, training data (we project), and experimental
setting
Models in [K 2010]: manifold, complete, no selfintersections
PSA allows us to handle any category and imperfect
31/40
11 object categories; about 2600 labeled images
All input 3D shapes tested have self-intersections
as well as other data artifacts
32/40
Pavilion
(465 pieces)
Bicycle
(704 pieces)
33/40
34/40
Matching two images (512 x 512) takes 0.06
seconds
Label transfer (2D-to-2D then to 3D): about 1
minute for a 20K-triangle mesh
Number of selected projections: 5 – 10
Number of retrieved images per projection: 2
35/40
Projective shape analysis (PSA): semantic
3D segmentation by learning from labeled
2D images
Demonstrated potential in labeling 3D
models: imperfect, complex topology, over
any category
36/40
36
Utilize the rich availability and ease of
processing of photos for 3D shape analysis
No strong requirements on quality of 3D model
37/40
Inherent limitation of 2D
projections: they do not fully
capture 3D info
Inherent to data-driven: knowledge has to be in
data
Relying on spatial and not feature-space
analysis
Assuming upright; not designed for articulated
shapes
38/40
Labeling 2D images is still tedious:
unsupervised projective analysis
Additional cues from images and projections,
e.g., color, depth, etc.
Apply PSA for other knowledge-driven
analyses
39/40
More results and data can be found
from http://web.siat.ac.cn/~yunhai/psa.html
40/40
40