Texton Boost Slides

Download Report

Transcript Texton Boost Slides

‫‪Texture‬‬
‫‪We would like to thank Amnon Drory‬‬
‫‪for this deck‬‬
‫הבהרה‪:‬‬
‫החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע ‪ /‬לא מופיע במצגת‪.‬‬
Syllabus
• Textons
• TextonsBoost
Textons
•
•
•
•
Run filter bank on images
Build Texton dictionary using K-means
Map texture image to histogram
Histogram Similarity using Chi-square
TextonBoost
•
•
•
•
•
Build Texton dictionary
Texture Layout (pixel, rectangle, Texton)
Count number of textons in rectangle
Use Integral Image
Generate multiple Texture layouts (Features)
• For each class do 1-vs-all classifier:
– For each pixel in class
• Train GentleBoost Classifier
• Map strong classifier to probability
• Take Maximum value
CRF/MRF
• How to ensure Spatial Consistency?
ML
Xˆ ML  ArgMax Pr obY | X 
PrY | X 
X
PrX | Y 
Likelihood
Posterior
PrX Y  
Bayes
PrY X PrX 
Xˆ MAP  ArgMax PrY X  PrX 
X
 ArgMin HY  X  AX 
2
X
MAP
PrY 
PrX   Const  exp AX 
Prior
Semantic Texton Forest
• Decision Trees
• Forest and Averaging
• Split decision to minimize Entropy
• Two level STF to add spatial regularization
• Works well when there is ample data, does not
generalize well
(1) Textons
B. Julesz,
Leung, Malik
M. Varma, A. Zisserman
(II) TextonBoost
J. Shotton, J. Winn, C. Rother, A. Criminisi
(III) Semantic Texton Forests
J. Shotton, M. Johnson, R. Cipolla
(IV) Pose Recognition from Depth Images J. Shotton, A.
Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A.
Kipman, A. Blake
Textures
Filter Bank
K-means
Texton Histogram
Classification
Results
TextonBoost :
Joint Appearance, Shape and Context
Modeling for Multi-Class Object Recognition
and Segmentation
J. Shotton*, J. Winn†, C. Rother†, and A. Criminisi†
*
University of Cambridge
† Microsoft Research Ltd, Cambridge, UK
TextonBoost
Simultaneous recognition and segmentation
Explain every pixel


TextonBoost

1.
Input:
Training: Images with pixel level ground truth classification
MSRC 21 Database
TextonBoost

1.
2.

Input:
Training: Images with
pixel level ground
truth classification.
Testing: Images
Output:
A classification of
each pixel in the test
images to an object
class.
Conditional Random Field
Unary Term
Unary Term
Binary Term
Binary Term
Textons
• Shape filters use texton maps ([Varma & Zisserman IJCV 05],
[Leung & Malik IJCV 01])
• Convolve with 17D filter bank (Gaussians, Derivatives of Gaussians, DoGs,
LoGs) – Can use Gabor instead
• Use k-means to create 400 clusters

Clustering
Texton map
Input image
Colors  Texton Indices
Filter Bank
• Compact and efficient characterisation of local texture
CRF: Unary Term
ℎ 𝒙 =
𝑎𝛿 𝑣 𝑖, 𝑟, 𝑡 > 𝜃 + 𝑏
𝑐𝑜𝑛𝑠𝑡.
𝑀
𝐻 𝒙 =
𝑚=1
0.001
ℎ𝑚 𝒙
0.47
0.23
0.02
Probability of class c_i given feature vector x
𝑃𝑖 𝑐𝑖 𝒙 =
𝑦 𝑖𝑛 𝑐𝑙𝑎𝑠𝑠
𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑒𝑥𝑝 𝐻 𝑐𝑖
𝑐′𝑖 𝑒𝑥𝑝
𝐻 𝑐′𝑖
0.1
Texture-Layout
Filters
up to 200 pixels
• Pair:
,
(
rectangle r
)
v(i1, r, t) = a
texton t
• Feature responses v(i, r, t)
• Large bounding boxes enable
long range interactions
(
, ) (
, )
v(i3, r, t) = a/2
Texture Layout (Toy Example)
CRF: Binary Term
CRF: Binary Term

Potts model


encourages neighbouring pixels to have same label
Contrast sensitivity

encourages segmentation to follow image edges
Accurate Segmentation?
• Boosted classifier alone
– effectively recognises objects
– but not sufficient for pixelperfect segmentation
• Conditional Random Field
(CRF)
– jointly classifies all pixels whilst
respecting image edges
unary term only
CRF
The TextonBoost CRF
Unary Term
Texture-Layout
Color
edge
Binary Term
location
Location Term
Texture-Layout
Color
edge

Capture prior on absolute image location
tree
sky
road
location
Color Term
Texture-Layout
Color
edge
location
Texture-Layout Term
Texton Boost - Summary
Performs per-pixel classification using:
1. Statistics learned from Training Set:
- Absolute location statistics
- Configuration of textured areas around pixel of
interest.
2. Cues from the Test Image:
- Edges
- Object Colors
3. Priors.
Results on 21-Class Database
building
Effect of Model Components
shape-texture
Shape-texture potentials only:
+ edge potentials:
+ Color potentials:
+ location potentials:
+ edge
69.6%
70.3%
72.0%
72.2%
+ Color & location
pixel-wise
segmentation
accuracies