Transcript slides

Polyhedral Classifier for Target Detection
A Case Study: Colorectal Cancer
Murat Dundar, Matthias Wolf, Sarang Lakare,
Marcos Salganicoff, Vikas C. Raykar
Siemens Medical Solutions, Inc. USA
Malvern, PA 19355
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Computer Aided Diagnosis (CAD) for Colon Cancer
 Identify suspicious regions
(candidates)
 Extract features for each
candidate
 Classify candidates as a polyp or
non-polyp
Page 2
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Multi-mode nature of CAD data
 The only ground truth
available is the location of the
polyp.
 All other candidates that are
not pointing to a known polyp
are pooled into the negative
class.
 Variation among the different
negatives is large.
Page 4
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
A CAD Example: Colorectal Cancer
Polyps vs. common false positives
Sessile polyp
Stool
Pedunculated polyp
Rectal tube
Page 5
July-08
Noise
Fold
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
State-of-the-Art – Finite Mixture Models
 Model class distribution by a mixture model, one mode for
each subclass, then design a maximum a posteriori or
maximum likelihood classifier
 Too few positives, too many features with redundancy!
Robust estimation of model parameters for positive class is
very difficult, if not impractical
Page 6
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
State-of-the-Art – Discriminative Techniques
 Pool all negative candidates into a single class and
learn a binary classifier, i.e. polyps vs. negatives
 A kernel-based discriminative technique (SVM, RVM,
KFD) can yield nonlinear decision boundaries
suitable for classifying multi-mode data.
 Too few positive candidates, too many features with
redundancy! Data can be easily overfit by a nonlinear
classifier
Page 7
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
State-of-the-Art – One-Class Classifiers
 Omits the negative class, learns a model with positive
samples only.
 Kernel-based and neural network implementation
yield nonlinear decision boundaries suitable for
classifying multi-mode data.
 Like other nonlinear classifiers susceptible to
overfitting
Page 8
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
State-of-the-art in a Nutshell
 Linear classifiers
 less prone to overfitting
 not enough capacity to deal with multi-mode data
 Finite mixture models
 Parameter estimation is an issue!
 Discriminative & One-class Classifiers
 good capacity
 more prone to overfitting
Page 9
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
A Viable solution
 A series of linear classifiers one for each subclass of the
negatives
 More capacity than a linear classifier, yet less prone to
overfitting than a nonlinear classifier
 An unseen sample is classified as positive if all the classifier
classifies it as positive
Page 10
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Training Multiple Linear Classifiers
 Train each classifier independently: Negative subclass k vs.
Positives, for k=1,…,K.
 Inefficient! Potentially excessive penalization due to a
misclassified positive sample
Page 11
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Proposed Approach
 Optimize classifiers jointly
 One classifier for each subclass of negative data
 Objective function is penalized once due to a
misclassified positive sample
 Yields a polyhedral decision surface
Page 12
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
A Toy Example
Page 13
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Hyperplane Classifiers with Hinge Loss
α x0
 i  max{ 0, 1  yi αT x i }
T
TP+
ξ
ξ
FP-
Page 14
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Polyhedral Classifier with AND Framework
If the hinge loss = 0, the example is correctly classified,
If the hinge loss > 0, the example is mis-classified
Let  ik be the hinge loss of i-th example induced by the
classifier k
i-th Positive example:
i-th Negative example:
Page 15
July-08
max(0, i1 , ξi 2 ,ξiK ) --
“AND”
max( 0, ik )
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Objective Function with the AND Framework
J (α1 , α 2 ,  α K )   1 
 max( 0, 
k iCk 
ik
)
Error on Negative Examples
  2  max(0,  i1 ,ξ i 2 , ξ iK )
iC 

Error on Positive Examples
K
 P (α
k 1
k
)
Regularization to Control
Complexity
Convex Problem!
Page 16
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Incomplete Ground Truth for Subclasses
 AND algorithm assumes the subclass membership is known for all
samples. Not Realistic!
 Annotate a small portion of the negatives
 identify potential subclasses
 pool training samples for each subgroup.
 Three different types of samples in the training data
 Positives
 Negatives with known and unknown subclass membership
Page 17
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Objective Function with the AND-OR Framework
J (α1 , α 2 , α K )   1 
 max( 0, 
k iCk 
  1  max( 0,  ik )
iCˆ  k
ik
)
Error on Negative Examples with
known subclasses
Error on Negative Examples with
unknown subclasses, OR operation
  2  max(0,  i1 ,ξ i 2 , ξ iK )
iC 

Error on Positive Examples
AND operation
K
 P (α
k 1
k
)
Regularization to Control
Complexity
Not Convex!
Page 18
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Alternating Optimization Iterative Algorithm
Each iteration contains K steps, and each step
optimizes a single classifier
At the k-th step,
Fix all classifiers (α’s) but the classifier k
Minimize J(α1,…, αk ,… αK) for optimal αk
Page 19
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Cascaded Design
Candidates
T1
1
T2
2
….
F2
F1
TK-1
TK
K
TP
FK
rejected candidates
Training Sets: T1
Page 20
July-08

T2
 …. 
TK
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Cascade Design with Sparse Linear Classifiers
 Setting P(k)=| k | yields K sparse classifiers, each with varying
number of non-zero coefficients
 Run-time order does not change the outcome
 Start with the classifier that has the least number of nonzero
coefficients
 Classify the sample, if negative reject, if positive pass it to the next
classifier that requires computation of least number of additional
features. Continue until all K classifiers are run
Page 21
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Experiments – Automatic Polyp Detection
Data
Training
Test
Volumes Polyps Negative candidates
316
226
1,249
385
245
1,920
98 numerical image features are computed,
out of 1249 negatives, 177 are annotated,
9 subclasses are identified
Page 22
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
ROC plots
Page 23
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Run-time Performance
Classifiers
Sens (at 3fp/vol)
Time (t)
Polyhedral
84
452
SVDD
80
595
Rbf-SVM
60
595
Linear-SVM
45
437
25 % gain in execution time over SVDD and RBF-SVM
Page 24
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM
Conclusions
 Polyhedral classifier for multi-mode data
AND framework when subclass information is fully available
AND-OR framework when subclass information is partially
available
Cascade design as a by-product to speed-up online
execution
Thank you! Questions and Comments
Page 25
July-08
Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Dundar et al.
CAD & Knowledge Solutions / Malvern, USA / IKM