Feature-Based Classification & Principle Component Analysis

Download Report

Transcript Feature-Based Classification & Principle Component Analysis

Feature-Based Classification & Principle Component Analysis

Another Approach to Feature Based Classification • • Offline: – Collect examples of each class – Determine features of each example – Store the resulting features as points in “feature space” Online: – Get a new instance – Determine its features – Choose the class that’s closest in “feature space”

Advantages / Disadvantages + Simple to compute + Avoids partitioning feature space (classes can overlap) – Leaves the possibility of large “unmapped” areas in feature space – heavily dependent on good sample set – Highly dependent on good feature set

Example: Letter Classification • Classifying Capital Letters with features: – Holes • A = 1, B = 2, C = 0 – Ends (does not count angles like the top of A) • A = 2, B = 0, C = 2, F = 3 – Straight • A = 3, B = 1, C = 0, D = 1 – Curve • A = 0, B = 2, C = 1, D = 1

Feature Classification Example

Classifying a New Letter • • • New letter is ø – Holes = – Ends = – Straight = – Curve = Distance to “A” (1, 2, 3, 0) is: Distance to “D” (1, 0, 1, 1) is:

Continuing the Example

Evaluating the Features • • • Does a slight modification of a letter still classify to the same letter? (Generalization) Are all different letters distinguishable? (Representation) Are all features independent and useful? • How can we modify this feature set to improve the representation?

Multiple Examples per Class • • • Improves robustness (why?) Increases space / time requirements (why?) How can we gain benefits without too much cost?

– K nearest neighbors – Clustering – Partitioning the space (as we saw before)

Recognizing Sign Language Letters “A” “E” Also “I”, “O”, and “U”

Input • • • 30 x 32 image of a hand signing the letter (grayscale) 960 pixels, values 0-255 We have a data set of 30 images per letter for the 5 vowels

Features?

Most Easily Available Features • • • • • Pixels from the image – 960 features per image!

These are very easy to compute, but we hope we don’t really need all of them Maybe linear combinations (e.g. total of upper half) would be useful How can we find out which linear combinations of pixels are the most useful features?

How can we decide how many (few) features we need?

To Be Continued… • • • Smith, L.I., A Tutorial on Principal Component Analysis. February, 2002. http://www.cs.otago.ac.nz/cosc453/student_tutorials/ principal_components.pdf

Shiens, J., A Tutorial on Principal Component Analysis. December, 2005. http://www.snl.salk.edu/~shlens/pub/notes/pca.pdf

Kirby, M., and L. Sirovich. "Application of the Karhunen Loeve Procedure for the Characterization of Human Faces." IEEE Trans. Patt. Anal. Mach. Intell. 12.1 (1990): 103-108.