Transcript Slide 1
Kernels in Pattern Recognition
A Langur - Baboon Binary Problem •
http://www.tribuneindia.co
m/2006/20060712/himplu s4.jpg
• … HA HA HA … •
http://www.sickworld.net/d b4/00381/sickworld.net/_ uimages/baboons.jpg
Representation of Binary Data
Concept of Kernels • Idea proposed by Aizerman in 1964.
• Feature … space … dimensionality … transformation
such that
• The dot product exists { i.e. is not infinite } in higher dimension & • Data is linearly separable .
Dot Product
• The scalar value signifies the amount of projection of
a
in the direction of
b
• The scalar value also signifies the degree of similarity between
a
and
b
• Adopted from
http://www.netcomuk.co.u
k/~jenolive/vect6.html
A Geometrical Interpretation Mapping • Mapping data from low dimension to high dimension.
• Data is linearly separable in higher dimension.
• Separable hyperplane defined by a normal or weight vector.
Cross Product
• Normal vector i.e. perpendicular to the hyperplane. or
http://www.netcomuk.co.uk/~je nolive/vect8.html
Weight vector • Area covered while moving
a
to
b
in counterclockwise direction moves the vector upwards ... Like tightening of a screw • This vector is perpendicular to the plane in which
a
and
b
lie.
• Importance of dot product & kernel == dot product Classification requires computation of dot product between normal of hyperplane and test point.
• Often, normal is expressed as a linear combination of points in higer dimension.
• Dot products signify on which side of the hyperplane the test point lies – act of classification • Dot product computation expensive and transformation not easy to find, so propose a kernel function , whose scalar value is equivalent to the dot product in higer dimensional plane.
Geometrical Interpretation of Importance of dot product & kernel == dot product
How does a kernel look like?
A Planner View from Top
How does a kernel look like?
An Isometric View from different Side angles
The End
Vapnick proposes Support Vector Machines
An Apple – Orange Binary Problem •
http://en.wikipedia.org
/wiki/Image:Apples.jp
g
•
http://en.wikipedia.org
/wiki/Image:Ambersw eet_oranges.jpg
Representation of Binary Data
Separable Case
The Lagrangian
• Optimize • Subject to • Differentiate w.r.t
• w weight vector • b the constant • alpha Lagrangian parameter
Non-Separable Case
• Optimize • Subject to
The Lagrangian
• Differentiate w.r.t
• w weight vector • b the constant • alpha Lagrangian parameter • xi another Lagrangian paramer
Finally … after some mental mathematical harrasment we get: • Optimized values of weight vector and b values.
• And Then • Use it to classify new test examples …
In The End
If SVMs can’t help classify… then DITCH them and classify apples and oranges by eating them yourself ...