Transcript Slide 1

Kernels in Pattern Recognition

A Langur - Baboon Binary Problem •

http://www.tribuneindia.co

m/2006/20060712/himplu s4.jpg

• … HA HA HA … •

http://www.sickworld.net/d b4/00381/sickworld.net/_ uimages/baboons.jpg

Representation of Binary Data

Concept of Kernels • Idea proposed by Aizerman in 1964.

• Feature … space … dimensionality … transformation

such that

• The dot product exists { i.e. is not infinite } in higher dimension & • Data is linearly separable .

Dot Product

• The scalar value signifies the amount of projection of

a

in the direction of

b

• The scalar value also signifies the degree of similarity between

a

and

b

• Adopted from

http://www.netcomuk.co.u

k/~jenolive/vect6.html

A Geometrical Interpretation Mapping • Mapping data from low dimension to high dimension.

• Data is linearly separable in higher dimension.

• Separable hyperplane defined by a normal or weight vector.

Cross Product

• Normal vector i.e. perpendicular to the hyperplane. or

http://www.netcomuk.co.uk/~je nolive/vect8.html

Weight vector • Area covered while moving

a

to

b

in counterclockwise direction moves the vector upwards ... Like tightening of a screw • This vector is perpendicular to the plane in which

a

and

b

lie.

• Importance of dot product & kernel == dot product Classification requires computation of dot product between normal of hyperplane and test point.

• Often, normal is expressed as a linear combination of points in higer dimension.

• Dot products signify on which side of the hyperplane the test point lies – act of classification • Dot product computation expensive and transformation not easy to find, so propose a kernel function , whose scalar value is equivalent to the dot product in higer dimensional plane.

Geometrical Interpretation of Importance of dot product & kernel == dot product

How does a kernel look like?

A Planner View from Top

How does a kernel look like?

An Isometric View from different Side angles

  The End  

Vapnick proposes Support Vector Machines

An Apple – Orange Binary Problem •

http://en.wikipedia.org

/wiki/Image:Apples.jp

g

http://en.wikipedia.org

/wiki/Image:Ambersw eet_oranges.jpg

Representation of Binary Data

Separable Case

The Lagrangian

• Optimize • Subject to • Differentiate w.r.t

• w  weight vector • b  the constant • alpha  Lagrangian parameter

Non-Separable Case

• Optimize • Subject to

The Lagrangian

• Differentiate w.r.t

• w  weight vector • b  the constant • alpha  Lagrangian parameter • xi  another Lagrangian paramer

Finally … after some mental mathematical harrasment we get: • Optimized values of weight vector and b values.

• And Then • Use it to classify new test examples …

In The End

If SVMs can’t help classify…  then DITCH them and classify apples and oranges by eating them yourself ... 