spectral clustering

Download Report

Transcript spectral clustering

Spectral Clustering

Jianping Fan

Dept of Computer Science UNC, Charlotte

Lecture Outline

 Motivation  Graph overview and construction  Spectral Clustering  Cool implementations 2

Semantic interpretations of clustering clusters

3

-2

Spectral Clustering Example – 2 Spirals

2 -1.5

-1 -0.5

1.5

1 0.5

0 0 -0.5

-1 0.5

1 1.5

2

Dataset exhibits complex cluster shapes

 K-means performs very poorly in this space due bias toward dense spherical clusters.

-1.5

-2 In the embedded space given by two leading eigenvectors, clusters are trivial to separate.

-0.709

-0.7085

-0.708

-0.7075

-0.707

-0.7065

0.8

0.6

0.4

0.2

0 -0.706

-0.2

-0.4

-0.6

Spectral Clustering Example Original Points K-means (2 Clusters) Why k-means fail for these two examples?

Lecture Outline

 Motivation  Graph overview and construction  Spectral Clustering  Cool implementation 6

Graph-based Representation of Data Similarity

7

similarity

Graph-based Representation of Data Similarity

8

Graph-based Representation of Data Relationship

9

Manifold 10

Graph-based Representation of Data Relationships

Manifold 11

Graph-based Representation of Data Relationships

12

Data Graph Construction

13

14

Graph-based Representation of Data Relationships

15

Graph-based Representation of Data Relationships

16

Graph-based Representation of Data Relationships

17

18

Graph-based Representation of Data Relationships

Graph Cut

19

Lecture Outline

 Motivation  Graph overview and construction  Spectral Clustering  Cool implementations 20

21

Graph-based Representation of Data Relationships

22

Graph Cut

23

24

25

26

27

Graph-based Representation of Data Relationships

28

Graph Cut

29

30

31

32

33

Eigenvectors & Eigenvalues 34

35

36

Normalized Cut A graph G(V, E) can be partitioned into two disjoint sets A, B Cut is defined as

:

Optimal partition of the graph G is achieved by minimizing the cut Min ( )

37

Normalized Cut Normalized Cut Association between partition set and whole graph

38

Normalized Cut

39

Normalized Cut

40

Normalized Cut

41

Normalized Cut becomes Normalized Cut Normalized cut can be solved by eigenvalue equation:

42

K-way Min-Max Cut Intra-cluster similarity Inter-cluster similarity Decision function for spectral clustering

43

Mathematical Description of Spectral Clustering Refined decision function for spectral cluste

ring

We can further define

: 44

Refined decision function for spectral clustering This decision function can be solved as

45

Spectral Clustering Algorithm

Ng, Jordan, and Weiss

 Motivation  Given a set of points

S

 

s

1 ,...,

s n

 

R l

 We would like to cluster them into k subsets 46

Algorithm

   Form the affinity matrix

W ii ij

  0

e

 ||

s i

s j

 2 if Scaling parameter chosen by user

i W

 

j R nxn

 Define D a diagonal matrix whose (i,i) element is the sum of A ’s row i 47

Algorithm

 Form the matrix

L

D

 1/ 2

WD

 1/ 2  Find L

x x

1 , 2 ,...,

x k

, the k largest eigenvectors of  These form the the columns of the new matrix X  Note: have reduced dimension from nxn to nxk 48

Algorithm

     Form the matrix Y Renormalize each of X ’s rows to have unit length

Y ij

Y  

X ij R nxk

/( 

j X ij

Treat each row of Y as a point in

R k

 Cluster into k clusters via K-means 49

Algorithm

 Final Cluster Assignment 

s

assigned to cluster j 50

Why?

 If we eventually use K-means, why not just apply K-means to the original data?

 This method allows us to cluster non-convex regions 51

 Some Examples 52

53

54

55

56

57

58

59

60

User ’s Prerogative

 Affinity matrix construction  Choice of scaling factor  gives the tightest clusters   Choice of k, the number of clusters  Choice of clustering method 61

How to select

k

?

Eigengap

: the difference between two consecutive eigenvalues.

 Most stable clustering is generally given by the value

k

that maximises the expression  

k

Largest eigenvalues of Cisi/Medline data 50 45 λ 1 40 

k

 

k

 1 max 

k

  2   1  Choose

k=2

35 30 25 20 15 10 5 0 λ 2 1 2 3 4 5 6 7 8 9 10

K

11 12 13 14 15 16 17 18 19 62 20

Recap – The bottom line

63

Summary

 Spectral clustering can help us in hard clustering problems  The technique is simple to understand  The solution comes from solving a simple algebra problem which is not hard to implement  Great care should be taken in choosing the “starting conditions” 64

Spectral Clustering

Spectral Clustering

Spectral Clustering

Spectral Clustering

Spectral Clustering

Spectral Clustering

Spectral Clustering

Spectral Clustering

Spectral Clustering

Spectral Clustering

Spectral Clustering

Spectral Clustering

Spectral Clustering