Classification

Transcript Classification

Classification
Adriano Joaquim de O Cruz ©2002
NCE/UFRJ
[email protected]
Classification
 Technique
that associates samples to
classes previously known.
 May be Crisp or Fuzzy
 Supervised
 MLP  trained
 Non
supervised
 K-NN e fuzzy K-NN  not trained
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
K-NN and Fuzzy K-NN
 Classification
 Classes
Methods
identified by patterns
 Classifies
 Previous
by the k nearest neighbours
knowledge about the problem
classes
 It
is not restricted to a specific
distribution of the samples
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Classification
Crisp K-NN
Crisp K-NN





Supervised clustering method (Classification
method).
Classes are defined before hand.
Classes are characterized by sets of
elements.
The number of elements may differ among
classes.
The main idea is to associate the sample to
the class containing more neighbours.
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Crisp K-NN
Class 1
w2
Class 2
w4
w3
w1
Class 3
w5
w9
w13
w14
w10
s
w8
Class 4

w11
w7
w12
w6
Class 5
3 nearest neighbours, and sample s is closest
to pattern w6 on class 5.
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Crisp K-NN
 Consider
W={w1, w2, ..., wt} a set of t
labelled data.
 Each object wi is defined by l
characteristics wi=(wi1, wi2, ..., wil).
 Input of y unclassified elements.
 k the number of closest neighbours of y.
 E the set of k nearest neighbours (NN).
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Crisp K-NN
 Let
t be the number of elements that
identify the classes.
 Let c be the number of classes.
 Let W be the set that contain the t
elements
 Each cluster is represented by a subset
of elements from W.
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Crisp K-NN algorithm
set k
{Calculating the NN}
for i = 1 to t
Calculate distance from y to xi
if i<=k
then add xi to E
else if xi is closer to y than any
previous NN
then delete the farthest neighbour
and include xi in the set E
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Crisp K-NN algorithm cont.
Determine the majority class represented
in the set E and include y in this
class.
if there is a draw,
then calculate the sum of distances from
y to all neighbours in each class in
the draw
if the sums are different
then add xi to class with smallest
sum
else add xi to class where last minimum
was found
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Classification
Fuzzy K-NN
Fuzzy K-NN
 The
basis of the algorithm is to assign
membership as a function of the
object’s distance from its K-nearest
neighbours and the memberships in the
possible classes.
 J. Keller, M. Gray, J. Givens. A Fuzzy KNearest Neighbor Algorithm. IEEE
Transactions on Systems, Man and
Cybernectics, vol smc-15, no 4, July
August 1985
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Fuzzy K-NN
Classe 2
Classe 1
Classe 3
w2
w4
w5
w13
w1
w9
w3
w10
w11
Classe 4
*@2001 Adriano Cruz
w8
w7
w12
*NCE e IM - UFRJ
w14

w6
Classe 5
Classification ‹#›
Fuzzy K-NN







Consider W={w1, w2, ..., wt} a set of t labelled
data.
Each object wi is defined by l characteristics
wi=(wi1, wi2, ..., wil).
Input of y unclassified elements.
k the number of closest neighbours of y.
E the set of k nearest neighbours (NN).
i(y) is the membership of y in the class i
ij is the membership in the ith class of the jth
vector of the labelled set (labelled wj in class i) .
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Fuzzy K-NN
 Let
t be the number of elements that
identify the classes.
 Let c be the number of classes.
 Let W be the set that contain the t
elements
 Each cluster is represented by a subset
of elements from W.
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Fuzzy K-NN algorithm
set k
{Calculating the NN}
for i = 1 to t
Calculate distance from y to xi
if i<=k
then add xi to E
else if xi is closer to y than
any previous NN
then delete the farthest neighbour
and include xi in the set E
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Fuzzy K-NN algorithm
Calculate i(y) using
for i = 1 to c // number of classes


1




ij
2 ( m 1)


j 1
y

x
j


i ( y ) 
k 

1



2 ( m 1)

j 1  y  x j


k
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Computing ij(y)
 ij(y)



can be assigned class membership
in several ways.
They can be given complete membership in
their known class and non membership in all
other.
Assign membership based on distance from
their class mean.
Assign membership based on the distance
from labelled samples of their own class and
those from other classes.
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Classification
ICC-KNN System
ICC-KNN System
 Non-Parametric
Statistical Pattern
Recognition System
 Associates
FCM, fuzzy KNN and ICC
 Evaluates
data disposed on several
class formats
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN System
 Divided
in two modules
 First module (training)
 chooses the best patterns to use with K-NN
 chooses the best fuzzy constant and best
number of neighbours (K)
 Second
module (classification)
 uses fuzzy k-nn to classify
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN First Module
 Classification
Module
 Finds structure on data sample
 Divided into two phases
 First
phase of training
 Finds the best patterns for fuzzy K-NN
• FCM – Applied to each class using many
numbers of categories
• ICC – Finds the best number of categories to
represent each class
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN First Phase
 Results
of applying FCM and ICC
 Patterns for K-NN which are the centres of
the chosen run of FCM
 Number of centres which are all the
centres of the number of categories
resulting after applying ICC to all FCM runs
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN Second Phase
 Second
phase of training
 Evaluates
the best fuzzy constant and
the best number of neighbours so to
achieve best performance on the K-NN
 tests several m and k values
 finds m and k for the maximum rate of crisp
hits
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN
 Pattern
Recognition Module
 Distributes each data to its class
 Uses
the chosen patterns, m and k to
classify data
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN block diagram
Classification Module
Pattern
Recognition
Module
U1cmin
Class 1
FCM
ICC
w1
U1cmáx
W, Uw
UScmin
Class s
FCM
Fuzzy
K-NN
m k
Fuzzy
K-NN
W Uw
ICC
ws
UScmáx
Not
classified
Data
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN






Let R={r1,r2,...,rn} be the set of samples.
Each sample ri belongs to one of s known
classes.
Let Uic be the inclusion matrix for the class i
with c categories.
Let Vic be the centre matrix for the class i with
c categories.
Let wi be equal to the best Vic of each class
Let W be the set of sets of centres wi
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN algorithm

Classification Module

First phase of training

Step 1. Set m
Step 2. Set cmin and cmáx
Step 3. For each s known class
Generate the set Rs with the points from R belonging to the
class s
For each category c in the interval [cmin , cmáx]
Run FCM for c and the set Rs generating
Usc and Vsc
Calculate ICC for Rs e Usc
End
Define the patterns ws of class s as the matrix Vsc
that
maximizes ICC



Step 4. Generate the set W = {w1, ..., ws}
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN algorithm

Second phase of Training

Step 5. Set mmin e mmáx
Step 6. Set kmin e kmáx
For each m from [mmin , mmáx]
For each k from [kmin , kmáx]
Run fuzzy K-NN for the patterns from th
set
W
generating Umk
Calculate the number of crisp hits for
Umk

Step 7. Choose m and k that yields the best crips hit figures
 Step 8. if there is a draw
If the k’s are different
Choose the smaller k
else
*NCE e IM - UFRJ
*@2001 Adriano Cruz
Classification
Choose the smaller m

‹#›
ICC-KNN algorithm

Pattern Recognition Module

Step 9. Apply fuzzy K-NN using patterns form the set W and the
chosen parameters m and k to the data to be classified.
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN results
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-NN results
 2000
samples, 4 classes, 500
samples in each class
 Classes
1 and 4 – concave classes
2 and 3 – convex classes,
elliptic format
 Classes
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN results
First phase of training
 FCM
applied to each class
 Training data 80%  400 samples from
each class
 c = 3..7 and m = 1,25
 ICC
applied to results
 Classes 1 and 4  4 categories
 Classes 2 and 3  3 categories
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN results
Second phase of Training
 Running




fuzzy K-NN
Patterns from first phase
Random patterns
k = 3 a 7 neighbours
m = {1,1; 1,25; 1,5; 2}
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN results
Conclusão:K-NN é mais estável em relação ao valor de m para os padrões da
PFT
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN results
Dados de Treinamento
Padrões da PFT
Padrões Aleatórios
Classes





1
2
3
4
1
2
3
4
1
388
10
0
2
213
66
0
121
2
14
379
0
7
19
380
0
1
3
0
0
376
24
3
0
324
73
4
0
1
2
397
4
46
1
349
Training data
Lines  classes
Columns  classification
m = 1,5 e k = 3  96,25%
m = 1,1 e k = 3  79,13% (random patterns)
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN results
Dados de Testes
Classes




Padrões da PFT
Padrões Aleatórios
1
2
3
4
1
2
3
4
1
97
2
0
1
53
27
0
20
2
4
93
0
3
4
96
0
0
3
0
0
90
10
0
0
82
18
4
0
0
1
99
0
15
0
85
Test data
Lines  classes
Columns  classification
Pad. PFT – 94,75% Pad. Aleat – 79%
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN x Others
 FCM,
FKCN, GG e GK
 Fase de Treinamento (FTr)
 Dados de treinamento
 c = 4 e m = {1,1; 1,25; 1,5; 2}
 Associar as categorias às classes
• Critério do somatório dos graus de inclusão
o Cálculo do somatório dos graus de inclusão dos
pontos de cada classe em cada categoria
o Uma classe pode ser representada por mais de uma
categoria
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN x Others

Fase de Teste
 Dados de Teste
 Inicialização dos métodos com os centros da FTr
 Calcula o grau de inclusão dos pontos em cada
categoria

Classe representada por mais de 1 categoria
 Grau de inclusão = soma dos graus de inclusão
dos pontos nas categorias que representam a
classe
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
ICC-KNN x Others
ICC-KNN
KNN A.
FCM
FKCN
GG
GK
R
94,75%
79%
66%
66%
69%
84%
N
95,75%
83%
70,75%
70,75%
69%
89,5%
T
36,5s
23,11s
2,91s
2,59s
22,66s
18,14s




GK para m = 2  84%
FCM e FKCN  66% para m = 1,1 e m = 1,25
GG-FCM  69% para m = 1,1 e 1,25
GG Aleatório  57,75% para m = 1,1 e 25% para
m = 1,5
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
GK
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
GK
GK
Classes
1
2
3
4
1
77
6
0
17
2
6
94
0
0
3
0
0
97
3
4
0
0
32
68
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Classification
KNN+Fuzzy Cmeans System
KNN+Fuzzy C-Means algorithm




The idea is an two-layer clustering algorithm
First an unsupervised tracking of cluster
centres is made using K-NN rules
The second layer involves one iteration of the
fuzzy c-means to compute the membership
degrees and the new fuzzy centres.
Ref. N. Zahit et all, Fuzzy Sets and Systems
120 (2001) 239-247
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
First Layer (K-NN)





Let X={x1,…,xn} be a set of n unlabelled
objects.
c is the number of clusters.
The first layer consists of partitioning X into c
cells using the fist part of K-NN.
Each cell i is (1<=i<=c) represented as Ei (yi,
K-NN of yi, Gi)
Gi is the center of cell Ei and defined as
Gi 
*@2001 Adriano Cruz
 xk
xk Ei
k 1
*NCE e IM - UFRJ
Classification ‹#›
KNN-1FCMA settings





Let X={x1,…,xn} be a set of n unlabelled
objects.
Fix c the number of clusters.
Choose m>1 (nebulisation factor).
Set k = Integer(n/c –1).
Let I={1,2,…,n} be the set of all indexes of X.
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
KNN-1FCMA algorithm step 1
Calculate G0
for i = 1 to c
Search in I for the index of the
farthest object yi from Gi-1
For j = 1 to n
Calculate distance from yi to xj
if j <= k
then add xj to Ei
else if xi is closer to y than
any previous NN
then delete the farthest neighbour
and include xi in the set Ei
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
KNN-1FCMA algorithm cont.
Include yi in the set Ei .
Calculate Gi.
Delete yi index and the K-NN indexes of yi
from I.
if I  
then for each remaining object x
determine the minimum
distance
to any centre Gi of
Ei.
classify x to the nearest centre.
update all centres.
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
KNN-1FCMA algorithm step2

Compute the matrix U according to
1
2

 m 1 
 c
  d ik
 ik     
 l 1  d lk 





Calculate all fuzzy centres using
n
vij 
m
(

)
 ie  xej
e 1
n
m
(

)
 ie
e 1
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›
Results KNN-1FCMA
Data
Elem
c
Misclassification
rate
Number of
Iterations avg
FCMA
KNN-1FCMA
FCMA
S1
20
2
0
0
11
S2
60
3
1
0
8
S3
80
4
2
0
10
S4
120
6
13
13
19
IRIS23
150
2
14
13
10
IRIS
100
3
16
17
12
*@2001 Adriano Cruz
*NCE e IM - UFRJ
Classification ‹#›

Classification

Transcript Classification

Directory