Transcript Competitive Nets
IE 585
Competitive Network – I Hamming Net & Self-Organizing Map
Competitive Nets
Unsupervised • MAXNET • Hamming Net • Mexican Hat Net • Self-Organizing Map (SOM) • Adaptive Resonance Theory (ART) Supervised • Learning Vector Quantization (LVQ) • Counterpropagation 2
Clustering Net
• Number of input neurons equal to the dimension of input vectors • Each output neuron represents a cluster the number of output neurons limits the number of clusters that can be performed • The weight vector for an output neuron serves as a representative for the input patterns which the net has placed on that cluster • The weight vector for the winning neuron is adjusted 3
Winner-Take-All
• The squared Euclidean distance is used to determine the closest weight vector to a pattern vector • Only the neuron with the smallest Euclidean distance from the input vector is allowed to update 4
MAXNET
• Developed by Lippmann, 1987 • Can be used as a subset to pick the node whose input is the largest • Completely interconnected (including self connection) • Symmetric weights • No training • Weights are fixed 5
1 1 /
Architecture of MAXNET
1 / 1 / 1 / 1 / 1
w ij
1 if
i
1 / if
i
j j
1 / Transfer Function
x
if
x
0
f
(
x
) 0 otherwise 1 1 6
Procedure of MAXNET
Initialize activations and weights Update activation of each node
x new j
f
(
j w ij x old j
)
x old j
x new j
If more than one node has a nonzero activation, continue; otherwise, stop 7
Hamming Net
• Developed by Lippmann, 1987 • A maximum likelihood classifier • used to determine which of several exemplar vectors is most similar to an input vector • Exemplar vectors determine the weights of the net • Measure of similarity between the input vector and the stored exemplar vectors is (n – HD between the vectors) 8
Weights and Transfer Function of Hamming Net
bipolar (-1,1)
w ij w b
1 2
n
2
x i
binary (0,1)
w w b ij
# 1 1 if if of 0' s
x x i i
in vector 1 0 Transfer Function (Identity Function)
x
if
x
0
f
(
x
) 0 otherwise 9
Architecture of Hamming Net
MAXNET y 1 y 2 B x 1 x 2 x 3 x 4 B 10
Procedure of the Hamming Net
Initialize weights to store the m exemplar vectors For each input vector x compute
net Y j
b j
i w ij x i
initialize activation for MAXNET MAXNET iterates for find the best match exemplar
y j
( 0 )
net Y j
11
Hamming Net Example
12
Mexican Hat Net
• Developed by Kohonen, 1989 • Positive weight with “cooperative neighboring” neurons • Negative weight with “competitive neighboring” neurons • Not connect with far away neurons 13
Teuvo Kohonen
• http://www.cis.hut.fi/teuvo/ (his own home page) • published his work starting in 1984 • LVQ - learning vector quantization • SOM map - self organizing
Professor at Helsinki Univ.
Finland
14
SOM
• Also called Topology-Preserving Maps or Self Organizing Feature Maps (SOFM) • “ Winner Take All ” learning (also called competitive learning) • winner has the minimum Euclidean distance • learning only takes place for winner • final weights are at the centroids of each cluster • Continuous inputs, continuous or 0/1 (winner take all) outputs • No bias, fully connected • used for data mining and exploration • supervised version exists 15
Architecture of SOM Net
O
W
I U N T P P U U T T S S (a’s)
n
(y’s) Input Layer 16
Kohonen Learning Rule Derivation
min
a
w winner
2
a
2 2
wa
w
2 E
w
(
a
w
w
2 )
(
2
a-
2
w)
w
(
a
0 .
1
w
) 0 .
7 and usually decreases during training.
17
Kohonen Learning
w
.
new j
w
.
old j
w
w
.
old j
[
a
w
.
old j
]
a
( 1 )
w
.
old j
18
Procedure of SOM
Initialize weights uniformly and normalize to unit length Normalize inputs to unit length Present an input vector x calculate Euclidean distance between x and all Kohonen neurons select winning output neuron
j
(with the smallest distance) update the winning neuron
w new
w old
[
a
w old
] re-normalize weights to
j
(sometimes skipped) present next training vector 19
Method
Normalize input vectors,
a
, by:
a
ki
a
ki a ki
2
i
Normalize weight vectors,
w
, by:
w
ji
w
i ji w
2
ji
Calculate distance from
a
to each
w
by:
d j
i
a
ki
w
ji
2 20
Min
d
wins (this is the winning neuron) Update
w
of the min d neuron by:
w new ji
w
old ji
a
ki
w
old ji
Return to 2 and repeat for all input vectors
a
Reduce if applicable Repeat until weights converge (stop changing) 21
SOM Example - 4 patterns
=0.25
p1 p2 p3 p4 neuron1 w1 w2 1 1 0 0 neuron2 w1 0.2
0.2
0.2
0.2
0.15
0.15
0.15
0.15
0.3
0.3
0.3
0.3
0.225
0.225
0.225
0.225
0.5
0.5
0.625
0.625
0.625
0.625
0.71875
0.71875
0.1125
0.1125
0.1125
0.1125
0.16875
0.16875
0.71875
0.71875
0.16875 0.789063
0.16875 0.789063
0.084375 0.126563 0.789063
1 0 1 0 w2 neuron3 w1 w2 neuron4 w1 0.4
0.4
0.3
0.3
0.3
0.3
0.225
0.225
0.1
0.1
0.1
0.075
0.075
0.075
0.075
0.05625
0.7
0.7
0.7
0.775
0.775
0.775
0.775
0.83125
0.5
0.625
0.625
0.625
0.625
0.71875
0.71875
0.71875
0.225
0.225
0.05625
0.05625
0.83125
0.71875
0.83125 0.789063
0.16875
0.05625
0.83125 0.789063
0.16875 0.042188 0.873438 0.789063
0.16875 0.042188 0.873438 0.789063
w2 dist1 dist2 dist3 dist4 0.6
0.7
0.7
0.7
1.13
0.73
0.61
0.41
0.53 0.880625
0.13 0.480625
0.9
0.41
1.3 0.630625
0.1 0.480625
0.60625 0.880625
0.7 1.323125 0.630625
0.775 0.773125 0.230625
0.90625 0.230625
1.45625 0.679727
0.775 0.623125 1.117227
0.05625 0.567227
0.775 0.073125 0.567227 0.694141 1.117227
0.775 1.478633 0.679727 0.919141 0.129727
0.83125 0.816133 0.129727 1.581641 0.735471
0.83125 0.703633 1.313596 0.031641 0.651096
0.83125 0.041133 0.651096 0.764673 1.313596
0.83125
22
Movement of 4 weight clusters
1 0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0 0 Neuron 3 0.1
Neuron 1 0.2
0.3
0.4
0.5
0.6
0.7
Neuron 4 0.8
Neuron 2 0.9
1 23
Adding a “conscience”
• prevents neurons from winning too many training vectors using a bias (
b
) factor • winner had min (
d
-
b
) where
b j
=10(1/
n
-
f j
) (
n
=# output neurons)
f j
new =
f j
old +0.0001(
y j
-
f j
old ) f initial =1/
n
• for neurons that win,
b
becomes negative and for neurons that don’t win,
b
becomes positive 24
Supervised Version
• Same, except if the winning neuron is “ correct ” use same weight update: w new = w old + (a - w old ) and • if winning neuron is “ incorrect ” use: w new = w old (a - w old ) 25