Transcript An Introduction to Pattern Recognition (Part Two)
Slide 1
Pattern Recognition
Ku-Yaw Chang
[email protected]
Assistant Professor, Department of
Computer Science and Information Engineering
Da-Yeh University
Slide 2
Outline
Introduction
Features and Classes
Supervised v.s. Unsupervised
Statistical v.s. Structural (Syntactic)
Statistical Decision Theory
2004/03/02
Pattern Recognition
2
Slide 3
Supervised v.s. Unsupervised
Supervised learning
Using a training set of patterns of known class
to classify additional similar samples
Unsupervised learning
Dividing samples into groups or clusters
based on measures of similarity without any
prior knowledge of class membership
2004/03/02
Pattern Recognition
3
Slide 4
Supervised v.s. Unsupervised
Dividing the class into two groups:
Supervised learning
Male features
Female features
Unsupervised learning
Male v.s. Female
Tall v.s. Short
With v.s. Without glasses
…
2004/03/02
Pattern Recognition
4
Slide 5
Statistical v.s. Structural
Statistical PR
To obtain features by manipulating the
measurements as purely numerical (or
Boolean) variables
Structural (Syntactic) PR
To design features in some intuitive way
corresponding to human perception of the
objects
2004/03/02
Pattern Recognition
5
Slide 6
Statistical v.s. Structural
Optical Character Recognition (OCR)
Statistical PR
Structural PR
2004/03/02
Pattern Recognition
6
Slide 7
Statistical Decision Theory
An automated classification system
Classified data sets
Selected features
2004/03/02
Pattern Recognition
7
Slide 8
Statistical Decision Theory
Hypothetical Basketball Association (HBA)
apg : average number of points per game
To predict the winner of the game
Based on the difference between the home team’s
apg and the visiting team’s apg for previous games
Training set
Scores of previously played games
Home team classified as a winner or a loser
2004/03/02
Pattern Recognition
8
Slide 9
Statistical Decision Theory
Given a game to be played, predict the
home team to be a winner or loser using
the feature:
dapg = Home Team apg – Visiting Team apg
2004/03/02
Pattern Recognition
9
Slide 10
Statistical Decision Theory
Game
dapg
Home Team
Game
dapg
Home Team
1
1.3
Won
16
-3.1
Won
2
-2.7
Lost
17
1.7
Won
3
-0.5
Won
18
2.8
Won
4
-3.2
Lost
19
4.6
Won
5
2.3
Won
20
3.0
Won
6
5.1
Won
21
0.7
Lost
7
-5.4
Lost
22
10.1
Won
8
8.2
Won
23
2.5
Won
9
-10.8
Lost
24
0.8
Lost
10
-0.4
Won
25
-5.0
Lost
11
10.5
Won
26
8.1
Won
12
-1.1
Lost
27
-7.1
Lost
13
2.5
Won
28
2.7
Won
14
-4.2
Won
29
-10.0
Lost
15
-3.4
Lost
30
-6.5
Won
2004/03/02
Pattern Recognition
10
Slide 11
Statistical Decision Theory
A histogram of dapg
10
8
6
Number
Lost
Won
4
2
9
5
1
-3
-7
-1
1
0
dapg
2004/03/02
Pattern Recognition
11
Slide 12
Statistical Decision Theory
The classification cannot be performed
perfectly using the single feature dapg.
Probability of membership in each class
With the smallest expected penalty
Decision boundary or threshold
The value T for Home Team
Won: dapg is less than or equal to T
Lost: dapg is greater than T
2004/03/02
Pattern Recognition
12
Slide 13
Statistical Decision Theory
T = -1
Home team’s apg = 103.4
Visiting team’s apg = 102.1
dapg = 103.4 – 102.1 = 1.3 and 1.3 > T
Home team will win the game
T = 0.8 or -6.5
T = 0.8 achieves the minimum error rate
2004/03/02
Pattern Recognition
13
Slide 14
Statistical Decision Theory
Adding an additional feature to increase
the accuracy of classification
dwp = Home Team wp – Visiting Team wp
wp denotes the winning percentage
2004/03/02
Pattern Recognition
14
Slide 15
Statistical Decision Theory
Game
dapg
dwp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1.3
-2.7
-0.5
-3.2
2.3
5.1
-5.4
8.2
-10.8
-0.4
10.5
-1.1
2.5
-4.2
-3.4
25.0
-16.9
5.3
-27.5
-18.0
31.2
5.8
34.3
-56.3
13.3
16.3
-17.6
5.7
16.0
-3.4
2004/03/02
Home
Team
Won
Lost
Won
Lost
Won
Won
Lost
Won
Lost
Won
Won
Lost
Won
Won
Lost
Game
dapg
dwp
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-3.1
1.7
2.8
4.6
3.0
0.7
10.1
2.5
0.8
-5.0
8.1
-7.1
2.7
-10.0
-6.5
9.4
6.8
17.0
13.3
-24.0
-17.8
44.6
-22.4
12.3
-3.8
36.0
-20.6
23.2
-46.9
19.7
Pattern Recognition
Home
Team
Won
Won
Won
Won
Won
Lost
Won
Won
Lost
Lost
Won
Lost
Won
Lost
Won
15
Slide 16
Statistical Decision Theory
Feature vector (dapg, dwp)
Presented as a scatterplot
40
W
W W
W
dwp
20
WW
WW
W
W W
L
W
W
L L
W WW
L
L L L
W
W
W
L
0
-20
-40
-60
W
L
L
-10
-5
0
5
10
dapg
2004/03/02
Pattern Recognition
16
Slide 17
Statistical Decision Theory
The feature space can be divided into two
decision region by a straight line
Linear decision boundary
If a feature space cannot be perfectly separated
by a straight line, a more complex boundary
line might be used.
2004/03/02
Pattern Recognition
17
Slide 18
Exercise One
The values of a feature x for nine samples
from class A are 1, 2, 3, 3, 4, 4, 6, 6, 8.
Nine samples from class B had x values of
4, 6, 7, 7, 8, 9, 9, 10, 12. Make a
histogram (with an interval width of 1) for
each class and find a decision boundary
(threshold) that minimizes the total number
of misclassifications for this training data
set.
2004/03/02
Pattern Recognition
18
Slide 19
Exercise Two
Can the feature vectors (x,y) = (2,3), (3,5),
(4,2), (2,7) from class A be separated from
four samples from class B located at (6,2),
(5,4), (5,6), (3,7) by a linear decision
boundary? If so, give the equation of one
such boundary and plot it. If not, find a
boundary that separates them as well as
possible.
2004/03/02
Pattern Recognition
19
Pattern Recognition
Ku-Yaw Chang
[email protected]
Assistant Professor, Department of
Computer Science and Information Engineering
Da-Yeh University
Slide 2
Outline
Introduction
Features and Classes
Supervised v.s. Unsupervised
Statistical v.s. Structural (Syntactic)
Statistical Decision Theory
2004/03/02
Pattern Recognition
2
Slide 3
Supervised v.s. Unsupervised
Supervised learning
Using a training set of patterns of known class
to classify additional similar samples
Unsupervised learning
Dividing samples into groups or clusters
based on measures of similarity without any
prior knowledge of class membership
2004/03/02
Pattern Recognition
3
Slide 4
Supervised v.s. Unsupervised
Dividing the class into two groups:
Supervised learning
Male features
Female features
Unsupervised learning
Male v.s. Female
Tall v.s. Short
With v.s. Without glasses
…
2004/03/02
Pattern Recognition
4
Slide 5
Statistical v.s. Structural
Statistical PR
To obtain features by manipulating the
measurements as purely numerical (or
Boolean) variables
Structural (Syntactic) PR
To design features in some intuitive way
corresponding to human perception of the
objects
2004/03/02
Pattern Recognition
5
Slide 6
Statistical v.s. Structural
Optical Character Recognition (OCR)
Statistical PR
Structural PR
2004/03/02
Pattern Recognition
6
Slide 7
Statistical Decision Theory
An automated classification system
Classified data sets
Selected features
2004/03/02
Pattern Recognition
7
Slide 8
Statistical Decision Theory
Hypothetical Basketball Association (HBA)
apg : average number of points per game
To predict the winner of the game
Based on the difference between the home team’s
apg and the visiting team’s apg for previous games
Training set
Scores of previously played games
Home team classified as a winner or a loser
2004/03/02
Pattern Recognition
8
Slide 9
Statistical Decision Theory
Given a game to be played, predict the
home team to be a winner or loser using
the feature:
dapg = Home Team apg – Visiting Team apg
2004/03/02
Pattern Recognition
9
Slide 10
Statistical Decision Theory
Game
dapg
Home Team
Game
dapg
Home Team
1
1.3
Won
16
-3.1
Won
2
-2.7
Lost
17
1.7
Won
3
-0.5
Won
18
2.8
Won
4
-3.2
Lost
19
4.6
Won
5
2.3
Won
20
3.0
Won
6
5.1
Won
21
0.7
Lost
7
-5.4
Lost
22
10.1
Won
8
8.2
Won
23
2.5
Won
9
-10.8
Lost
24
0.8
Lost
10
-0.4
Won
25
-5.0
Lost
11
10.5
Won
26
8.1
Won
12
-1.1
Lost
27
-7.1
Lost
13
2.5
Won
28
2.7
Won
14
-4.2
Won
29
-10.0
Lost
15
-3.4
Lost
30
-6.5
Won
2004/03/02
Pattern Recognition
10
Slide 11
Statistical Decision Theory
A histogram of dapg
10
8
6
Number
Lost
Won
4
2
9
5
1
-3
-7
-1
1
0
dapg
2004/03/02
Pattern Recognition
11
Slide 12
Statistical Decision Theory
The classification cannot be performed
perfectly using the single feature dapg.
Probability of membership in each class
With the smallest expected penalty
Decision boundary or threshold
The value T for Home Team
Won: dapg is less than or equal to T
Lost: dapg is greater than T
2004/03/02
Pattern Recognition
12
Slide 13
Statistical Decision Theory
T = -1
Home team’s apg = 103.4
Visiting team’s apg = 102.1
dapg = 103.4 – 102.1 = 1.3 and 1.3 > T
Home team will win the game
T = 0.8 or -6.5
T = 0.8 achieves the minimum error rate
2004/03/02
Pattern Recognition
13
Slide 14
Statistical Decision Theory
Adding an additional feature to increase
the accuracy of classification
dwp = Home Team wp – Visiting Team wp
wp denotes the winning percentage
2004/03/02
Pattern Recognition
14
Slide 15
Statistical Decision Theory
Game
dapg
dwp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1.3
-2.7
-0.5
-3.2
2.3
5.1
-5.4
8.2
-10.8
-0.4
10.5
-1.1
2.5
-4.2
-3.4
25.0
-16.9
5.3
-27.5
-18.0
31.2
5.8
34.3
-56.3
13.3
16.3
-17.6
5.7
16.0
-3.4
2004/03/02
Home
Team
Won
Lost
Won
Lost
Won
Won
Lost
Won
Lost
Won
Won
Lost
Won
Won
Lost
Game
dapg
dwp
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-3.1
1.7
2.8
4.6
3.0
0.7
10.1
2.5
0.8
-5.0
8.1
-7.1
2.7
-10.0
-6.5
9.4
6.8
17.0
13.3
-24.0
-17.8
44.6
-22.4
12.3
-3.8
36.0
-20.6
23.2
-46.9
19.7
Pattern Recognition
Home
Team
Won
Won
Won
Won
Won
Lost
Won
Won
Lost
Lost
Won
Lost
Won
Lost
Won
15
Slide 16
Statistical Decision Theory
Feature vector (dapg, dwp)
Presented as a scatterplot
40
W
W W
W
dwp
20
WW
WW
W
W W
L
W
W
L L
W WW
L
L L L
W
W
W
L
0
-20
-40
-60
W
L
L
-10
-5
0
5
10
dapg
2004/03/02
Pattern Recognition
16
Slide 17
Statistical Decision Theory
The feature space can be divided into two
decision region by a straight line
Linear decision boundary
If a feature space cannot be perfectly separated
by a straight line, a more complex boundary
line might be used.
2004/03/02
Pattern Recognition
17
Slide 18
Exercise One
The values of a feature x for nine samples
from class A are 1, 2, 3, 3, 4, 4, 6, 6, 8.
Nine samples from class B had x values of
4, 6, 7, 7, 8, 9, 9, 10, 12. Make a
histogram (with an interval width of 1) for
each class and find a decision boundary
(threshold) that minimizes the total number
of misclassifications for this training data
set.
2004/03/02
Pattern Recognition
18
Slide 19
Exercise Two
Can the feature vectors (x,y) = (2,3), (3,5),
(4,2), (2,7) from class A be separated from
four samples from class B located at (6,2),
(5,4), (5,6), (3,7) by a linear decision
boundary? If so, give the equation of one
such boundary and plot it. If not, find a
boundary that separates them as well as
possible.
2004/03/02
Pattern Recognition
19