Transcript Document

Exercises
Prof. Gheorghe Tecuci
Learning Agents Laboratory
Computer Science Department
George Mason University
1
Overview
Your exercises
Some general questions and exercises
Sample questions on version space learning
Sample questions on decision tree learning
Sample questions on other learning strategies
2
Version Spaces
Select the correct answers and justify your solution:
The version space for a set of examples given incrementally
(for which there is a concept covering the positive examples
and not covering the negative examples) will decrease (i.e. will
contains strictly fewer concepts) when:
1. Always when a negative example is given
2. Always when a positive example is given
3. Always when a positive example is not covered by any
concept from the lower bound
4. Always when a negative example is covered by all the
concepts from the upper bound
Mihai Boicu
3
Explanation-based learning
Given:
• The axioms of plane Euclidian geometry
• Several problem solving examples consisting of
geometry problems with their axiomatic solutions
Questions:
1. What will Explanation-Based Learning generate from
one of the examples?
2. Are the learned theorems useful and generally
applicable?
3. How could one learn useful theorems from these
examples?
Cristina Boicu
4
Decision-tree learning
1) Give an example of a training set on which ID3 does not
generate the smallest possible decision tree.
Show the result of applying ID3 and also show a smaller
tree.
Hint: The information gain of an attribute is 0 if the ratio
pi/(pi+ni) is the same for all i; otherwise the information
gain is strictly positive.
2) How would you extend the ID3 algorithm to learn from
examples belonging to more than two classes?
Which is the formula for computing the information gain
of an attribute.
Bogdan Stanescu
5
Decision-tree learning
Give a counter example to the heuristic used by the ID3
algorithm for picking the attributes.
Gabriel Balan
6
Decision-tree learning
Training examples for the target
concept PlayTennis:
A decision tree
for the concept
PlayTennis:
Yan Sun
(continues)
7
Answer the following questions true or false, and explain the answer:
1. Is it possible to get ID3 to further elaborate the tree below the
rightmost leaf (and make no other changes to the tree), by adding a
single new correct training example to the original fourteen
examples?
2. Is it possible to get ID3 to learn an incorrect tree (i.e., a tree that is
not equivalent to the target concept) by adding new correct training
examples to the original fourteen ones?
3. Is it possible to produce some set of correct training examples that
will get ID3 to include the attribute Temperature in the learned tree,
even though the true target concept is independent of Temperature?
8
Suppose we want to classify whether a given balloon is inflated based on four attributes: color, size, the “act” of the
person holding the balloon, and the age of the person holding the balloon. Show the decision tree that ID3 would
build to learn this classification. Display the information gain for each candidate attribute at the root of the tree.
Color
Size
Act
Age
Inflated?
Yellow
Small
Stretch
Adult
F
Yellow
Small
Stretch
Child
T
Yellow
Small
Dip
Adult
T
Yellow
Small
Dip
Child
T
Yellow
Small
Dip
Child
F
Yellow
Large
Stretch
Adult
T
Yellow
Large
Stretch
Child
T
Yellow
Large
Dip
Adult
T
Yellow
Large
Dip
Child
F
Yellow
Large
Dip
Child
F
Purple
Small
Stretch
Adult
T
Purple
Small
Stretch
Child
T
Purple
Small
Dip
Adult
T
Purple
Small
Dip
Child
F
Purple
Small
Dip
Child
F
Purple
Large
Stretch
Adult
T
Purple
Large
Stretch
Child
T
Purple
Large
Dip
Adult
T
Purple
Large
Dip
Child
F
Purple
Large
Dip
Child
F
Discussion: In this
problem, there are situations
where the information gain
for each attribute is the
same, we cannot decide
which attribute to choose.
Are there any methods for
these situations ?
Xianjun Hao
9
Imagine the following attributes related to weather:
a. "wind degree" - windy, calm
b. "sun degree" - sunny, cloudy
c. "rain degree" - raining, not-raining
There are 2^3 = 8 possible "weathers" described by these attributes.
Ascribe + or - to each of the combination in such a way, that in
every decision tree the depth of each branch must be equal to
number of attributes (3). How many of such trees exist?
How many such trees exist for n attributes? Why?
Zbigniew Skolicki
10
Consider the following data:
Height
(inches)
Hair Color
Eye Color
Class
61
Brown
Brown
1
63
Brown
Brown
1
69
Brown
Brown
1
74
Brown
Brown
1
67
Brown
Blue
0
63
Blonde
Blue
0
71
Blonde
Brown
0
73
Blonde
Blue
0
Just looking at the table, what concept do you think defines class 1?
Use the ID3 algorithm taught in class to build a decision tree.
(Helpful hints: The entropy of a set whose members all have the same value for the
attribute in question is 0. The entropy of a set which has exactly equal numbers of each
value for the attribute in question is 1.)
(continues)
Charles Day
11
Write out the concept represented by this tree.
Does this rule match your intuitive sense of the concept
represented by the data?
Are you happy with the concept learned using the decision tree?
Why? Do you think this decision tree would do well in classifying
other instances of the concept represented by the data?
What can you say about attributes with a lot of values?
Another method for choosing attributes to split a node uses gain
ratio. Gain ratio is defined as:
Gain/Split Information
where the term Split Information is defined as
12
In ID3, when an attribute has continuous values, one approach for
handling the attribute is to categorize the value into discrete set of
bins. Sometimes, an attribute may have a large set of finite discrete
value that may not render itself to discrete set of bins. For example,
an attribute like retail store name. Each example may have a
different value for the attribute. How should a decision tree
algorithm deal with such a situation?
Decision tree has often been applied in data mining applications. A
marketing company may use consumer data to target a specific
group of people earning certain amount of income or higher. Below
is a set of attributes and associated possible values. What attributes
should be used to create a decision tree that will predict a person’s
salary being above $50K? Remember there are some attributes
containing continuous values and some containing a larges set of
nominal values.
(continues)
Simon Liu
13
age:
workclass:
continuous.
Private, Self-emp-not-inc, Self-emp-inc, Federalgov Local-gov, State-gov, Without-pay, Never
worked.
fnlwgt:
continuous.
education:
Bachelors, Some-college, 11th, HS-grad, Profschool, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th,
Masters, 1st-4th, 10th, Doctorate, 5th-6th,
Preschool.
education-num: continuous.
marital-status: Married-civ-spouse, Divorced, Never-married,
Separated, Widowed, Married-spouse-absent,
Married-AF-spouse.
occupation:
Tech-support, Craft-repair, Other-service, Sales,
Exec-managerial, Prof-specialty, Handlerscleaners, Machine-op-inspct, Adm-clerical,
Farming-fishing, Transport-moving, Priv-houseserv, Protective-serv, Armed-Forces.
relationship:
Wife, Own-child, Husband, Not-in-family, Otherrelative, Unmarried.
race:
White, Asian-Pac-Islander, Amer-Indian-Eskimo,
Other, Black.
sex:
Female, Male.
capital-gain:
continuous.
capital-loss:
continuous.
hours-per-week: continuous.
native-country: United-States, Cambodia, England, Puerto-Rico,
Canada, Germany, Outlying-US(Guam-USVI-etc),
India, Japan, Greece, South, China, Cuba, Iran,
Honduras, Philippines, Italy, Poland, Jamaica,
Vietnam, Mexico, Portugal, Ireland, France,
Dominican-Republic, Laos, Ecuador, Taiwan, Haiti,
Columbia, Hungary, Guatemala, Nicaragua, Scotland,
Thailand, Yugoslavia, El-Salvador,
Trinadad&Tobago, Peru, Hong, Holand-Netherlands.
class:
>50K, <=50K
14
Overview
Your exercises
Some general questions and exercises
Sample questions on version space learning
Sample questions on decision tree learning
Sample questions on other learning strategies
15
Questions
What is an instance?
What is a concept?
What is a positive example of a concept?
What is a negative example of a concept?
Give an intuitive definition of generalization.
What does it mean for concept A to be more general than concept B?
Indicate a simple way to prove that a concept is not more general than
another concept.
Given two concepts C1 and C2, from a generalization point of view,
what are all the different possible relations between them?
16
What is a generalization rule?
What is a specialization rule?
What is a reformulation rule?
Name all the generalization rules you know.
Briefly describe and illustrate with an example the “turning
constants into variables” generalization rule.
Define and illustrate the dropping conditions generalization rule.
17
Questions
Indicate various generalizations of the following sentence:
“A student who has lived in Fairfax for 3 years.”
What could be said about the predictions of a cautious
learner?
What could be said about the predictions of an
aggressive learner?
How could one synergistically integrate a cautious
learner with an aggressive learner to take advantage of
their qualities to compensate for each other’s
weaknesses?
18
Questions
What is the learning bias?
Which are the different types of bias?
19
Exercise
Consider the background knowledge represented by the following
generalization hierarchies and theorem:
any-shape
any-color
warm-color
red
orange yellow
cold-color
blue green black
polygone
round
t riangle rectangle circle ellipse
square
"x"y [(ON x y) => (NEAR x y)]
Show that E1 is more general than E2:
E1 = (COLOR x warm-color) & (SHAPE x round) & (COLOR y red) &
(SHAPE y polygon) & (NEAR x y)
E2 = (COLOR u yellow) & (SHAPE u circle) & (COLOR v red) &
(SHAPE v triangle) & (ON u v) & (ISA u toy) & (ISA v toy)
20
Consider the background knowledge represented by the following generalization
hierarchies and theorem:
any-shape
any-color
warm-color
red
orange yellow
cold-color
blue green black
polygone
round
t riangle rectangle circle ellipse
square
"x"y [(ON x y) => (NEAR x y)]
Consider also the following concept:
E = (COLOR u yellow) & (SHAPE u circle) & (COLOR v red) &
(SHAPE v triangle) & (ON u v) & (ISA u toy) & (ISA v toy) & (HEIGHT u 5)
Indicate six different generalization rules. For each such rule determine an
expression Eg which is more general than E according that that rule.
21
Consider the following two concepts:
C1:
?X
IS
HEAD
COST
SCREW
HEXAGONAL
5
C2:
?X
IS
COST
NUT
6
Indicate different generalization of them.
22
Define the following:
•a generalization of two concepts
•a minimally general generalization of two concepts
•the least general generalization of two concepts
•the maximally general specialization of two concepts.
23
Consider the following concepts:
G1: ?X
?M
?Z
IS
L O U D S P E-C
A KOEMRP O N E N TG2: ?X
MADE-OM
F
?
?M
IS
MATERIAL
?Z
IS
ADHESIVE
G L U E S M?
and the following generalization hierarchies:
IS
L O U D S P E-C
A KOEMRP O N E N T
MADE-OM
F
?
IS
MATERIAL
IS
I N F L A M M-O
A BBLJEE C T
G L U E S M?
L O U D S P E-C
A KOEMRP O N E N T
IS
IS
MEMBRANE
IS
C H A S S-AISSS E M B L Y
BOLT
A D H E S I V ET O X I C - S U B S T AINNCFEL A M M-O
A BBLJEE C T
IS
Indicate four
specializations of G1
and G2 (including
two maximally
general
specializations).
S C O T-T
C HA P E
IS
IS IS IS
IS
IS
IS
S U P E R - G L UMEO W I C O L LC O N T A-ACDTH E S I
MATERIAL
IS
CAOUTCHOUC
IS
PAPER
IS
METAL
24
Overview
Your exercises
Some general questions and exercises
Sample questions on version space learning
Sample questions on decision tree learning
Sample questions on other learning strategies
25
Version Space questions
What happens if there are not enough examples for S
and G to become identical?
Could we still learn something useful?
How could we classify a new instance?
When could we be sure that the classification is the
same as the one made if the concept were completely
learned?
Could we be sure that the classification is correct?
26
Version Space questions
Could the examples contain errors?
What kind of errors could be found in an example?
What will be the result of the learning algorithm if
there are errors in examples?
What could we do if we know that there is at most one
example wrong?
27
Overview
Your exercises
Some general questions and exercises
Sample questions on version space learning
Sample questions on decision tree learning
Sample questions on other learning strategies
28
Questions
What induction hypothesis is made in decision tree learning?
What are some reasons for transforming a decision tree into a set if
rules?
How to change the ID3 algorithm to deal with noise in the examples?
What is overfitting and how could it be avoided?
Compare tree pruning with rule post pruning.
How could one use continuous attributes with decision tree learning?
How to deal with missing attribute values?
29
Questions
Compare the candidate elimination algorithm with the decision tree
algorithm, from the point of view of the generalization language, the
bias, the search strategy and the use of the examples.
What problems are appropriate for decision tree learning?
Which are the main features of decision tree learning?
30
Overview
Your exercises
Some general questions and exercises
Sample questions on version space learning
Sample questions on decision tree learning
Sample questions on other learning strategies
31
Questions
Questions are in the lecture notes corresponding to each learning
strategy.
32