人工知能特論 6．機械学習概論とバージョン空間法

Transcript 人工知能特論 6．機械学習概論とバージョン空間法

Artificial Intelligence
6. Machine Learning, Version Space
Method
Japan Advanced Institute of Science and Technology (JAIST)
Yoshimasa Tsuruoka
Outline
• Introduction to machine learning
– What is machine learning?
– Applications of machine learning
• Version space method
– Representing hypotheses, version space
– Find-S algorithm
– Candidate-Elimination algorithm
• http://www.jaist.ac.jp/~tsuruoka/lectures/
Recognizing handwritten digits
Hastie, Tibshirani and Friedman (2008). The Elements of Statistical Learning
(2nd edition). Springer-Verlag.
Natural language processing
• GENIA tagger
– Tokenization
– Part-of-speech tagging
– Shallow parsing
– Named entity recognition
http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/tagger/
Applications of machine learning
• Image/speech recognition
• Part-of-speech tagging, syntactic parsing, word
sense disambiguation
• Detection of spam emails
• Intrusion detection
• Credit card fraud detection
• Automatic driving
• AI players in computer games
• etc.
Types of machine learning
• Supervised learning
– “correct” output is given for each instance
• Unsupervised learning
– No output is given
– Analyses relations between instances
• Reinforcement learning
– Supervision is given via “rewards”
Application of Unsupervised learning
• Search engine + clustering
http://clusty.com
Reinforcement learning
• Autonomous helicopters and robots
Inverted autonomous helicopter flight via
reinforcement learning, Andrew Y. Ng, Adam Coates,
Mark Diel, Varun Ganapathi, Jamie Schulte, Ben Tse,
Eric Berger and Eric Liang. In International Symposium
on Experimental Robotics, 2004
Quadruped Robot Obstacle Negotiation via Reinforcement
Learning, Honglak Lee, Yirong Shen, Chih-Han Yu, Gurjeet Singh,
and Andrew Y. Ng. In Proceedings of the IEEE International
Conference on Robotics and Automation , 2006
What is machine learning?
• What does a machine learn?
• What “machine learning” can do:
– Classification, regression, structured prediction
– Clustering
• Machine learning involves
– Numerical optimization, probabilistic modeling,
graphs, search, logic, etc.
Why machine learning?
• Why not write rules manually?
– Detecting spam emails
•
•
•
•
If the mail contains the word “Nigeria” then it is a spam
If the mail comes from IP X.X.X.X then it is a spam
If the mail contains a large image then it is a spam
…
• Too many rules
• Hard to keep consistency
• Each rule may not be completely correct
Version space method
Chapter 2 of Mitchell, T., Machine Learning (1997)
•
•
•
•
•
•
Concept Learning
Training examples
Representing hypotheses
Find-S algorithm
Version space
Candidate-Elimination algorithm
Learning a concept with examples
• Training examples
attributes
Ex.
Sky
AirTemp
Humidity
Wind
Water
Forecast
EnjoySport
1
Sunny
Warm
Normal
Strong
Warm
Same
Yes
2
Sunny
Warm
High
Strong
Warm
Same
Yes
3
Rainy
Cold
High
Strong
Warm
Change
No
4
Sunny
Warm
High
Strong
Cool
Change
Yes
• The concept we want to learn
– Days on which my friend Aldo enjoys his favorite
water sports
Hypotheses
• Representing hypotheses
h1 = <Sunny, ?, ?, Strong, ?, ?>
Weather = Sunny, Wind = Strong
(the other attributes can take any values)
h2 = <Sunny, ?, ?, ?, ?, ?>
Weather = Sunny
• General and Specific
h1 is more specific than h2
(h2 is more general than h1)
Find-S Algorithm
1. Initialize h to the most specific hypothesis in H
2. For each positive training instance x
– For each attribute constraint ai in h
• If the constraint ai is satisfied by x
• Then do nothing
• Else replace ai in h by the next more general constraint
that is satisfied by x
3. Output hypothesis h
Example
h0 = <0, 0, 0, 0, 0, 0>
x1 = <Sunny, Warm, Normal, Strong, Warm, Same>, yes
h1 = <Sunny, Warm, Normal, Strong, Warm, Same>
x2 = <Sunny, Warm, High, Strong, Warm, Same>, yes
h2 = <Sunny, Warm, ?, Strong, Warm, Same>
x3 = <Rainy, Cold, High, Strong, Warm, Change>, no
h3 = <Sunny, Warm, ?, Strong, Warm, Same>
x4 = <Sunny, Warm, High, Strong, Cool, Change>, yes
h4 = <Sunny, Warm, ?, Strong, ?, ?>
Problems with the Find-S algorithm
• It is not clear whether the output hypothsis is
the “correct” hypothesis
– There can be other hypotheses that are consistent
with the training examples.
– Why prefer the most specific hypothesis?
• Cannot detect when the training data is
inconsistent
Version Space
• Definition
– Hypothesis space H
– Training examples D
– Version space:
VS H , D  h  H Consistent
 h , D 
– The subset of hypotheses from H consistent with
the training examples in D
LIST-THEN-ELIMINATE algorithm
1. VersionSpace ← initialized with a list
containing every hypothesis in H
2. For each training example, <x, c(x)>
– Remove from VersionSpace any hypothesis h for
which h(x) ≠ c(x)
3. Output the list of hypothesis in VersionSpace
Version Space
• Specific boundary and General boundary
S: { <Sunny, Warm, ?, Strong, ?, ?> }
<Sunny, ?, ?, Strong, ?, ?>
<Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>
G: { <Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> }
• The version space can be represented with S and G.
You don’t have to list all the hypotheses.
Candidate-Elimination algorithm
• Initialization
– G: the set of maximally general hypotheses in H
– S: the set of maximally specific hypotheses in H
• For each training example d, do
– If d is a positive example
• Remove from G any hypothesis inconsistent with d
• For each hypothesis s in S that is not consistent with d
– Remove s from S
– Add to S all minimal generalization h of s such that
» h is consistent with d, and some member of G is more general than h
• Remove from S any hypothesis that is more general than another
hypothesis in S
– If d is a negative example
• …
Example
1st training example
<Sunny, Warm, Normal, Strong, Warm, Same>, yes
S0: { <0, 0, 0, 0, 0, 0> }
S1: { <Sunny, Warm, Normal, Strong, Warm, Same> }
G0, G1: { <?, ?, ?, ?, ?, ?> }
Example
2nd training example
<Sunny, Warm, High, Strong, Warm, Same>, yes
S1: { <Sunny, Warm, Normal, Strong, Warm, Same> }
S2: { <Sunny, Warm, ?, Strong, Warm, Same> }
G0, G1 , G2 : { <?, ?, ?, ?, ?, ?> }
Example
3rd training example
<Rainy, Cold, High, Strong, Warm, Change>, no
S2,S3 :{ <Sunny, Warm, ?, Strong, Warm, Same> }
G3: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> <?, ?, ?, ?, ?, Same> }
G2 : { <?, ?, ?, ?, ?, ?> }
Example
4th training example
<Sunny, Warm, High, Strong, Cool, Change>, yes
S3 :{ <Sunny, Warm, ?, Strong, Warm, Same> }
S4 :{ <Sunny, Warm, ?, Strong, ?, ?> }
G4: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> }
G3: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> <?, ?, ?, ?, ?, Same> }
The final version space
S4 :{ <Sunny, Warm, ?, Strong, ?, ?> }
<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>
G4: { <Sunny, ?, ?, ?, ?, ?> <?, Warm, ?, ?, ?, ?> }