Transcript Slide 1
Machine Learning
in Practice
Carolyn Penstein Rosé
Language Technologies Institute/
Human-Computer Interaction
Institute
Machine Learning?
Why should we care?
Overwhelmed with data…
www.powerfulinformation.org
What do we do with all of it?
Machine learning is about
automatically finding meaningful
patterns in data
Example for credit history data:
Rule predicts who is more likely to have problems
paying off credit.
But what can
machine learning
do for me
personally?
How I got interested…
Would you believe I started dating
the man who became my husband
because of a common interest in
machine learning?
Apprentices of Wonder: Inside the Neural Network Revolution, by William F. Allman
Your TA
Kai-min Kevin Chang
Language
Technologies Institutes Ph.D.
[email protected]
http://www.cs.cmu.edu/~kkchang/
Office hour: 2:00pm – 3:00pm @ NSH 2507
Machine Learning in Practice
– Mind Reading
Predicting Human Brain Activity Associated with the
Meanings of Nouns (Mitchell et al., 2008)
In an object-contemplation task, participants were presented with
60 objects and were instructed to think of the same properties of
the stimulus object consistently while being scanned by fMRI
machines.
Given the evoked neural activity signatures evoked, we can
correctly guess what the participants were thinking 70% of the
time!
Help
Machine learning in my work….
Processing conversational data
Student1: I don’t
understand what to
do next.
Time
Student2:
Let me do it.
Support Agent: Student2,
it looks like your partner
could use some help.
Triggering feedback for
collaborative idea generation…
Speaker
Text
Student 1
People stole sand and stones to use for construction.
FEEDBACK
AGENT
Yes, steeling sand and stones may destroy the balance and
thus make mountain areas unstable. Thinking about
development of mountain areas, can you think of a kind of
development that may cause a problem?
Student 2
Development of mountain areas often causes problems.
Student 1
It is okay to develop, but there must be some constraints.
Individuals+ Individuals+
Feedback
NoFeedback
Pairs+
Feedback
Process Analysis
Pairs+
NoFeedback
Unique Ideas
8
#Unique Ideas
10
12
Individuals+Feedback
Nom+N
Nom+F
Real+N
Real+F
Individuals+NoFeedback
Why do we care?
Conclusion
Pairs+Feedback
Pairs+NoFeedback
0
2
4
6
If we can understand how our design
is affecting
behavior
differently
overfirst
time,
Don’t offer
feedback
during the
we can get more
insight into what is the
5 minutes.
most fruitful direction for a redesign.
Negative effect of Pairs vs 0
Individuals:
F(1,24)=12.22, p<.005, 1 st. dev.
Negative effect of Feedback:
F(1,24)= 7.23, p<.05, -1.03 st.
dev
5
10
15
Time Stamp
20
25
30
Negative effect of Pairs vs Individuals:
F(1,24)=4.61, p<.05, .61 st. dev.
Positive effect of feedback:
F(1,24)=16.43, p<.0005, 1.37 st. dev.
How does machine learning work?
A slightly
more
sophisticated
rule
learner
The simplest
rule
learner
will
Outlook:
willlearn
find the
feature
that
gives
the
most
to predict
whatever
is
Sunny -> No
information
the
result
class.
What
the most about
frequent
result
class.
Overcast -> Yes
do This
you think
that
would
be
in
this
case?
is called
the
majority
Rainy-> Yes
Class.
<Feature Name>:
What
<value> -> <prediction>
will the
rule
be
in
this
case?
<value> -> <prediction>
It will …
always predict yes.
What is machine learning?
Automatically or semi-automatically
Inducing
concepts (i.e., rules) from data
Finding patterns in data
Explaining data
Making predictions
Data
Learning Algorithm
Model
New Data
Classification Engine
Prediction
What will be the prediction?
Model
Outlook:
Sunny -> No
Overcast -> Yes
Rainy-> Yes
New Data
Yes
Terminology
Concept: the rule you
want to learn
Instance: one data
point from your training
or testing data (row in
table)
Attribute: one of the
features that an
instance is composed
of (column in table)
* Compute the predicted value.
What do concepts look like?
Clarification: Concepts as Lines
R
S
T
B
X
X
X
X
C
X
X
Styles of Learning
Classification – learn rules from labeled
instances that allow you to assign new
instances to a class
Association – look for relationships between
features, not just rules that predict a class
from an instance (more general)
Clustering – look for instances that are
similar (involves comparisons of multiple
features)
Numeric Prediction (regression models)
6 Data sets that come with Weka
The weather problem: tiny fictitious data set
Supposedly
helps you predict whether you should go
outside to play based on features of the weather
Contact lenses: still fake but slightly more realistic
Data
for telling you what type of contact lenses a
person should have based on information about the
patient
Irises: numeric predictors, nominal target attribute
Famous
data set from the 50s
50 examples each of 3 types of irises
Learn rules for determining which type of iris you have
6 Data sets that come with Weka
CPU Performance: both predictors and target are numeric
Labor negotiations: predict whether the outcome of
negotiations was good or not (nominal predictor)
Predict CPU performance based on computer configuration
information
Real data from labor negotiations in Canada
Both nominal and numeric predictors
Some missing and noisy data
Soybean classification: classic machine learning problem
Rules for diagnosing soybean diseases
Data taken from questionnaires about soybean diseases
Why is this course different from
typical machine learning courses?
Machine learning researchers focus on general
purpose algorithms
Data
simply provides a level playing field
Data “cleaned up” for evaluations
Data selected for showing off the algorithm’s strengths
Focus
on relative results on standard data sets
Strive for generalizability across applications
Why is this course different from
typical machine learning courses?
Applied machine learning researchers focus on
doing something practical using machine learning
Focus
on the data representation/ feature construction
Focus on understanding the data
Focus on usable results
Data used “as is”
Data selected based on task
Dual Focus: Machine Learning and
Language
Language interactions are important for
many types of applications
Computer
mediated communication
Computer supported education
Speech based applications
My expertise is in computational linguistics
Computational
linguistics is a good example of
an applied machine learning field
Rich space of possible data representations
Course Objectives
Gain an appreciation for what machine learning is
and is not
Gain competence in applying machine learning
technology in a purposeful manner to your
research
Basic
data manipulation skills
Data structure design skills
Means-ends analysis
Solid corpus based experimentation methodology
Evaluating and reporting your results
Course Objectives
Learn problem solving skills for moving forward in
the face of difficulties
Data
interpretation
Error analysis skills
Hypothesis formation skills
Gain accessibility to the primary literature
Readings
Witten, I. H. & Frank, E. (2005). Data Mining:
Practical Machine Learning Tools and
Techniques, second edition, Elsevier: San
Francisco
Jackson, P. & Mouliner, I. (2002). Natural
Language Processing for Online Applications:
Text Retrieval, Extraction, and Categorization,
John Benjamins Publishing Company:
Philadelphia
Selected readings
Will
be posted to blackboard in Course Documents
Folder
Software Tools
Weka
Open
source machine learning toolkit
Includes Java API
We will do most of our work in this package
TagHelper
Weka
add-on for text processing
Developed at CMU
We will introduce this tool in Lecture 10
Data manipulation tools
Whatever
you are comfortable with
Scripting language like Perl
Excel
Course Projects
Problem of interest in the real world
Lots of data in a machine readable format
Well defined question with an unknown
answer
Should be room for improvement over
baseline
Deliverables: Proposal, Final Report
(Substantial implementation not required!)
Some previous projects…
Predicting which neighborhoods would become more or less
ethnically diverse
Identifying people by their handwriting
Predicting political affiliation from demographic information
Predicting music genre from lyrics
Predicting level of grammatical competence from writing
samples
Assessing level of cultural sensitivity from a newsgroup post
Predicting whether someone will be late to their next meeting
Predicting whether a student is contributing productively to
their working group in a project class
Predicting whether a newsgroup post will get a response
Predicting whether a newsgroup post was written by a male
or female
Browsable Summaries
Predicting student’s tendency to behave like a “slacker”
based on behavior in a message board environment.
Predict likelihood of reply
For
messages posted to online discussion
groups. Based on features of the message and
target group, such as:
message length
on-topicness
self-disclosing language
requests
usual group traffic
usual group response rate
juxtaposing raw data with model predictions, the application would allow you to:
1
View the model’s predictions of destination
2
Compare those predictions with raw sensor data
3
Interactively label a ground truth, for accuracy estimates
Grading
Quizzes (10%)
Weekly assignments (20%)
2 Midterms (10% each)
To give you practice on the skills of the week
You will get credit for doing these
These will be practical exercises meant to test your competence,
not focused on memorization
Course project (50%)
Evaluated based on demonstration of competence, not accuracy of
the technology
Project proposals: includes a write up of preliminary work (defining
the problem, building baseline approach, evaluating baseline
performance, error analysis of baseline approach)
Final project report will discuss your experimentation process and
final results in comparison with baseline approach
Questions?