Transcript Slide 1
Machine Learning in Practice Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute Machine Learning? Why should we care? Overwhelmed with data… www.powerfulinformation.org What do we do with all of it? Machine learning is about automatically finding meaningful patterns in data Example for credit history data: Rule predicts who is more likely to have problems paying off credit. But what can machine learning do for me personally? How I got interested… Would you believe I started dating the man who became my husband because of a common interest in machine learning? Apprentices of Wonder: Inside the Neural Network Revolution, by William F. Allman Your TA Kai-min Kevin Chang Language Technologies Institutes Ph.D. [email protected] http://www.cs.cmu.edu/~kkchang/ Office hour: 2:00pm – 3:00pm @ NSH 2507 Machine Learning in Practice – Mind Reading Predicting Human Brain Activity Associated with the Meanings of Nouns (Mitchell et al., 2008) In an object-contemplation task, participants were presented with 60 objects and were instructed to think of the same properties of the stimulus object consistently while being scanned by fMRI machines. Given the evoked neural activity signatures evoked, we can correctly guess what the participants were thinking 70% of the time! Help Machine learning in my work…. Processing conversational data Student1: I don’t understand what to do next. Time Student2: Let me do it. Support Agent: Student2, it looks like your partner could use some help. Triggering feedback for collaborative idea generation… Speaker Text Student 1 People stole sand and stones to use for construction. FEEDBACK AGENT Yes, steeling sand and stones may destroy the balance and thus make mountain areas unstable. Thinking about development of mountain areas, can you think of a kind of development that may cause a problem? Student 2 Development of mountain areas often causes problems. Student 1 It is okay to develop, but there must be some constraints. Individuals+ Individuals+ Feedback NoFeedback Pairs+ Feedback Process Analysis Pairs+ NoFeedback Unique Ideas 8 #Unique Ideas 10 12 Individuals+Feedback Nom+N Nom+F Real+N Real+F Individuals+NoFeedback Why do we care? Conclusion Pairs+Feedback Pairs+NoFeedback 0 2 4 6 If we can understand how our design is affecting behavior differently overfirst time, Don’t offer feedback during the we can get more insight into what is the 5 minutes. most fruitful direction for a redesign. Negative effect of Pairs vs 0 Individuals: F(1,24)=12.22, p<.005, 1 st. dev. Negative effect of Feedback: F(1,24)= 7.23, p<.05, -1.03 st. dev 5 10 15 Time Stamp 20 25 30 Negative effect of Pairs vs Individuals: F(1,24)=4.61, p<.05, .61 st. dev. Positive effect of feedback: F(1,24)=16.43, p<.0005, 1.37 st. dev. How does machine learning work? A slightly more sophisticated rule learner The simplest rule learner will Outlook: willlearn find the feature that gives the most to predict whatever is Sunny -> No information the result class. What the most about frequent result class. Overcast -> Yes do This you think that would be in this case? is called the majority Rainy-> Yes Class. <Feature Name>: What <value> -> <prediction> will the rule be in this case? <value> -> <prediction> It will … always predict yes. What is machine learning? Automatically or semi-automatically Inducing concepts (i.e., rules) from data Finding patterns in data Explaining data Making predictions Data Learning Algorithm Model New Data Classification Engine Prediction What will be the prediction? Model Outlook: Sunny -> No Overcast -> Yes Rainy-> Yes New Data Yes Terminology Concept: the rule you want to learn Instance: one data point from your training or testing data (row in table) Attribute: one of the features that an instance is composed of (column in table) * Compute the predicted value. What do concepts look like? Clarification: Concepts as Lines R S T B X X X X C X X Styles of Learning Classification – learn rules from labeled instances that allow you to assign new instances to a class Association – look for relationships between features, not just rules that predict a class from an instance (more general) Clustering – look for instances that are similar (involves comparisons of multiple features) Numeric Prediction (regression models) 6 Data sets that come with Weka The weather problem: tiny fictitious data set Supposedly helps you predict whether you should go outside to play based on features of the weather Contact lenses: still fake but slightly more realistic Data for telling you what type of contact lenses a person should have based on information about the patient Irises: numeric predictors, nominal target attribute Famous data set from the 50s 50 examples each of 3 types of irises Learn rules for determining which type of iris you have 6 Data sets that come with Weka CPU Performance: both predictors and target are numeric Labor negotiations: predict whether the outcome of negotiations was good or not (nominal predictor) Predict CPU performance based on computer configuration information Real data from labor negotiations in Canada Both nominal and numeric predictors Some missing and noisy data Soybean classification: classic machine learning problem Rules for diagnosing soybean diseases Data taken from questionnaires about soybean diseases Why is this course different from typical machine learning courses? Machine learning researchers focus on general purpose algorithms Data simply provides a level playing field Data “cleaned up” for evaluations Data selected for showing off the algorithm’s strengths Focus on relative results on standard data sets Strive for generalizability across applications Why is this course different from typical machine learning courses? Applied machine learning researchers focus on doing something practical using machine learning Focus on the data representation/ feature construction Focus on understanding the data Focus on usable results Data used “as is” Data selected based on task Dual Focus: Machine Learning and Language Language interactions are important for many types of applications Computer mediated communication Computer supported education Speech based applications My expertise is in computational linguistics Computational linguistics is a good example of an applied machine learning field Rich space of possible data representations Course Objectives Gain an appreciation for what machine learning is and is not Gain competence in applying machine learning technology in a purposeful manner to your research Basic data manipulation skills Data structure design skills Means-ends analysis Solid corpus based experimentation methodology Evaluating and reporting your results Course Objectives Learn problem solving skills for moving forward in the face of difficulties Data interpretation Error analysis skills Hypothesis formation skills Gain accessibility to the primary literature Readings Witten, I. H. & Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques, second edition, Elsevier: San Francisco Jackson, P. & Mouliner, I. (2002). Natural Language Processing for Online Applications: Text Retrieval, Extraction, and Categorization, John Benjamins Publishing Company: Philadelphia Selected readings Will be posted to blackboard in Course Documents Folder Software Tools Weka Open source machine learning toolkit Includes Java API We will do most of our work in this package TagHelper Weka add-on for text processing Developed at CMU We will introduce this tool in Lecture 10 Data manipulation tools Whatever you are comfortable with Scripting language like Perl Excel Course Projects Problem of interest in the real world Lots of data in a machine readable format Well defined question with an unknown answer Should be room for improvement over baseline Deliverables: Proposal, Final Report (Substantial implementation not required!) Some previous projects… Predicting which neighborhoods would become more or less ethnically diverse Identifying people by their handwriting Predicting political affiliation from demographic information Predicting music genre from lyrics Predicting level of grammatical competence from writing samples Assessing level of cultural sensitivity from a newsgroup post Predicting whether someone will be late to their next meeting Predicting whether a student is contributing productively to their working group in a project class Predicting whether a newsgroup post will get a response Predicting whether a newsgroup post was written by a male or female Browsable Summaries Predicting student’s tendency to behave like a “slacker” based on behavior in a message board environment. Predict likelihood of reply For messages posted to online discussion groups. Based on features of the message and target group, such as: message length on-topicness self-disclosing language requests usual group traffic usual group response rate juxtaposing raw data with model predictions, the application would allow you to: 1 View the model’s predictions of destination 2 Compare those predictions with raw sensor data 3 Interactively label a ground truth, for accuracy estimates Grading Quizzes (10%) Weekly assignments (20%) 2 Midterms (10% each) To give you practice on the skills of the week You will get credit for doing these These will be practical exercises meant to test your competence, not focused on memorization Course project (50%) Evaluated based on demonstration of competence, not accuracy of the technology Project proposals: includes a write up of preliminary work (defining the problem, building baseline approach, evaluating baseline performance, error analysis of baseline approach) Final project report will discuss your experimentation process and final results in comparison with baseline approach Questions?