Multiclass Sentiment Analysis with Restaurant Reviews Moontae Lee and Patrick Grafe OpenTable.com Data Set • Overall Rating (1 to 5 stars) • • • • Food Rating (1
Download ReportTranscript Multiclass Sentiment Analysis with Restaurant Reviews Moontae Lee and Patrick Grafe OpenTable.com Data Set • Overall Rating (1 to 5 stars) • • • • Food Rating (1
Multiclass Sentiment Analysis with Restaurant Reviews Moontae Lee and Patrick Grafe OpenTable.com Data Set • Overall Rating (1 to 5 stars) • • • • Food Rating (1 to 5 stars) Ambiance Rating (1 to 5 stars) Service Rating (1 to 5 stars) Noise Rating (1 to 3) • Data Set statistics • Heavily biased toward 5 star ratings Strategies • • • • • Spell Correction POS Tagging Unigram/Bigram/Trigram Stop Words Pruning Spell Correction Common Spelling Mistakes: • Restaurant: resturant, restuarant, restaurante • Waiter: waitor • Service: sevice, serivce Distance Metrics: • Edit Distance • Levenstein Distance • Keyboard Distance • Sound Distance Parsing Problem Sentences: • The atmosphere is pretty bad and food is quite good • The food, service, and atmosphere were fantastic! Results Training Set Training Set Test Set Accuracy MSE Accuracy Test Set MSE Unigram 84.62% 0.3398 57.36% 0.8231 Unigram with spell check 84.53% 0.3256 57.12% 0.8297 Unigram/Bigram/Trigram 95.65% 0.1058 57.18% 0.9181 Unigram/Bigram/Trigram with 98.66% 0.0321 56.71% 0.9052 95.53% 0.1088 57.27% 0.8984 Unigram/Bigram/Trigram with no 95.38% 0.1110 57.42% 0.8936 0.0304 56.56% 0.8970 pruning Unigram/Bigram/Trigram with spell checking stop words and spell check Unigram/Bigram/Trigram with no 98.77% stop words, pruning, and spell check Conclusions • Inherently Difficult Data Set • More Advanced Techniques Necessary