13 Twitter Sentiment Analysis

Download Report

Transcript 13 Twitter Sentiment Analysis

DATA SCIENCE
MIS0855 | Spring 2016
Twitter Sentiment Analysis
SungYong Um
[email protected]
Example: Programmbleweb
Method I: Crawling
Method I: Crawling
Method I: Crawling
Example: Twitter
• Dataset: Timelines (Hillary Clinton, Donald Trump,
Marco Rubio, and Ted Cruz)
• Period: December 1 – January 31
• Sample size: The total number of documents=104,726
The total number of terms=5,799
Example: Twitter
Frequency Figure
win
vote
trump
support
stop
rubio
right
president
poll
Terms
peopl
obama
makeamericagreatagain
lie
hillary
help
gop
donald
debat
carson
candidate
book
american
america
0
5000
10000
count
15000
20000
Topic Modeling
Find 5 topics and first 10 terms of each topic:
Topic 1
Topic 2
Topic 3
Topic 4
Topic 5
trump
america
presid
debat
hillari
support
gopdebat
poll
win
republican
makeamericagreatagain
american
isis
mani
countri
donald
famili
obama
marco
plan
iowa
demdeb
right
word
candid
interview
live
lead
nice
state
vote
work
report
total
team
nation
join
gop
rubio
clinton
meet
fight
job
stori
care
crowd
sign
gun
tri
leader
The probabilities of each topic
“After the events in Paris and with thousands of gun deaths in the US each year,
hard to fathom this from the GOP. https://t.co/CQp8sCKx7G -H”
(0.186, 0.169, 0.305, 0.169, 0.169)
“RT @TheBriefing2016: There was only one candidate on stage at the #demdebate last
weekend who wouldn't raise middle class taxes. https://t.&”
(0.2, 0.2,0.182,0.218,0.2)
“RT @TheBriefing2016: Hey folks! Laura Rosenberger here, Hillary‘s foreign policy
advisor (@rosenbergerlm). I’m taking over the briefing fo&”
(0.169,0.169,0.169,0.169,0.322)
Try with your own
Retrieve Tweets
using a spreadsheet
add-in in Google
Drive
Copy the tweets
from Google
Drive to a special
Excel workbook
Excel classifies
the Tweets as
positive or
negative