Twitter Mood Predicts the Stock Market
Download
Report
Transcript Twitter Mood Predicts the Stock Market
Twitter Mood Predicts the
Stock Market
Authors: Johan Bollen, Huina Mao, Xiao-Jun Zeng
Presented By:
Krishna Aswani
Computing ID: ka5am
Is it possible to predict Stock Markets??
Early research: Stock markets are based on the Efficient
Market Hypothesis (by new information, i.e. news, rather
than present and past prices) and random walk theory
Recent research: News may be unpredictable but early
indicators can be extracted from online social media
(blogs, Twitter feeds, etc) to predict changes in various
economic and commercial indicators
Method:
Mood Indicators (Daily)
Twitter
Feed
DJIA
Text
Analysis
Phase 1
Granger
Causality
Normalization
SOFNN
t-1
t-2
t-3
Stock Markets (Daily)
t=0
value
F-statistics
p-value
Predicted
Value
MAPE
Direction%
Opinion Finder istime
a softwareseries
Phase1: Creating sentiment
package that classifies tweets
into Positive and Negative.
For each day ratio of total no.
of Positive tweets to total no.
Step 1 – Collecting Public Tweets of
(February
28 to December
19th,
negative tweets
is
calculated 2.7M users), removing
9,853,498 tweets posted by approximately
Google Profile of Mood States
classifies tweets into 6 types:
Calm, Alert, Sure, Vital, Kind &
Happy.
2008
stopwords, normalizing them etc.
Step2- Pass it through Opinion Finder and Google Profile of Mood
States
(GPOMS) to create time series.
Step3 – To have a comparison of time series from Opinion Finder and
Google Profile of Mood States z-score is used to normalize each:
Step 4 – Cross Validating against large socio-cultural events.
Method:
Phase 2
Mood Indicators (Daily)
Twitter
Feed
DJIA
Text
Analysis
Granger
Causality
Normalization
SOFNN
t-1
t-2
t-3
Stock Markets (Daily)
t=0
value
F-statistics
p-value
Predicted
Value
MAPE
Direction%
Phase 2 – Correlation between mood
time series and DJIA
Granger causality analysis
rests on the assumption
that if anormalize
variable X it
causes
Step1- Collect DJIA data for the same time duration,
and
Y then changes in X will
plot a time series.
systematically occur
Step2 - Use Granger causality analysis on model
1 & 2:
before
changes in Y
Correlation does not mean causation
Method:
Mood Indicators (Daily)
Twitter
Feed
DJIA
Text
Analysis
Granger
Causality
Normalization
SOFNN
t-1
t-2
t-3
Stock Markets (Daily)
t=0
value
F-statistics
p-value
Phase 3
Predicted
Value
MAPE
Direction%
Phase 3- Non-linear models for accurate
stock prediction
As the relationship between DJIA and Mood time series doesn’t look
linear, to predict with better accuracy Self Organizing Fuzzy Neural
Network (SOFNN) are used.
Different Permutations of input variables (Mood Time series) are used:
Results:
Calm
Calm
and
Happy
Factors not considered
Geographic Location of Tweets. This approach worked because twitter
base is predominantly located in the US.
These results are strongly indicative of a predictive correlation
between measurements of the public mood states from Twitter feeds,
but offer no information on the causative mechanisms that may
connect online public mood states with DJIA values
It is highly vulnerable to twitter bombing campaigns, which very easily
become viral.
Applications:
Companies like Tower Research Capital (computational
investment trading)
Dataminr (social analytics company)