A hybrid SOFM-SVR with a filter

Download Report

Transcript A hybrid SOFM-SVR with a filter

A hybrid SOFM-SVR with a
filter-based feature selection
for stock market forecasting
Huang, C. L. & Tsai, C. Y.
Expert Systems with Applications 2008
Introduction
 Stock
market price index prediction is
regarded as a challenging task of the finance.
 Support vector regression (SVR) has
successfully solved prediction problems in
many domains, including the stock market.
Introduction
 filter-based
feature selection to choose
important input attributes
 SOFM algorithm to cluster the training
samples
 SVR to predict the stock market price index
 Using a real future dataset – Taiwan index
futures (FITX) to predict the next day’s price
index
Introduction
 SOFM+SVR
: to improve the prediction
accuracy of the traditional SVR method and to
reduce its long training time,
 SOFM+SVR+filter-based
feature selection :
improvement in training time, prediction accuracy,
and the ability to select a better feature subset is
achieved.
SVR
 Unlike
pattern recognition problems where the
desired outputs are discrete values (e.g.,
Boolean)
 support vector regression (SVR) deals with
‘real valued’ functions
Self-organizing Feature Maps; SOFM
SOFM
1
2
3
4
Training the SOFM-SVR model
 1.
Scaling the training set
 2.Clustering the training dataset
 3.Training the Individual SVR Models for
Each Cluster
Training the SOFM-SVR model
Parameters Optimization
 setting
of the SVR parameters can improve the
SVR prediction accuracy
 Using RBF kernel and ε-insensitive loss function,
three parameters, C, r, and ε, should be
determined in the SVR model
 The grid search approach is a common method to
search for the C, r, and ε values.
Grid Search Approach
Evaluating the SOFM-SVR model with
test set
 Scale
the test set based on the scaling equation
according to the attribute rage of the training
set
 Find the cluster to which the test sample in the
test set
 Calculate the predicted value for each sample
in the test set
 Calculate the prediction accuracy for the test
set
SOFM-SVR model
SOFM-SVR combined with filterbased feature selection
X
is Certain input variable (i.e. feature)
 Y is response variable (i.e. label)
 n is the number of training samples
SOFM-SVR filter-based feature
selection
Performance measures
Ai is the actual value of sample i
 Fi is a predicted value of sample i
 n is the number of samples.

Experimental data set
SOFM-SVR with various numbers of
clusters in dataset #1
Accuracy measures with various
numbers of clusters
Wilcoxon sign rank test
Wilcoxon sign rank test on the prediction errors for the SOFM-SVR with
various numbers of clusters
Results of SOFM-SVR using three
clusters
Results of SOFM-SVR with selected
features
Original Feature VS. Original Feature
 Original
Feature
 Original
Feature
Wilcoxon sign rank test
Important Feature
 MA10:
10-day moving average.
 MACD9: 9-day moving average convergence/
divergence.
 +DI10: directional indicator up.
 -DI10: directional indicator down.
 K10: 10-day stochastic index K
 PSY10: 10-day psychological line.
 D9: 9-day stochastic index D
Relative importance of the selected
features
Wilcoxon sign rank test: SOFM-SVR
vs. single SVR
MAPE comparison: SOFM-SVR vs.
single SVRs.
Training time comparisons: SOFMSVR vs. single SVRs.
Conclusion
Hybrid SOFM-SVR with filter based feature selection
to improve the prediction accuracy and to reduce the
training time for the financial daily stock index
prediction
 Further research directions are using optimization
algorithms (e.g., genetic algorithms) to optimize the
SVR parameters and performing feature selection
using a wrapper-based approach that combines SVR
with other optimization tools

Thank You