A hybrid SOFM-SVR with a filter
Download
Report
Transcript A hybrid SOFM-SVR with a filter
A hybrid SOFM-SVR with a
filter-based feature selection
for stock market forecasting
Huang, C. L. & Tsai, C. Y.
Expert Systems with Applications 2008
Introduction
Stock
market price index prediction is
regarded as a challenging task of the finance.
Support vector regression (SVR) has
successfully solved prediction problems in
many domains, including the stock market.
Introduction
filter-based
feature selection to choose
important input attributes
SOFM algorithm to cluster the training
samples
SVR to predict the stock market price index
Using a real future dataset – Taiwan index
futures (FITX) to predict the next day’s price
index
Introduction
SOFM+SVR
: to improve the prediction
accuracy of the traditional SVR method and to
reduce its long training time,
SOFM+SVR+filter-based
feature selection :
improvement in training time, prediction accuracy,
and the ability to select a better feature subset is
achieved.
SVR
Unlike
pattern recognition problems where the
desired outputs are discrete values (e.g.,
Boolean)
support vector regression (SVR) deals with
‘real valued’ functions
Self-organizing Feature Maps; SOFM
SOFM
1
2
3
4
Training the SOFM-SVR model
1.
Scaling the training set
2.Clustering the training dataset
3.Training the Individual SVR Models for
Each Cluster
Training the SOFM-SVR model
Parameters Optimization
setting
of the SVR parameters can improve the
SVR prediction accuracy
Using RBF kernel and ε-insensitive loss function,
three parameters, C, r, and ε, should be
determined in the SVR model
The grid search approach is a common method to
search for the C, r, and ε values.
Grid Search Approach
Evaluating the SOFM-SVR model with
test set
Scale
the test set based on the scaling equation
according to the attribute rage of the training
set
Find the cluster to which the test sample in the
test set
Calculate the predicted value for each sample
in the test set
Calculate the prediction accuracy for the test
set
SOFM-SVR model
SOFM-SVR combined with filterbased feature selection
X
is Certain input variable (i.e. feature)
Y is response variable (i.e. label)
n is the number of training samples
SOFM-SVR filter-based feature
selection
Performance measures
Ai is the actual value of sample i
Fi is a predicted value of sample i
n is the number of samples.
Experimental data set
SOFM-SVR with various numbers of
clusters in dataset #1
Accuracy measures with various
numbers of clusters
Wilcoxon sign rank test
Wilcoxon sign rank test on the prediction errors for the SOFM-SVR with
various numbers of clusters
Results of SOFM-SVR using three
clusters
Results of SOFM-SVR with selected
features
Original Feature VS. Original Feature
Original
Feature
Original
Feature
Wilcoxon sign rank test
Important Feature
MA10:
10-day moving average.
MACD9: 9-day moving average convergence/
divergence.
+DI10: directional indicator up.
-DI10: directional indicator down.
K10: 10-day stochastic index K
PSY10: 10-day psychological line.
D9: 9-day stochastic index D
Relative importance of the selected
features
Wilcoxon sign rank test: SOFM-SVR
vs. single SVR
MAPE comparison: SOFM-SVR vs.
single SVRs.
Training time comparisons: SOFMSVR vs. single SVRs.
Conclusion
Hybrid SOFM-SVR with filter based feature selection
to improve the prediction accuracy and to reduce the
training time for the financial daily stock index
prediction
Further research directions are using optimization
algorithms (e.g., genetic algorithms) to optimize the
SVR parameters and performing feature selection
using a wrapper-based approach that combines SVR
with other optimization tools
Thank You