Transcript Forecasting the BET-C Stock Index with Artificial Neural Networks
DOCTORAL SCHOOL OF FINANCE AND BANKING DOFIN ACADEMY OF ECONOMIC STUDIES
Forecasting the BET-C Stock Index with Artificial Neural Networks
July 2006 MSc Student: Stoica Ioan-Andrei Supervisor: Professor Moisa Altar
Stock Markets and Prediction
Predicting stock prices - goal of every investor trying to achieve profit on the stock market predictability of the market - issue that has been discussed by a lot of researchers and academics Efficient Market Hypothesis - Eugene Fama three forms: Weak: future stock prices can’t be predicted using past stock prices Semi-strong: even published information can’t be used to predict future prices Strong: market can’t be predicted no matter what information is available
Stock Markets and Prediction
Technical Analysis ‘castles-in-the air’ investors behavior and reactions according to these anticipations Fundamental Analysis ‘firm foundations’ stocks have an intrinsic value determined by present conditions and future prospects of the company Traditional Time Series Analysis uses historic data attempting to approximate future values of a time series as a linear combination Machine Learning - Artificial Neural Networks
The Artificial Neural Network
computational technique that benefits from techniques similar to those employed in the human brain 1943 - W.S. McCulloch and W. Pitts attempted to mimic the ability of the human brain to process data and information and comprehend patterns and dependencies The human brain - a complex, nonlinear and parallel computer The neurons: elementary information processing units building blocks of a neural network
The Artificial Neural Network
semi-parametric approximation method Advantages: ability to detect nonlinear dependencies parsimonious compared to polynomial expansions generalization ability and robustness no assumptions of the model have to be made flexibility Disadvantages: has the ‘black box’ property training requires an experienced user training takes a lot of time, fast computer needed overtraining undertraining overfitting underfitting
The Artificial Neural Network
y
f
(
x
) sin
x
ln
x
The Artificial Neural Network
y
sin
x
The Artificial Neural Network
Overtraining/Overfitting
The Artificial Neural Network
Undertraining/Underfitting
Architecture of the Neural Network
Types of layers: input layer: number of neurons = number of inputs output layer: number of neurons = number of outputs hidden layer(s): number of neurons = trial and error Connections between neurons: fully connected partially connected The activation function: threshold function piecewise linear function sigmoid functions
The feed forward network
n k
,
t
k
, 0
i n
1
k
,
i
*
x i
,
t N k
,
t
L
(
n k
,
t
) 1 1
e
n k
,
t
^
y t
0
k m
1
k
*
N k
,
t
m = number of hidden layer neurons n = number of inputs
The Feed forward Network with Jump Connections
n k
,
t
k
, 0
i n
1
k
,
i
*
x i
,
t N k
,
t
L
(
n k
,
t
) 1 1
e
n k
,
t
^
y t
0
k m
1
k
*
N k
,
t
i n
1
i x i
,
t
The Recurrent Neural Network - Elman
n k
,
t
k
, 0
i n
1
k
,
i
*
x i
,
t
k m
1
k
*
n k
,
t
1
N k
,
t
L
(
n k
,
t
) 1 1
e
n k
,
t
^
y t
0
k m
1
k
*
N k
,
t
allows the neurons to depend on their own lagged values building ‘memory’ in their evolution
Training the Neural Network
Objective: minimizing the discrepancy between real data and the output of the network min ( )
t T
1 (
y t
^
y t
) 2 ^
y t
f
(
x t
; ) Ω - the set of parameters Ψ – loss function Ψ nonlinear nonlinear optimization problem - backpropagation - genetic algorithm
The Backpropagation Algorithm
alternative to quasi-Newton gradient descent Ω 0 – randomly generated ( 1 0 ) 0 ρ – learning parameter, in [.05,.5] after n iterations: μ=0.9, momentum parameter
n
n
1
n
1 (
n
1
n
2 ) problem: local minimum points
The Genetic Algorithm
based on Darwinian laws Population Creation: N random vectors of weights
Selection
(Ωi Ωj) parent vectors
Crossover & Mutation
C1,C2 children vectors Election Tournament: the fittest 2 vectors passed to the next generation Convergence: G* generations G* - large enough so there are no significant changes in the fitness of the best individual for several generations
Experiments and Results
Data
BET-C stock index – daily closing prices, 16 April 1998 until 18 May 2006 daily returns:
R t
ln
P t
ln
P t
1 ln
P t P t
1 conditional volatility - rolling 20-day standard deviation:
V t
i
20 1 (
R t
i
_
R t
) 2 19 BDS-Test for nonlinear dependencies: H 0 : i.i.d. data BDS m,ε ~N(0,1) Series m=2 ε=1 ε=1.5
ε=1 m=3 ε=1.5
OD ARF 16.6526
16.2626
17.6970
17.2148
18.5436
18.3803
18.7202
18.4839
ε=1 m=4 ε=1.5
19.7849
19.7618
19.0588
18.9595
Experiments and Results
3 types of Ann's: feed-forward network feed-forward network with jump connections recurrent network Input: [Rt-1 Rt-2 Rt-3 Rt-4 Rt-5] & Vt Output: next-day-return Rt Training: genetic algorithm & backpropagation Data divided in: training set – 90% test set – 10% one-day-ahead forecasts - static forecasting Network: trained 100 times best 10 – SSE best 1 - RMSE
Experiments and Results
Evaluation Criteria
In-sample Criteria
R
2
t T
1 ( ^
y t t T
1 (
y t
y t
) 2
y t
) 2 1
t T
1 (
y t t T
1 (
y t
^
y t
) 2
y t
) 2 Out-of-sample Criteria
RMSE
t T
1 (
y t T
^
y t
) 2
MAE
1
T t T
1
SSE
ˆ
y t
y t
t T
1 (
y t
^
y t
) 2
t T
1
t
2
HR
I t
1 ,
if t T
1
I t y t
^
y t
0 0
ROI
t T
1
y t sign
( ^
y t
)
RP
t T
1
y t sign
( ^
y t
)
t T
1 |
y t
| Pesaran-Timmerman Test for Directional Accuracy: H 0 : signs of the forecast and those of the real data are independent DA~N(0,1)
Experiments and Results
ROI - trading strategy based on the sign forecasts: + buy sign - sell sign Finite differences:
y
x i
f
(
x
1 ,...,
x i
h i
,...,
x n
)
f
(
x
1 ,...,
x i
,...
x n
)
h i
Benchmarks
Naïve model: R t+1 =R t buy-and-hold strategy AR(1) model – LS – overfitting: RMSE MAE
h i
10 6
R 2 SSE RMSE MAE HR ROI RP PT-Test B&H
Experiments and Results
AR(1) FFN – no vol FFN FFN-jump Naïve RN 0.015100
0.011948
55.77% (111) 0.079257
0.332702
0.011344
0.008932
56.78% (113) 0.083252
0.331258
0.011325
0.008929
57.79% (115) 0.083755
0.331077
0.011304
0.008873
59.79% (119) 0.084827
0.330689
0.011332
0.008867
59.79% (119) 0.091762
0.328183
0.011319
0.008892
59.79% (119) 0.265271
15.02% 0.2753
0.255605
14.47% 0.2753
Volatility 0.318374
18.02% 14.79
0.2753
FFN -0.1123
0.351890
FFN-jump -0.1358
0.331464
18.77% 15.01
0.2753
RN -0.1841
0.412183
23.34% 14.49
0.2753
Experiments and Results
Actual, fitted ( training sample)
Experiments and Results
Actual, fitted ( test sample)
Conclusions
RMSE and MAE < AR(1) no signs of overfitting R 2 < 0.1 forecasting magnitude is a failure sign forecasting ~60% success Volatility: improves sign forecast finite differences negative correlation perceived as measure of risk trading strategy: outperforms naïve model and buy-and-hold quality of the sign forecast – confirmed by Pesaran-Timmerman test
Further development
Volatility: other estimates neural classificator: specialized in sign forecasting using data outside the Bucharest Stock Exchange: T-Bond yields exchange rates indexes from foreign capital markets