Forecasting Crude Oil Price (Revisited)

Transcript Forecasting Crude Oil Price (Revisited)

Forecasting Crude Oil Price
(Revisited)
Imad Haidar* and Rodney C. Wolff
* PhD student
The University of Queensland, Brisbane, QLD
4072, Australia
E-mail: [email protected],
[email protected]
This paper attempts to answer the
following questions:
• What type of dynamics is governing crude oil prices and
returns?
– Specifically, we investigate if there is any non-linear deterministic
dynamics (chaos) which could be miss specified as a random walk.
• From a statistical point of view have the dynamics of crude oil
returns changed significantly during the past twenty years?
• Do we have strong empirical evidence that crude oil spot
returns are predictable in the short-term?
• Can we forecast the direction of crude oil return for multisteps ahead?
Data
• Crude oil daily spot prices/ returns for West Texas
Intermediate (WTI),
• official closing price are from 2 January 1986 to 2 March 2010
(6194 daily observation).
• The data were retrieved on 11 March 2010 from the Energy
Information Administration (EIA).
Diagnostic Tests
• The autocorrelation (ACF) is much more evident in the
squared log-returns, especially for return II;
• AFC was significantly over the upper confidence level. This
could present evidence of heteroskedasticity.
• The Augmented Dicky-Fuller and Phillips–Perron test for crude
oil price and return at 1% significant level are as follows:
– Crude oil price for the whole series from January 1986 to March 2010
is integrated of the first order, or I(1).
– Crude oil price for the first subsection from January 1986 to January
1998 is I(0).
– Crude oil price for the second subsection from end of January 1998March 2010 is I(1),
– The returns for all subsections are I(0).
Testing for non-linearity
Three tests:
• The Brock, Dechert and Scheinkman (BDS) test (Brock et al.
1996);
• The Fuzzy Classification System (FCS), by Kaboudan (1999);
• The Time Domain Test for Non-linearity, by Barnett and Wolff
(2005);
The BDS test
W (m ,  , n) 
B (m ,  , n)
m
where
B (m ,  , n) 
n [ C ( m ,  , n )  C (1,  , n ) ]
m
• The Correlation Integral measures how often a temporal
pattern appears in the data.
• The null hypothesis is that the data are pure whiteness (iid).
The BDS test
Return I
ε =1
ε =0.5
m
2
5
8
W
63.28235
277.9648
1569.915
SIG
0
0
0
ε =0.5
m
2
5
8
W
69.901
370.17
2574.9
SIG
0
0
0
ε =1.5
W
SIG
W
SIG
32.41485
0
28.294
0
63.43841
0
41.301
0
113.0694
0
51.747
0
Return II
ε =1
ε =1.5
W
30.065
80.356
194.74
SIG
0
0
0
W
21.751
37.184
57.465
SIG
0
0
0
ε =2
W
24.21126
34.10912
38.16579
SIG
0
0
0
ε =2
W
19.484
30.265
36.404
SIG
0
0
0
The BDS test
Return III
m
W
SIG
2
16.08305
0
5
30.51802
0
8
47.82609
0
The FCS test
ARIMA  R
  ( M  1)
Class
SL

NL  HN
1
2
M
ln C y ( m , e1 )  ln C y ( m , e 2 )
m2
ln C s ( m , e1 )  ln C s ( m , e 2 )

Create fuzzy Membership rules
Membership class
Degree of Membership
 R 2  0 . 70



2
0
.
70

R

0
.
90


 2

 R  0 . 90

 0



2
  (1  ( 0 . 90  R ) / 0 . 2 ) 
 1




  1



0 . 95    1




 0 . 85    0 . 95 
 0 . 75    0 . 85 


  0 . 75









 



(  0 . 95 ) / 0 . 05 )


1

(1  ( 0 . 85   ) / 0 . 1) 


0
0
Source: Kaboudan (1999)
Rule no.
1 
 
2
3 
 

 35 
 
36
 
 
 37 
 38 
 
 39 
The FCS results
Data set
Oil price (all)
Oil price I
Oil price II
Oil return (all)
Oil return I
Oil return II
3-MA return II*
Wavelet
return II**
Fitted ARIMA
Simple
ARIMA(3,1,3)
Simple
ARIMA(0,0,5)
ARIMA(3,0,3)
ARIMA(2,0,0)
ARIMA (2,0,6)
ARIMA (3, 0, 3)
R2
0.99
0.90
0.90
0.005
0.01
0.02
0.77
0.018
θ
0.96
0.94
0.87
0.96
0.93
0.99
0.00
0.29
Decision
SL-NL-HN
SL-NL-HN
SL-NL-HN
NL-NL-HN
NL-NL-HN
NL-WN
SL-NL
WL-NL
SL: strongly linear; NL: non-linear; HN: high noise; WN: white noise.
* is smoothed return II with a simple three days moving average.
** is filtered return II using a wavelet filter.
The FCS over time
Testing for Chaos
• Chaos is characterize by sensitivity of a time series to the changes in
the initial condition.
• Lyapunov exponents (LE) is a quantitative measure of the existence
of chaos.
– Let 𝑥𝑡+1 = 𝑓 𝑋𝑡 + 𝑒𝑡 be a dynamical system where
𝑋𝑡 : 𝑥𝑡 , 𝑥𝑡−1 , … , 𝑥𝑡−𝑑+1 , 𝑑 ≥ 1 are the data, e is a random error and f is a
non-linear function. Also, let 𝐽𝑡 = 𝐷𝑓(𝑋𝑡 ) be the Jacobian matrix of f, and
𝑇𝑚 = 𝐽𝑚 𝐽𝑚−1 , … , 𝐽1 = 𝐷𝑓 𝑚 , 𝑚 = 1, 2, … ,
– Lyapunov exponents λ are estimated as
1
ln
𝑚⟶∞ 𝑚
– 𝜆𝑖 𝑋 = lim
𝑎𝑖 𝑚, 𝑋 , 𝑖 = 1,2, … , 𝑑
– where 𝑎𝑖 𝑚, 𝑋 represents the 𝑖 𝑡ℎ largest eigenvalue of the Jacobian
matrix 𝐽𝑡
Lyapunov exponents
Return
Lyapunov Exponent
m
d1
d2
d3
d4
d5
d6
3MA return
λ
0.0924
0.1951
3.48E-19
0.0204
0.0318
9.74E-20
Lyapunov Exponent
m
d1
d2
d3
d4
d5
d6
λ1
0.07
0.3405
0.1101
0.0766
0.1353
2.82E-19
99% Confidence level
1000 times bootstrap
Highest
Lowest
1.45E-18
0
0.063865
-2.20E-20
0.078794
-1.10E-19
0.176158
0.021725
1.875162
-1.30E-18
0.648368
-0.0502
99% Confidence level
1000 time bootstrap
Highest
Lowest
1.32E-18
0
0.062429
-3.10E-20
0.046871
-8.40E-20
0.127102
-1.40E-19
2.08148
-9.30E-19
0.594717
-0.03211
Lyapunov exponents
• We cautiously conclude that, the dynamics of crude oil
returns series are non-linear deterministic, possibly chaotic.
• This conclusion contradicts the findings of Moshiri and
Foroutan (2006) in which they found no evidence of chaos in
crude oil futures price.
• It is important to note that Moshiri and Foroutan (2006) were
testing LE using raw price of crude oil futures contracts and
not the spot return.
FORECASTING
We use three types of Models:
• ARIMA
• EGARCH
• ANN
ANN
• ANN were designed in an attempt to imitate the human brain
functionality;
• the fundamental idea of ANN is to learn the desirable
behaviour from the data with no a priori assumptions.
• From an econometrics view, ANN falls in the non-linear, nonparametric, and multivariate group of models (Grothmann
2002).
• This makes it a suitable approach to model non-linear
relationship in high dimensional space.
ANN (cont.)
j
ui  b 
x
j
w ji
i
•
Input layer
Input
Hidden layer
•
•
weights
X1
ws1
Summing
junction
x2
ws2
Σ
x3
Output layer
Activation function
y
ws3
b
Base
xn
wsn
b
Output
•
•
y i  f (u i )
Training neural network in the
backpropagation paradigm involves
continuous change to the values of
the network parameters in the
direction that reduces the error
between the input and the target
𝑤𝑡+1 = 𝑤𝑡 + ∆𝑤𝑡
𝑤𝑇= 𝑤0 + 𝑇−1
𝑡=0 ∆𝑤𝑡
𝜕𝐸
∆𝑤 = −𝜆𝛿 = −𝜆
,
𝜕𝑤
where w is the network weights, t is
the current step, T the number of
iterations, λ is the learning rate (step
size), 𝛿 is the gradients of the error
surface, E the global error
ANN results
Out-of-sample
Hit rate (%)
Benchmark
Squared
Wav1
Wav2
3 MA
48.82
55.12605
61.99
74.97
79.67
RMSE
0.0356
0.032827
0.0337
0.0196
0.0131
R2
0.0182
0.004447
0.0293
0.0837
0.6219
IC
0.7529
0.965628
0.7554
0.9693
0.806
MSE
0.0013
0.001164
0.0011
0.0004
0.0002
MAE
0.0253
0.02222
0.0233
0.0114
0.0093
SSE
0.3756
0.27694
0.3208
0.1127
0.0829
DA
-0.4304
1.51874
4.011
8.4024
13.45
0.6665
0.091106
0.0008
0
0
P-value
ANN results (cont.)
Metric
3 days MA
5 days MA
ANN
RW
ANN
RW
Hit rate
79.67
72.40265
83.52
80.97166
RMSE
0.0131
0.012578
0.0066
0.011199
R2
0.6219
0.408819
0.7302
0.611197
Multi-steps Forecast
y (t )
yˆ ( t  1 )

Input layer
y (t )
yˆ ( t  2 ) 

Hidden layer
Output layer
y (t  q )
yˆ ( t  1 )

yˆ ( t   )

Input layer
Hidden layer
Output layer
y ( t  q 1)

Input layer
where q is the number of lags
and  is the forecast horizon
Hidden layer
Output layer
Multi-steps Forecast
Forecast horizon
Hit rate renege
Confidence limit
Mean hit rate for
1000 tests
19 days
52-60%
95%
56%
20 days
52-60%
95%
56%
21 days
52-59%
95%
56%
22 days
52-58%
95%
55%
23 days
51-57%
95%
54%
24 days
52-58%
95%
54%
25 days
52-58%
95%
55%
Conclusion
• The BDS statistic indicates the existence of non-linear
behaviour in all crude oil prices and returns subseries .
• The FCS test also suggests that the dynamics of crude oil
series are non-linear stochastic.
• Finally, the Lyapunov exponents for crude oil returns (and
smoothed returns) highlights the possibility of low
dimensional deterministic dynamics, i.e., chaos.
• The Lyapunov exponent results could explain the randomwalk like behaviour of the crude oil return.
Conclusion (cont.)
• Several data transformations and smoothing with the hope
that we could reduce the noise and highlight certain
dynamics, such as mean reversion.
• Our empirical results showed that some of these measures
are effective in improving the forecast accuracy.
• We show that for smoothed data multi-step forecasting is
possible (for 19-25 steps ahead) with reasonable accuracy.
Thank you