STATE SPACE MODELS AND STOCHASTIC VOLATILITY

Download Report

Transcript STATE SPACE MODELS AND STOCHASTIC VOLATILITY

Chapter 2. Unobserved
Component models
Esther Ruiz
2006-2007
PhD Program in Business Administration
and Quantitative Analysis
Financial Econometrics
2.1 Description and properties
Unobserved component models assume that the
variables of interest are made of components
with a direct interpretation that cannot be
directly observed
Applications in finance:
“Fads” model of Potterba and Summers (1998).
There are two types of traders: informed (μt)
and uninformed (εt). The observed price is yt
y t  t   t
t  t 1  t
Models for Ex ante interest differentials proposed by
Cavaglia (1992): We observe the ex post interest
differential which is equal to the ex ante interest
differential plus the cross-country differential in inflation
y t  y t*   t
y t*   (L )y t*1  t
Factor models simplify the computation of the covariance
matrix in mean-variance portfolio allocation and are
central in two asset pricing theories: CAPM and APT
 r1t   v1t   11t ... 1kt   f1t   1t 
 r  v   
   
 2t    2t    21t ...  2 kt  f2t    2t 
 ...   ...   ... ... ...   ...   ... 
    
   
rNt  v Nt    N1t ...  Nkt  fkt   Nt 




E ft ft ' | rt 1   t , E  t  t' | rt 1  t
k N
Term structure of interest rates model proposed by Rossi
(2004): The observed yields are given by the theoretical
rates implied by a no arbitrage condition plus a
stochastic disturbance
 y t (1 ) 
 y ( )
 t 2    At  Bt rt  Ct ut   t
 ... 


 y t ( n )
~t 



exp

b

t

exp

a

~
r
 t 1  exp a t 
 rt 
~


a b
u  
u   ct  t
 u 
 t 1  
0
exp bt 

Modelling volatility
There are two main types of models to represent the
dynamic evolution of volatilities:
i)
GARCH models that assume the volatility is a nonlinear funcion of past returns
yt   tt
 t2    y t21   t21
E y t | Yt 1   E  t  t | Yt 1    t E  t | Yt 1   0
Var y t | Yt 1   E  t2 t2 | Yt 1    t2E  t2 | Yt 1    t2
σ is the one-step ahead (conditional) variance and, therefore, can be
observed given observations up to time t-1.
As a result, classical inference procedures can be implemented.
Example: Consider the following GARCH(1,1) model for the IBEX35
returns
 t2  0.019  0.103y t21  0.889 t21
The returns corresponding to the first two days in the sample are:
0.21and -0.38
E ( y 2 | y1  0.21)  0
Var ( y 2 | y1  0.21)  0.019  0.103 * 0.212  0.889 * 2.38  2.14
E ( y 3 | y1  0.21, y 2  0.38)  0
Var ( y 3 | y1  0.21, y 2  0.38)  0.019  0.103 * 0.382  0.889 * 2.14
 1.95
In this case, there are not unobserved components but consider the
model for fundamental prices with GARCH errors
y t  t   t
t  t 1  t
 t   t ht , ht   0  1 t21
t  tqt , qt   0  1t21
In this case, the variances of the noises cannot be observed with
information available at time t-1
ii) Stochastic volatility models assume that the volatility
represents the arrival of new information into the
market and, consequently, it is unobserved
y t   * t t
log( t2 )   log( t21 )  t
Both models are able to represent
a) Excess kurtosis
b) Autocorrelations of squares: small and persistent
Although the properties of SV models are more attractive
and closer to the empirical properties observed in real
financial returns, their estimation is more complicated
because the volatility, σt, cannot be observed one-stepahead.
2.2 State space models
The Kalman filter allows the estimation of the underlying
unobserved components.
To implement the Kalman filter we are writting the
unobserved model of interest in a general form known as
“state space model”.
The state space model is given by
y t  Zt t   t ,
 t  N (0, Ht )
 t  Tt t 1  t , t  N (0,Qt )
where the matrices Zt, Ht, Tt and Qt can evolve over time as
far as they are known at time t-1.
Consider, for example, the random walk plus noise model proposed to
represent fundamental prices in the market.
In this case, the measurement equation is given by
y t  Z t t   t
y t  t   t
Therefore, Zt=1, the state αt is the underlying level μt and Ht=
The transition equation is given by
 t  Tt t 1  t
2

and Tt=1 and Qt= 
 t   t 1  t
 2
Unobserved component models depend on
several disturbances. Provided de model is
linear, the components can be combined
to give a model with a single disturbance:
reduced form.
The reduced form is an ARIMA model with
restrictions in the parameters.
Consider the random walk plus noise model
y t  t   t
t  t 1  t
In this case
yt  t   t
The mean and variance of
yt
are given by
E(yt )  E(t  t )  0
Var(yt )  E (t   t ) 2   2  2 2
The autocorrelation function is given by
   2
1


, h 1
 2
2
q

2
  2 
 ( h)   


0,
h2
 2
signal to noise ratio.
The reduced form is an IMA(1,1) model with
negative parameter where
q
 2
q 2  4q  2  q

2
When, q=0, yt reduces to a non-invertible MA(1)
model, i.e. yt is a white noise process. On the
other hand, as q increases, the autocorrelations
of order one, and consequently, θ , decreases. In
the limit, if  2  0 , yt is a white noise and yt is a
random walk.
Although we are focusing on univariate series, the results
are valid for multivariate systems. The Kalman filter is
made up of two sets of equations:
i) Prediction equations: one-step ahead predictions of the
states and their corresponding variances
at / t 1  E  t | Yt 1   Tt at 1
Pt / t 1  Et  at / t 1 | Yt 1  Tt Pt 1Tt'  Qt
For example, in the random walk plus noise model:
mt / t 1  mt 1
Pt / t 1  Pt 1   2
ii) Updated equations: Each new observation changes
(updates) our estimates of the states obtained using past
information
at  E  t | Yt   at / t 1  Pt / t 1Zt' Ft1 t
Pt  E ( t  at )( t  at )' | Yt   Pt / t 1  Pt / t 1Zt' Ft1Zt Pt / t 1
where
 t  y t  yˆ t / t 1  y t  Zt at / t 1
 
Ft  E  2  Zt Pt / t 1Zt'  Ht
The updating equations can be derived
using the properties of the multivariate
normal distribution.
Consider the distribution of αt and yt
conditional on past information up to and
including time t-1.
The conditional mean and variance are:
at / t 1  Et | Yt 1   Tt at 1  ct
yˆt / t 1  Eyt | Yt 1   Zt at / t 1  dt
The conditional covariance can be easily
derived by writting
yt  Zt at / t 1  dt  Zt (t  at / t 1 )  t
Cov( t , yt | Yt 1 )  E ( t  at / t 1 )( yt  Z t at / t 1  d t )'  E (( t  at / t 1 )(( t  at / t 1 )' Z t'   t ))
t 1
t 1
 Pt / t 1Z t'
  at / t 1   Pt / t 1
 t 
 N  
 y  | Yt 1 
, 
  Z t at / t 1  d t   Z t Pt / t 1
 t
Pt / t 1Z t'  

Ft  
Consider, once more the random walk plus noise model. In
this case,
Pt / t 1
mt  mt / t 1 
( y t  mt / t 1 )
ft
 Pt / t 1 

Pt  Pt / t 1 1 
ft 

ft  Pt / t 1   2
The Kalman filter needs some initial conditions for the state
and its covariance matrix at time t=0. There are several
alternatives. One of the simplest consists on assuming
that the state at time zero es equal to its marginal mean
and P0 is the marginal variance.
However, when the state is not stationary this solution is
not factible. In this case, one can initiallize the filter by
assuming what is known as a diffuse prior (we do not
have any information about what happens at time zero:
m0=0 and P0=∞.
Now we are in the position to run the Kalman filter.
Consider, for example, the random walk plus noise
model and that we want to obtain estimates of the
underlying level of the series. In this case, the equations
are given by
m1 / 0  m0  0
P1 / 0  P0   2
P0   2
P1 / 0
m1  m1 / 0 
( y1  m1 / 0 ) 
y  y1
2
2 1
F1
P0      
P1 
P1 / 0  P1 / 0F11P1 / 0

P0   2 
2
 (P0    )1 




2
2
 P0       
2
m2 / 1  m1
P2 / 1  P1   2
f2  P2 / 1   2
P2 / 1
m2  m2 / 1 
( y 2  m2 / 1 )
F2
P2  P2 / 1  P2 / 1F21P2 / 1
2


P


1

2
 (P1    )1 
2
2
P




1

 

Consider, for example, that we have observations of a time series
generated by a random walk with  2  1 and  2  4
1.14, 0.59, 1.58,….
m1  1.14
P1  1
m2 / 1  1.14
P2 / 1  1  4  5
f2  5  1  6
5
m2  1.14  (0.59  1.14)  0.68
6
P2  5(1  6 / 5)  0.83
30
20
10
0
-10
-20
-30
250
500
Y
750
PREDICTIONS
1000
20
10
0
-10
-20
-30
250
500
PREDICTIONS
750
FILTERED
1000
The Kalman filter gives:
i)
One-step ahead and updated estimates of the
unobserved states and their associated mean squared
errors: at/t-1, Pt/t-1, at and Pt
ii)
One-step ahead estimates of yt yˆt / t 1  Ey t | Yt 1  Zt at / t 1
iii)
One-step ahead errors (innovations) and their
variances, νt and Ft
Smoothing algorithms
There are also other algorithms known as smoothing
algorithms that generate estimates of the unobserved
states based on the whole sample
at / T  E  t | YT   at  Pt * (at 1 / T  Tt 1at )
Pt / T  Pt  Pt * (Pt 1 / T  Pt 1 / t )Pt *'
Pt *  PtTt '1Pt 11 / t
The smoothers are very useful because they generate
estimates of the disturbances associated with each of
the components of the model: auxiliary residuals.
For example, in the random walk plus noise model:
aT 1 / T  aT 1 
PT 1
(aT / T  aT 1 )
PT / T 1
aT  2 / T  aT  2 
PT  2
(aT 1 / T  aT  2 )
PT 1 / T  2
20
10
0
-10
-20
-30
250
FILTERED
500
750
SMOOTHED
1000
30
20
10
0
-10
-20
-30
250
500
Y
750
SMOOTHED
1000
The auxiliary residuals are useful to:
i)
identify outliers in different components; Harvey and
Koopman (1992)
ii)
test whether the components are heteroscedastic;
Broto and Ruiz (2005a,b)
This test is based on looking at the differences
between the autocorrelations of squares and the
squared autocorrelations of each of the auxiliary
residuals; Maravall (1983).
Prediction
One of the main objectives when dealing with time series
analysis is the prediction of future values of the series of
interest. This can easily be done in the context of state
space models by running the prediction equations
without the updating equations:
aT  k  E T  k | YT   Zt'T k aT
PT  k  E (T  k  aT  k )(T  k
 k 1 j

 aT  k )' | Yt   Z 'T PT T Z  Z '   T QtT j ' Z
 j 0

k
k'
In the context of the random walk plus noise model:
mT  k / T  mT
PT 1 / T  PT  k 2
Estimation of the parameters
Up to now, we have assumed that the parameters of the
state space model are known. However, in practice, we
need to estimate them. In Gaussian state space models,
the estimation can be done by Maximum Likelihood. In
this case, we can write
y t  Zt at / t 1  Zt ( t  at / t 1 )   t
y t | Yt 1  N(Zt at / t 1, Ft )
The expression of the log-likelihood is then given by
T
1T
1 T  t2
log L   log( 2 )   log | Ft |  
2
2 t 1
2 t 1 Ft
The asymptotic properties of the ML estimator are
the usual ones as far as the parameters lie on the
interior of the parameter space. However, in
many models of interest, the parameters are
variances, and it is of interest to know whether
they are zero (we have deterministic
components). In some cases, the asymptotic
distribution could still be related with the Normal
but is modified as to take into account of the
boundary; see Harvey (1989).
If the model is not conditionally Gaussian, then
maximizing the Gaussian log-likelihood, we
obtain what is known as the Quasi-Maximum
Likelihood (QML) estimator. In this case, the
estimator looses its eficiency. Furthermore,
droping the Normality assumption tends to affect
the asymptotic distribution of all the parameters.
In this case, the asymptotic distribution is given
by
T (ˆ   )  J 1IJ 1
where
  2 log L 
J  E 





'


  log L  log L 
I  E
 ' 
 
Unobserved component models for
financial time series
We are considering two particular applications of
unobserved component models with financial data:
i)
ii)
Dealing with stochastic volatility
Heteroscedastic components
Stochastic volatility models
Understanding and modelling stock volatility is necessary
to traders for hedging against risk, in options pricing
theory or as a simple risk measure in many asset pricing
models.
The simplest stochastic volatility model is given by
y t   * t t
log( t2 )   log( t21 )  t
Taking logarithms of squared returns, we obtain a linear
although non-Gaussian state space model
log( y t2 )  log( *2 )  log( t2 )  log( t2 )
log( y t2 )    ht  t
ht   ht 1  t
Because, log(yt)2 is not truly Gaussian, the Kalman filter
yields minimium mean square linear estimators
(MMSLE) of ht and future observations rather than
minimum mean square estimators (MMSE).
If y1, y2,…,yT is the returns series, we transform the data by
y t*  log(( y t  y ) / sˆy )2
The Kalman filter is given by
h0  0, P0   2 /(1   2 )
h1 / 0  h0 , P1 / 0   2P0   2
f1  P1 / 0   2
P1 / 0 *
P1 / 0
h1  h1 / 0 
( y1  h1 / 0 ), P1  P1 / 0 (1 
)
f1
f1
SP500
10
5
0
-5
-10
-15
-20
-25
2500
5000
SP500
7500
10000
ˆ  0.9943
( 0.001)
ˆ  0.0085
2
( 0.001)
ˆ  5.1719
2
( 0.120)
2
ˆ *
 0.572
2.4
2.0
1.6
1.2
0.8
0.4
0.0
2500
5000
7500
VOLSP
10000
Canadian/Dollar
2
1
0
-1
-2
1000 2000 3000 4000 5000 6000 7000 8000
CAN
ˆ  0.988
( 0.004)
ˆ  0.023
2
( 0.003)
ˆ  5.260
2
( 0.140)
2
ˆ *
 0.049
.6
.5
.4
.3
.2
.1
.0
1000
2000
3000
4000
VOLCAN
5000
6000
Unobserved heteroscedastic
components
Unobserved component models with heteroscedastic
disturbances have been extensively used in the analysis
of financial time series; for example, multivariate models
with common factors where both the idiosyncratic and
common factors may be heteroscedastic as in Harvey,
Ruiz and Sentana (1992) or univariate models where the
components may be heteroscedastic as in Broto and
Ruiz (2005b).
To simplify the exposition, we are focusing on the random walk plus
noise model with ARCH(1) disturbances given by
y t  t   t
t  t 1  t
where
 t   t ht , ht   0  1 t21
t  tqt , qt   0  1t21
The model can be written in state space form as follows
 
y t  1 0 t    t
t 
 t  1 0  t 1  1
   0 0    1t
  t 1   
 t 
where
 
 

Ht  E t2   0  1 E  t21   0  1 E y t 1  mt 1  t 1  mt 1    0  1 ˆt21  Pt 1
t 1
 
t 1
 
2
t 1


Qt  E t2   0  1 E t21   0  1 E ˆt 1  t 1  ˆt 1    0  1 ˆt21  Pt1
t 1
t 1
2
t 1

The model is not conditionally Gaussian, since knowledge
of past observations does not, in general, imply
knowledge of past disturbances. Nevertheless we can
proceed on the basis that the model can be treated as
though it were conditionally Gaussian, and we will refer
to the Kalman filter as being quasi-optimal.
Nikkei
1010
1000
990
980
970
960
950
940
250
500
750
1000
NIKKEI
1250
1500
1750
QML estimates of the Random walk plus noise parameters:
ˆ 2  0.111
ˆ2  1.482
Diagnostics of innovations
Q(10)  15.892
Q2 (10)  145.78
*
.20
.16
.12
.08
.04
.00
-.04
5
10
DIFIRRENIK
15
20
25
DIFFLEVELNIK
30
35
SDNIKKEI
Diagnostics of auxiliary residuals
Q (10)  408.67 *
Q 2 (10)  401.70*
Q (10)  23.805*
Q 2 (10)  148.81*
ht  0.079
( 2.396)
qt  0.035 0.082t21  0.900 qt21  0.108t 1
( 3.468)
( 5.082)
( 51.752)
( 4.949)
Hewlett Packard
440
400
360
320
280
240
200
500
1000
1500
HP
2000
QML estimates of the Random walk plus noise parameters:
ˆ 2  0.215
ˆ2  7.352
Diagnostics of innovations
Q(10)  20.898
Q2 (10)  193.54*
.16
.14
.12
.10
.08
.06
.04
.02
5
10
DIFIRREHP
15
20
25
DIFLEVELHP
30
35
SDHP
Diagnostics of auxiliary residuals
Q (10)  515.39*
Q 2 (10)  431.70*
Q (10)  22.872*
Q 2 (10)  196.73*
ht  0.399
( 3.186 )
qt  0.033 0.021t21  0.973qt21  0.053t 1
( 2.896 )
( 4.886 )
(193.257 )
( 2.274 )