Transcript Thanks
Stationarity Issues in Time Series
Modeling
David A. Dickey
North Carolina State University
“Stationarity”-what is it?
Example: Stocks of Silver in the
NY Commodities Exchange
Two forecasts:
Nonstationary in yellow
– No mean reversion, unbounded error bands
Stationary in green
– Reverts to mean, bounded error bands
Silver Series
“Stationarity”-what is it?
Constant mean m
Covariance between Yt, Yt+h function of h
only. g(h)
[Autocorrelation r(h) = g(h)/g(0)]
One Lag Model
Yt-m=r(Yt-1-m)+et
– “shocks” et~N(0,s2)
Stationary: |r|<1
–
–
Yt=m(1-r) +rYt-1+et
Regress Yt on 1, Yt-1
» Estimators approximately normally
distributed in large samples
» Use t test for H0:r=0
One Lag Model with r=1
Yt-m=1(Yt-1-m)+et
–
“shocks” et~N(0,s2)
Yt=Yt-1+et
Best forecast of Yt is Yt-1
Nonstationary: r=1
– Regress Yt on 1, Yt-1
– Estimators NOT normally distributed even in large
samples
– CANNOT use t tables to test for H0:r=0
– t test statistic does NOT have t distribution!!!
Hypothesis Test
Model: Yt-m=r(Yt-1-m)+et
Test
–
–
H0: r=1 “Nonstationary, Unit Root”
H1: |r|<1 “Stationary (mean reverting)
Compare t calculated to new distribution
Two Tests
Model: Yt-m=r(Yt-1-m)+et
–
Yt-m-(Yt-1-m)=(r-1)(Yt-1-m)+et
–
Yt-Yt-1= m (1-r)+ (r-1)Yt-1+et
– Regress Yt-Yt-1 on 1, Yt-1
– Tests:
–
n(coefficient of Yt-1) “Rho”
–
calculated t test
“Tau”
Some math
Yt et + et -1 + + e1
e 21 e1e2
2
e
e
e
2
21
e3e1 e3e2
e4 e1 e4 e2
Above diagonal ->
e1e3
e2 e3
e32
e4 e3
Y1e2
Y2e3
n
e1e4
e2 e4
Y42
e3e4
2
e4
Y3e4
1 2 n 2
Yt -1et (Yn - et )
2
t 2
t 1
More math
n
n
t 2
t 2
n( rˆ - 1) [n -1 Yt -1et ] /[ n - 2 Yt -21 ]
n
1 2 n 2
[ (Yn - et ) /( ns 2 )] / [ Yt -21 /( n 2s 2 )]
2
t 1
t 2
1
1 2
(W (1) - 1) / W 2 (t ) dt
2
0
W(t) is Wiener Process on [0,1]
1 2
n (t test ) (W (1) - 1) /
2
1
2
W
(t ) dt
0
Two Series
SAS software: PROC ARIMA
proc gplot; plot (Y Z)*t / overlay;
proc arima;
i var=Y nlag=10 stationarity=(adf);
i var=Z nlag=10 stationarity=(adf);
Symptoms of Nonstationarity
ACF dies down slowly
– ACF is Corr (Yt, Yt-j) plot vs. j
Nonconstant level when plotted
Saw plot, ACFs coming up
Y series ACF
The ARIMA Procedure
Name of Variable = Y
Mean of Working Series
Standard Deviation
Number of Observations
110.9728
5.286108
250
Autocorrelation
Lag
0
1
2
3
4
5
6
7
8
9
10
Correlation
1.00000
0.97219
0.94506
0.91741
0.89025
0.86479
0.84145
0.81771
0.79836
0.77912
0.75671
-1 9 8 7 6 5 4 3 2 1
|
|
.
|
.
|
.
|
.
|
.
|
.
|
.
|
.
|
.
|
.
0 1 2 3 4 5 6 7 8 9 1
|********************|
|******************* |
|******************* |
|****************** |
|****************** |
|*****************
|
|*****************
|
|****************
|
|****************
|
|****************
|
|***************
|
Std Error
0
0.063246
0.107523
0.136771
0.159498
0.178269
0.194326
0.208391
0.220853
0.232110
0.242346
Z series ACF
The ARIMA Procedure
Name of Variable = Z
Lag
0
1
2
3
4
5
6
7
8
9
10
Mean of Working Series
100.5022
Standard Deviation
2.402392
Number of Observations
250
Autocorrelations
Correlation
-1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
1.00000
|
|********************|
0.90796
|
. |****************** |
0.81755
|
.
|****************
|
0.72228
|
.
|**************
|
0.63703
|
.
|*************
|
0.56707
|
.
|***********
|
0.51964
|
.
|**********
|
0.47865
|
.
|**********
|
0.46026
|
.
|*********
|
0.44466
|
.
|*********
|
0.42313
|
.
|********
|
"." marks two standard errors
Tests on Y
The ARIMA Procedure
Augmented Dickey-Fuller Unit Root Tests
Type
Lags
Rho
Pr < Rho
Tau
Pr < Tau
F
Pr > F
Zero Mean
0
1
2
0.1014
0.0880
0.0719
0.7059
0.7027
0.6989
0.71
0.59
0.45
0.8675
0.8422
0.8101
Single Mean
0 -6.8507
1 -6.8539
2 -7.1478
0.2817
0.2815
0.2624
-2.30
-2.16
-2.07
0.1724
0.2211
0.2564
2.99
2.57
2.29
0.3095
0.4147
0.4861
Trend
0 -7.3468
1 -7.3273
2 -7.5909
0.6313
0.6328
0.6114
-2.46
-2.30
-2.19
0.3502
0.4295
0.4905
3.64
3.07
2.65
0.4500
0.5636
0.6489
Tests on Z
The ARIMA Procedure
Augmented Dickey-Fuller Unit Root Tests
Type
Lags
Rho
Pr < Rho
Tau
Pr < Tau
0.6803
0.6769
0.6733
-0.05
-0.15
-0.24
0.6647
0.632
0.5997
F
Pr > F
Zero Mean
0
1
2
-0.0087
-0.0237
-0.0393
Single Mean
0
1
2
-22.8511
-24.5443
-28.8542
0.0051
0.0034
0.0015
-3.45
-3.48
-3.69
0.0104
0.0095
0.0050
5.96
6.06
6.80
0.0136
0.0114
0.0010
Trend
0
1
2
-24.6119
-26.2971
-30.7682
0.0236
0.0161
0.0057
-3.61
-3.60
-3.77
0.0312
0.0319
0.0196
6.53
6.48
7.13
0.0449
0.0461
0.0283
Higher Order Processes
Yt-ma1(Yt-1-m) + a2(Yt-2-m) + a3(Yt-3-m) + et
DYt= Yt-Yt-1 =
-(1-a1- a2 - a3) (Yt-1-m) - (a2 + a3) DYt-1 - a3 DYt-2 + et
[ coefficient ]
ADF stands for
Augmenting lags
Augmented Dickey-Fuller
Testing for no mean reversion:
H0: (1-a1- a2 - a3) 0
Regress Yt-Yt-1 on 1, Yt-1, Yt-1-Yt-2, Yt-2-Yt-3
Nonstandard | N(__, __) |
Higher Order Processes
Q1: How many lags???
Regress DYt on 1,Yt-1, DYt-1 , DYt-2, . . .
| N(__, __) |
so . . .
Just use usual t tests and p-values!!!
Q2: Why “Unit Root” Tests ??
B(Yt)= Yt-1
(1-a1B - a2 B2- a3B3)(Yt -m)= et
root of 1-a1B - a2 B2- a3B3 at B=1 means
1-a11 - a2 12- a313 = 0
Check Silver Series for
Augmenting Lags
PROC REG;
MODEL DEL=
LSILVER DEL1 DEL2 DEL3 DEL4;
TEST DEL2=0, DEL3=0, DEL4=0;
Source
DF
Numerator
3
Denominator 133
Mean
Square F Value Pr > F
4589.63459
3515.48242
1.31 0.2753
Unit Root test in PROC REG
PROC REG;
MODEL DEL= LSILVER DEL1;
Variable
Parameter
DF Estimate t Value Pr > |t|
Intercept
1
75.58073
2.76
0.0082
LSILVER
DEL1
1
1
-0.11703
0.67115
-2.78
6.21
0.0079
<.0001
Unit Root test in PROC ARIMA
PROC ARIMA DATA=SILVER;
I VAR=SILVER STATIONARITY=(ADF=(1));
Augmented Dickey-Fuller Unit Root Tests
Type
Lags
Tau
Pr < Tau
Zero Mean
1
-0.28
0.5800
Single Mean
Trend
1
1
-2.78
-2.63
0.0689
0.2697
And now. . .the rest of the story
Type
Lags
Zero Mean
Single Mean
Trend
Tau
Pr < Tau
????? (A)
1
-2.78
0.0689
????? (B)
(A) Assumes mean is 0 (or known and subtracted off)
Has different (pair of) distributions !!
(B) Allows for TREND under H1
Has third (pair of) distributions !!!!
Silver - Need 2nd Difference?
Dt = DYt = Yt-Yt-1
Q: Does D (also) have a unit root ?
Regress DDt on Dt-1 using /NOINT (why?)
No augmenting lags (why?)
I VAR=Y(1) STATIONARITY = . . .
Type
Zero Mean
Single Mean
Trend
Lags
0
0
0
Tau Pr < Tau
-3.42
-3.39
-3.62
0.0010
0.0158
0.0383
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0
7612550
1.00000 |
|********************|
1
7604217
0.99891 |
.|********************|
2
7595529
0.99776 |
.|********************|
3
7586855
0.99662 |
. |********************|
4
7578152
0.99548 |
. |********************|
5
7569481
0.99434 |
. |********************|
6
7560553
0.99317 |
. |********************|
7
7551925
0.99204 |
. |********************|
8
7543869
0.99098 |
. |********************|
9
7535957
0.98994 |
. |********************|
10
7528240
0.98892 |
. |********************|
11
7519890
0.98783 |
. |********************|
12
7511672
0.98675 |
.
|********************|
"." marks two standard errors
Output from SAS PROC ARIMA
Augmented Dickey-Fuller Unit Root Tests
Type
Lags
Rho
Pr < Rho
Zero Mean
0
1.3567
0.9565
1
1.3481
0.9557
Single Mean
0
0.4065
0.9744
1
0.3500
0.9725
Trend
0
-6.3073
0.7203
1
-6.5833
0.6981
Differences
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0
4003.285
1.00000 |
|********************|
1
102.471
0.02560 |
.|*
|
2
-117.368
-.02932 |
*|.
|
3
-235.578
-.05885 |
*|.
|
4 -26.946567
-.00673 |
.|.
|
5 -46.750761
-.01168 |
.|.
|
6 -77.100469
-.01926 |
.|.
|
7
-224.055
-.05597 |
*|.
|
8 -27.874814
-.00696 |
.|.
|
9
132.415
0.03308 |
.|*
|
10
316.534
0.07907 |
.|**
|
11
-254.117
-.06348 |
*|.
|
12
200.979
0.05020 |
.|*
|
"." marks two standard errors
Inverse Autocorrelation
Ming Chang thesis
Dual model
(1-a B) Yt= et
AR(1)
dual is
Yt = (1-a B) et
MA(1)
Chang shows IACF dies off slowly if you
overdifference.
Differenced DJIA IACF
Inverse Autocorrelations
Lag
Correlation
-1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
1
-0.51119
|
**********|.
|
2
0.01380
|
.|.
|
3
-0.00533
|
.|.
|
4
0.01061
|
.|.
|
5
-0.02324
|
.|.
|
6
0.00722
|
.|.
|
7
0.02122
|
.|.
|
8
-0.01617
|
.|.
|
9
0.02831
|
.|*
|
10
-0.04860
|
*|.
|
11
0.02759
|
.|*
|
12
-0.00422
|
.|.
|
2nd Differenced DJIA IACF
Just for illustration, here is the inverse autocorrelation you
would get if you differenced these differences once more, that
is, if you took the second difference of the original series.
Note the roughly triangular appearance, suggesting that you
should have stopped after the first difference
Lag
1
2
3
4
5
6
7
8
9
10
11
12
Inverse Autocorrelations
Correlation
-1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0.89720
|
.|****************** |
0.80302
|
.|****************
|
0.70785
|
.|**************
|
0.60466
|
.|************
|
0.50498
|
.|**********
|
0.41173
|
.|********
|
0.32523
|
.|*******
|
0.23836
|
.|*****
|
0.15871
|
.|***
|
0.09447
|
.|**
|
0.05758
|
.|*
|
0.01735
|
.|.
|
Rho and F
Yt-ma1(Yt-1-m) + a2(Yt-2-m) + et
Factor:
(1-a1B-a2B2) (1-rB)(1-gB)
DYt - (1-r)(1-g)(Yt-1-m) + rg(DYt-1) + et
Rho
(1) Estimate rg -a2 ( H0) g by regression
(2) Divide n[(1-r)(g-1) estimate] by (g estimate-1)
F
Regress DYt on 1, t, Yt-1 , DYt-1
Test underlined items with F (3 numerator df)
Trend is not Unit Root
Yt = a + b t + Zt with Zt stationary
Yt-1 = a + b(t-1) + Zt-1
DYt = b + DZt with DZt an overdifferenced
series !!
Example:
Amazon.com Example (volume)
PROC REG; MODEL DV = DATE LAGV DV1-DV4; TEST DV3=0, DV4=0;
Variable
DF
Parameter
Estimate
Intercept
date
LAGV
DV1
DV2
DV3
DV4
1
1
1
1
1
1
1
-17.49220
0.00147
-0.21914
-0.15446
-0.18447
-0.04433
-0.05774
t Value
Pr > |t|
Type I SS
-5.26
5.41
-5.80
-3.08
-3.72
-0.94
-1.31
<.0001
<.0001
<.0001
0.0022
0.0002
0.3477
0.1923
0.00848
0.01395
26.67803
0.94211
3.52898
0.07997
0.48763
Test 1 Results for Dependent Variable DV
Source
Numerator
Denominator
DF
2
497
Mean
Square
0.28380
0.28602
F Value
0.99
Pr > F
0.3715
ACF Levels:
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1
0
2.503910
1.00000 |
1
2.327538
0.92956 |
.
2
2.225324
0.88874 |
.
3
2.193509
0.87603 |
.
4
2.155492
0.86085 |
.
5
2.127643
0.84973 |
.
6
2.099292
0.83841 |
.
7
2.069929
0.82668 |
.
8
2.062194
0.82359 |
.
9
2.051450
0.81930 |
.
10
2.011864
0.80349 |
.
11
2.006564
0.80137 |
.
12
1.996735
0.79745 |
.
13
1.960231
0.78287 |
.
14
1.951272
0.77929 |
.
15
1.940939
0.77516 |
.
16
1.919167
0.76647 |
.
17
1.906896
0.76157 |
.
18
1.905406
0.76097 |
.
19
1.892168
0.75569 |
.
20
1.857199
0.74172 |
.
21
1.846038
0.73726 |
.
22
1.826167
0.72933 |
.
23
1.816151
0.72533 |
.
24
1.821228
0.72735 |
.
0 1 2 3 4 5 6 7 8 9 1
|********************|
|******************* |
|****************** |
|****************** |
|*****************
|
|*****************
|
|*****************
|
|*****************
|
|****************
|
|****************
|
|****************
|
|****************
|
|****************
|
|****************
|
|****************
|
|****************
|
|***************
|
|***************
|
|***************
|
|***************
|
|***************
|
|***************
|
|***************
|
|***************
|
|***************
|
"." marks two standard errors
IACF - Differences
Lag
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Correlation
0.48216
0.44816
0.34266
0.30682
0.25213
0.24854
0.23624
0.18675
0.14088
0.20330
0.13295
0.11437
0.15524
0.11829
0.09978
0.10919
0.09049
0.06653
0.02886
0.09515
0.05504
0.07104
0.06065
0.02284
-1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
|
. |**********
|
|
. |*********
|
|
. |*******
|
|
. |******
|
|
. |*****
|
|
. |*****
|
|
. |*****
|
|
. |****
|
|
. |***
|
|
. |****
|
|
. |***
|
|
. |**
|
|
. |***
|
|
. |**
|
|
. |**
|
|
. |**
|
|
. |**
|
|
. |*.
|
|
. |*.
|
|
. |**
|
|
. |*.
|
|
. |*.
|
|
. |*.
|
|
. | .
The ARIMA Procedure
Do the test:
Augmented Dickey-Fuller Unit Root Tests
Type
Zero Mean
Single Mean
Trend
Lags
Rho
Pr < Rho
Tau
Pr < Tau
F
Pr > F
2
2
2
0.0144
-14.2100
-85.7758
0.6861
0.0474
0.0007
0.02
-2.60
-6.35
0.6909
0.0944
<.0001
3.42
20.18
0.1920
0.0010
Fit AR(3) plus trend.
Diagnostics:
Autocorrelation Check of Residuals
To
Lag
ChiSquare
DF
Pr >
ChiSq
6
12
18
24
30
36
42
48
1.59
10.89
12.43
18.97
23.75
30.32
37.56
39.37
3
9
15
21
27
33
39
45
0.6615
0.2835
0.6460
0.5872
0.6439
0.6014
0.5358
0.7087
-----Autocorrelations-----0.015
-0.025
-0.036
. . . -0.000
. . . 0.072
. . . 0.031
Extensions
S. E. Said shows that models with lagged et terms can
still be tested by ADF tests.
Nobel Prize “cointegration” idea:
Two or more unit root processes have
stationary linear combination.
Compute, e.g. Yt = ln(St/Lt) and test for
stationarity.
http://www4.stat.ncsu.edu/~dickey
Click: SAS Code from Presentations
Thanks !
Questions ?