Serial Correlation - James Madison University

Transcript Serial Correlation - James Madison University

Autocorrelation
Outline
1) What is it?
2) What are the consequences for our Least Squares estimator when we
have an autocorrelated error?
3) How do we test for an autocorrelated error?
4) How do we correct a model that has an autocorrelated error?
12.1
What is Autocorrelation?
12.2
Review the assumption of Gauss-Markov
1.
2.
3.
4.
5.
Linear Regression Model
y = 1 + 2x + e
Error Term has a mean of zero:
E(e) = 0  E(y) = 1 + 2x
Error term has constant
variance: Var(e) = E(e2) = 2
Error term is not correlated with
itself (no serial correlation):
Cov(ei,ej) = E(eiej) = 0 ij
Data on X are not random and
thus are uncorrelated with the
error term: Cov(X,e) = E(Xe) = 0
This is the assumption
of a serially uncorrelated error.
The error is assumed to be
independent of its past; it has
no memory of its past values. It
is like flipping a coin.
A a serial correlated error (a.k.a.
autocorrelated error) is one that has a
memory of its past values. It is
correlated with itself.
Autocorrelation is more commonly a problem for time-series data.
An example of an autocorrelated error:
et  et 1  vt
et  0.8et 1  vt
Here we have  = 0.8. It means that 80% of the error in period t-1
is still felt in period t. The error in period t is comprised of 80%
of last period’s error plus an error that is unique to period t. This is
sometimes called an AR(1) model for “autoregressive of the first
order”
The autocorrelation coefficient must lie between –1 and 1:
-1 <  < 1
Anything outside this range is unstable and very unlikely for
economic models
12.3
• Autocorrelation can be positive or negative:
if  > 0  we say that the error has positive autocorrelation.
A graph of the errors shows a tracking pattern:
12.4
if  < 0  we say that the error has negative autocorrelation.
A graph of the errors shows an oscillating pattern:
• In general  measures the strength of the correlation between the errors at
time t and their values lagged one period.
• There can be higher orders such as a second order AR(2) model:
et  1et 1  2et 2  vt
The mean, variance and covariance for an AR(1) error:
et  et 1  vt
E (et )  0
 v2
Var (et )   
1  2
2
e
Cov(et , et  k )   e2  k
corr(et , et  k ) 
Cov(et , et  k )
 e e
 e2  k
k



 e2
12.5
What are the Implications for Least Squares?
We have to ask “where did we used the assumption”? Or “why was the
assumption needed in the first place?”
We used the assumption in the derivation of the variance formulas for
the least squares estimators, b1 and b2.
For b2 this was
The assumption of a serially
uncorrelated error is made when
b2   2  wt et
we say that the variance of a sum
is equal to the sum of the variances.
Var (b2 )  Var (  2  wt et )
This is true only if the random
variables are uncorrelated.
 wt2Var (et )
See Chapter 2, pg. 31.



2

  wt2 t2 
2
(
x

x
)
 t
12.6
• The proof that the least squares estimators is unbiased did not use
12.7
the assumption of serially uncorrelated errors; therefore, this
property of least squares continues to hold even in the presence of a
autocorrelated error.
• The “B” in BLUE of the Gauss-Markov Theorem no longer holds.
• The variance formulas for the least squares estimators are
incorrect invalidates hypoth tests and confidence intervals.
The “correct” variance formula:
Var (b2 )  Var (  2   wt et )


2
1
k
1 


(
x

x
)(
x

x
)


i
j
2
2

(
x

x
)
(
x

x
)
 t
  t

The large term in brackets shows how the Var(b2) formula changes
to allow for an autocorrelated error.
• If  > 0 which is typically the case for economic models, it can be
shown that the “incorrect” Var(b2) < “correct” Var(b2).
• If we ignore the problem and use the “incorrect” Var(b2) we will
overstate the reliability of the estimates, because we will report a
standard error that is too small. The t-statistics will be “falsely” large,
leading to a false sense of precision.
12.8
How to Test for Autocorrelation
12.9
We test for autocorrelation similar to how we test for a heteroskedastic
error: estimate the model using least squares and examine the residuals
for a pattern.
1) Visual Inspection: Plot residuals against time. Do they have a
systematic pattern that indicates a tracking pattern (for positive
autocorrelation) or an oscillating pattern (for negative autocorrelation)?
Example: a model of Job Vacancies and the Unemployment Rate
Page 278, Exercise 12.3
ln(JV)t = 1 + 2 ln(U)t + et
Where JV are job vacancies, U is the unemployment rate.
Source
DF
Sum of
Squares
Mean
Square
Model
Error
Corrected Total
1
22
23
8.72001
1.78687
10.50688
8.72001
0.08122
F Value
Pr > F
107.36
<.0001
Root MSE
Dependent Mean
Coeff Var
Variable
Intercept
lu
DF
1
1
0.28499
R-Square
0.8299
0.63427
Adj R-Sq
0.8222
44.93266
Parameter Estimates
Parameter
Standard
Estimate
Error
t Value
Pr > |t|
3.50270
-1.61159
0.28288
0.15554
^
ln(JV)t = 3.503 – 1.612 ln(U)t
12.38
-10.36
<.0001
<.0001
12.10
2) Formal Test: Durbin-Watson Test
This test is based on the residuals from the least squares regression.
(remember that our test for heteroskedasticity was also based on the
residuals from a least squares regression)
If the error term has first-order serial correlation, et = et-1 + vt
 The residuals at t and t-1 ought to be correlated.
Ho:  = 0
H1:  > 0 (positive autocorrelation is more likely in economics)
The Durbin-Watson test statistic is used to test this hypothesis. It is
constructed using the least squares residuals. Specifically:
T
 (eˆt  eˆt 1)2
d  t 2
T
 eˆt
t 1
2
12.11
The d statistic can be simplified into an expression involving
The sample correlation ^ between the residuals at t and t-1:
T
 eˆt
2
T

eˆt 12
T
 eˆt eˆt 1
d  t T 2  t T2
 2 t  2T
2
2
2
ˆ
ˆ
ˆ
e
e
e
t
t
t
t 1
t 1
t 1
 1  1  2 ˆ
 2  2 ˆ
^
Note that if there is no autocorrelation, then  = 0, so that  should also
be around 0, implying a d-statistic of 2.
If  = 1  d = 0
If  = -1  d = 4
The question then becomes: “How far below 2 must the d-statistic be to
say that there is positive autocorrelation?” and “How far above 2 must
the d-statistic be to say that there is negative autocorrelation?”
12.12
Typically we want to compare our test statistic to a critical value
to determine whether or not the data reject the null hypothesis.
The probability distribution for the d-statistic have some convenient
well-known form such as the t or the F. Instead, its distribution depends
on the values of the explanatory variables. For this reason, the best we
can do is tie down a lower and upper bound for the critical d values.
See Table 5, pg 393-396.
12.13
Suppose T=24 observations used to estimate a model with one independent
variable and an intercept k = 2.
1) The test:
Ho:  = 0
H1:  > 0
2) Calculate the d-statistic according
to the formula on slide 12.11
3) Conduct the test
If d < 1.273  reject Ho
If d > 1.446  fail to reject Ho
If 1.273 < d < 1.446 
2
0
dL
dU
4 d
inconclusive
1.273 1.446
Example: Test the model of job vacancies.
For this model T=24, and k=2  we can use the dL and dU critical
values from slide 12.13.
To calculate the durbin-watson d-statistic, we get SAS to do so by adding
the dw option to the model statement:
Proc reg;
model ljv
Run;
= lu
/
dw;
The REG Procedure
Model: MODEL1
Dependent Variable: ljv
Durbin-Watson D
Number of Observations
1st Order Autocorrelation
1.090
24
0.432
Conclusion: reject Ho because d = 1.09 < 1.273
12.14
How to Correct for Autocorrelation
12.15
1) It is quite possible that the error in a regression equation appears to be
autocorrelated due to an omitted variable. Recall that omitted variables
“end up” in the error term. If the omitted variable is correlated over
time (which is true of many economic time-series), then the residuals
will appear to track  Correct the problem by reformulating the model
(include the omitted variable)
2) Generalized Least Squares
Similar to the problem of a heteroskedastic error, we will take our
model that has an autocorrelated error and transform it into a model
that has a well-behaved (serially uncorrelated) error.
The original model:
where:
yt  1   2 xt  et
et  et 1  vt
Algebraic manipulations:
12.16
vt is a “well-behaved” error
that is serially uncorrelated
Construct new variables:
yt*  ( yt  yt 1 )
x1*t  (1   )
x2*t  ( xt  xt 1 )
These variables are sometimes called “generalized differences”.
We will then estimate this model using the new variables:
*
yt

*
1x1

*
 2 xt
 vt
Note that x1* is really a constant, not a variable. The intercept 1
has always been multiplied by 1 and now it is multiplied by (1-)
12.17
The problem is, what to use for  because it is unknown?
There are many different ways of estimating .
12.18
1) All methods begin with the residuals from least squares,
the same residuals used to construct the durbin-watson test statistic:
T
 eˆt eˆt 1
2
ˆ  t  2T
2
ˆ
e
t
t 1
2) Use this estimate of  to construct the generalized differences
according to the formulas on the previous slide for y*, x1* and x2*
3) Run Least Squares using these generalized differences
4) (Cochrane-Orcutt’s Iterative Procedure) [a.k.a Yule-Walker Method]
From step 3), take the residuals from this regression
and repeat steps 1) – 3). Each time you get new estimates of 1 and 2.
Continue to iterative until the values of the estimates converge.
The AUTOREG Procedure
Dependent Variable
ljv
Ordinary Least Squares Estimates
SSE
1.78686627
DFE
22
MSE
0.08122
Root MSE
0.28499
SBC
12.1229868
AIC
9.76687918
Regress R-Square
0.8299
Total R-Square
0.8299
Durbin-Watson
1.0896
Standard
Approx
Variable
DF
Estimate
Error
t Value
Pr > |t|
Intercept
1
3.5027
0.2829
12.38
<.0001
lu
1
-1.6116
0.1555
-10.36
<.0001
Lag
0
1
Covariance
0.0745
0.0322
Estimates of Autocorrelations
Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
1.000000 |
|********************|
0.431840 |
|*********
|
Preliminary MSE
0.0606
Estimates of Autoregressive Parameters
Standard
Lag
Coefficient
Error
t Value
1
-0.431840
0.196822
-2.19
Yule-Walker Estimates
SSE
MSE
SBC
Regress R-Square
Durbin-Watson
Variable
Intercept
lu
DF
1
1
1.4379184
DFE
21
0.06847
Root MSE
0.26167
10.2930273
AIC
6.75886582
0.8853
Total R-Square
0.8631
2.0166
The AUTOREG Procedure
Standard
Approx
Estimate
Error
t Value
Pr > |t|
3.5138
0.2437
14.42
<.0001
-1.6162
0.1269
-12.73
<.0001
12.19
These results will
be discussed in class.
12.20
options ls=78;
options formdlim='*';
goptions reset=all;
data one;
infile 'c:\my documents\classes\UE\datafiles\vacan.dat' firstobs=2;
input jv u;
time=_n_;
ljv = log(jv);
lu = log(u);
symbol1 i=none c=red v=dot h=.5;
symbol2 c=black i=join l=1;
proc gplot ;
plot ljv * lu = 1 ;
proc autoreg;
model ljv = lu / dwprob;
output out=stuff residual= ehat predicted=ljv_hat;
run;
use PROC AUTOREG
with DWPROB to get pvalues for the DW
statistic
proc gplot data=stuff;
plot ljv*mortg=1 ljv_hat*mortg = 2 / overlay legend;
plot ehat*time=1 / vref=0;
proc autoreg;
model ljv = lu / nlag=1;
run;
Using PROC AUTOREG with the / nlag=1
option in the model statement will estimate
the model and correct for first-order
autocorrelation in the errors.

Serial Correlation - James Madison University

Transcript Serial Correlation - James Madison University

Directory