Transcript Document

Some Useful Econometric
Techniques
Selcuk Caner
7/21/2015
1
Outline




Descriptive Statistics
Ordinary Least Squares
Regression Tests and Statistics
Violation of Assumptions in OLS Estimation
– Multicollinearity
– Heteroscedasticity
– Autocorrelation

Specification Errors
 Forecasting
 Unit Roots, Spurious Regressions, and
cointegration
7/21/2015
2
Descriptive Statistics

Useful estimators summarizing the
probability distribution of a variable:

Mean
X



T
i 1
Xi
T
Standard Deviation

7/21/2015
1
T
 X
T
i 1
X
2
i
3
Descriptive Statistics (Cont.)

Skewness (symmetry)
1 T X i  X 
S  i 1
T
3
3

Kurtosis (thickness)
K
7/21/2015
1
T

T
i 1
X
X
4
i
4
4
Ordinary Least Squares (OLS)

Estimation
– Model
Yt  0  1 X1t  et
– The OLS requires:
• Linear relationship between Y and X,
• X is nonstochastic,
• E(et) = 0 , Var(et) = s2 and Cov(et, es)=0
for t not equal to s.
7/21/2015
5
Ordinary Least Squares (OLS) (Cont.)

The OLS estimator for 0 and 1 are
found by minimizing the sum of
squared errors (SSE):

T
7/21/2015


e i 1 Yt  Yt
2
i 1 t
T
  Y  
2

T
i 1
t
0

 1 X t

2
6
Ordinary Least Squares (OLS) (Cont.)
o
Yt

Yt
o
o
o
o
o
o
o
Xt
7/21/2015
X
7
Ordinary Least Squares (OLS) (Cont.)

Minimizing the SSE is equivalent to:


 i 1 e 2 t
T
 0

  0,  
2
e
t
i 1
T
1
 0
Estimators are:

 0  Y  1 X
Cov( X , Y ) i 1 X t  X Yt  Y 
1 

T
2
Var( X )
 X t  X 
T
i 1
7/21/2015
8
Ordinary Least Squares (OLS) (Cont.)

Properties of OLS estimators:


 0 and 1 are unbiased estimators


E( 0 )   0 , E(1 )  1


2
2
 0  N( 0, , b0 ), 1  N(1, , 1 )
– They are normally distributed
– Minimum variance and unbiased estimators
7/21/2015
9
Ordinary Least Squares (OLS) (Cont.)
Multiple regression, in matrix form,
Y  X  e
Y= Tx1 vector of dependent variables
X = Txk matrix of independent variables
(first column is all ones)
= kx1 vector of unknown parameters
e = Tx1 vector of error terms
7/21/2015
10
Ordinary Least Squares (OLS) (Cont.)
Estimator of the multiple regression model:
1
ˆ
   X ' X  X 'Y



X’X is the variance-covariance matrix of the
components of X.
X’Y is the vector of covariances between X and Y.
It is an unbiased estimator and normally
distributed
1
2
ˆ
  N ( ,  X ' X  )
7/21/2015
11
Example: Private Investment

FIRt = b0 + b1RINT t-1 + b2INFL t-1 + b3RGDP
t-1 + b4NKFLOW t-1 + et
 One can run this regression to estimate
private fixed investment
–
–
–
–
7/21/2015
A negative function of real interest rates (RINT)
A negative function of inflation (INFL)
A positive function of real GDP (RGDP)
A positive function of net capital flows
(NKFLOW)
12
Regression Statistics and Tests

R2 is the measure if goodness of fit:

ˆ
Y
i1 i  Y
T

2

ˆ
Y

Y
i1 i
T

2
SSR
SSE
R 
 T
 1 T
 1
2
2
TSS  Yi  Y 
TSS


Y

Y

i
i 1
i 1
2

Limitations:
 Depends on the assumption that the model is correctly
specified
 R2 is sensitive to the number of independent variables
 If intercept is constrained to be equal to zero, then R2
may be negative.
7/21/2015
13
Meaning of
2
R
Yt

Yt
Y
o
o
o
o
o
o
o
Xt
7/21/2015
Yt ˆ0  ˆ1 X1t t
o
X
14
Regression Statistics and Tests
Adjusted R2 to overcome limitations of
 R2 = 1-SSE/(T- K)/TSS/(T-1)
 Is i statistically different from zero?
 When et is normally distributed, use tstatistic to test the null hypothesis i = 0.

– A simple rule: if t(T-k) > 2 then i is significant.
t (T  k ) 
7/21/2015
ˆi   i
S ˆ
i
15
Regression Statistics and Tests

Testing the model:
– F-test: F-statistics with k-1 and T-k degrees of
freedom is used to test for the null hypothesis:
 1=2=3=…= k=0
 The f-statistics is:
2
(
T

k
)
R

F

( k 1,T  k )
(k  1)(1  R 2 )
– The F test may allow the null hypothesis
1=2=3=…= k=0 to be rejected even when
none of the coefficients are statistically
significant by individual t-tests.
7/21/2015
16
Violations of OLS Assumptions

Multicollinearity
– When 2 or more variables are correlated (in the
multi variable case) with each other. E.g.,
Yt  0  1 X1t  2 X 2t  et
– Result: high standard errors for the
parameters and statistically insignificant
coefficients.
– Indications:
• Relatively high correlations between one or more
explanatory variables.
• High R2 with few significant t-statistics. Why?
7/21/2015
17
Violations of OLS Assumptions (Cont.)
 2  X ' X 1  
and
ˆ
i
ˆ 
7/21/2015
0
i
18
Violations of OLS Assumptions (Cont.)

Heteroscedasticity: when error terms do
not have constant variances 2.
– Consequences for the OLS estimators:
– They are unbiased [E(=] but not efficient.
Their variances are not the minimum variance.
– Test: White’s heteroscedasticty test.

If there are ARCH effects, use the GARCH
models to account for volatility clustering
effects.
7/21/2015
19
Violations of OLS Assumptions (Cont.)

Autocorrelation: when the error
terms from different time periods are
correlated [et=f(et-1,et-2,…)]:
– Consequences for the OLS estimators:
• They are unbiased [E(=] but not efficient.
– Test for serial correlation: DurbinWatson for first order serial correlation:
2
ˆ
ˆ
t 2 et  et 1 
T
DW 
7/21/2015
2
ˆ
t 1 et 
T
20
Violations of OLS Assumptions (Cont.)


Autocorrelation (cont.):
Test for serial correlation (cont.)

Durbin-Watson statistic (cont.)
 The DW statistic is approximately equal to:
where
 Coveˆt , eˆt 1  

DW  2(1  r1 )  21 
Vareˆt  

et  r1et 1  ut
Note, if r1=1 then DW =0. If r1=-1 then DW =4. For r1=0,
DW =2.
 Ljung-Box Q test statistic for higher order correlation.

7/21/2015
21
Specification Errors

Omitted variables:
– True model:
Yt  0  1 X1t  2 X 2t  et
– Regression model:
Yt  0  1 X1t  et
– Then, the estimator for 1 is biased.
Cov( X 1 , X 2 )
*
E ( 1 )  1   2
Var ( X 2 )
7/21/2015
22
Specification Errors (Cont.)

Irrelevant variables:
– True model:
Yt  0  1 X1t  et
– Regression model:
Yt  0   X1t   X 2t  et
*
1
*
2
– Then, the estimator for 1 is still unbiased.
Only efficiency declines, since the variance of
1* will be larger than the variance of 1.
7/21/2015
23
A Naïve Estimation





Estimate aggregate demand elasticity e
Using historical consumption data:
Estimate the regression equation:
– ln(QDt) = a + b*ln(Pt)
– b is an estimate of e
Forecast change in consumption price, DP
Estimate change in demand as:
– (DQD/QD)F = e * (DP/P)F
7/21/2015
24
A Regression Result
Dependent
Variable
Log QD
Variable
Coefficient
Stand. Error
T-Statistic
C
-8.35
0.431
-19.4
Log CPI
1.295
0.031
42.3
R=Squared
0.9895
AIC
-1.49
Log
likelihood
17.68
Schwartz C.
-1.4
DW
0.726
F-Statistic
1790.0
7/21/2015
25
Stationarity (ADF Test)
Log CPI
Variable
Coeficient
Std. Error
t-Statistic
Prob.
Ln CPI(-1)
-0.006
0.010
-0.579
0.570
D(ln CPI(-1))
0.670
0.183
3.659
0.002
C
0.128
0.151
0.847
0.410
Variable
Coeficient
Std. Error
t-Statistic
Prob.
Ln QD(-1)
-0.024
0.025
-0.924
0.369
D(ln QD(-1))
0.162
0.241
0.674
0.510
C
0.380
0.260
1.460
0.164
Log QD
1% Critical Value
-3.830
5% Critical Value
-3.029
10% Critical Value
-2.655
7/21/2015
26
Error Correction Model (ECM)
for Non-Stationarity

One can try regression of first differences.
 However, first differences do not use
information on levels.
 It mixes long-term relationship with the
short-term changes.
 Error correction model (ECM) can
separate long-term and short-term
relationships.
7/21/2015
27
Results of ECM
Dependent
Variable
D(lnPIT)
Variable
Coefficient
Stand. Error
T-Statistic
C
-5.327
1.426
-3.735
D(ln CPI)
-0.348
0.490
-0.709
lnQD(-1)
-0.697
0.175
-3.985
lnCPI(-1)
0.883
0.225
3.923
R=Squared
0.551
AIC
-2.307
Log
likelihood
27.063
Schwartz C.
-2.107
DW
1.946
F-Statistic
6.538
7/21/2015
28
Interpretation of the Estimated
Regression

ln QDt – lnQD t-1 = -5.327-0.348*(lnCPIt – lnCPI t-1)
– 0.697* (lnQD t-1- 1.267 lnCPI t-1)
Short-run
Effect
Long-run Effect
Error Correction
Coefficient
7/21/2015
29
Forecasting

A forecast is:
– A quantitative estimate about the likelihood of
future events which is developed on the basis
of current and past information.
– Some useful definitions:
– Point forecast: predicts a single number for Y
in each forecast period
– Interval forecast: indicates an interval in which
the realized value of Y will lie.
7/21/2015
30
Unconditional Forecasting

First estimate the econometric model
Yt  0  1 X1t  et
et ~ N 0, 2 
 Then, compute:
YˆT 1  ˆ0  ˆ1 X1T 1
assuming XT+1 is known. This is the point
forecast.
7/21/2015
31
Unconditional Forecasting (Cont.)



The forecast error is:

The 95% confidence interval for YT+1 is:

eˆT 1  YˆT 1  YT 1  ˆ0  0  ˆ1  1 XT 1  eT 1
YˆT 1  t0.5 s f  YT 1  YˆT 1  t0.5 s f

where

1
2
2

ˆ
s f   1 
2



X  X 
 X  X 
2
T 1
T
t 1
t


2


Which provides a good measure of the precision of the
forecast.
7/21/2015
32
Conditional Forecasting

If XT+1 is not known and needs to be
forecasted.
– The stochastic nature of the predicted
values for Xs leads to forecasts that are
less reliable.
– The forecasted value of Y at time T+1 is
YˆT 1  ˆ0  ˆ1 X1T 1
7/21/2015
33
Unit Roots, Spurious Regressions,
and Cointegration
Simulate the processes
Yt  10  Yt 1  et
 where et ~N(0,4) and
X t  20  X t 1  ut


where ut ~N(0,9).
7/21/2015
34
Unit Roots, Spurious Regressions,
and Cointegration (Cont.)

Spurious regressions:
– Granger and Newbold(1974) demonstrated that
macroeconomic variable data are trended
upwards and that in regressions involving the
levels of such data, the standard significance
tests are misleading. The conventional t and F
tests reject the hypothesis of no relationship
when in fact there might be one.
– Symptom: R2 > DW is a good rule of thumb to
suspect that the estimated regression is
spurious.
7/21/2015
35
Unit Roots, Spurious Regressions,
and Cointegration (Cont.)

Unit roots:
– If a variable behaves like
Yt    Yt 1  et
– Then its variance will be infinite since,
Yt  t  t 1 et  et  t  t 1 et
t 1
t
– This is a non-stationary variable. E.g.,
Yt  10  Yt 1  et
where et ~N(0,4). This would result with a forever
increasing series.
7/21/2015
36
Unit Roots, Spurious Regressions,
and Cointegration (Cont.)

The series can be made stationary by
taking first difference of Yt,
Yt  Yt 1    et

The series has finite variance and is
a stationary variable. The original
series Yt is said to be integrated of
order one [I(1)].
7/21/2015
37
Unit Roots, Spurious Regressions,
and Cointegration (Cont.)

•
A trend-stationary variable
Yt    t  et
also has a finite variance.
The process
Yt  Yt 1  et
is non-stationary and does not have
a finite variance.
7/21/2015
38
Unit Roots, Spurious Regressions,
and Cointegration (Cont.)
But the variable,
Yt  rYt 1  et
 Is stationary and has a finite variance
if abs(r)<1. E.g.,

Yt  0.8Yt 1  et
where et ~N(0,4).
7/21/2015
39
Unit Roots, Spurious Regressions,
and Cointegration (Cont.)

Tests for unit roots: Dickey-Fuller Test
– Case of I(1)
Yt    Yt 1  et
– Null hypothesis:
– Alternative hypothesis:
Yt    t  et
DYt    t  ( r 1)Yt 1  et

Run regression:

And test (r-1)=0 by comparing the t-statistic with
MacKinnon critical values for rejection of the hypothesis of
a unit root.
7/21/2015
40
Unit Roots, Spurious Regressions,
and Cointegration (Cont.)
Case of Random Walk (RW)
– Null hypothesis:
Yt  Yt 1  et
– Alternative hypothesis:
Yt  rYt 1  et
DYt  ( r 1)Yt 1  et

Run regression:

And test (r-1)=0 by comparing the t-statistic with
MacKinnon critical values for rejection of the hypothesis of
a unit root.
7/21/2015
41
Unit Roots, Spurious Regressions,
and Cointegration (Cont.)
DF tests on macroeconomic
variables:
 Most macroeconomic flows and
stocks related to the population size
such as output, consumption or
employment are I(1) while price
levels are I(2). E.g., GDP is I(1) while
interest rates are I(2).

7/21/2015
42
Unit Roots, Spurious Regressions,
and Cointegration (Cont.)

Cointegration
– If two series are both I(1), there may be a 1
such that
et  Yt  1 X t
– Is I(0). The implication is that the two series
are drifting upward together at roughly the
same rate.
– Two series satisfying the above requirement
are said to be cointegrated and the vector
1, 1 ]'
is a cointegrating vector.
7/21/2015
43