No Slide Title

Download Report

Transcript No Slide Title

Research Method
Lecture 11-1 (Ch15)
Instrumental Variables
Estimation and Two
Stage Least Square
©
1
Motivation
One explanatory variable
case
Consider the following regression.
log( wage )   0  1educ  (  2 abil  e)

u
Since ability is not observed, we can only
run the following regression.
log(wage)  0  1educ u
Since ability is correlated with educ, educ
is endogenous (i.e, correlated with u).
Thus, ˆ1 will be biased.
2
We learned two methods to eliminate the
bias.
(1)Plug in the proxy variable for ability,
such as IQ.
(2)Use panel data method (either the fixed
effect or the first differenced model).
Instrumental variable method is another
method to eliminate the bias.
3
Instrumental variable method:
One explanatory variable case.
Consider the following model.
y  0  1 x  u
Suppose that x is endogenous, that is cov(x,
u)≠0.
Further, suppose that you have another
variable, z, which satisfies the following
conditions.
Cov(z,u)=0 (instrument exogeneity) …....(1)
Cov(z,x)≠0 (instrument relevance)………(2)
If the above conditions are satisfied, we
call z an instrumental variable.
4
There are two ways to intuitively
understand these conditions.
1.Instrumental variable is a variable that is
not correlated with the omitted variable,
but is correlated with the endogenous
explanatory variable.
2.Instrumental variable is a variable that
affects y only through x.
5
The condition Cov(z,u)=0 involves
unobserved u. Therefore, we cannot test
this condition. (When you have extra
instrumental variables, you can test this.
This will be discussed later).
The condition Cov(z,x)≠0 is easy to test.
Just runt the following OLS,
x=π0+π1z+v
then test
H0:π1=0
6
Instrumental variable estimation: One
explanatory variable-one instrument case
Now, consider
y  0  1 x  u
Then we have
Cov(z,y)=Cov(z,β0+β1x+u)
So we have,
Cov(z,y)= β1Cov(z,x)+Cov(z,u)
Since Cov(z,u)=0, we have
Cov( z , y )
1 
..............(3)
Cov( z , x)
7
By replacing Cov(z,y) and Cov(z,x) with
their sample covariances, we have the
instrumental variable estimator of β1
which is given by
n
ˆ1 
 (z
i 1
n
i
 (z
i 1
 z )( yi  y )
............(4)
i
 z )(xi  x )
You can easily show that ˆ is a consistent
estimator of β1.
1
8
Statistical inference with IV:
Homoskedasticity case
Homoskedasticity assumption in the case
of IV regression is stated in terms of z.
E(u2|z)=σ2
It can be shown that the asymptotic
variance of ˆ1 is given by:
2

var(ˆ1 ) 
................(5)
2
2
n x  x, z
where  x2 is the variance of x, and
correlation between x and z.
 x2,z
is the
9
Now, the estimator of var( ˆ1 ) is obtained
by replacing σ2,  x2 , and  x2,z with their
sample estimates.
Sample estimator of σ2 is obtained in the
following way. First, obtain the IV
estimates for β0 and β1, then compute
uˆi  yi  ˆ0  ˆ1xi ...........................................(6)
The estimator for σ2 is then computed as
1 n 2
ˆ 
uˆi .........................................(7)

n  2 i 1
2
10
The sample estimator for  x2 is given as:
SSTx
1 n
2
ˆ
   ( xi  x ) 
................(8)
n
n
i 1

2
x
SSTx
 x2,zcan
Finally, sample estimator for
be
most easily obtained in the following way.
First, regress x on z. Then the R-squared
from this regression equals the square of
the sample correlation. Let call this R2x,z.
(Off course, you can compute the sample correlation
and raise it by power 2. You will get the same result).
11
Then, the estimator for the variance of
is given by:
varˆ(ˆ1 ) 
ˆ 2
SSTx  R
2
x, z
ˆ1
................(9)
You can show that this is a consistent
estimator of the asymptotic variance
given by (5).
12
Note: R-squared in IV
regression
The R-squared for IV regression is
computed as
R2=1-SSR/SST
Where SSR is the sum of the squared IV
residuals. (The IV residual is given by (6)).
Unlike in the case of OLS, SSR can be greater
than SST. Thus, R2 can be negative. In IV
regression, R2 does not have a natural
interpretation.
13
Finding the instrumental
variable
The most difficult part of the instrumental
variable estimation is to find suitable
instrumental variables.
Consider the following regression
log( wage )   0  1educ  (  2 abil  e)

u
Then, you have to find z that is correlated
with educ, but not correlated with abil.
What can be z?
14
Consider the father’s education. Perhaps a
person whose father is highly educated tends to
take more education as well. So the father’s
education is likely correlated with educ.
But, for father’s education to be an instrument,
this should not be correlated with the
unobserved ability. A highly educated father
may nurture his child better, so father’s
education may be correlated with the
unobserved ability. If this is the case, father’s
education is not a good instrument.
Nonetheless, many studies have used father’s
and mother’s education as instruments.
15
Exercises
1. Run the following regression using OLS,
using MROZ.dat
log(wage)  0  1educ u
2. Using the father’s education as an
instrument for edu, estimate the same
model using IV regression. Also check if
father’s education is correlated with educ.
16
. reg lwage educ
Source
SS
df
MS
Model
Residual
26.3264193
197.001022
1 26.3264193
426 .462443713
Total
223.327441
427 .523015084
lwage
Coef.
educ
_cons
.1086487
-.1851968
Std. Err.
.0143998
.1852259
Number of obs
F( 1, 426)
Prob > F
R-squared
Adj R-squared
Root MSE
t
P>|t|
7.55
-1.00
0.000
0.318
=
=
=
=
=
=
428
56.93
0.0000
0.1179
0.1158
.68003
OLS
[95% Conf. Interval]
.0803451
-.5492673
.1369523
.1788736
. ivregress 2sls lwage (educ= fatheduc)
Instrumental variables (2SLS) regression
lwage
Coef.
educ
_cons
.0591735
.4411034
Number of obs
Wald chi2(1)
Prob > chi2
R-squared
Root MSE
Std. Err.
z
P>|z|
.0350596
.4450583
1.69
0.99
0.091
0.322
=
428
=
2.85
= 0.0914
= 0.0934
= .68778
[95% Conf. Interval]
-.009542
-.4311947
IV
regression
.127889
1.313402
Instrumented: educ
Instruments: fatheduc
17
Number of obs
F( 1,
426)
Prob > F
R-squared
Adj R-squared
Root MSE
educ
Coef.
fatheduc
_cons
.2694416
10.23705
Robust
Std. Err.
.0288675
.2718861
t
9.33
37.65
=
=
=
=
=
=
428
87.12
0.0000
0.1726
0.1706
2.0813
P>|t|
[95% Conf. Interval]
0.000
0.000
.2127013
9.702646
Check if
father’s
education is
correlated with
educ.
.326182
10.77146
18
An application
Angrist and Krueger (1991), “Does
Compulsory School Attendance Affect
Schooling and Earning?”
They used the quarter of the birth dummy
as an instrument for education to estimate
the effect of education on wage.
19
In the US, the compulsory schooling law
requires students to remain in school until
their 16th birthday.
At the same time, schools usually requires
Children to be 6 years old on January 1st to
be admitted to school. Therefore, children
who were born in the first quarter were
older than children who were born in the
last quarter when they were first admitted
to schools (6.45 v.s. 6.07 years).
20
This also means that children who were
born in the first quarter of the year has
shorter schooling when they reach the
legal drop out age. So, children who were
born in the first quarter can legally drop
out of school with less education than
children who were born in other quarters.
If some people want to take as little
education as possible but are constrained
by the compulsory schooling law, the
quarter of birth should affect the
education attainment.
21
At the same time, the quarter of birth is
unlikely to be correlated with the
unobserved ability.
Therefore, the dummy variable indicating
if a person was born in the first quarter of
the year is a good instrument for
education.
22
Those born in the first quarter of the year
tend to have lower education attainment
23
This is the IV
regression
24