The Why?, What?, and How? of Instrumental Variables
Download
Report
Transcript The Why?, What?, and How? of Instrumental Variables
Applied Econometrics
Instrumental Variable Approach
Nguyen Ngoc Anh
Nguyen Ha Trang
DEPOCEN
Topics That
Will Be Covered in this Workshop
Why use IV?
–
–
What is an IV?
–
–
Discussion of endogeneity bias
Statistical motivation for IV
Identification issues
Statistical properties of IV estimators
How is an IV model estimated?
–
–
Software and data examples
Diagnostics: IV relevance, IV exogeneity, Hausman
Review of the Linear Model (in
metrix algebra)
Population model: Y = α + βX + ε
–
Assume that the true slope is positive, so β > 0
Sample model: Y = a + bX + e
–
Least squares (LS) estimator of β:
bLS = (X′X)–1X′Y = Cov(X,Y) / Var(X)
Under what conditions can we speak of bLS
as a causal estimate of the effect of X on Y?
Review of the Linear Model
Key assumption of the linear model:
–
–
–
E(|x) = E( ) = 0 Cov(x, ) = E(x ) = 0
E(X′e) = Cov(X,e) = E(e | X) = 0
Exogeneity assumption = X is uncorrelated with the
unobserved determinants of Y
Important statistical property of the LS estimator
under exogeneity:
x x
x
x
y
ˆ
i
i
ˆ1
2
x x
x
x
i
i
1
i
1
2
i
So, E ˆ1 1
E(bLS) = β + Cov(X,e) / Var(X)
plim(bLS) = β + Cov(X,e) / Var(X)
Second terms 0,
so bLS unbiased
and consistent
Review of the Linear Model
When you regress Y on X, Y = β0 + β1X + ε and
the OLS estimate of β1 can be described as
bOLS
Cov Y , X
Cov X , X
Cov β0 β1X ε, X
Cov X , X
β1 Cov X , X Cov ε, X
Cov X , X
β1
Cov ε, X
Cov X , X
But since X and ε are correlated, bOLS does not
estimate β1 but some other quantity that depends on
the correlation of X and ε
Endogeneity and the Evaluation Problem
When is the exogeneity assumption violated?
–
–
–
Selection bias is the problem in observational
research that undermines causal inference
–
Measurement error → Attenuation bias
Instantaneous causation → Simultaneity bias
Omitted variables → Selection bias
Measurement error and instantaneous causation
can be posed as problems of omitted variables
Potential outcome approach!!!!
When Is the Exogeneity Assumption Violated?
Omitted variable (W) that is correlated with both
X and Y
–
Classic problem of omitted variables bias
Coefficient on X will absorb the indirect path through W,
whose sign depends on Cov(X,W) and Cov(W,Y)
X
W
Y
Things more complicated in applied
settings because there are bound
to be many W’s, not to mention that
the “smearing” problem applies in
this context also
Example #1: Police Hiring
Measurement error
–
Instantaneous causation
–
Mobilization of sworn officers (M.E. in X) as well
as differential victim reporting or crime recording
(M.E. in Y) may be correlated with police size
More police might be hired during a crime wave
Omitted variables
–
Large departments may differ in fundamental
ways difficult to measure (e.g., urban,
heterogeneous)
Example #2: Delinquent Peers
Measurement error
–
Instantaneous causation
–
Highly delinquent youth probably overestimate the
delinquency of their peers (M.E. in X), and likely
underestimate their own delinquency (M.E. in Y)
If there is influence/imitation, then it is bidirectional
Omitted variables
–
High-risk youth probably select themselves into
delinquent peer groups (“birds of a feather”)
Regression Estimation
Ignoring Omitted Variables
Suppose we estimate treatment effect model:
–
Y = α + βX + ε
Let’s assume without loss of generality that X is a
binary “treatment” (= 1 if treated; = 0 if untreated)
Least squares estimator:
–
bLS = Cov(X,Y) / Var(X) = E(Y | X = 1) – E(Y | X = 0)
Simply the difference in means between “treated”
units (X = 1) and “untreated” units (X = 0)
Estimating Treatment Effects
Consider treatment assignment (dummy variable) X and outcome
Y
Regress Y on X
Yi = β0 + β1Xi + εi
The estimate of β1 is just the difference between the mean Y for X
= 1 (the treatment group) and the mean Y for X = 0 (the control
group)
Y1 β0 β1 ε1
Y0 β0 ε0
Thus the OLS estimate is
Y1 Y0
= β1 +
1 0
Estimating Treatment Effects
(With Random Assignment)
If the treatment is randomly assigned, then X is uncorrelated with ε (X
is exogenous)
If X is uncorrelated with ε if and only if 1 0
But if 1 0 , then the mean difference is
Y1 Y0 = β1 + 1 0 = β1
This implies that standard methods (OLS) give an unbiased estimate of
β1, which is the average treatment effect
That is, the treatment-control mean difference is an unbiased estimate
of β1,
What goes wrong without randomization?
If we do not have randomization, there is no guarantee that X is
uncorrelated with ε (X may be endogenous)
Thus the OLS estimate is still
Y1 Y0
= β1 +
1 0
If X is correlated with ε, then
1 0
HenceY1 Y0 does not estimate β1, but some other quantity that
depends on the correlation of X and ε
If X is correlated with ε, then standard methods give a biased
estimate of β1
Omitted Variables
in applied research
What variables of interest to us are surely
endogenous?
–
–
Micro = Employment, education, marriage, military
service, fertility, conviction, family structure,....
Macro = Poverty, unemployment rate, collective
efficacy, immigrant concentration,....
Basically, EVERYTHING!
–
(I’m sorry ....... But it suck)
Potential outcome framework
Traditional Strategies
to Deal with Omitted Variables
Randomization (physical control)
Covariate adjustment (statistical control)
–
–
Control for potential W’s in a regression model
But...we have no idea how many W’s there are, so
model misspecification is still a real problem here
Quasi-Experimental Strategies
to Deal with Omitted Variables
Difference in differences (fixed-effects model)
–
Requires panel data
Propensity score matching
–
Requires a lot of measured background variables
Similar to covariate adjustment, but only the treated and
untreated cases which are “on support” are utilized
Instrumental variables estimation
–
Requires an exclusion restriction
Instrumental Variables
Estimation Is a Viable Approach
An “instrumental variable” for X is one solution
to the problem of omitted variables bias
Z
Requirements for Z to be a
valid instrument for X
–
X
W
Y
e
–
Relevant = Correlated with X
Exogenous = Not correlated
with Y but through its
correlation with X
Important Point about
Instrumental Variables Models
I often hear...“A good instrument should not
be correlated with the dependent variable”
–
Z has to be correlated with Y, otherwise it is
useless as an instrument
–
–
WRONG!!!
It can only be correlated with Y through X
(trong X có 2 phần, 1 phần dính với e một phần với
Y, muốn tận dụng phần dính với Y)
A good instrument must not be correlated
with the unobserved determinants of Y
Important Point about Instrumental
Variables Models
Not all of the available variation in X is used
–
Only that portion of X which is “explained” by
Z is used to explain Y
X
Y
Z
X = Endogenous variable
Y = Response variable
Z = Instrumental variable
Important Point about
Instrumental Variables Models
X
Y
Best-case scenario: A lot of
X is explained by Z, and
most of the overlap between
X and Y is accounted for
Y
Realistic scenario: Very
little of X is explained by Z,
or what is explained does
not overlap much with Y
Z
X
Z
Important Point about
Instrumental Variables Models
The IV estimator is BIASED
–
–
In other words, E(bIV) ≠ β (finite-sample bias)
The appeal of IV derives from its consistency
–
“Consistency” is a way of saying that E(b) → β as N → ∞
So…IV studies often have very large samples
But with endogeneity, E(bLS) ≠ β and plim(bLS) ≠ β
anyway
Asymptotic behavior of IV
–
plim(bIV) = β + Cov(Z,e) / Cov(Z,X)
If Z is truly exogenous, then Cov(Z,e) = 0
Instrumental
Variables Terminology
Three different models to be familiar with
–
–
–
First stage: X = α0 + α1Z + ω
Structural model: Y = β0 + β1X + ε
Reduced form: Y = δ0 + δ1Z + ξ
More on the Method of
Two-Stage Least Squares (2SLS)
Step 1: X = a0 + a1Z1 + a2Z2 + + akZk + u
–
Obtain fitted values (X̃) from the first-stage model
Step 2: Y = b0 + b1X̃ + e
–
–
Substitute the fitted X̃ in place of the original X
Note: If done manually in two stages, the standard
errors are based on the wrong residual
e = Y – b0 – b1X̃ when it should be e = Y – b0 – b1X
Best to just let the software do it for you
Some examples
Some examples
Including Control
Variables in an IV/2SLS Model
Control variables (W’s) should be entered into
the model at both stages
–
–
First stage: X = a0 + a1Z + a2W + u
Second stage: Y = b0 + b1X̃ + b2W + e
Control variables are considered “instruments,”
they are just not “excluded instruments”
–
They serve as their own instrument
Functional
Form Considerations with IV/2SLS
Binary endogenous regressor (X)
–
Binary response variable (Y)
–
Consistency of second-stage estimates do not
hinge on getting first-stage functional form correct
IV probit (or logit) is feasible but is technically
unnecessary
In both cases, linear model is tractable, easily
interpreted, and consistent
–
Although variance adjustment is well advised
Technical Conditions
Required for Model Identification
Order condition = At least the same # of IV’s
as endogenous X’s
–
–
Just-identified model: # IV’s = # X’s
Overidentified model: # IV’s > # X’s
Rank condition = At least one IV must be
significant in the first-stage model
–
Number of linearly independent columns in a matrix
E(X | Z,W) cannot be perfectly correlated with E(X | W)
Instrumental Variables
and Randomized Experiments
Imperfect compliance in randomized trials
–
Some individuals assigned to treatment group will
not receive Tx, and some assigned to control group
will receive Tx
–
Assignment error; subject refusal; investigator discretion
Some individuals who receive Tx will not change
their behavior, and some who do not receive Tx will
change their behavior
A problem in randomized job training studies and other
social experiments (e.g., housing vouchers)
Durbin-Wu-Hausman (DWH) Test
Balances the consistency of IV against the
efficiency of LS
–
–
H0: IV and LS both consistent, but LS is efficient
H1: Only IV is consistent
DWH test for a single endogenous regressor:
–
DWH = (bIV – bLS) / √(s2bIV – s2bLS) ~ N(0,1)
If |DWH| > 1.96, then X is endogenous and IV is
the preferred estimator despite its inefficiency
Durbin-Wu-Hausman (DWH) Test
A roughly equivalent procedure for DWH:
1. Estimate the first-stage model
2. Include the first-stage residual in the structural
model along with the endogenous X
3. Test for significance of the coefficient on residual
Note: Coefficient on endogenous X in this
model is bIV (standard error is smaller, though)
–
First-stage residual is a “generated regressor”
Software Considerations
Basic model specification in Stata
ivreg y (x = z) w [weight = wtvar], options
y = dependent variable
x = endogenous variable
z = instrumental variable
w = control variable(s)