Estimating Causal Effects: Non

Download Report

Transcript Estimating Causal Effects: Non

Non-Experimental Data:
Natural Experiments
and more on IV
Non-Experimental Data
• Refers to all data that has not been collected as
part of experiment
• Quality of analysis depends on how well one can
deal with problems of:
–
–
–
–
Omitted variables
Reverse causality
Measurement error
selection
• Or… how close one can get to experimental
conditions
Natural/ ‘Quasi’ Experiments
• Used to refer to situation that is not
experimental but is ‘as if’ it was
• Not a precise definition – saying your data
is a ‘natural experiment’ makes it sound
better
• Refers to case where variation in X is
‘good variation’ (directly or indirectly via
instrument)
• A Famous Example: London, 1854
The Case of the
Broad Street Pump
• Regular cholera epidemics in 19th century
London
• Widely believed to be caused by ‘bad air’
• John Snow thought ‘bad water’ was cause
• Experimental design would be to randomly
give some people good water and some
bad water
• Ethical Problems with this
Soho Outbreak
August/September 1854
• People closest to Broad Street Pump most
likely to die
• But breathe same air so does not resolve
air vs. water hypothesis
• Nearby workhouse had own well and few
deaths
• Nearby brewery had own well and no
deaths (workers all drank beer)
Why is this a Natural experiment?
• Variation in water supply ‘as if’ it had been
randomly assigned – other factors (‘air’) held
constant
• Can then estimate treatment effect using
difference in means
• Or run regression of death on water source
distance to pump, other factors
• Strongly suggests water the cause
• Woman died in Hampstead, niece in Islington
What’s that got to do with it?
• Aunt liked taste of water from Broad Street
pump
• Had it delivered every day
• Niece had visited her
• Investigation of well found contamination
by sewer
• This is non-experimental data but
analysed in a way that makes a very
powerful case – no theory either
Methods for Analysing Data from
Natural Experiments
• If data is ‘as if’ it were experimental then can use
all techniques described for experimental data
– OLS (perhaps Snow case)
– IV to get appropriate units of measurement
• Will say more about IV than OLS
– IV perhaps more common
– If can use OLS not more to say
– With IV there is more to say – weak instruments
Conditions for Instrument Validity
• To be valid instrument:
– Must be correlated with X - testable
– Must be uncorrelated with ‘error’ – untestable
– have to argue case for this assumption
• These conditions guaranteed with
instrument for experimental data
• But more problematic for data from quasiexperiments
Bombs, Bones and Breakpoints:
The Geography of Economic Activity
Davis and Weinstein, AER, 2002
• Existence of agglomerations (e.g. cities) a
puzzle
• Land and labour costs higher so why don’t firms
relocate to increase profits
• Must be some compensatory productivity effect
• Different hypotheses about this:
– Locational fundamentals
– Increasing returns (Krugman) – path-dependence
Testing these Hypotheses
• Consider a temporary shock to city
population
• Locational fundamentals theory would
predict no permanent effect
• Increasing returns would suggest
permanent effect
• Would like to do experiment of randomly
assigning shocks to city size
• This is not going to happen
The Davis-Weinstein idea
• Use US bombing of Japanese cities in WW2
• This is a ‘natural experiment’ not a true
experiment because:
– WW2 not caused by desire to test theories of
economic geography
– Pattern of US bombing not random
• Sample is 303 Japanese cities, data is:
– Population before and after bombing
– Measures of destruction
Basic Equation
si,6047  0  1si,4740  2 xi  i
• Δsi,47-40 is change in population just before and
after war
• Δsi,60-47 is change in population at later period
• How to test hypotheses:
– Locational fundamentals predicts β1=-1
– Increasing returns predicts β1=0
The IV approach
• Δsi,47-40 might be influenced by both
permanent and temporary factors
• Only want part that is transitory shock
caused by war damage
• Instrument Δsi,47-40 by measures of death
and destruction
The First-Stage:
Correlation of Δsi,47-40 with Z
Why Do We Need First-Stage?
• Establishes instrument relevance –
correlation of X and Z
• Gives an idea of how strong this
correlation is – ‘weak instrument’ problem
• In this case reported first-stage not
obviously that implicit in what follows
– That would be bad practice
The IV Estimates
Why Are these other variables
included?
• Potential criticisms of instrument exogeneity
– Government post-war reconstruction expenses
correlated with destruction and had an effect on
population growth
– US bombing heavier of cities of strategic importance
(perhaps they had higher growth rates)
• Inclusion of the extra variables designed to head
off these criticisms
• Assumption is that of exogeneity conditional on
the inclusion of these variables
• Conclusion favours locational fundamentals view
An additional piece of supporting
evidence….
• Always trying to build a strong evidence base – many
potential ways to do this, not just estimating equations
The Problem of Weak Instruments
• Say that instruments are ‘weak’ if
correlation between X and Z low (after
inclusion of other exogenous variables)
• Rule of thumb - If F-statistic on
instruments in first-stage less than 10 then
may be problem (will explain this a bit
later)
Why Do Weak Instruments Matter?
• A whole range of problems tend to arise if
instruments are weak
• Asymptotic problems:
– High asymptotic variance
– Small departures from instrument exogeneity lead to
big inconsistencies
• Finite-Sample Problems:
– Small-sample distirbution may be very different from
asymptotic one
• May be large bias
• Computed variance may be wrong
• Distribution may be very different from normal
Asymptotic Problems I:
Low precision
• asymptotic variance of IV estimator is
larger the weaker the instruments
• Intuition – variance in any estimator tends
to be lower the bigger the variation in X –
think of σ2(X’X)-1
• IV only uses variation in X that is
associated with Z
• As instruments get weaker using less and
less variation in X
Asymptotic Problems II:
Small Departures from Instrument Exogeneity
Lead to Big Inconsistencies
• Suppose true causal model is
y=Xβ+Zγ+ε
So possibly direct effect of Z on y.
• Instrument exogeneity is γ=0.
• Obviously want this to be zero but might
hope that no big problem if ‘close to zero’
– a small deviation from exogeneity
But this will not be the case if
instruments weak… consider justidentified case
IV
ˆ
  Z ' X  Z ' y
IV
ˆ
     Z ' X  Z ' Z   Z ' X  Z ' 
1

1

IV
ˆ
p lim     p lim  Z ' X  p lim  Z ' Z      ZX1  ZZ 
N

N

• If instruments weak then ΣZX small so ΣZX-1
large so γ multiplied by a large number
An Example:
The Return to Education
• Economists long-interested in whether
investment in human capital a ‘good’ investment
• Some theory shows that coefficient on s in
regression:
y=β0+β1s+β2x+ε
Is measure of rate of return to education
• OLS estimates around 8% - suggests very good
investment
• Might be liquidity constraints
• Might be bias
Potential Sources of Bias
• Most commonly mentioned is ‘ability bias’
• Ability correlated with earnings
independent of education
• Ability correlated with education
• If ability omitted from ‘x’ variables then
usual formula for omitted variables bias
suggests upward bias in OLS estimate
Potential Solution
• Find an instrument correlated with education but
uncorrelated with ‘ability’ (or other excluded
variables)
• Angrist-Krueger “Does Compulsory Schooling
Attendance Affect Schooling and Earnings” ,
QJE 1991, suggest using quarter of birth
• Argue correlated with education because of
school start age policies and school leaving laws
(instrument relevance)
• Don’t have to accept this – can test it
A graphical version of first-stage
(correlation between education and Z)
In this case…
• Their instrument is binary so IV estimator can be
written in Wald form
• And this leads to following expression for
potential inconsistency:
p lim ˆ IV 
E  y Z  1  E  y Z  0 
E  X Z  1  E  X Z  0 


E  X Z  1  E  X Z  0 
• Note denominator is difference in schooling for
those born in first- and other quarters
• Instrument will be ‘weak’ if this difference is
small
Their Results
Interpretation
(and Potential Criticism)
• IV estimates not much below OLS
estimates (higher in one case)
• Suggests ‘ability bias’ no big deal
• But instrument is weak
• Being born in 1st quarter reduces
education by 0.1 years
• Means ‘γ’ will be multiplied by 10
But why should we have γ≠0
• Remember this would imply a direct effect of
quarter of birth on earnings, not just one that
works through the effect on education
• Bound, Jaeger and Baker argued that evidence
that quarter of birth correlated with:
– Mental and physical health
– Socioeconomic status of parents
• Unlikely that any effects are large but don’t have
to be when instruments are weak
An example: UK data
.32
.325
.33
.335
.34
Variation in Socoeconomic Status of Parents by Birth Month
1
2
3
4
5
6
7
8
Month of Birth of Child
9
10
11
12
Effect is small but significantly different from zero
A Back-of-the-Envelope Calculation
• Being born in first quarter means 0.01 less likely to have
a managerial/professional parent
• Being a manager/professional raises log earnings by
0.64
• Correlation between earnings of children and parents 0.4
• Effect on earnings through this route
0.01*0.64*0.4=0.00256 i.e. ¼ of 1 per cent
• Small but weak instrument causes effect on
inconsistency of IV estimate to be multiplied by 10 –
0.0256
• Now large relative to OLS estimate of 0.08
Summary
• Small deviations from instrument
exogeneity lead to big inconsistencies in
IV estimate if instruments are weak
• Suspect this is often of great practical
importance
• Quite common to use ‘odd’ instrument –
argue that ‘no reason to believe’ it is
correlated with ε but show correlation with
X
Finite Sample Problems
• This is a very complicated topic
• Exact results for special cases, approximations
for more general cases
• Hard to say anything that is definitely true but
can give useful guidance
• Problems in 3 areas
– Bias
– Incorrect measurement of variance
– Non-normal distribution
• But really all different symptoms of same thing
Review and Reminder
• If ask STATA to estimate equation by IV
• Coefficients compute using formula given
• Standard errors computed using formula
for asymptotic variance
• T-statistics, confidence intervals and pvalues computed using assumption that
estimator is unbiased with variance as
computed and normally distributed
• All are asymptotic results
Difference between asymptotic and
finite-sample distributions
• This is normal case
• Only in special cases e.g. linear
regression model with normally distributed
errors are small-sample and asymptotic
distributions the same.
• Difference likely to be bigger
– The smaller the sample size
– The weaker the instruments
Rule of Thumb for Weak
Instruments
• F-test for instruments in first-stage >10
• Stricter than significant e.g. if one
instrument F=10 equivalent to t=3.3
Conclusion
• Natural experiments useful source of knowledge
• Often requires use of IV
• Instrument exogeneity and relevance need
justification
• Weak instruments potentially serious
• Good practice to present first-stage regression
• Finding more robust alternative to IV an active
research area