Research in infrastructure: Quasi
Download
Report
Transcript Research in infrastructure: Quasi
When Randomization is not
possible:
Quasi-experimental methods
Development Impact Evaluation Initiative
innovations & solutions
in infrastructure, agriculture & environment
naivasha, april 23-27, 2011
in collaboration with Africa region, SD network, GAFSP and AGRA
Alternatives to
Randomization
Sometimes, randomization is really not
possible
Large infrastructure projects
Politically sensitive projects
In these cases, we can use “quasiexperimental” methods to try to mimic the
benefits of randomized assignment
Defining the control group
The point of quasi-experimental methods is
to obtain a control group that is almost as
good as what would have been obtained by
randomization
We still have some form of treatment and
control, and generally use the difference-indifference estimator
We just use different methods to select a
“good” control group
This session
Three quasi-experimental methods for evaluation
Regression discontinuity design
Propensity score matching
Instrumental variables methods
In each: Pros and cons, practical and research-wise
Illustrative examples
Regression Discontinuity
Designs
RDD is based on the selection process
When in presence of an official/bureaucratic, clear and
reasonably enforced eligibility rule
A simple, quantifiable score
Assignment to treatment is based on this
score
A threshold is established
▪ Ex: target firms with sales above a certain amount
▪ Those above receive, those below do not
▪ compare firms just above the threshold to firms just
below the threshold
5
RDD Logic
Assignment to the treatment depends, either completely or
partly, on a continuous “score”, ranking (age in the previous case):
potential beneficiaries are ordered by looking at the score
there is a cut-off point for “eligibility” – clearly defined criterion
determined ex-ante
cut-off determines the assignment to the treatment or no-treatment
groups
These de facto assignments often result from administrative
decisions
resource constraints limit coverage
very targeted intervention with expected heterogeneous impact
transparent rules rather than discretion used
6
Possible dicontinuities
Income/Land eligibility for government
programs:
People with land below 2ha get subsidized loans,
those above do not get it
Age Eligibility Criteria
Children below 5yrs get access to new schools,
those above 5yrs go to old schools
Geography
People on one side of a border only get program,
those on other side do now
RDD Example: Drinking
Age
A country is considering implementing a
minimum drinking age in their country.
Will this cause a:
Decrease in drinking?
Decrease in deaths?
Use US data to explore this question
RDD in Practice
Policy: US drinking age, minimum legal age is 21
under 21, alcohol consumption is illegal
Outcomes: alcohol consumption and mortality rate
Observation:
The policy implies that
individuals aged 20 years, 11 months and 29 days cannot drink
individuals ages 21 years, 0 month and 1 day can drink
however, do we think that these individuals are inherently different?
wisdom, preferences for alcohol and driving, party-going behavior, etc
People born “few days apart” are treated differently, because of the arbitrary age
cut off established by the law
a few days or a month age difference could is unlikely to yield variations in behavior and
attitude towards alcohol
The legal status is the only difference between the treatment group (just above 21) and
comparison group (just below 21)
9
RDD in Practice
In practice, making alcohol consumption illegal lowers
consumption and, therefore, the incidence of drunk-driving
Idea: use the following groups to measure the impact of a
minimum drinking age on mortality rate of young adults
Treatment group: individuals 20 years and 11 months to 21 years
old
Comparison group: individuals 21 years to 21 years and a month old
Around the threshold, we can safely assume that individuals
are randomly assigned to the treatment
We can then measure the causal impact of the policy on
mortality rates around the threshold
10
RDD Example
MLDA (Treatment) reduces
alcohol consumption
11
RDD Example
Total number of Deaths
Higher alcohol
consumption increases
death rate around age 21
Total number of accidental
deaths related to alcohol
and drug consumption
Total number of
other deaths
12
Conclusion: Causal
Since the jump at exactly 21 years could not
be caused by other factors, we conclude it is
caused by the drinking age policy.
RDD: Caveats
Caveats
Requires program with clear, well-defined
eligibility rules
Requires data from many people just above and
below cutoff (meaning there need to be many
people right around the cutoff!)
Program should be only source of discontinuity
(meaning in general, administrative borders are
not great for RDD)
Method 2: Matching Method
Match participants with non-participants on the basis of
observable characteristics
Counterfactual:
Matched comparison group
Each program participant is paired with one or more
similar non-participant(s) based on observable
characteristics
>> On average, matched participants and nonparticipants
share the same observable characteristics (by
construction)
Estimate the effect of our intervention by using
difference-in-differences
15
How do we do it?
Design a control group by establishing close
matches in terms of observable characteristics
Carefully select variables along which to match
participants to their control group
So that we only retain
▪ Treatment Group: Participants that could find a match
▪ Comparison Group: Non-participants similar enough to
the participants
>> We trim out a portion of our treatment group!
Implications
In most cases, we cannot match everyone
Need to understand who is left out
Example
Matched
Individuals
Portion of treatment
group trimmed out
Nonparticipants
Participants
Score
Wealth
Matching: Caveats
Caveats:
Needs lots of data to create good matches
Even with good data, results are less robust than
other methods
Need to start with very large sample to assure
there are enough people with matches
Must be really convinced that the control villages
were not excluded for important reasons
Method 3: Instrumental
Variables Methods
Idea: if only part of the allocation of project to places is
random, use only that part to get at the causal impact
An instrumental variable is a variable that helps you
isolate just that part of the variation in project placement
(a “lever” to manipulate good variation in project
placement)
Method 3: Instrumental
Variables Methods
Example: Dinkelman 2011: Rural household
electrification and employment in South Africa
End of apartheid (1994), Eskom promises to make
500,000 new household connections each year, fully
subsidized
Will electricity improve employment propects?
Project selection criteria
Political reasons: part of the “not-easily-identifiable but
good reasons for selecting particular target groups”
Cost reasons: high household density, short distances to
existing grid, flatter land gradient
Method 3: Instrumental
Variables Methods
Comparing electrified and unelectrified areas likely biased,
because project areas not selected randomly
Instead: use variation in cost that affects placement: Land
Gradient
Steeper areas are more costly to electrify so are less likely to get
electricity
Assumption: Land gradient is relatively random and should not
affect employment in other ways besides the cost to electrify
Data and Context
Census communities (1996, 2001) for rural KZN
•
•
Unit of
analysis
Districts (d) ~ 30,000-150,000 hh’s
Community/village (j) ~ 220 hh’s, n=1,816
Geography
•
•
1996 grid infrastructure, proximity to roads, towns
Community land gradient
Electricity
•
Administrative data on whether community had an Eskom
project 1996-2001 (20% did)
Sample area
and project
assignments
Received Elec.
No Elec.
Towns
Substations
Power lines
Sample area
and gradient
Flatter gradient = light yellow
Steeper gradient = brown
Towns
Substations
Power lines
Method 4: Instrumental
Variables Methods
Assumptions and conditions
1. IV must predict project allocation
▪ “Strong first stage”
▪ This is testable!!
2.
IV must by unrelated to unobservable factors that
affect project allocation and outcomes
▪ This is not testable
▪ Need good contextual knowledge to defend this
IV Methods: Main results
A 10% increase in land gradient reduces the probability of
electrification by 7.7 p.p.
Electrification raises female employment by 9.5 p.p, no
significant impacts on men
Electrification raises the fraction of households using
electric lighting by 63 p.p., cooking with electricity by 23
p.p., reduces cooking with wood by 27.5 p.p.
Instrumental Variables:
Caveats
Caveats:
To use IV, you need a good instrument, and this is
not always possible!
Generally, it is very difficult to find a convincing
instrument, so this method only works in certain
cases
Recap of Methods: Which is
Best
Randomization
“Gold Standard”- Produces most rigorous results, but may
not be technically/politically feasible in all cases
RDD
Produces strong results, but requires a clear, measurable
allocation rule
Instrument Variables
Produces strong results, but requires good instrument,
which may not exist
Matching
Produces results that may be less rigorous, but may be
easier to implement than other methods