No Slide Title
Download
Report
Transcript No Slide Title
Lorraine Dearden
Director of ADMIN Node
Institute of Education
Email: [email protected]
Introduction
Give you a whirlwind tour of the economic approach to
evaluation
Can’t go into too much technical detail
Excellent new review article by Blundell and Costa Dias
(forthcoming Journal of Human Resources) – borrow heavily
from their exposition
But hope I get the essential ideas across so that you can judge
which (if any) of the approaches may be useful
Along the way give some of my initial thoughts on how
different approaches may be used
The Evaluation Problem
Question which we want to answer is
What is the effect of some treatment (Di=1) on some
outcome of interest (Y1i) compared to the outcome (Y0i)
if the treatment had taken place (Di=0)
Don’t observe the counterfactual
Fine if treatment is randomly assigned, but in a lot of
economic and epidemiological settings this is not the
case
The economic approach to evaluation involves
methods that try and get around this selection
problem
Selection Problem
Selection bias: is caused by characteristics
(observed (Z) and unobserved (v)) that affect both
the decision to participate in the program and its
outcomes
If participants are systematically different from
non-participants with respect to such
characteristics, then the outcome observed for
non-participants does not represent a good
approximation to the counterfactual for
participants
Economic Evaluation Methods
Constructing the counterfactual in a convincing way is
the key requirement
Six distinct, but related approaches, attempting to deal
with potential selection bias:
Social experiment methods
Natural experiments
Matching methods
Instrumental variable methods (not going to discuss)
Discontinuity design methods
Control function methods (not going to discuss)
Assignment to treatment
Selection into treatment at time k is assumed to be
made on the basis of an index function D*
D*ik = Zikc + vik
where c is the vector of coefficients and vik the
unobservable term
The treatment status is then defined as
Dit = 1 if D*ik > 0 and t > k
Dit = 0 otherwise
What are we trying to measure?
Express average parameters at time t>k at a particular
value of Zik= z as:
Average treatment (ATE) for population (if individual
assigned at random to treatment)
aATE(z) = E(ai| Zik= z )
Average treatment effect on the treated (ATT)
aATT(z) = E(ai| Zik= z, Dit = 1 )= E(ai| Zik= z, vik>-zc)
Average treatment effect on the non-treated (ATNT)
aATNT(z) = E(ai| Zik= z, Dit = 0 )= E(ai| Zik= z, vik<-zc)
All these parameters identical if homogeneous
treatment effects
Outcome equation
The potential outcome for individual i at time t is given
by (ignoring other covariates (X) that impact on Y):
Y1it= b + ai + uit
if Dit = 1
Y0it= b + uit
if Da it = 0
Hence we can write:
Y1it= b + ai Dit + uit
Collecting unobserved heterogeneity terms together:
Y1it = b + a Dit + (uit Dit (ai a )) b + a Dit +eit
Where a is the ATE. Non-random selection occurs if e
is correlated with D.
What does this mean?
This implies e is either correlated with the regressors
determining assignment, Z and/or correlated with the
unobservable component in the selection equation (v)
Consequently there are 2 forms of selection
Selection on the observables
Selection on the unobservables
If homogeneous treatment effect, selection bias only
occurs if D correlated with u whereas if heterogeneous
treatment effect could also arise if D correlated with
idiosyncratic gain from treatment ai a
Different estimators use different assumptions about
assignment to identify the impact of the treatment
Social Experiment
Closest to the theory free method of a clinical trial
Relies on the availability of a random assignment
rule
The assumptions required are:
R1: E[ui|Di=1]= E[ui|Di=0]=E[ui]
R2: E[ai|Di=1]= E[ai|Di=0]=E[ai]
If conditions hold, can identify the average effect
in the experimental population using OLS (ATE)
Natural Experiments: Difference in
Difference (DID) Estimator
DID approach uses a natural experiment to mimic the
randomisation of a social experiment
Natural experiment – some naturally occuring event
which creates a policy shift for one group and not
another
It may be a change in health policy in one jurisdiction but not
another
Or may refer to the eligibility of a certain group to a change in
health policy for which a similar group is ineligible
The difference in outcomes between the two groups
before and after the policy change gives the estimate of
the policy impact
Require longitudinal data or repeated cross section data
(where samples are drawn from the same population)
before and after the intervention
DID Estimator
Rewrite the outcome equation as:
Y1it= b + ai Dit + uit = b + ai Dit + i+t+it
i.e. u is decomposed into three terms: an unobserved
fixed effect, an aggregate macro (time) shock and an
idiosyncratic transitory shock
The main assumption underlying DID is that selection
into treatment is independent of the transitory shock:
DID: E(uit | Dit)=E(i | Dit)+ it
that is R1 holds in first differences
DID estimator
Measures ATT
Doesn’t rule out selection on unobservables as long as
fixed
DID estimator is just first difference estimator
commonly used with panel data in presence of fixed
effects
Problems if selection on idiosyncratic temporary
shock, not common macro effect, compositional
changes over time (repeated cross –sections)
But may have applications, for instance postcode
lottery with health services, abolition/introduction of
health program or service affecting health for a subgroup of the population
Matching Methods
Assumes all selection is based on observables
characteristics/matching variables (X) that you have in
your data
OLS is a form of matching and will give you the
ATT=ATE=ATNT if the X
(i) are unaffected by the treatment
(ii) contain all the variables that influence both the
participation decision and the outcome of interest
(iii) there is common support (all values of X are
observed amongst treated and non-treated)
Can use more flexible regression methods so if the
effect of X’s is heterogeneous (testable) then ATT
ATNT ATE
Propensity Score Matching
Regression approaches are a form of matching
approach
Propensity score matching is another matching
approach
Shares a number of assumptions with regression based
approaches
A lot more flexible but also much more
computationally expensive
Assumptions
Matching is based on the following assumption
M1: Conditional Independence Assumption (CIA) – condition on the
set of observables X, the non-treated outcomes are independent of
the participation status i.e.
Yi 0 Di | X i
Assumption M1 implies a conditional version of R1
E[ui|Di, Xi]= E[ui| Xi]
Slightly stronger assumption needed to get ATE
Don’t need an equivalent of R2 to identify ATT as selection on the
unobservable gains is accommodated by matching but do need one
more assumption – that each treated observation can be reproduced
amongst the non-treated
Common Support
M2: All treated individuals have a counterpart on the
non-treated population any anyone constitutes a
possible participant
So S the common support for X is the part of the
distribution of X represented in the two groups
All individuals in the treatment group for whom there
is not common support are excluded from the
matching estimate
Matching
Involves selecting from the non-treated pool a control
group in which the distribution of observed variables is as
similar as possible to the distribution in the treated group
(by coming up with a set of weights for the control group to
make it look like the treatment group)
There are a number of ways of doing this but they almost
always involve calculating the propensity score pi(x)
Pr{D=1|X=x}
Drop any individuals in treatment group who have
propensity score greater than maximum in control group
(to ensure common support)
The propensity score
The propensity score is the probability of being in the
treatment group given you have characteristics X=x
How do you do this?
Use parametric methods (e.g. logit or probit) and
estimate the probability of a person being in the
treatment group for all individuals in the treatment
and non-treatment groups
Rather than matching on the basis of ALL X’s can
match on basis of this propensity score (Rosenbaum
and Rubin (1983))
How do we match?
All matching methods come up with a way of
reweighting the control group
ATT is the difference in the mean outcome in the two
groups (appropriately weighted)
Nearest neighbour matching
each person in the treatment group choose individual(s)
with the closest propensity score to them
can do this with (most common) or without
replacement
not very efficient as discarding a lot of information
about the control group
Kernel based matching
each person in the treatment group is matched to a
weighted sum of individuals who have similar
propensity scores with greatest weight being given to
people with closer scores
Some kernel based matching use ALL people in nontreated group (e.g. Gaussian kernel) whereas others
only use people within a certain probability userspecified bandwidth (e.g. Epanechnikov )
Choice of bandwidth involves a trade-off of bias with
precision
Other methods
Radius matching
Caliper matching
Mahalanobis matching
Local linear regression matching
Spline matching…..
Matching an option?
Need very good data – otherwise highly likely selection on
unobservables
Common support – if some of treated cannot be matched
then definition of estimated parameter becomes unclear
Can also combine matching and DID methods - common
support more problematic if using repeated cross-section
Applications in Epidemiology? If have well designed pilot
study with well chosen control groups and rich survey data
then usually good approach(EMA evaluation in UK)
Whether appropriate in other cases depends on questions
and data availability
Regression Discontinuity Design
Some deterministic rule means that some
individuals below a threshold receive a treatment
whereas those above to do not
Look at differences in outcomes for those just
below and just above the threshold to look at
impact of treatment
Like randomised control trial but only for a very
specific group of individuals (UNLESS effect is
constant across all participants – untestable)
Example of RDD
Medical treatment given on basis of diagnostic test:
compare impact of treatment for those just above and
just below threshold
Date of birth and when you start school – children
born on 31 August start school one year earlier than
children born on 1 September – can look at whether
better to start school at age 4 or 5 in neighbourhood of
discontinuity
Nov90
Feb91
May91
.7
.6
.5
.4
.4
.5
.6
.7
Proportion achieving expected level
.8
KS2
.8
KS1
Aug91
Nov91
Feb92
May92
Aug92
Nov90
Feb91
May91
Day of Birth
Aug91
Nov91
Feb92
May92
Aug92
Day of Birth
.4
.5
.6
.7
.8
KS3
Nov90
Feb91
May91
Aug91
Nov91
Day of Birth
Feb92
May92
Aug92
Males
Females
Idea
The RDD uses the discontinuous dependence of D on z
at z*.
The variable z is an observable variable which can also
have an independent effect on the outcome of interest
not just through its affect on D (unlike with the IV
approach)
The RDD approach relies on continuity assumptions
namely:
DD1: E(bi |z) as a function of z is continuous at z=z*
DD2: E(ai |z) as a function of z is continuous at z=z*
DD3: The participation decision, D, is independent from
the participation gain ai, in the neighbourhood of z*
What potential RDD?
Major drawback of discontinuity design is its
dependence on discontinuous changes in odds of
participation dictated by the design of the policy
Means can only look at impact of policy at a certain
margin dictated by the discontinuity – generalisability
much more difficult without strong assumptions....
If rule can be manipulated and/or if it changes
behaviour then finding might be spurious – new
diagnostic tests question a lot of early RDD findings
See Lee|Lemieux NBER methodological paper
IV and Control Function
Not going to discuss
Control Function approach accounts for selection on
unobservables by treating the endogeneity of D as an
omitted variable problem
Requires exclusion restrictions and distributional
assumptions
IV approach, like RDD requires finding a policy
accident/exogenous event that means some people get a
treatment whilst others don’t. It assumes that the
accident/exogenous event only impacts on the outcome
through its effect on D
Untestable assumption
Conclusions
Number of options when evaluating whether
something effective and think economic approach to
evaluation could be used in epidemiology
Depends on nature of intervention, available data,
question you want to answer
Each methods has advantages and disadvantages and
involves assumptions that may or may not be credible
and all these factors have to be carefully assessed