No Slide Title

Transcript No Slide Title

Lorraine Dearden
Director of ADMIN Node
Institute of Education
Email: [email protected]
Introduction
 Give you a whirlwind tour of the economic approach to
evaluation
 Can’t go into too much technical detail
 Excellent new review article by Blundell and Costa Dias
(forthcoming Journal of Human Resources) – borrow heavily
from their exposition
 But hope I get the essential ideas across so that you can judge
which (if any) of the approaches may be useful
 Along the way give some of my initial thoughts on how
different approaches may be used
The Evaluation Problem
 Question which we want to answer is
 What is the effect of some treatment (Di=1) on some
outcome of interest (Y1i) compared to the outcome (Y0i)
if the treatment had taken place (Di=0)
 Don’t observe the counterfactual
 Fine if treatment is randomly assigned, but in a lot of
economic and epidemiological settings this is not the
case
 The economic approach to evaluation involves
methods that try and get around this selection
problem
Selection Problem
 Selection bias: is caused by characteristics
(observed (Z) and unobserved (v)) that affect both
the decision to participate in the program and its
outcomes
 If participants are systematically different from
non-participants with respect to such
characteristics, then the outcome observed for
non-participants does not represent a good
approximation to the counterfactual for
participants
Economic Evaluation Methods
 Constructing the counterfactual in a convincing way is
the key requirement
 Six distinct, but related approaches, attempting to deal
with potential selection bias:
 Social experiment methods
 Natural experiments
 Matching methods
 Instrumental variable methods (not going to discuss)
 Discontinuity design methods
 Control function methods (not going to discuss)
Assignment to treatment
 Selection into treatment at time k is assumed to be
made on the basis of an index function D*
 D*ik = Zikc + vik
where c is the vector of coefficients and vik the
unobservable term
 The treatment status is then defined as
 Dit = 1 if D*ik > 0 and t > k
 Dit = 0 otherwise
What are we trying to measure?
Express average parameters at time t>k at a particular
value of Zik= z as:
 Average treatment (ATE) for population (if individual
assigned at random to treatment)
aATE(z) = E(ai| Zik= z )
 Average treatment effect on the treated (ATT)
aATT(z) = E(ai| Zik= z, Dit = 1 )= E(ai| Zik= z, vik>-zc)
 Average treatment effect on the non-treated (ATNT)
aATNT(z) = E(ai| Zik= z, Dit = 0 )= E(ai| Zik= z, vik<-zc)
 All these parameters identical if homogeneous
treatment effects
Outcome equation
 The potential outcome for individual i at time t is given
by (ignoring other covariates (X) that impact on Y):
Y1it= b + ai + uit
if Dit = 1
Y0it= b + uit
if Da it = 0
 Hence we can write:
Y1it= b + ai Dit + uit
 Collecting unobserved heterogeneity terms together:
Y1it = b + a Dit + (uit  Dit (ai  a ))  b + a Dit +eit
 Where a is the ATE. Non-random selection occurs if e
is correlated with D.
What does this mean?
 This implies e is either correlated with the regressors
determining assignment, Z and/or correlated with the
unobservable component in the selection equation (v)
 Consequently there are 2 forms of selection
 Selection on the observables
 Selection on the unobservables
 If homogeneous treatment effect, selection bias only
occurs if D correlated with u whereas if heterogeneous
treatment effect could also arise if D correlated with
idiosyncratic gain from treatment ai  a
 Different estimators use different assumptions about
assignment to identify the impact of the treatment
Social Experiment
 Closest to the theory free method of a clinical trial
 Relies on the availability of a random assignment
rule
 The assumptions required are:
R1: E[ui|Di=1]= E[ui|Di=0]=E[ui]
R2: E[ai|Di=1]= E[ai|Di=0]=E[ai]
 If conditions hold, can identify the average effect
in the experimental population using OLS (ATE)
Natural Experiments: Difference in
Difference (DID) Estimator
 DID approach uses a natural experiment to mimic the
randomisation of a social experiment
 Natural experiment – some naturally occuring event
which creates a policy shift for one group and not
another
 It may be a change in health policy in one jurisdiction but not
another
 Or may refer to the eligibility of a certain group to a change in
health policy for which a similar group is ineligible
 The difference in outcomes between the two groups
before and after the policy change gives the estimate of
the policy impact
 Require longitudinal data or repeated cross section data
(where samples are drawn from the same population)
before and after the intervention
DID Estimator
 Rewrite the outcome equation as:
Y1it= b + ai Dit + uit = b + ai Dit + i+t+it
i.e. u is decomposed into three terms: an unobserved
fixed effect, an aggregate macro (time) shock and an
idiosyncratic transitory shock
 The main assumption underlying DID is that selection
into treatment is independent of the transitory shock:
DID: E(uit | Dit)=E(i | Dit)+ it
that is R1 holds in first differences
DID estimator
 Measures ATT
 Doesn’t rule out selection on unobservables as long as
fixed
 DID estimator is just first difference estimator
commonly used with panel data in presence of fixed
effects
 Problems if selection on idiosyncratic temporary
shock, not common macro effect, compositional
changes over time (repeated cross –sections)
 But may have applications, for instance postcode
lottery with health services, abolition/introduction of
health program or service affecting health for a subgroup of the population
Matching Methods
 Assumes all selection is based on observables
characteristics/matching variables (X) that you have in
your data
 OLS is a form of matching and will give you the
ATT=ATE=ATNT if the X
(i) are unaffected by the treatment
(ii) contain all the variables that influence both the
participation decision and the outcome of interest
(iii) there is common support (all values of X are
observed amongst treated and non-treated)
 Can use more flexible regression methods so if the
effect of X’s is heterogeneous (testable) then ATT 
ATNT  ATE
Propensity Score Matching
 Regression approaches are a form of matching
approach
 Propensity score matching is another matching
approach
 Shares a number of assumptions with regression based
approaches
 A lot more flexible but also much more
computationally expensive
Assumptions
 Matching is based on the following assumption
M1: Conditional Independence Assumption (CIA) – condition on the
set of observables X, the non-treated outcomes are independent of
the participation status i.e.
Yi 0  Di | X i
 Assumption M1 implies a conditional version of R1
E[ui|Di, Xi]= E[ui| Xi]
 Slightly stronger assumption needed to get ATE
 Don’t need an equivalent of R2 to identify ATT as selection on the
unobservable gains is accommodated by matching but do need one
more assumption – that each treated observation can be reproduced
amongst the non-treated
Common Support
M2: All treated individuals have a counterpart on the
non-treated population any anyone constitutes a
possible participant
 So S the common support for X is the part of the
distribution of X represented in the two groups
 All individuals in the treatment group for whom there
is not common support are excluded from the
matching estimate
Matching
 Involves selecting from the non-treated pool a control
group in which the distribution of observed variables is as
similar as possible to the distribution in the treated group
(by coming up with a set of weights for the control group to
make it look like the treatment group)
 There are a number of ways of doing this but they almost
always involve calculating the propensity score pi(x)
Pr{D=1|X=x}
 Drop any individuals in treatment group who have
propensity score greater than maximum in control group
(to ensure common support)
The propensity score
 The propensity score is the probability of being in the
treatment group given you have characteristics X=x
 How do you do this?
 Use parametric methods (e.g. logit or probit) and
estimate the probability of a person being in the
treatment group for all individuals in the treatment
and non-treatment groups
 Rather than matching on the basis of ALL X’s can
match on basis of this propensity score (Rosenbaum
and Rubin (1983))
How do we match?
 All matching methods come up with a way of
reweighting the control group
 ATT is the difference in the mean outcome in the two
groups (appropriately weighted)
 Nearest neighbour matching
 each person in the treatment group choose individual(s)
with the closest propensity score to them
 can do this with (most common) or without
replacement
 not very efficient as discarding a lot of information
about the control group
 Kernel based matching
 each person in the treatment group is matched to a
weighted sum of individuals who have similar
propensity scores with greatest weight being given to
people with closer scores
 Some kernel based matching use ALL people in nontreated group (e.g. Gaussian kernel) whereas others
only use people within a certain probability userspecified bandwidth (e.g. Epanechnikov )
 Choice of bandwidth involves a trade-off of bias with
precision
Other methods
 Radius matching
 Caliper matching
 Mahalanobis matching
 Local linear regression matching
 Spline matching…..
Matching an option?
 Need very good data – otherwise highly likely selection on




unobservables
Common support – if some of treated cannot be matched
then definition of estimated parameter becomes unclear
Can also combine matching and DID methods - common
support more problematic if using repeated cross-section
Applications in Epidemiology? If have well designed pilot
study with well chosen control groups and rich survey data
then usually good approach(EMA evaluation in UK)
Whether appropriate in other cases depends on questions
and data availability
Regression Discontinuity Design
 Some deterministic rule means that some
individuals below a threshold receive a treatment
whereas those above to do not
 Look at differences in outcomes for those just
below and just above the threshold to look at
impact of treatment
 Like randomised control trial but only for a very
specific group of individuals (UNLESS effect is
constant across all participants – untestable)
Example of RDD
 Medical treatment given on basis of diagnostic test:
compare impact of treatment for those just above and
just below threshold
 Date of birth and when you start school – children
born on 31 August start school one year earlier than
children born on 1 September – can look at whether
better to start school at age 4 or 5 in neighbourhood of
discontinuity
Nov90
Feb91
May91
.7
.6
.5
.4
.4
.5
.6
.7
Proportion achieving expected level
.8
KS2
.8
KS1
Aug91
Nov91
Feb92
May92
Aug92
Nov90
Feb91
May91
Day of Birth
Aug91
Nov91
Feb92
May92
Aug92
Day of Birth
.4
.5
.6
.7
.8
KS3
Nov90
Feb91
May91
Aug91
Nov91
Day of Birth
Feb92
May92
Aug92
Males
Females
Idea
 The RDD uses the discontinuous dependence of D on z
at z*.
 The variable z is an observable variable which can also
have an independent effect on the outcome of interest
not just through its affect on D (unlike with the IV
approach)
 The RDD approach relies on continuity assumptions
namely:
DD1: E(bi |z) as a function of z is continuous at z=z*
DD2: E(ai |z) as a function of z is continuous at z=z*
DD3: The participation decision, D, is independent from
the participation gain ai, in the neighbourhood of z*
What potential RDD?
 Major drawback of discontinuity design is its
dependence on discontinuous changes in odds of
participation dictated by the design of the policy
 Means can only look at impact of policy at a certain
margin dictated by the discontinuity – generalisability
much more difficult without strong assumptions....
 If rule can be manipulated and/or if it changes
behaviour then finding might be spurious – new
diagnostic tests question a lot of early RDD findings
 See Lee|Lemieux NBER methodological paper
IV and Control Function
 Not going to discuss
 Control Function approach accounts for selection on
unobservables by treating the endogeneity of D as an
omitted variable problem
 Requires exclusion restrictions and distributional
assumptions
 IV approach, like RDD requires finding a policy
accident/exogenous event that means some people get a
treatment whilst others don’t. It assumes that the
accident/exogenous event only impacts on the outcome
through its effect on D
 Untestable assumption
Conclusions
 Number of options when evaluating whether
something effective and think economic approach to
evaluation could be used in epidemiology
 Depends on nature of intervention, available data,
question you want to answer
 Each methods has advantages and disadvantages and
involves assumptions that may or may not be credible
and all these factors have to be carefully assessed