Transcript An introduction to Impact Evaluation Markus Goldstein Poverty Reduction Group
An introduction to Impact Evaluation
Markus Goldstein Poverty Reduction Group The World Bank
My question is: Are we making an impact?
2 parts
•
M
onitoring,
e
valuation and
I
mpact
E
valuation • The impact evaluation problem • Introduce fertilizer example
What is M&E?
There is a difference between M and E!
Monitoring:
The gathering of evidence to show what Focuses on
inputs
and
outputs progress
has been made in the implementation of programs. but will often include outcomes as well.
Evaluation:
Measuring changes in
outcomes
the
impact
and evaluating of specific interventions on those outcomes.
Monitoring
Regular collection and reporting of information to track whether actual results are being achieved as planned Periodically collect data on the indicators and compare actual results with targets To identify bottle-necks and red flags (time-lags, fund flows) Point to what should be further investigated
Indicator
50% 40% 30% 1 2 3 4 5
20 %
Year
Evaluation
Analyses why intended results were or were not achieved Explores unintended results Provides lessons learned and recommendations for improvement Analytical efforts to answer specific questions about performance of program activities. Oriented to answering WHY? And HOW?
Indicator
50% 40% 30% 1 2 3 4 5
15 %
Year
Complementary roles for M&E
Monitoring
• Routine collection of information • Tracking implementation progress • Measuring efficiency
Evaluation
• Ex-post assessment of effectiveness and impact • Confirming (or not) project expectations • Measuring impacts
“Is the project doing things right ?” “Is the project doing the right things?”
monitoring
Understanding the different levels of indicators
IMPACT Effect on living standards
- infant and child mortality, - prevalence of specific disease
OUTCOMES OUTPUTS INPUTS Access, usage and satisfaction of users
- number of children vaccinated, - percentage within 5 km of health center
Goods and services generated
- number of nurses - availability of medicine
Financial and physical resources
- spending in primary health care
Selecting Indicators
The
“CREAM”
of Good Performance
A good performance indicator must be:
C
lear
R
elevant
E
conomic
A
dequate
M
onitorable (Precise and unambiguous) (Appropriate to subject at hand) (Available at reasonable cost) (Must provide a sufficient basis to assess performance) (Must be amenable to independent validation) Salvatore-Schiavo-Campo 2000
Compare with SMART Indicators …
S
pecific
M
easurable
A
ttributable
R
ealistic and relevant
T
ime-bound
And some other thoughts on monitoring
• Information must be available in time for it be put to use • Think about the use you will put the information to when deciding what to collect • Monitoring is not about the quantity of indicators, it is about their quality
evaluation
Thinking about types of evaluation
• “e” lies in between M and IE (impact evaluation) • Analyzing existing information (baseline data, monitoring data) • Drawing
intermediate
lessons • Serves as a feed-back loop into project design • Useful for analyzing and understanding processes, not for establishing causality
Examples of non-impact evaluation approaches
• Non-comparative designs – no counterfactual required: – Case study – Rate of return analysis – present discounted value (e.g. by subprojects in CDD portfolio) – Process analysis (e.g. understanding how inputs translate into outputs) – Lot quality assurance
Examples of how “e” helps
• • • Timely information to:
Revise targeting
: A watershed project found that 30% of the livelihood component (meant exclusively for marginal and landless) was benefiting small-big farmers. This information was used to do some mid course corrections.
Monitor progress
purchased with project funding died. This led to the introduction of livestock insurance as a pre requisite. : In a CDD project, several goats
Monitor implementing agency
: an NGO only built pit green houses (supply driven or demand driven?)
impact evaluation
IMPACT OUTCOMES OUTPUTS INPUTS
Monitoring and IE
Effect on living standards
- infant and child mortality, - prevalence of specific disease
Access, usage and satisfaction of users
- number of children vaccinated, - percentage within 5 km of health center
Goods and services generated
- number of nurses - availability of medicine
Financial and physical resources
- spending in primary health care
Program impacts confounded by local, national, global effects Users meet service delivery Gov’t/program production function
Monitoring and IE
IMPACTS OUTCOMES OUTPUTS INPUTS difficulty of showing causality
Impact evaluation
• Many names (e.g. Rossi et al call this impact assessment) so need to know the concept.
• Impact is the difference between outcomes with the program and without it • The goal of impact evaluation is to measure this difference in a way that can attribute the difference to the program, and only the program
Why it matters
• We want to know if the program had an impact and the average size of that impact – Understand if policies work • Justification for program (big $$) • Scale up or not – did it work?
• Compare different policy options within a program • Meta-analyses – learning from others – (with cost data) understand the net benefits of the program – Understand the distribution of gains and losses
What we need
The difference in outcomes with the program versus without the program – for the
same
unit of analysis (e.g. individual) • Problem: individuals only have one existence • Hence, we have a problem of a missing counter-factual, a problem of missing data
Thinking about the counterfactual
• Why not compare individuals before and after (the reflexive)?
– The rest of the world moves on and you are not sure what was caused by the program and what by the rest of the world • We need a control/comparison group that will allow us to attribute any change in the “treatment” group to the program (causality)
comparison group issues
• Two central problems: – Programs are targeted Program areas will differ in observable and unobservable ways precisely because the program intended this – Individual participation is (usually) voluntary Participants will differ from non-participants in observable and unobservable ways • Hence, a comparison of participants and an arbitrary group of non-participants can lead to heavily biased results
Example: providing fertilizer to farmers
• The intervention: provide fertilizer to farmers in a poor region of a country (call it region A) – Program targets poor areas – Farmers have to enroll at the local extension office to receive the fertilizer – Starts in 2002, ends in 2004, we have data on yields for farmers in the poor region and another region (region B) for both years • We observe that the farmers we provide fertilizer to have a
decrease
in yields from 2002 to 2004
Did the program not work?
• Further study reveals there was a national drought, and everyone’s yields went down (failure of the reflexive comparison) • We compare the farmers in the program region to those in another region. We find that our “treatment” farmers have a larger decline than those in region B.
Did the program have a negative impact?
– Not necessarily (program placement) • Farmers in region B have better quality soil (unobservable) • Farmers in the other region have more irrigation, which is key in this drought year (observable)
OK, so let’s compare the farmers in region A
• We compare “treatment” farmers with their neighbors. We think the soil is roughly the same. • Let’s say we observe that treatment farmers’ yields decline by less than comparison farmers.
Did the program work?
– Not necessarily. Farmers who went to register with the program may have more ability, and thus could manage the drought better than their neighbors, but the fertilizer was irrelevant. (individual unobservables) • Let’s say we observe no difference between the two groups.
Did the program not work?
– Not necessarily. What little rain there was caused the fertilizer to run off onto the neighbors’ fields. (spillover/contamination)
The comparison group
• In the end, with these naïve comparisons, we cannot tell if the program had an impact We need a comparison group that is as identical in observable and unobservable dimensions as possible, to those receiving the program, and a comparison group that will not receive spillover benefits.
What difference do unobservables make? Microfinance in Thailand
• 2 NGOs in north-east Thailand • Village banks with loans of 1500-7500 (300 US$) baht • Borrowers (women) form peer groups, which guarantee individual borrowing • What would we expect impacts to be?
Comparison group issues in this case:
•
Program placement
: villages which are selected for the program are different in observable and unobservable ways •
Individual self-selection:
households which choose to participate in the program are different in observable and unobservable ways (e.g. entrepreneurship) • Design solution: 2 groups of villages -- in comparison villages allow membership but no loans at first
Women’s land value Women’s self emp sales Women’s ag sales Unobserved village char Observed village char X Member obs & unobs char X Member land 5 years ago X 42.5
(93.3) -10.7
(504) 76.5 (101) X FE model Results from Coleman (JDE 1999) Naïve model Super naïve Non-FE model 87.5
(65.3) 174 (364) 162 (73.9) 121** (54.6) 542* (296) 101* (59.5) 6916*** (1974) 545* (295) 113* (59.9) X X X X
summing up
Step back and put it into context…
So … where do you begin?
• Clear
objectives
problem?) of the project (what is the • Clear idea of
how you will achieve
the objectives (causal chain or storyline) • Outcome focused: – Answer the question: What visible changes in behavior can be expected among end users as a result of the project, thus validating the causal chain?
Ideally at preparation stage
Design the “M, e and IE” Plan
• What?
– Type of information and data to be consolidated • How? – Procedures and approaches including methods for data collection and analysis • Why?
– How the collected data will support monitoring and project management • When?
– Frequency of data collection and reporting • Who?
– Focal points, resource persons and responsibilities
Choose your tools and what they will cover
• Monitoring, a must – for key indicators • Evaluation – to understand processes, analyze correlations • Impact evaluation – where you want to establish causal effects