An introduction to Impact Evaluation Markus Goldstein Poverty Reduction Group

Download Report

Transcript An introduction to Impact Evaluation Markus Goldstein Poverty Reduction Group

An introduction to Impact Evaluation

Markus Goldstein Poverty Reduction Group The World Bank

My question is: Are we making an impact?

2 parts

M

onitoring,

e

valuation and

I

mpact

E

valuation • The impact evaluation problem • Introduce fertilizer example

What is M&E?

There is a difference between M and E!

Monitoring:

The gathering of evidence to show what Focuses on

inputs

and

outputs progress

has been made in the implementation of programs. but will often include outcomes as well.

Evaluation:

Measuring changes in

outcomes

the

impact

and evaluating of specific interventions on those outcomes.

Monitoring

Regular collection and reporting of information to track whether actual results are being achieved as planned  Periodically collect data on the indicators and compare actual results with targets  To identify bottle-necks and red flags (time-lags, fund flows)  Point to what should be further investigated

Indicator

50% 40% 30% 1 2 3 4 5

20 %

Year

Evaluation

   Analyses why intended results were or were not achieved Explores unintended results Provides lessons learned and recommendations for improvement Analytical efforts to answer specific questions about performance of program activities. Oriented to answering WHY? And HOW?

Indicator

50% 40% 30% 1 2 3 4 5

15 %

Year

Complementary roles for M&E

Monitoring

• Routine collection of information • Tracking implementation progress • Measuring efficiency

Evaluation

• Ex-post assessment of effectiveness and impact • Confirming (or not) project expectations • Measuring impacts

“Is the project doing things right ?” “Is the project doing the right things?”

monitoring

Understanding the different levels of indicators

IMPACT Effect on living standards

- infant and child mortality, - prevalence of specific disease

OUTCOMES OUTPUTS INPUTS Access, usage and satisfaction of users

- number of children vaccinated, - percentage within 5 km of health center

Goods and services generated

- number of nurses - availability of medicine

Financial and physical resources

- spending in primary health care

Selecting Indicators

The

CREAM

of Good Performance

A good performance indicator must be:

C

lear

R

elevant

E

conomic

A

dequate

M

onitorable (Precise and unambiguous) (Appropriate to subject at hand) (Available at reasonable cost) (Must provide a sufficient basis to assess performance) (Must be amenable to independent validation) Salvatore-Schiavo-Campo 2000

Compare with SMART Indicators …

S

pecific

M

easurable

A

ttributable

R

ealistic and relevant

T

ime-bound

And some other thoughts on monitoring

• Information must be available in time for it be put to use • Think about the use you will put the information to when deciding what to collect • Monitoring is not about the quantity of indicators, it is about their quality

evaluation

Thinking about types of evaluation

• “e” lies in between M and IE (impact evaluation) • Analyzing existing information (baseline data, monitoring data) • Drawing

intermediate

lessons • Serves as a feed-back loop into project design • Useful for analyzing and understanding processes, not for establishing causality

Examples of non-impact evaluation approaches

• Non-comparative designs – no counterfactual required: – Case study – Rate of return analysis – present discounted value (e.g. by subprojects in CDD portfolio) – Process analysis (e.g. understanding how inputs translate into outputs) – Lot quality assurance

Examples of how “e” helps

• • • Timely information to:

Revise targeting

: A watershed project found that 30% of the livelihood component (meant exclusively for marginal and landless) was benefiting small-big farmers. This information was used to do some mid course corrections.

Monitor progress

purchased with project funding died. This led to the introduction of livestock insurance as a pre requisite. : In a CDD project, several goats

Monitor implementing agency

: an NGO only built pit green houses (supply driven or demand driven?)

impact evaluation

IMPACT OUTCOMES OUTPUTS INPUTS

Monitoring and IE

Effect on living standards

- infant and child mortality, - prevalence of specific disease

Access, usage and satisfaction of users

- number of children vaccinated, - percentage within 5 km of health center

Goods and services generated

- number of nurses - availability of medicine

Financial and physical resources

- spending in primary health care

Program impacts confounded by local, national, global effects Users meet service delivery Gov’t/program production function

Monitoring and IE

IMPACTS OUTCOMES OUTPUTS INPUTS difficulty of showing causality

Impact evaluation

• Many names (e.g. Rossi et al call this impact assessment) so need to know the concept.

• Impact is the difference between outcomes with the program and without it • The goal of impact evaluation is to measure this difference in a way that can attribute the difference to the program, and only the program

Why it matters

• We want to know if the program had an impact and the average size of that impact – Understand if policies work • Justification for program (big $$) • Scale up or not – did it work?

• Compare different policy options within a program • Meta-analyses – learning from others – (with cost data) understand the net benefits of the program – Understand the distribution of gains and losses

What we need

 The difference in outcomes with the program versus without the program – for the

same

unit of analysis (e.g. individual) • Problem: individuals only have one existence • Hence, we have a problem of a missing counter-factual, a problem of missing data

Thinking about the counterfactual

• Why not compare individuals before and after (the reflexive)?

– The rest of the world moves on and you are not sure what was caused by the program and what by the rest of the world • We need a control/comparison group that will allow us to attribute any change in the “treatment” group to the program (causality)

comparison group issues

• Two central problems: – Programs are targeted  Program areas will differ in observable and unobservable ways precisely because the program intended this – Individual participation is (usually) voluntary  Participants will differ from non-participants in observable and unobservable ways • Hence, a comparison of participants and an arbitrary group of non-participants can lead to heavily biased results

Example: providing fertilizer to farmers

• The intervention: provide fertilizer to farmers in a poor region of a country (call it region A) – Program targets poor areas – Farmers have to enroll at the local extension office to receive the fertilizer – Starts in 2002, ends in 2004, we have data on yields for farmers in the poor region and another region (region B) for both years • We observe that the farmers we provide fertilizer to have a

decrease

in yields from 2002 to 2004

Did the program not work?

• Further study reveals there was a national drought, and everyone’s yields went down (failure of the reflexive comparison) • We compare the farmers in the program region to those in another region. We find that our “treatment” farmers have a larger decline than those in region B.

Did the program have a negative impact?

– Not necessarily (program placement) • Farmers in region B have better quality soil (unobservable) • Farmers in the other region have more irrigation, which is key in this drought year (observable)

OK, so let’s compare the farmers in region A

• We compare “treatment” farmers with their neighbors. We think the soil is roughly the same. • Let’s say we observe that treatment farmers’ yields decline by less than comparison farmers.

Did the program work?

– Not necessarily. Farmers who went to register with the program may have more ability, and thus could manage the drought better than their neighbors, but the fertilizer was irrelevant. (individual unobservables) • Let’s say we observe no difference between the two groups.

Did the program not work?

– Not necessarily. What little rain there was caused the fertilizer to run off onto the neighbors’ fields. (spillover/contamination)

The comparison group

• In the end, with these naïve comparisons, we cannot tell if the program had an impact  We need a comparison group that is as identical in observable and unobservable dimensions as possible, to those receiving the program, and a comparison group that will not receive spillover benefits.

What difference do unobservables make? Microfinance in Thailand

• 2 NGOs in north-east Thailand • Village banks with loans of 1500-7500 (300 US$) baht • Borrowers (women) form peer groups, which guarantee individual borrowing • What would we expect impacts to be?

Comparison group issues in this case:

Program placement

: villages which are selected for the program are different in observable and unobservable ways •

Individual self-selection:

households which choose to participate in the program are different in observable and unobservable ways (e.g. entrepreneurship) • Design solution: 2 groups of villages -- in comparison villages allow membership but no loans at first

Women’s land value Women’s self emp sales Women’s ag sales Unobserved village char Observed village char X Member obs & unobs char X Member land 5 years ago X 42.5

(93.3) -10.7

(504) 76.5 (101) X FE model Results from Coleman (JDE 1999) Naïve model Super naïve Non-FE model 87.5

(65.3) 174 (364) 162 (73.9) 121** (54.6) 542* (296) 101* (59.5) 6916*** (1974) 545* (295) 113* (59.9) X X X X

summing up

Step back and put it into context…

So … where do you begin?

• Clear

objectives

problem?) of the project (what is the • Clear idea of

how you will achieve

the objectives (causal chain or storyline) • Outcome focused: – Answer the question: What visible changes in behavior can be expected among end users as a result of the project, thus validating the causal chain?

Ideally at preparation stage

Design the “M, e and IE” Plan

• What?

– Type of information and data to be consolidated • How? – Procedures and approaches including methods for data collection and analysis • Why?

– How the collected data will support monitoring and project management • When?

– Frequency of data collection and reporting • Who?

– Focal points, resource persons and responsibilities

Choose your tools and what they will cover

• Monitoring, a must – for key indicators • Evaluation – to understand processes, analyze correlations • Impact evaluation – where you want to establish causal effects

Thank you