Can we obtain the required rigour without randomisation

Transcript Can we obtain the required rigour without randomisation

Can we obtain the required rigour
without randomisation?
Oxfam GB’s non-experimental Global
Performance Framework
Karl Hughes
Global Programme Effectiveness Advisor, Oxfam GB
3ie, Development Initiative and BOND Impact Evaluation
Workshop
24-25 May 2011
Presentation Outline
1. Sidestepping the Global Outcome Indicator
Bandwagon (for a second time)
2. A Workable Compromise: OGB’s Global
Performance Framework
3. Two Reputable Approaches to Causal
Inference
4. Experiences from the Pilots
5. Future Plans and Possibilities
1. Sidestepping the Global Outcome
Indicator Bandwagon (for a second time)
How to demonstrate effectiveness
ineffectively and at great cost:
1. Work with staff, partners, “beneficiaries,” and all other
relevant stakeholders to identify a comprehensive suite of
outcome/impact indicators that can be applied globally in a
transparent and participatory manner.
2. Develop data collection instruments to capture quality data
on the status of each indicator and have all relevant
programmes apply them on an annual basis.
3. Annually aggregate such data so that the status of each
indicator can be tracked over time.
4. Collaborate with your organisation’s media unit to produce
powerful and convincing impact reports, and seek to publish
key findings in the Journal of Development Effectiveness.
The Great Indicator Experiment
2006
2007
2008
• Oxfam GB jumps on the global outcome
indicator band wagon.
• 34 indicators proposed spread over the
organisations five major strategic aims.
• 10 projects identified to field test the indicators.
• After Year 1 only half of these projects ended up
collecting data.
• The Indicator Feasibility Study abandoned, given
a senior management steer to direct evaluation
efforts to other organisational priorities.
“These indicators were
meant to measure the
outcomes and impact of
the majority of our
programmes in a wide
range of contexts, and
could be synthesized and
further analyzed to
obtain a more aggregate
picture of Oxfam’s
impact as an
organisation.” (Shroff
and Stevenson 2008)
But It Ain’t Just about Indicators!
2009-10
• Only more intense pressure to demonstrate effectiveness.
• But how with over 250 programmes and 1,200 associated
projects operating in 60 countries across 5 thematic areas?
2010a
• Surprise! The global indicator strategy raises its ugly head yet
again.
• However, this time things would be different – get those in
power to develop the indicators and limit their number.
2010b
• Space opened to advise on the indicator development process
fully exploited.
• “But It Ain’t Just about Indicators” internal advocacy campaign
achieves considerable success.
Thematic Area
Outcome Indicator
Humanitarian Support
 % of people who received humanitarian support
from responses meeting established standards for
excellence, disaggregated by sex
Disaster Risk
Reduction/Climate
Change Adaptation
 % of targeted households indicating positive ability
to minimise risk from shocks and adapt to emerging
trends & uncertainty
Livelihoods Support
 % of targeted households living on more than £1.00
per day per capita
Women’s empowerment  % of supported women meaningfully involved in
household decision-making and influencing affairs at
community and enterprise levels
Popular Mobilisation
(Citizen’s Voice)
 % of targeted state institutions and other actors that
have modified their practices in response to
engagement with supported citizens , CBOs/CSOs
Policy Influencing
 % of policy objectives/outcomes successfully
achieved, disaggregated by thematic area
2. A Workable Compromise: OGB’s
Global Performance Framework
What to Do?
Need for rigorous
methods
Rigorously
evaluating all
projects not feasible
Need for
aggregation
Avoiding biased
selection
(“cherry picking”)
“Effectiveness
Auditing”
Global Output
Reporting
(GOR)
Demonstrating the
scale and much of
the diversity of what
we do
(all relevant
projects)
Effectiveness
Audits
The Global
Performance
Framework
(GPF)
Demonstrating our
effectiveness
(randomly selected
“mature” projects)
What will we be able to say?
• Example:
2011 saw the successful delivery of 51 OGB
supported livelihood and value-chain
enhancement initiatives.
These provided vital income generation support
to 242,454 people (over 156,000 of who were
women) from 36 low-income countries.
Six of these projects were randomly selected
and then rigorously evaluated, and – overall –
they were found to have improved household
income by over 25%.
3. Two Reputable Approaches to
Causal Inference
Service delivery and
innovative interventions
Capacity building and Policy
influencing interventions
(large n interventions)
(small n interventions)
Same
challenge
The Two Approaches:
• Many qual. scholars argue
• Dominant approach to
that CI not limited to
CI for large n
counterfactual
programmes; widely
framework.
used in many fields.
Causal
• OneCounterfactual/
strong approach:
• Use controls/
Evidencing
how cause
Potential
Inference
via
comparators to estimate
generated
effect – e.g.
•Outcomes
Twothe
approaches
complementary
CI
what Evidencing
would –have
process
tracing. with good est. ofMechanisms
Framework
stronger
happened with no
• Inductive
qual. to
counterfactual
& evidence
of how
intervention.
generatepurported
hypotheses;
cause generated
outcome.
• RCT often
best means,
deductive
qual.Based
to testApproaches of large n
• Theory
but other options to
which isprogrammes
supported bycan
thefit here.
“mimic” what it does.
data.
4. Experiences from the Pilots
Pilot 1: Somaliland Cross Boarder Programme
• Seeks to increase resilience to drought for
agro-pastorals living in 6 villages near the
boarder of Ethiopia (approximately 36,000
people) by:
1. Increasing availability of and access to
water resources.
2. Increasing availability of and access to
pasture.
3. Improving livestock health.
4. Enhancing community drought
preparedness capacity.
Approach for Large n Programmes:
Single Difference Ex-post with Baseline Recall using PSM
1. Purposively identify suitable comparison
population, e.g. HHs in close but not too
close villages.
2. Adapt generic questionnaires, e.g.
formative qual. to identify locally relevant
wealth assets and integrate “exposure”
questions.
3. Administer questionnaires in intervention
and survey sites to random samples of
units – 50% more in comparison villages.
4. Use propensity score matching to use
data from comparison sites to
approximate counterfactual for
intervention units – to estimate ATT.
Descriptive Statistics: Intervention and Comparison Villages -Somaliland DRR Project
Intervention
Comparison
Difference
t-stat.
Female headed household
Elderly headed household
Single adult household
Household head widow
Household head widow and female
Mean HH size
Mean number of productive adults
Mean dependency ration
Mean age of household head
Household head has at least prim. education
Mean number of adults with sec. education
Primary crop production (baseline)
Value added crop production (baseline)
Livestock rearing (baseline)
Livestock product production (baseline)
Hunting and gathering (baseline)
Off-farm business (baseline)
Casual labour (baseline)
Seasonal wage labour (baseline)
Full time wage labour (baseline)
HH receives aid (baseline)
HH receives remittances (baseline)
HH supported by community (baseline)
HH does other livelihood activities (baseline)
HH < 50 km from trading centre
HH < 40 km from Hargesia
HH in sparsely populated rural area
HH Asset Index Score
Overall HH characteristic score (baseline)
HH livelihood viability score (baseline)
HH contingency resource/support score (baseline)
0.19
0.01
0.04
0.10
0.08
6.41
2.71
1.59
43.93
0.01
0.01
0.89
0.01
0.96
0.44
0.00
0.07
0.04
0.00
0.01
0.00
0.01
0.01
0.00
0.19
0.00
1.00
-0.09
33.06
16.91
5.20
0.15
0.01
0.03
0.10
0.09
6.39
2.75
1.60
44.36
0.06
0.04
0.87
0.00
0.96
0.48
0.00
0.07
0.01
0.00
0.03
0.00
0.00
0.00
0.00
0.75
0.44
0.79
0.06
33.06
16.61
5.28
0.0448
0.00498
0.00746
0.00498
-0.00746
0.0174
-0.0373
-0.0158
-0.433
-0.0498*
-0.0373
0.0174
0.00746
0
-0.0423
0
0
0.0224
-0.00498
-0.0224
0
0.00746
0.00249
0
-0.552***
-0.438***
0.209***
-0.153
-0.00498
0.299
-0.0771
1.07
0.41
0.37
0.15
-0.24
0.07
-0.26
-0.13
-0.31
-2.17
-1.79
0.48
1.23
0.00
-0.76
.
0.00
1.31
-0.82
-1.40
.
1.23
0.29
.
-11.76
-10.18
5.93
-0.72
-0.01
1.08
-0.74
HH natural resource access/mgt. score (baseline)
10.95
11.17
-0.226
-1.17
134
201
335
Observations
Problem Encountered
• Location related differences considerable -HHs in the intervention villages systematically
more remote.
• And matching on the basis of these would
result in the considerable loss of data.
• Fortunately, both variables relating to the
distance from trading centres and Hargesia.
not correlated with the various outcome
variables that were examined.
Post Matching Covariate Balancing; Somaliland DRR Project
Mean
Variable
Sample
Treated
% reduct.
Control
% bias
bias
Unmatched
0.01504
0.07595
-29.5
Matched
0.01504
0.0058
4.5
Unmatched
HH involved in casual
labour ( baseline)
Matched
0.03759
0.00633
21.4
0.03759
0.03008
5.1
HH in livestock
product production
(baseline)
Unmatched
0.44361
0.51899
-15.1
Matched
0.44361
0.43796
1.1
Adults in household
with secondary
education
Unmatched
0.00752
0.05696
-25.6
Matched
0.00752
0.00924
-0.9
Head has at least
primary education
t-stat.
84.8
76
92.5
96.5
p-value
-2.44
0.015
0.74
0.461
1.87
0.062
0.34
0.736
-1.28
0.201
0.09
0.927
-2.11
0.036
-0.14
0.889
Overall HH Characteristic Scores; Somaliland DRR Project
Pre-matching:
Intervention Mean
Comparison Mean
Difference
Matching – kernel:
Intervention Mean
Comparison Mean
Difference
Matching – no replacement:
Intervention Mean
Comparison Mean
Difference
2011 raw
score
Dif.-in-dif. of
raw scores
2011 binary
score
Dif.-in-dif. of
binary scores
34.80
1.74
0.54
0.25
33.78
0.72
0.34
0.07
1.017**
1.022***
0.194***
0.172***
(2.70)
(3.64)
(3.58)
(3.34)
34.78
1.71
0.53
0.24
33.71
0.69
0.33
0.06
1.073**
1.027**
0.204***
0.180**
(3.16)
(3.27)
(3.94)
(3.04)
34.78
1.71
0.53
0.24
33.54
0.77
0.37
0.06
1.242**
0.947**
0.159**
0.180**
(2.61)
(3.02)
(2.73)
(3.04)
Sample size: 291 (133 intervention; 158 comparison)
Sensitivity Analysis
• Sensitivity analysis also carried out to assess
how robust effect estimates would be to the
presence of an unobserved difference (or
differences) between the intervention and
comparison villages.
• Results indicated that the prevalence of such
unobserved bias would need to be over 50
percent more prevalent among the
intervention households in order to “explain
away” the effect.
Programme Improvement
Potential
• While overall score difference positive,
disaggregation by dimension and
characteristic reveals where project seems to
be doing well and not so well.
• Intervention exposure statistics also reveal
core issue – HHs not being exposed intensively
and in large numbers
70%
Percentage of HHs Reporting Exposure to
Interventions Like Those Undertaken by
Havoyoco/Oxfam GB
60%
50%
40%
30%
20%
10%
0%
Intervention
Comparison
Revealing Example from Exposure Data:
Reported Use of Community Veterinary Services
-- Intervention Villages
Sometimes Often
2%
12%
Rarely
31%
Always
1%
Never
54%
Pilot 2: Fair Play for Africa Campaign
• 9 focal countries across the African continent
with over 200 participating CSOs.
• World Cup in South Africa used as a focal
point for campaigning on health issues.
• Aim: To work with Africans to amplify
community voices to demand their right to
universal access to health and HIV services.
• Main tactics: road shows; music events; rallies
and marches, signing pledges/petitions;
ambassadors and champions; lobbying key
policy- and decision-makers; and community
sporting events.
Approach for Small N Interventions:
Process Tracing
1. Specify the most recent intermediate and final outcomes intervention
expected (or is seeking) to achieve.
2. Systematically assess and document what was done to achieve the targeted
outcomes.
3. Identify and evidence what targeted intervention outcomes have actually
materialised, as well as any unintended outcomes.
4. Undertake “process induction” to identify all plausible causal explanations
for each evidenced outcome.
5. Use “process verification” to assess the extent each explanation is
supported or not by the available evidence – signatures, footprints.
1. Evidence that Fair Play was instrumental in influencing AU
Approach for Small N Interventions:
Finance Ministers to back-track on their intentions to
Process
Tracing in official
remove health budgetary
commitments
documents.
1. Specify the most recent intermediate and final outcomes intervention
expected (or is seeking) to achieve.
2. Little available evidence that it caused any significant
1. African
change
Union
in the
Member
direction
States
or pace
meet and
of investment
aim to exceed
in health
the Abuja
2. Systematically assess and document what was done to achieve the targeted
towards thetoAbuja
Declaration
Commitment
allocate
15 per centtarget.
of their national budgets to
outcomes.
health (Regional Level)
Fair
Playtoactivities,
aimed onitthe
one hand,
to reach
3. Due
a lack of evidence,
remains
inconclusive
as out
to to
3. Identifycitizens
and evidence
what targeted
intervention
outcomes havesuch
actually
African
through
community
mobilisation,
as
2. Fair
whether
Play
country
actions
governments
by
Fair
Play
take
had
accelerated
a
causal
relationship
action
by
with
materialised, as well as any unintended outcomes.
recent
commitments
inmedia;
AU documents
investing
in and
health
perseen
the
campaign
'asks’
Level)into
road
shows
byasusing
the
and,(National
on published
the other,
March 2011.
influence
bytolobbying
and decision4. Undertakegovernments
“process induction”
identify all policyplausible causal
explanations
3.
Strongly
linked civil
society organisations
work to improve
health
for each through
evidenced
outcome.
makers,
identification
and cultivation
of
4. for
There
all Africans
is good
(Civil
evidence
Society to
Level)
suggest that Fair Play was
champions and ambassadors and through policy advocacy.
successful in linking CSOs under the campaign’s aim of
5. Use “process verification” to assess the extent each of explanation are
4. Africans
‘health have
for
all’
asupported
strong
and some
collective
evidence
voice evidence.
which
that the
they
campaign
use to demand
has
supported
or not
by the
available
their
supported
right to health
joint CSO
(Community
action. Level)
Quantifying Qualitative Data
Targeted Outcome
Extent
observed
(high, medium,
low, none)
1.
2.
3.
4.
Unforeseen Outcome
n/a
n/a
Extent of
project/campaign
contribution
(high, medium,
low, none)
Specific
contribution
score*
/5
“I enjoyed conducting the evaluation and
although struggled at times with how
best to write-up the results, I didn’t find
any substantive issues with using the
process tracing approach. If anything it
forced me to be more challenging when
assessing evidence, particularly from key
informant interviews.”
Gavin Stedman-Bryce, Pamoja Consulting
Future Plans and Possibilities
• 7 effectiveness audits per year per indicator
area, so potential to do internal-systematic
reviews.
• Using data to look at cost-effectiveness.
• Further drilling down on reasons why particular
interventions do and do not work through
complementary qualitative research.
• Building body of evidence of what seems to
work and not – for sharing internally-externally.
• More rigorous evaluations of interventions that
show high promise.

Can we obtain the required rigour without randomisation

Transcript Can we obtain the required rigour without randomisation

Directory