EEF Evaluators’ Conference 25th June 2015 Session 1: Interpretation / impact 25th June 2015

Download Report

Transcript EEF Evaluators’ Conference 25th June 2015 Session 1: Interpretation / impact 25th June 2015

EEF Evaluators’ Conference
25th June 2015
Session 1: Interpretation / impact
25th June 2015
Rethinking the EEF Padlocks
Calum Davey
Education Endowment Foundation
25th June 2015
Overview
→ Background
→ Problems
→ Attrition
→ Power/chance
→ Testing
→ Proposal
→ Discussion
Background
• Summary of the security of evaluation findings
• ‘Padlocks’ developed in consultation with evaluators
Background
• Summary of the security of evaluation findings
• ‘Padlocks’ developed in consultation with evaluators
Group
Number of
pupils
Effect size
Estimated months’ progress
Literacy intervention
550
0.10 (0.03, 0.18)
+2
Evidence strength
Background
• Summary of the security of evaluation findings
• ‘Padlocks’ developed in consultation with evaluators
Group
Number of
pupils
Effect size
Estimated months’ progress
Literacy intervention
550
0.10 (0.03, 0.18)
+2
Evidence strength
Background
• Summary of the security of evaluation findings
• ‘Padlocks’ developed in consultation with evaluators
Group
Number of
pupils
Effect size
Estimated months’ progress
Literacy intervention
550
0.10 (0.03, 0.18)
+2
Evidence strength
• Five categories – combined to create overall rating:
Rating
1. Design
2. Power (MDES)
3. Attrition
4. Balance
5. Threats to validity
5
Fair and clear experimental
design (RCT)
< 0.2
< 10%
Well-balanced on
observables
No threats to validity
4
Fair and clear experimental
design (RCT, RDD)
< 0.3
< 20%
3
Well-matched comparison
(quasi-experiment)
< 0.4
< 30%
2
Matched comparison
(quasi-experiment)
< 0.5
< 40%
1
Comparison group with
poor or no matching
< 0.6
< 50%
0
No comparator
> 0.6
> 50%
Imbalanced on
observables
Significant threats
Background
15
n=37
10
13
5
5
7
6
4
2
0
0
1
2
3
Number of padlocks
Note: count does not include pilots, which often don’t get a security rating
4
5
Oxford Improving Numeracy and
Literacy
Rating
1. Design
2. Power
(MDES)
3. Attrition
4. Balance
5. Threats to validity
Fair and clear experimental
design (RCT)
< 0.2
< 10%
Well-balanced on
observables
No threats to validity
Fair and clear experimental
design (RCT, RDD)
< 0.3
< 20%
Well-matched comparison (quasiexperiment)
< 0.4
< 30%
Matched comparison (quasiexperiment)
< 0.5
< 40%
Comparison group with poor or
no matching
< 0.6
< 50%
No comparator
> 0.6
> 50%
Imbalanced on
observables
Significant threats
5
4
3
2
1
0
Act, Sing, Play
Rating
1. Design
2. Power
(MDES)
3. Attrition
4. Balance
5. Threats to validity
Fair and clear experimental
design (RCT)
< 0.2
< 10%
Well-balanced on
observables
No threats to validity
Fair and clear experimental
design (RCT, RDD)
< 0.3
< 20%
Well-matched comparison (quasiexperiment)
< 0.4
< 30%
Matched comparison (quasiexperiment)
< 0.5
< 40%
Comparison group with poor or
no matching
< 0.6
< 50%
No comparator
> 0.6
> 50%
Imbalanced on
observables
Significant threats
5
4
3
2
1
0
Team Alphie
Rating
1. Design
2. Power
(MDES)
3. Attrition
4. Balance
5. Threats to validity
Fair and clear experimental
design (RCT)
< 0.2
< 10%
Well-balanced on
observables
No threats to validity
Fair and clear experimental
design (RCT, RDD)
< 0.3
< 20%
Well-matched comparison (quasiexperiment)
< 0.4
< 30%
Matched comparison (quasiexperiment)
< 0.5
< 40%
Comparison group with poor or
no matching
< 0.6
< 50%
No comparator
> 0.6
> 50%
Imbalanced on
observables
Significant threats
5
4
3
2
1
0
Problems : power
• MDES at baseline
• MDES changes
• Confusion with p-values and CIs:
– Effect bigger than MDES!
• E.g. Calderdale
• ES=0.74, MDES <0.5
– P-value < 0.05!
• E.g. Butterfly Phonics
• ES=0.43, p<0.05, MDES>0.5
Rating
2. Power
(MDES)
5
< 0.2
4
< 0.3
3
< 0.4
2
< 0.5
1
< 0.6
0
> 0.6
Problems : attrition
• Calculated overall at the level of
randomisation
• 10% pupils off school each day
• Disadvantages individuallyrandomised:
– Act, Sing, Play (pupil): 0% attrition
at school or class level, 10% at pupil
level
– Oxford Science (school): 3%
attrition at school level, 16% at pupil
level
• Are the levels right?
Rating
3. Attrition
5
< 10%
4
< 20%
3
< 30%
2
< 40%
1
< 50%
0
> 50%
Problems : testing
• Lots of testing administered by
teachers
• Teachers rarely blinded to intervention
status
• What is the threat to validity when
effect sizes are small?
Rating
5. Threats to
validity
5
No threats to
validity
4
3
2
1
0
Significant
threats
Potential solution?
•
•
•
•
•
Assess ‘chance’ as well as MDES in padlock?
Assess attrition at pupil level for all trials?
Randomise invigilation of testing to assess bias?
Number of pupils (number with intervention)
Confidence interval for months progress?
Discussion
• Can p-values, confidence intervals, power, sample size,
etc. could be combined in measure of ‘chance’?
• What are the advantages and disadvantages of reporting
confidence intervals alongside the security rating?
• Is it right to include all attrition in the security rating?
What potential disadvantages are there?
• What is the more appropriate way to ensure
unbiasedness in testing? Would it be possible to conduct
a trial across evaluations?
Session 2: Implementation
25th June 2015
Implementation and process
evaluation in intervention
development and research
Neil Humphrey and Ann Lendrum
Manchester Institute of Education
[email protected]
0161 275 3404
Implementation and process
evaluation
•
The (mistaken) assumption of effective implementation (Berman &
McLaughlin, 1978)
–
•
“When faced with the realities of human services, implementation outcomes should not be assumed any more than
intervention outcomes are assumed” (Fixsen et al, 2005, p.6)
Parallel development of ‘implementation’ and ‘process evaluation’
literature in different disciplines
– Implementation – psychology and education
– Process – health
•
“Implementation is defined as a specified set of activities designed to
put into practice an activity or program of known dimensions” (Fixsen et
al, 2005, p.5)
– Term also used more broadly in reference to supporting the uptake of
‘evidence-based’ interventions
•
•
“Process evaluation involves gathering data to assess the delivery of
programs” (Domitrovich, 2009, p.195)
Put simply - looking inside the ‘black box’ (Saunders et al, 2005)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
• “The goals of implementation science have been to
understand barriers to and facilitators of implementation, to
develop new approaches to improving implementation, and to
examine relationships between an intervention and its impact.
Implementation science has investigated a number of issues,
including: influences on the professional behavior of
practitioners; influences on the functioning of health and
mental health care practice organizations; the process of
change; strategies for improving implementation, including
how organizations can support the implementation efforts of
staff members; appropriate adaptation of interventions
according to population and setting; identification of
approaches to scaling-up effective interventions;
implementation measurement methods; and implementation
research design” (Forman et al, 2013, p.83)
Implementation science
UNDERSTANDING
IMPLEMENTATION
PROCESSES
IMPLEMENTATION
AND PROCESS
EVALUATION (IPE)
EVALUATING
IMPLEMENTATION
PROCESSES
SUPPORTING
IMPLEMENTATION
PROCESSES
What is an intervention?
• Interventions are,
“purposively
implemented change
strategies” (Fraser &
Galinksky, 2010,
p.459)
• Key elements:
–
–
–
–
Purposive
Implementation
Change
Strategic
The intervention development and
research cycle
DEFINE AND
UNDERSTAND
THE PROBLEM
SCALE-UP
INTERVENTION
DESIGN AND
DESCRIBE THE
PROPOSED
SOLUTION
ESTABLISH
INTERVENTION
EFFECTIVENESS
ITERATIVE AND
CYCLICAL, NOT
LINEAR
ESTABLISH
INTERVENTION
EFFICACY
ARTICULATE AN
INTERVENTION
THEORY
PILOT AND
REFINE
Design and describe the proposed
solution
•
Design of intervention – build on accrued knowledge base
– Review by stakeholders and experts in the field
– Use of Type 1 translational research (applying basic science to inform
intervention development)
•
Basic intervention features can include
– Form (e.g. universal, selective, indicated)
– Function (e.g. environmental, developmental, informational)
– Level and location (e.g. individual, group, family, school, community,
societal)
– Complexity and structure (e.g. single component, multi-component)
– Prescriptiveness and specificity (e.g. manualised, flexible)
– Components (e.g. curriculum, environment/ethos, parents/wider community)
– Intervention agents (e.g. teachers, external staff)
– Recipients (e.g. teachers, pupils)
– Procedures and materials (e.g. what is done, how often)
Design and describe the proposed
solution
•
•
•
•
•
“The quality of description of interventions in publications… is remarkably poor”
(Hoffman et al, 2014, p.1)
Without a complete description of an intervention:
– The person/s responsible for delivery cannot reliably implement it
– The recipient/s do not know exactly what they ‘signing up for’
– Researchers cannot properly replicate or build upon existing findings
– Researchers cannot adequately evaluate the implementation of the intervention
– It is difficult, if not impossible, to understand how and why it works
Less than 40% non-pharmacological interventions were found to be described
adequately in papers, appendices or websites (Hoffman et al, 2013)
Hence, the Template for Intervention Description and Replication’ (TIDieR) (Hoffman
et al, 2014) offers a useful tool that can improve the quality of how interventions are
described and subsequently understood
TIDieR (adapted version) = 1. Brief name, 2. Why? (theory/rationale), 3. Who
(recipients), 4. What (materials), 5. What (procedures), 6. Who (provider), 7. How
(format), 8. Where (location), 9. When and how much (dosage), 10. Tailoring (e.g.
adaptation)
Design and describe the proposed
solution
•
•
•
•
•
Think about a recent or current pilot or trial of an
intervention in which you are involved
Can you provide a full description of the intervention?
Reminder of the TIDieR (adapted version) items: 1. Brief
name, 2. Why? (theory/rationale), 3. Who (recipients), 4.
What (materials), 5. What (procedures), 6. Who (provider),
7. How (format), 8. Where (location), 9. When and how
much (dosage), 10. Tailoring (e.g. adaptation)
‘Just a Minute’ format – responses to the 10 items in the
adapted TIDieR framework
Questions for reflection
– Why are these 10 items important? Were some harder
to populate with information than others? Which
ones? Why?
– Are there any fundamental ways of describing an
intervention that TIDieR misses? What are these?
– Is TIDieR better suited to describing certain kinds of
interventions than others? If so, what kinds of
interventions and why?
– Would TIDieR be useful as a standardised reporting
framework for EEF projects?
Articulate an intervention theory
•
Without understanding intervention theory, we are effectively left with a
‘black box’ view (e.g. we think about interventions in terms of effects without
paying attention to how and why those effects are produced)
–
•
A logic model, “describes the sequence of events for bringing about change
by synthesizing the main program elements into a picture of how the
program is supposed to work” (CDCP, 1999, p.9)
–
–
–
1.
2.
3.
4.
“Seasoned travellers would not set out on a cross country motor trip without having a
destination in mind, at least some idea of how to get there, and, preferably, a detailed map to
provide direction and guide progress along the way” (Stinchcomb, 2001, p.48).
Often articulated in terms of inputs, processes/mechanisms, and outcomes
Sometimes factors affecting inputs and processes are also added
Typically displayed in diagrammatic form
What is done in the intervention? (inputs)
What are you trying to achieve? (outcomes)
What are the mechanisms/processes that link 1 and 2 above? (change
mechanisms)
What factors could impact on the above? (moderators)
Articulate an intervention theory
Articulate an intervention theory
• Try to create a basic logic model for the
intervention you described in the
previous activity using the worksheet
provided
• Questions for reflection
– Which component(s) of the logic model were
the most difficult to complete? Why?
– How might you go about empirically testing the
assumptions of your intervention logic model in
a pilot or trial context?
– Is logic modeling better suited to theorising
certain kinds of interventions than others? If
so, what kinds of interventions and why?
– What are the limitations of logic modeling, and
what alternative methods might be used in
order to articulate intervention theory?
Pilot and refine
•
•
What are we trying to achieve when we pilot an intervention?
One possible organising framework for a pilot study is that of social validity
–
•
Adapting Wolf’s (1978) classic taxonomy
–
–
–
•
The value and social importance attributed to a given innovation by those who are direct or
indirect consumers of it (Hurley, 2012; Luiselli & Reid, 2011)
Acceptability – are the intended outcomes of the intervention wanted, needed and/or social
significant?
Feasibility – is the intervention considered to be ‘doable’?
Utility – are the outcomes of the intervention satisfactory, and worth the effort required to
achieve them?
Consideration of phase of implementation (Fixsen et al, 2005)
–
–
–
–
Exploration
Installation
Initial implementation
Full implementation
Pilot and refine
•
Design, data generation and analysis
–
–
–
–
–
•
Small scale
Mixed methods – assumption of ‘pilot’ = ‘qualitative’ is not
helpful
Review of materials by stakeholders and experts
Key implementation-related questions may include – can
implementers deliver the intervention in the time allotted?
Does any sequencing of content and other aspects of
intervention design make sense to implementers and
recipients? Are suggested activities congruent with the
context of delivery (e.g. target population, setting)? Are
recipients engaged? (Fraser & Gallinsky, 2010)
Key outcome-related questions may include – are there
indications of impact on intended outcomes? Of what kind
of magnitude? For whom?
What kinds of refinements are made?
–
–
–
–
Intervention theory
Intervention design
Context – required contextual characteristics, foundations
for change, implementers
Methodological considerations for evaluation
Establish intervention efficacy and
effectiveness
•
Can the intervention produce intended outcomes under optimal conditions?
–
•
There is an implicit assumption that implementation will be uniformly high quality in
efficacy trials (see for example Flay et al, 2005) – this is rarely the case in schoolbased interventions
–
–
•
How do we define ‘optimal’? Likely to be informed by intervention theory
We know that implementation variability predicts outcome variability
Interventions do not happen in a vacuum – understanding context and social processes is crucial
IPE is therefore essential in randomised trials
–
–
–
–
–
–
Studying how the intervention is implemented (including how and why this varies)
Distinguishing between different intervention components and identifying those that are
critical (‘active ingredients’) through analysis of natural variation or experimental manipulation
(e.g. multi-arm or factorial trials) (Bonell et al, 2012)
Planned sub-group analyses to identify differential responsiveness/gains (Petticrew et al,
2012)
Investigating contextual factors that may influence the achievement of expected outcomes
Empirical validation of intervention theory (Bonell et al, 2012)
Interpretation of outcomes, regardless of their valence
•
•
Intervention theory, implementation, evaluation
Requires a move toward ‘realist’ RCTs (Bonell et al, 2012) with expectation of some
natural variation
Establish intervention efficacy and
effectiveness
•
IPE in a trial context should consider
– Aspects of implementation (e.g.
fidelity/adherence, dosage, quality,
participant responsiveness, programme
differentiation, reach, adaptation,
monitoring of comparison conditions)
•
Important to avoid Type III error
–
•
Factors affecting implementation (e.g.
preplanning and foundations,
implementation support system,
implementation environment,
implementer factors, intervention
characteristics) (Durlak & DuPre, 2008;
Greenberg et al, 2005; Forman et al,
2009)
Not a quant/qual division!
–
Implementation quality model (Domitrovich et al, 2008)
•
•
62% quantitative, 21% qualitative, 17% both in
health promotion research (Oakley et al, 2006)
Use of a range of methods and informants
There is no one set way to do things – IPE
in a trial has to be pragmatic!
Establish intervention efficacy and
effectiveness
•
•
•
Promoting Alternative Thinking Strategies (Humphrey et al, 2015)
– PATHS is a social-emotional learning curriculum that aims to help children
manage their behaviour, understand their emotions and work well with others
– Cluster RCT; 23 PATHS vs 22 control (N=4516)
– Training provided by developers; teachers supported by trained coaches (in
turn supervised by developers); all materials provided free of charge
– Assessment of outcomes: social-emotional competence, mental health,
attainment, health-related quality of life
– Assessment of implementation: surveys of usual practice, structured
observations, teacher implementation surveys, teacher factors affecting
implementation surveys, interviews with teachers and school staff, focus
groups with pupils, interviews with parents
Outcome analysis showed no impact of PATHS on children’s attainment in
English/reading or maths
Structured observational data indicated that fidelity, quality, reach, and participant
responsiveness were generally high; however, in terms of dosage, teachers were
on average 20 lessons (10 weeks) behind schedule at the point of observation
Establish intervention efficacy and
effectiveness
•
•
Quantitative analysis of usual practice surveys and
structured observational data indicated that increased
provision of targeted interventions, higher levels of
implementation quality, and optimal intervention reach were
associated with improved academic outcomes
– Most consistent finding was for reach; largest effect
sizes were for quality
Qualitative interview data was extremely helpful in
illuminating the processes underpinning the above findings
– Philosophical fit
– Meeting perceived needs
– Practical fit
– Pedagogical fit
– Barriers and facilitators to effective implementation
– Technical support and assistance
– School leadership
Establish intervention efficacy and
effectiveness
•
Will the intervention produce intended outcomes in
‘real world’ conditions?
–
–
–
–
•
•
Implementation is likely to be even more variable in
an effectiveness trial so it is vital that the
aforementioned aspects and factors are
documented and analysed
Increased implementation variability is one possible
reason why effects observed in efficacy trials are
not always replicated in effectiveness trials
–
•
Emphasis on natural settings – increased external validity,
decreased internal validity
Intervention developer much less likely to be involved
Success is heavily dependent upon relationship between
the research team and the host institutions (Flay et al,
2005)
Paradox of researcher involvement
The so-called ‘voltage drop’ (Chambers, Glasgow & Stange,
2013)
IPE in effectiveness trials should therefore include
a particular focus on how the intervention is
‘interpreted’ in real world conditions
–
–
What form(s) does this interpretation take? e.g. dilution and
drift?
What real world constraints and processes influence this?
Academic achievement effect sizes for socialemotional learning interventions in efficacy and
effectiveness trials (Wigelsworth, Lendrum &
Oldfield, in press)
Scale-up intervention
•
How can we take an intervention “from science to service” (August, Gerwitz &
Realmuto, 2010, p.72)?
–
–
•
•
•
“There is a broad consensus that schools do not have a good record in accessing the available knowledge
base on empirically validated interventions… Developers and advocates of effective practices have a shared
responsibility with educators to create the awareness, conditions, incentives and context(s) that will allow
achievement of this important goal” (Walker, 2004, p.399)
Evidence to routine practice ‘lag’ can be 20 years
Two related issues – scaling up (bringing the intervention to a wider audience) and
sustainability (maintaining effective use and impact of the intervention) (Forman,
2015)
“The implementation stage begins after the adoption decision is made and culminates
when the innovation ‘disappears’ either because it has become so thoroughly
integrated into everyday practices that it is no longer visible as an innovation or
because it has been discontinued” (Bosworth et a, 1999, p.1)
Body of work on Type 2 translational research “examines factors associated with the
adoption, maintenance, and sustainability of science-based interventions at the
practice level” (Greenberg, 2010, p.37)
– This kind of research is by no means confined to the scale-up phase – indeed, it
is also critical in effectiveness trials
– ‘Implementation is the outcome’
Scale-up intervention
•
High quality IPE is needed here more than ever! We need to understand the
factors that influence:
–
–
–
•
•
….intervention engagement and reach (e.g. who takes it on and why?)
….implementation quality (e.g. when it is delivered well, what supports this?)
….sustainability over time (e.g. what is sustained? How?) (Greenberg, 2010)
Important to document how and why the intervention evolves as it goes to
scale
Building capacity and partnerships for scale-up and sustainability: example
of PROSPER (PROmoting School-community-university Partnerships to
Enhance Resilience) (Spoth, Greenberg, Bierman & Redmond, 2004)
Scale-up intervention
• Imagine that the intervention you focused upon in the
previous activities has passed successfully through
development, piloting, efficacy and effectiveness stages
and is ready to be ‘taken to scale’ and disseminated
more broadly
• What factors do you think are most likely to influence the
engagement, reach, implementation quality and
sustainability of the intervention when it is scaled-up?
• How might you go about researching the above?
• How could the knowledge generated be used to improve
the scaling-up process?
RCTs and instrumental variables
Anna Vignoles
University of Cambridge
Why do you need an IV in an RCT?
• RCTs randomize the allocation of the treatment
• But not everyone complies
• People used to analyse the data “as treated”
– Treatment on the treated ignoring the fact that some
people who were randomised into the treatment did
not participate
• This is generally a bad solution because those who
choose to participate are not the same as those who
don’t!
Why do you need an IV in an RCT?
• Nowadays the preferred analytical solution is Intention to
Treat (ITT approaches)
– Difference in outcomes between those who are
randomised into the treatment and those who are not
• But ITT tells you impact of offering the programme
• We would still like to know the effect of the treatment on
the treated
• But the treated are not a random subset….
What about an IV solution?
• IV often used post hoc to evaluate a programme
– Maimonides rule in Israel, Victor Lavy
• Can be used ex ante
– Design an IV into an evaluation
– Design an IV into a RCT
• In medical literature use of IV in a trial is called
contamination adjusted intention to treat
Why use an IV in a RCT?
• Computing the ITT
– Straight difference in average outcomes
between the group to whom you offered
treatment, and the group to whom you did not
offer treatment
• Computing the Effect of Treatment on the
Treated (TOT)
– Use whether or not the person was randomised
into the intervention (Z) to predict whether or not
the individual actually participated in the
intervention (D)
Instrumental Variables: a refresher
• Y1i is the value of the outcome if the treatment is
received by individual i
• Y0i is the value of the outcome if the treatment is not
received by individual i
• Di = 1 if treatment is received by individual i
• Di = 0 if treatment is not received by individual i
• Xi denotes the set of observed characteristics before the
intervention/treatment for individual i
Instrumental Variables: a refresher
• Di is composed of two parts, one that is
correlated with u (endogenous part) and one
that is independent of the error term (exogenous
part)
• IV uses an additional variable(s) Z (called an
instrumental variable, to isolate that part of D
that is correlated with the error term
• In this case Z is the randomisation process
Instrumental Variables: a refresher
• For a valid instrument the following must be true:
– corr (Zi,Di) > 0
Instrument is relevant
– E(ui|Zi, Xi)= E(ui|Xi)= 0
Instrument effects D, but
not Y directly (only
through its impact on D)
• The instrument must predict D
• The instrument must also only effect Y through
its impact on D (untestable assumption)
• IV is estimated by 2 Stage Least Squares
Problems with IV
• If instruments are weakly correlated with the
endogenous variable, the instruments are said
to be weak
• When using weak instruments the IV 2SLS
estimator is biased even in large samples
• In small samples IV estimates are biased
anyway
• Finite sample bias will lessen as sample size
increases
• In this case clear strong instrument….
Advantages and disadvantages
• Essentially adjusts estimate for degree of non
compliance
• Information on non compliance can be revealing in itself
to understand the impact of the intervention
• Non compliance may be difficult to measure in practice
– incomplete or partial compliance
• Assumes that if the non compliers had received the
treatment the effect for them would have been the
same as for the compliers
• Assumptions behind ITT – the effect of the treatment is
averaged over those who actually receive it and those
who do not
Some examples
• Vitamin A supplementation in malnourished children
reduced mortality by 41% using ITT
• Supplements was found to reduce mortality by two thirds
(72%) using CA ITT
– Sommer and Zeger 1991
References
• Angrist, J. D. and A. Krueger (2001). “Instrumental Variables
and the Search for Identification: From Supply and Demand to
Natural Experiments”, Journal of Economic Perspectives,
15(4).
• Heckman, James J. "Randomization as an instrumental
variable." (1995).
• Imbens, G. W. and J. D. Angrist, (1994). “Identification and
Estimation
of
Local
Average
Treatment
Effects.”
Econometrica, 62(2).
• Sommer A, Zeger SL. On estimating efficacy from clinical
trials. Stat Med1991;10:45-52
• Sussman, Jeremy B., and Rodney A. Hayward. "An IV for the
RCT: using instrumental variables to adjust for treatment
contamination in randomised controlled trials." Bmj 340
(2010): c2073.
Session 3: Costs
25th June 2015