The Campbell Collaboration

Transcript The Campbell Collaboration

Evidence Based Policy, Evidence
Grading Schemes and Entities,
and Ethics in Complex Research
Systems
Robert Boruch, University of Pennsylvania
September 14-15 2008
5th European Conference on Complex Systems
Jerusalem
1
Summary of Themes
People want to know “what works” so as to
inform their decisions.
 The scientific quality of evidence on “what
works” is variable and often poor.
 People’s access to information on what works
through the internet is substantial.
 Organizations and data bases have been created
to (a) develop evidence grading schemes, (b)
apply the schemes in systematic reviews of
evidence from multiple studies, and (c)
disseminate results through the internet.

2
The “What Works” Theme
“What works” refers here to estimating the
effects of social, educational, or
criminological interventions
 In a statistically/scientifically unbiased way
 And so as to generate a statistical statement
of one’s confidence in the results.

3
Evidence Based Policy/Law: A
Driver of Interest in What Works

US, Canada, UK, Israel (e.g. National
Academy)
 Sweden, Norway, Denmark
 Australia, Malaysia, China
 Mexico, others in Central America
 Multinationals: OECD, World Bank
 Others
4
Information Glut as Driver: Naïve
Web Searches
A Google search on “evidence based, ” yields
9, 660,000 links.
A Google search on “what works” yields 6,350,000
links (.21 seconds).
A Google search on “evidence based practice”
yields 2,000,000 links (.42 seconds).
A Google search on “evidence based policy”
yields 132,000 links (.35 seconds).
 What are we to make of this?
5
Publication Rates in Education

20,000 articles on education published each year
in English language journals
 2/1,000-5/1,000/year report on controlled trials of
programs, policies, or practices to estimate
effectiveness
 For every curriculum package that has been tested
in a controlled trial, there are 50-80 that are
claimed to be effective based on no defensible
scientific evidence.
6
Relevant Organizations Nested

National, State/Provincial, Municipal:
Policy or law
 Agencies with nation, etc., e.g. National
Science Foundations, Institute for Education
Sciences (US), University Research
 Programs and projects within agencies
 Data bases and reports within projects
 Users of information at each level, e.g.
scientists, policy people, the public
7
International Organizations:
NGOs

Cochrane Collaboration in Health Care:
http://cochrane.org
 Campbell Collaboration in education,
welfare, crime and justice:
http://campbellcollboration.org
8
Two Examples Here

International Campbell Collaboration in
education, welfare, crime and justice
 What Works Clearinghouse in education
(Institute for Education Sciences, US)
9
Data Bases in this Context

Evidence Grading Schemes currently focus on
reports of statistical analyses of impact, not microrecords of individuals as yet.
 Example: 5-10 statistical reports (ingredients of
part of data base) on evaluating impact of
conditional income transfer programs in
developing regions
 Example: Cochrane Collaboration data base on
randomized trials contains nearly .5 million such
reports
 “Meta-analysis” of results of multiple studies
10
C2 SPECTR

C2 Social, Psychological, Educational, and
Criminological Trials Register
 13,000+ entries on randomized and possibly
randomized trials
 Feeding into C2 systematic reviews
 Feeding into the IES What Works
Clearinghouse (USDE)
11
The Campbell Collaboration

Mission: since 2000, prepare, maintain and make
accessible C2 systematic reviews of evidence on the
the effects of interventions (“what works” ) to inform
decision makers and stakeholders.

International and Multidisciplinary: Education,
social welfare/services, crime and justice


http://campbellcollaboration .org
Precedent: Cochrane Collaboration in health (1993)
12
Nine Key Principles of C2:
A Scientific Ethic

1. Collaborating across
 5. Keeping Current
Nations and Disciplines
 6. Striving for Relevance
 2. Building on Enthusiasm
 7. Promoting Access
 3. Avoiding Duplication
 8. Ensuring Quality
 4. Minimizing Bias
 9. Maintaining
13
Continuity
What are Evidence Grading
Schemes (EGSs) ?

These are inventories (guidance, checklists,
scales) or processes that…
 facilitate making transparent and uniform
scientific judgments about…
 The quality of evidence on effects of
programs or practices or policies
14
C2’s and Others’ Major Evidence
Grading Distinction on What
Works

Randomized controlled trials yield the least biased
and least equivocal evidence on “what works” i.e.
effect of a new intervention (program, practice,
etc.)
 Alternative methods to estimate the effect of
interventions yield more equivocal and more
biased estimates of effect, e.g. “before-after”
evaluations and other nonrandomized trials.
 Both randomized trials and nonrandomized trials
are important, but they must be separated in
evidence grading schemes.
15
Example: Randomized Controlled
Trial

Individuals or entities such as villages or
organizations are randomly allocated to one of two
or more interventions
 The random allocation assures a fair comparison
of the effects of the interventions
 And the random allocation assures a statistically
credible statement about confidence in the result,
e.g. confidence interval and statistical tests
16
More Specific Example

A new commercially curriculum package for math
education is the intervention under investigation
 The new curriculum is RANDOMLY allocated to
half of a sample of 100 schools, with the
remaining half of schools serving as a control
group, so as to form two equivalent groups of
schools (fair comparison)
 The outcomes, such as achievement test scores,
from the intervention group and the control group
are compared
17
Entities and Evidence Grading
Schemes for What Works

Cochrane Collaboration: Systematic reviews in health
 Campbell Collaboration: crime, education, welfare
 Society for Prevention Research (Prevention Science 2006)
 What Works Clearinghouse, Institute for Education
Sciences WWC IES http://whatworks.ed.gov
 Food and Drug Administration, other regulatory agencies
 National Register of Evidence-based Programs and
Practices
 Others: California etc.
18
What are the Ingredients of EGSs?
Pre-specification of primary outcomes
Comparison condition fidelity
Pre-specification of all analyses
Reliability of outcome measures
Pre-specification of all measures
Validity of outcome measures
Control for assignment/selection bias
Adherence to standards for data collection
Appropriate comparison condition
Adjustment for differential attrition
Control for subject awareness of assigned
intervention
Adjustment for overall loss to follow-up
Control for provider awareness of assigned
intervention
Adjustment for missing data
Control for data collector awareness of
assigned intervention
Analysis meets statistical assumptions
Assurances to participants to elicit disclosure
Analysis consistent with study theory
Intervention fidelity/Measurement of
exposure
Adjustment for multiple measures
Control for contamination and co-intervention Absence of or explanation for anomalous
findings
Reliability and validity of exposure measures
19
WWC Aims

To be a trusted source of scientific evidence on
what works, what does not, and on where
evidence is absent…
 Not to endorse products
 http://www.whatworks.ed.gov
20
What Works Clearinghouse
Illustration
21
Beginning Reading Review Protocol
The Beginning Reading What Works Clearinghouse (WWC) review focuses on reading interventions for students in grades K–3 (or
ages 5-8) that are intended to increase skills in alphabetics (phonemic awareness, phonological awareness, letter recognition, print
awareness and phonics), reading fluency, comprehension (vocabulary and reading comprehension), or general reading achievement.
Interventions for this review are defined as programs, products, practices, or policies that are intended to increase skills in the areas
named above. For the first set of intervention Beginning Reading reports, the WWC focused on “branded” programs and products.
Effectiveness ratings for Beginning Reading programs in four domains
Intervention
Alphabetics
Comprehension
Fluency
General reading
achievement
DaisyQuest
Reading Recovery®
WWC Intervention Reports provide all findings that "Meet Evidence Standards" or "Meet Evidence Standards with Reservations" for
studies on a particular intervention. Intervention reports are created for those interventions that have at least one study that "Meets
Evidence Standards" or "Meets Evidence Standards with Reservations." Intervention reports are one component of the decision-making
process, but should not be the sole source of information when making educational decisions.
Key
Positive effects:
strong evidence
of a positive
effect with no
overriding
contrary
evidence
Potentially
positive effects:
evidence of a
positive effect
with no
overriding
contrary evidence
Mixed effects:
evidence of
inconsistent
effects
No discernible
effects: no
affirmative
evidence of
effects
Potentially
negative effects:
evidence of a
negative effect
with no
overriding
contrary evidence
Negative effects:
strong evidence
of a negative
effect with no
overriding
contrary
evidence
22
Example: C2 Parental
Involvement Trials

500 possibly relevant studies of impact
 45 Possible Randomized Controlled Trials
(RCTs)
 20 RCTs Met Study Inclusion Criteria
18 RCTs included in the Meta-Analysis
 Nye, Turner, Schwartz
http//:campbellcollaboration.org
23
Figure
Model
Study name
Efficacy of Parent Involvement on Student Achievement
Comparison
Outcome
Statistics for each study
Hedges's
g
Lower
limit
Ryan (1964)
Parent_vs_Control
Combined
0.347
0.088
0.605
Aronson (1966)
Combined
Read_Ach
1.109
0.421
1.798
Clegg (1971)
Hirst (1974)
Combined
Parent_vs_Control
Combined
Combined
0.776
0.181
-0.098
-0.217
1.651
0.579
Henry (1974)
Combined
Combined
0.281
-0.677
1.239
O'Neil (1975)
Combined
Combined
0.223
-0.724
1.169
Tizard (1982)
Combined
Read_Comp
0.879
0.369
1.390
Heller (1993)
Miller (1993)
ParentRpt_vs_Control Combined
Combined
Combined
1.496
0.164
0.881
-0.557
2.110
0.884
Roeder (1993)
Parent_vs_Control
Math_Ach
0.123
-0.445
0.692
Fantuzzo (1995)
Combined
Combined
0.741
-0.047
1.529
Ellis (1996)
Joy (1996)
Parent_vs_Control
Combined
Combined
Cr_Math_Ach
-0.116
0.114
-0.652
-0.842
0.420
1.071
Peeples (1996)
Parent_vs_Control
Combined
0.920
0.345
1.495
Kosten (1997)
Parent_vs_Control
Science_Ach
0.075
-0.573
0.723
Hewison (1988)
Combined
Read_Comp
0.646
0.089
1.203
Combined
Combined
0.381
-0.298
-0.164
-1.076
0.925
0.480
Fixed
0.430
0.299
0.561
Random
0.453
0.248
0.659
Meteyer (1998)
Parent_vs_Control
Powell-Smith (2000) Combined
Hedges's g and 95% CI
Upper
limit
-2.00
-1.00
0.00
1.00
2.00
Favors ControlFavors Treatment
Heterogeneity Statistics for a Fixed Effects Model: Q=35.6, df=17, Prob.=0.005, and I Squared=52.3%.
24
Example: Petrosino et al on
Scared Straight Trials

Over 600 articles that are possibly relevant
to impact of Scared Straight
 Only 15 reach a “reasonable” level of
scientific standard
 Only 7 reached standard of being
randomized controlled trial.
25
Figure 1. The effects of Scared Straight and other juvenile awareness
programs on juvenile delinquency: random effects model, “first effect,”
reported in the study (Petrosino, Turpin-Petrosino, and Buehler, 2002)
n=number of failures
N=number of participants
CI=confidence intervals
Random=random effects model assumed
26
C2 Product: Scared Straight
Pro Humanitate Award






Observational Studies
Ashcroft: -50% crime
Buckner: 0%
Berry: -5%
Mitchell -53%
Several dozen others








Randomized Trials
Mich: +26% crime
Gtr Egypt: +5%
Yarb: +1%
Orchow: +2%
Vreeland: +11%
Finckenauer: +30%
Lewis: +14%
27
Scientific Ethic

Providing access to scientific reports of
evaluations of the effect of interventions, e.g.
journal publications and limited circulation reports
from governments or private organizations
 Providing information beyond reports to assure
understanding
 In principle, but not always in practice, providing
access to micro-records from impact evaluations
28
Ethics of Research on Humans

Evidence Grading Schemes and organizations
need not worry about individual privacy because
they have not access, as yet, to individuals records
in identifiable form
 They rely only on statistical/scientific reports that
are published in peer reviewed journals and other
reports and which include no individual records.
29
Ethics and Law: US

Individual rights to privacy are routinely assured
on account of professional ethics statements and
laws in the US.
 The relevant codes of professional ethics in US
include those of AERA, ASA, AAPOR, APA, and
others.
 The relevant laws in the US include Family
Education Rights and Privacy Act (FERPA),
Privacy Act, HIPPA
30
Ethics and Randomized
Controlled Trials

Relevant codes and law concern individual
privacy and confidentiality of individual’s
identifiable micro-records
 Relevant regulations and codes include
attention to informed consent (45CFR46)
 Access to anonymous micro-records for
secondary analysis is problematic and
possibly unnecessary in this context
31
Appendices
32
Robert Boruch: Bio
Boruch is the University Trustee Chair Professor in
the Graduate School of Education and the
Statistics Department of the Wharton School at the
University of Pennsylvania, Philadelphia
Pennsylvania
Boruch is Fellow of the American Statistical
Association, Academy of Experimental
Criminology, American Academy of Arts and
Sciences, American Educational Research
Association
Email: [email protected]
33
Provision to Advance Rigorous Evaluations
in Legislation


The program shall allocate X% of program funds [or $Y
million] to evaluate the effectiveness of funded projects using a
methodology that –
–
Includes, to the maximum extent feasible, random assignment of
program participants (or entities working with such persons) to
intervention and control groups; and
–
Generates evidence on which program approaches and strategies
are most effective.
The program shall require program grantees, as a condition of
grant award, participate in such evaluations if asked, including
the random assignment.
34
Provision to Advance Replication of
Research-Proven Interventions

Agency shall establish a competitive grant program
focused on scaling up research-proven models
Grant applicants shall –

–
Identify the research-proven model they will implement,
including supporting evidence (well-designed RCTs
showing sizeable, sustained effects on important outcomes);
–
Provide a plan to adhere closely to key elements of the the
model; and
–
Obtain sizeable matching funds from other sources,
especially large formula grant programs.
35
A Focus on Data Bases that
Concern “What Works”

Here, the focus is on projects that generate
evidence about “what works,” and what
does not work using good scientific
standards
 This is different from a focus on projects or
programs that generate information on
nature of a problem, monitoring program
compliance with law, etc.
36
What are the Campbell
Collaboration (C2) Assumptions?

Public interest in evidence based policy and
practice will increase.
 Scientific and government interest in cumulation
and synthesis of evidence on “what works” will
increase.
 Access to information and evidence of dubious
quality and need to screen for quality of evidence
will increase.
 The use of randomized controlled trials to
generate trustworthy evidence on what works will
increase.
37
What are the Products?
1.
2.
3.
4.
5.
6.
7.
Registries of C2 Systematic Reviews of the effects of
interventions (C2-RIPE)
Registries of reports of randomized trials and nonrandomized trials, (C2-SPECTR) and future reports
of randomized trials (C2-PROT)
Standards of evidence for conducting C2 Systematic
reviews
Annual Campbell Colloquia
Training for producing reviews
New technologies and methodologies
Web site: http://www.campbellcollaboration.org
38
What are Other C2 Products?





C2 Trials Register (C2 SPECTR): 13,000 entries
Annals of the American Academy of Political and
Social Sciences: Special Issues
C2 Prospective Trials Register
C2 Policy Briefs
Annual and Intermediate Meetings: London,
Philadelphia, Stockholm, Lisbon, Paris, Oslo,
Copenhagen, Helsinki, Los Angeles
39
Hand Search vs Machine Based
Search
Journal of Educational Psychology (‘03”06)
 Hand search: RCT=66
 Full Text Elec N=99: 59% accurate, 41%
false positives, 24% false negatives
 Abstract only Elect N=11: 91% accurate.
9% false positive, 85% false negative

40
What Is the Value Added ?

Building a cumulative knowledge base
 Developing exhaustive searches
 Producing transparent and uniform
standards of evidence
 International scope
 Periodic updating
 Making reviews accessible
41
C2 Futures/Tensions

C2 Production: AIR and others
 C2 Publications v journals
 C2 and governments and C2 apart from
governments
 C2 and Sustainability, C2 as voluntary
Organization versus C2 and Spin Off
Organizations and Products
42
What are Other Illustrative
Reviews?







“Scared Straight” Programs (Done, Award)
Multi-systemic Therapy (Done)
Parental Involvement (Done)
After School programs (Due 12/05)
Peer Assisted Learning
Counter Terrorism Strategies (Under revision)
Reducing Illegal Firearms Possession
43

The Campbell Collaboration

Transcript The Campbell Collaboration

Directory