Transcript Slide 1

8.0
Assessing
the Quality
of the Evidence
Assessing study quality or critical
appraisal
► Minimize
► Weight
Bias
for Quality
► Assess
relationship between effect size
and quality
2
Coding for study quality
 Pre-established criteria (internal validity)
applied to each study to inform the synthesis
(meta-analysis)
 Only use findings from studies judged to be
of high quality or qualify findings
 Look for homogeneity/heterogeneity
 Examine differences in findings according to
quality (sensitivity analysis)
3
“A careful look at randomized
experiments will make clear that
they are not the gold standard.
But then, nothing is. And the
alternatives are usually worse.”
Berk RA. (2005) Journal of Experimental Criminology 1, 417-433.
4
Code for Study Design Characteristics:
1.
Design Type
 RCT or quasi-experiment or other
2.
Fidelity to Random Allocation
Is the method of assignment unclear?
Look for confusion between non-random and
random assignment – the former can lead to
bias.
3.
Allocation Concealment
5
Which studies are RCTs?
1.
“We took two groups of schools – one group had high ICA use
and the other low ICA use – we then took a random sample of
pupils from each school and tested them.”
2.
“We put the students into two groups, we then randomly
allocated one group to the intervention whilst the other formed
the control.”
3.
“We formed the two groups so that they were approximately
balanced on gender and pre-test scores.”
4.
“We identified 200 children with a low reading age and then
randomly selected 50 to whom we gave the intervention. They
were then compared to the remaining 150.”
5.
“Of the eight [schools] two randomly chosen schools served as a
control group.”
6
EXAMPLES
7
Is it randomized?
“The groups were balanced for
gender and, as far as possible,
for school. Otherwise,
allocation was randomized.”
Thomson et al. Br J Educ Psychology 1998;68:475-91.
8
Is it randomized?
“The students were assigned to one
of three groups, depending on how
revisions were made: exclusively
with computer word processing,
exclusively with paper and pencil or
a combination of the two
techniques.”
Greda and Hannafin, J Educ Res 1992; 85:144.
9
Mixed allocation
“Students were randomly assigned to
either Teen Outreach participation or
the control condition either at the
student level (i.e., sites had more
students sign up than could be
accommodated and participants and
controls were selected by picking names
out of a hat or choosing every other
name on an alphabetized list) or less
frequently at the classroom level.”
Allen et al, Child Development 1997;64:729-42.
10
Non-random assignment confused
with random allocation
“Before mailing, recipients were randomized
by rearranging them in alphabetical order
according to the first name of each person.
The first 250 received one scratch ticket for a
lottery conducted by the Norwegian Society
for the Blind, the second 250 received two
such scratch tickets, and the third 250 were
promised two scratch tickets if they replied
within one week.”
Finsen V, Storeheier, AH (2006) Scratch lottery tickets are a poor incentive
to respond to mailed questionnaires. BMC Medical Research Methodology 6,
19. doi:10.1186/1471-2288-6-19.
11
Misallocation issues
“23 offenders from the treatment group
could not attend the CBT course and
they were then placed in the control
group.”
12
Concealed allocation – Why is it
important?
►Inflated Effect Sizes
►Selection bias and exaggeration of group
differences
13
Allocation concealment:
A meta-analysis
►250 randomized trials in the field of
pregnancy and child birth.
►The trials were divided into 3 concealment
groups:
 Good concealment (difficult to subvert);
 Unknown (not enough detail in paper);
 Poor (e.g., randomisation list on a public notice
board).
►Results: Inflated ES for poorly concealed
compared with well concealed randomisation.
14
Comparison of adequate, unclear
and inadequate concealment
Allocation
Concealment
Adequate
Effect Size
OR
1.0
Unclear
0.67
Inadequate
0.59
P < 0.01
Schulz et al. JAMA 1995; 273:408.
15
Examples of good
allocation concealment
► “Randomisation
by centre was conducted by
personnel who were not otherwise involved
in the research project.” [1]
► Distant
assignment was used to: “protect
overrides of group assignment by the staff,
who might have a concern that some cases
receive home visits regardless of the
outcome of the assignment process.”[2]
[1] Cohen et al. (2005) J of Speech Language and Hearing Res. 48, 715-729.
[2] Davis RG, Taylor BG. (1997) Criminology 35, 307-333.
16
Assignment Discrepancy
► “Pairs
of students in each classroom were
matched on a salient pretest variable, Rapid
Letter Naming, and randomly assigned to
treatment and comparison groups.”
► “The original sample – those students were
tested at the beginning of Grade 1 –
included 64 assigned to the SMART program
and 63 assigned to the comparison group.”
Baker S, Gersten R, Keating T. (2000) When less may be more: A 2-year
longitudinal evaluation of a volunteer tutoring program requiring minimal
17
training. Reading Research Quarterly 35, 494-519.
Change in concealed allocation
50
45
40
35
30
25
20
15
10
5
0
<1997
>1996
Drug
P = 0.04
No Drug
P = 0.70
18
NB No education trial used concealed allocation
Example of unbalanced trial
affecting results
► Trowman
and colleagues undertook a
systematic review to see if calcium
supplements were useful for helping weight
loss among overweight people.
► The
meta-analysis of final weights showed a
statistically significant benefit of calcium
supplements. HOWEVER, a meta-analysis of
baseline weights showed that most of the
trials had ‘randomized’ lower weight people
into the intervention group. When this was
taken into account there was no longer any
19
difference.
Meta-analysis of baseline body weight
Trowman R
et al (2006) A systematic review of the effects of calcium
supplementation on body weight. British Journal of Nutrition 95, 1033-38.
20
Summary of assignment and
concealment
 Code for Design Type
 Code Fidelity of Allocation
 Code for assignment discrepancies
21
Other design issues
►
Attrition (drop-out)
►
Unblinded ascertainment (outcome measurement)
►
Small samples can lead to Type II error (concluding
there is no difference when there is a difference)
►
Multiple statistical tests can give Type I errors
(concluding there is a difference when this is due to
chance)
►
Poor reporting of uncertainty (e.g., lack of
confidence intervals)
22
Blinding of Participants
and Investigators
►Participants can be blinded to:
 Research hypotheses
 Nature of the control or experimental condition
 Whether or not they are taking part in a trial
►Investigators should be blinded (if possible)
to follow-up tests as this eliminates
‘ascertainment’ bias.
23
Blinded outcome assessment
40
P = 0.13
P = 0.03
35
30
25
<1997
>1996
20
15
10
5
0
Hlth Ed
Education
Torgerson CJ, Torgerson DJ, Birks YF, Porthouse J. (2005) A
comparison of randomized controlled trials in health and
education. British Educational Research Journal,31:761-785.
24
Statistical power
► Few
effective educational interventions
produce large effect sizes especially when
the comparator group is an ‘active’
intervention.
► In
a tightly controlled setting 0.5 of a
standard deviation difference at post-test is
good. Smaller effect sizes in field trials are
to be expected (e.g. 0.25). To detect 0.5 of
an effect size with 80% power (sig = 0.05),
we need 128 participants for an individually
randomized experiment.
25
Percentage of trials underpowered
(n < 128)
P = 0.22
90
80
P = 0.76
70
60
50
<1997
>1996
40
30
20
10
0
Hlth Ed
Education
Torgerson CJ, Torgerson DJ, Birks YF, Porthouse J. (2005) A comparison
of randomized controlled trials in health and education. British
26
Educational Research Journal,31:761-785.
Code for analysis issues
► Code
for whether, once randomized,
all participants are included within
their allocated groups for analysis
(i.e., was intention to treat analysis
used).
► Code
for whether a single analysis is
pre-specified before data analysis.
27
Attrition
► Attrition
can lead to bias; a high quality
trial will have maximal follow-up after
allocation.
►A
good trial reports low attrition with no
between group differences.
► Rule
of thumb: 0-5%, not likely to be a
problem. 6% to 20%, worrying, > 20%
selection bias.
28
Poorly reported attrition
►
In a RCT of Foster-Carers extra training was given.
 “Some carers withdrew from the study once the
dates and/or location were confirmed; others
withdrew once they realized that they had been
allocated to the control group” “117 participants
comprised the final sample”
►
No split between groups is given except in one table
which shows 67 in the intervention group and 50 in
the control group. 25% more in the intervention
group – unequal attrition hallmark of potential
selection bias. But we cannot be sure.
29
Macdonald & Turner, Brit J Social Work (2005) 35,1265
What is the problem here?
Random allocation
160 children in 20 schools (8 per school)
80 in each group
1 school 8 children withdrew
N = 17 children replaced following
discussion with teachers
76 children allocated
76 allocated to
to control
intervention group
30
Intention to Treat (ITT)
vs.
Treatment Only (TO)
► ITT:
The analysis of the outcome
measure for all participants initially
assigned to a condition regardless of
whether or not they completed or
received that intervention.
► TO:
The analysis of ONLY the
participants initially assigned to a
condition AND completed the
intervention
31
Survey of trial quality
Characteristic
Cluster Randomised
Sample size justified
Concealed randomisation
Blinded Follow-up
Use of CIs
Low Statistical Power
Drug Health Education
1%
36%
18%
59% 28%
0%
40%
8%
0%
53% 30%
14%
68% 41%
1%
45% 41%
85%
Torgerson CJ, Torgerson DJ, Birks YF, Porthouse J. (2005) A comparison of
randomized controlled trials in health and education. British Educational
Research Journal,31:761-785. (based on n = 168 trials)
32
CONSORT
► Because
the majority of health care trials
were badly reported, a group of health care
trial methodologists developed the
CONSORT statement, which indicates key
methodological items that must be reported
in a trial report.
► This
has now been adopted by all major
medical journals and some psychology
journals.
33
The CONSORT guidelines, adapted for
trials in educational research
►
►
►
►
►
►
►
►
Was the target sample size adequately determined?
Was intention to teach analysis used? (i.e. were all children
who were randomized included in the follow-up and analysis?)
Were the participants allocated using random number tables,
coin flip, computer generation?
Was the randomisation process concealed from the
investigators? (i.e. were the researchers who were recruiting
children to the trial blind to the child’s allocation until after
that child had been included in the trial?)
Were follow-up measures administered blind? (i.e. were the
researchers who administered the outcome measures blind to
treatment allocation?)
Was precision of effect size estimated (confidence intervals)?
Were summary data presented in sufficient detail to permit
alternative analyses or replication?
Was the discussion of the study findings consistent with the
data?
34