FDA GUIDELINE ON DMCs: CONTROVERSIES

Download Report

Transcript FDA GUIDELINE ON DMCs: CONTROVERSIES

STATISTICAL
CONSIDERATIONS IN
CLINICAL TRIALS
Susan S. Ellenberg, Ph.D.
University of Pennsylvania School of
Medicine
ASENT Clinical Trials Course
Arlington, VA
March 6, 2008
CCEB
TOPICS
•
•
•
•
•
CCEB
Randomization
Sample size determination
Dropouts and noncompliance
Multiplicity
Interim monitoring
WHY WE NEED CONTROLS
• Changes from baseline could be due to
factors other than intervention
– Natural variation in disease course
– Patient expectations/psychological effects
– Regression to the mean
• Cannot assume investigational treatment
is cause of observed changes
CCEB
CCEB
CCEB
RANDOMIZATION
Good
Prognosis
CCEB
Poor
Prognosis
R
A
N
D
O
M
I
Z
E
Treatment
A
GP
PP
Treatment
B
GP
PP
STRATIFIED
RANDOMIZATION
• Randomization in principle should
produce groups that are prognostically
equivalent
• In practice, not uncommon to observe
imbalances in treatment assignments
• Performing randomization within strata
defined by prognostic factors reduces
risk of such imbalances
CCEB
BLOCKED RANDOMIZATION
• “Blocking” refers to a constraint on
randomization that forces the numbers
assigned to treatment and control to be
equal after every X assignments, where X
is the specified block size
• X must be a multiple of the number of
treatment arms
• In a 2-arm trial, X typically is 2, 4 or 6
CCEB
CENTRAL VS LOCAL
RANDOMIZATION
• Local randomization allows more
flexibility at site re time of
randomization
• More quality assurance concerns when
each site has its own randomization
lists, especially in open-label study
CCEB
– Inadvertent bias
– Deliberate subversion of randomization
– Use of envelopes particularly
problematic
SAMPLE SIZE DEPENDS ON
• Size of effect (“difference”) to be
detected (or ruled out, for
noninferiority trials)
• Desired limits on error rates (a, b)
• Variability of outcome
CCEB
CONTROL OF ERROR
• Significance level = Type I error = a
Probability of concluding there is a
treatment difference when there is really
no difference (false positive)
• Power = 1-Type II error = 1- b
Probability of concluding there is no
treatment difference when there truly is a
difference of given size (false negative)
CCEB
WHEN WE ARE INVESTIGATING
PROPORTIONS
• The smaller the desired error rates...
• The smaller the difference to be
detected...
• The closer the expected event rates to
0.5…
THE LARGER THE SAMPLE SIZE WILL BE
CCEB
SAMPLE SIZE BY ERROR RATE AND
DIFFERENCE TO BE DETECTED
Event/success rates
a=0.05
pwr=0.80
0.10 vs 0.20
0.20 vs 0.30
0.20 vs 0.40
0.40 vs 0.60
438
626
182
214
a=0.05
a=0.01
pwr=0.90 pwr=0.90
572
824
236
278
794
1152
328
386
From Fleiss, Levin and Paik, Statistical Methods for Rates and
Proportions, John Wiley
CCEB
SURVIVAL (TIME-TO-EVENT)
DATA
• Sample size depends on the number
of events that are expected
• The expected number of events
increases if
–
–
–
–
CCEB
sample size increases
patients are followed longer
higher risk patients are entered
more outcomes are counted as events
(e.g., recurrence and death instead of
just death)
INTENT-TO-TREAT (ITT)
PRINCIPLE
All randomized patients should be
included in the (primary) analysis,
in their assigned treatment
groups, regardless of compliance
with the assigned treatment.
CCEB
IMPORTANT IMPLICATION
OF ITT PRINCIPLE
All patients entered into the
study should be followed for the
study outcome, regardless of
compliance with the assigned
treatment or other aspects of
the protocol.
CCEB
WHY DO INVESTIGATORS
WANT TO EXCLUDE
SUBJECTS FROM ANALYSIS?
• They refused the assigned treatment
• They didn’t return for evaluations
• They were found to be ineligible after
randomization
• They started taking other treatments
after randomization that violated
protocol
CCEB
EFFECT OF
RANDOMIZATION
Good
Prognosis
CCEB
Poor
Prognosis
R
A
N
D
O
M
I
Z
E
Treatment
A
GP
PP
Treatment
B
GP
PP
EXAMPLE: BIAS DUE TO
EXCLUSION FROM ANALYSIS
• Randomized trial of cancer therapy
following surgery to remove tumor
– Arm 1: chemotherapy after recovery from
surgery
– Arm 2: no further therapy after surgery
• Not blinded—side effects of
chemotherapy would reveal treatment
• Protocol called for treatment to commence
no later than 6 weeks post-surgery
CCEB
EXAMPLE (cont.)
• What if treatment did not start until more
than 6 weeks post-surgery?
– Rationale for therapy is that it will kill any
remaining cancer not removed at surgery
– If therapy not started shortly after surgery,
won’t work—including such patients will dilute
treatment effect
• What’s wrong with this?
CCEB
EXAMPLE (cont.)
• Only those assigned to post-surgical
treatment are at risk of being excluded
• What if those who start therapy late are
those who had the most extensive surgery
and thus required longer recovery period?
• What if those with most extensive surgery
are most likely to have remaining unseen
cancer?
CCEB
CORONARY DRUG PROJECT
Five-year mortality by treatment group
Treatment Group
N
% mortality
clofibrate
1065
18.2
placebo
2695
19.4
Coronary Drug Project Research Group, JAMA, 1975
CCEB
CORONARY DRUG PROJECT
Five-year mortality by adherence to
clofibrate
Adherence
N
% mortality
< 80%
357
24.6
>80%
708
15.0
Coronary Drug Project Research Group, NEJM, 1980
CCEB
CORONARY DRUG PROJECT
Five-year mortality by adherence to
clofibrate and placebo
Clofibrate
Placebo
Adherence
N
% Mortality
N
% mortality
<80%
357
24.6
882
28.2
>80%
708
15.0
1813
15.1
Coronary Drug Project Research Group, NEJM, 1980
CCEB
MULTIPLICITY
• Suppose we do a study in which we
compare placebo A with placebo B
• We do not expect the results to be
identical, but we do expect them to
be similar
• If we look at the data in enough ways,
however, we may well find an
occasional “statistically significant”
difference: a FALSE POSITIVE
CCEB
PRE-SPECIFYING OBJECTIVES
• In designing a trial, need to decide
how treatment will be evaluated
• Often not straightforward—may be
many ways of measuring treatment
effect
• Problem: if we don’t determine
primary measure of effect in
advance, the multiplicity issue arises
CCEB
RCT EXAMPLE: “LIQUID
STITCHES”
• New material developed that surgeon can
apply to close wound, stop bleeding
• Need to study how quickly and
effectively bleeding is stopped
• Possible outcomes of interest:
–
–
–
–
CCEB
time to cessation of bleeding
whether bleeding stopped within X sec
total amount of blood loss
whether further effort was needed to
stop bleeding
– whether blood loss greater than Y ml
MORE MULTIPLICITY
• Which statistical test?
• How to handle missing data?
• What baseline factors (e.g.,
size of wound) should be taken
into account?
CCEB
OTHER MULTIPLICITY
CONCERNS
• Subset effects: no overall treatment effect,
but effect seen in a subset: eg,
– women
– those over age of 50
– those with early stage disease
– those treated in specialty clinics
• Time effect
– no overall treatment effect at prespecified time
point, but effect seen at earlier time point
CCEB
COMMON PHRASES RELATED
TO THE MULTIPLICITY
PROBLEM
• Testing to a foregone conclusion
• Data dredging
• Torturing the data until they confess
CCEB
Subset analyses are important in
developing information about optimal
treatment strategies
BUT
Subset analyses may be unreliable
since multiple analyses frequently
produce spuriously positive (or
negative) results
CCEB
PROBABILITY OF POSITIVE SUBSET
WHEN NO TRUE DIFFERENCES
No. subsets*
2
5
10
20
*non-overlapping
CCEB
Prob. ≥ 1 subset with p<0.05
0.10
0.23
0.40
0.64
ALSO WORKS OTHER WAY
• Suppose a clinical trial is positive
• What will you find when you examine
results in subsets?
• The more subsets examined, the
greater the chance you will find a
subset in which the result is in the
opposite direction from overall result
CCEB
MULTIPLE TESTING AND
EARLY STOPPING
• Repeated testing of data to evaluate
any emerging differences between arms
offers multiple chances to observe a
nominally significant (i.e., p<0.05) result
• Conducting repeated test at nominal
level will inflate false positive rate
CCEB
MULTIPLE LOOKS AND TYPE I
ERROR
nominal significance
level
Probability of nominally significant result (%)
No. of repeated tests
1
2
3
4
5
10
25
50 200
_____________________________________________
.01
1
1.8
2.4
2.9
3.3
4.7
7.0
8.8 12.6
.05
5
8.3
10.7 12.6 14.2 19.3 26.6 32.0 42.4
From McPherson K, New England Journal of Medicine; 290:501-2, 1974
CCEB
DEVELOPMENT OF NEW
MONITORING APPROACHES
• Recognition that simple monitoring at
nominal significance level was inadequate
led to new statistical methods for interim
monitoring
• Most common approach currently: group
sequential designs
– Pre-specified number of interim analyses
– Overall significance level controlled at desired
level (e.g., 0.05)
CCEB
CCEB
FUTILITY TESTING
• Evaluates whether the possibility of an
eventual positive result can be ruled out
• No repeated testing issues--can evaluate
on any schedule, with any frequency
• Termination unlikely on this basis until
most of the trial is completed
CCEB
REFERENCES
• This is just a brief overview of some
important issues in designing, conducting and
analyzing clinical trials
• Good starting references
– Friedman, Furberg, DeMets, Fundamentals of
Clinical Trials (Springer)
– International Conference on Harmonization
Guidance: Statistical Principles for Clinical Trials
www.fda.gov/cder/guidance/ICH_E9-fnl.PDF
CCEB