Management of Missing Data in Clinical Trials from a Regulatory Perspective H.M. James Hung Div.

Download Report

Transcript Management of Missing Data in Clinical Trials from a Regulatory Perspective H.M. James Hung Div.

Management of Missing Data in
Clinical Trials from a Regulatory
Perspective
H.M. James Hung
Div. of Biometrics I, OB/OPaSS/CDER/FDA
Presented in FDA/Industry Workshop,
Bethesda, Maryland, September 23, 2004
James Hung, 2004 FDA/Industry
Collaborators
Charles Anello, Yeh-Fong Chen,
Kun Jin, Fanhui Kong,
Kooros Mahjoob, Robert O’Neill,
Ohid Siddiqui
Office of Biostatistics, OPaSS, CDER
Food and Drug Administration
James Hung, 2004 FDA/Industry
Disclaimer
The views expressed in this presentation
are not necessarily of the U.S. Food and
Drug Administration.
Acknowledgment
O’Neill (2003, 2004)
Temple (1994-2004)
James Hung, 2004 FDA/Industry
Outline
• Informative dropout
• Statistical analysis methods
• Methodology consideration
• Summary
James Hung, 2004 FDA/Industry
Clinical trial focuses on intent-to-treat
population (including completers and
dropouts)
Response variables often measured
over time (e.g., at multiple clinic or
hospital visits)
James Hung, 2004 FDA/Industry
Often the main clinical hypothesis
concerns the effect K of a test drug r.t.
a control at some time K (e.g., end of
study).
Statistical null hypothesis
H0: K = 0
i.e., allow nonzero  at other time
points? (make sense?)
James Hung, 2004 FDA/Industry
Unclear why testing only at the last time
point is most relevant (for simplicity? avoid
statistical adjustment for testing multiple times?)
Drug effects over time are important
information.
e.g., inconceivable to market a drug that
is effective only at Week 6, say.
James Hung, 2004 FDA/Industry
For drug effect over time (or some
period of time, e.g., at steady state), the
relevant null hypothesis is
H0: 1 = ∙∙∙ = K = 0
or H0: slope difference = 0 (if
response follows straight-line model )
or others for relevant time period.
James Hung, 2004 FDA/Industry
Informative Dropout
In many disease areas, dropout rate is
high and the results of any analyses for
ITT population is not interpretable
because of a large amount of missing
data, particularly when dropouts are
‘informative’.
James Hung, 2004 FDA/Industry
Dropout problems are multi-dimensional
e.g., dropping out due to multiple
reasons: side effects of the drug, health
state is worsening, unperceived benefit
Little knowledge of real causes of
missing data, whether missing
mechanism related to study outcome or
treatment
James Hung, 2004 FDA/Industry
Informative dropout has many different
definitions, e.g.,
- dependent on observed data,
dependent on missing data,
treatment-related dropout, …
- tied in with missing mechanism
MCAR, MAR, MNAR, NIM, …
O’Neill (2003, 2004)
James Hung, 2004 FDA/Industry
For regulatory consideration, any
treatment related dropout may be a
suspect of informative dropout and
missing mechanism probably needs to
be considered informative (i.e., may
severely bias estimates and tests) unless
proven otherwise.
James Hung, 2004 FDA/Industry
In a clinical trial, each cohort of dropout
by reason or by dropout time can be very
small.
Difficult or impossible to assess whether
missing values are informative.
James Hung, 2004 FDA/Industry
100
80
60
40
Outcome
120
140
160
placebo patients' responses
W eek 0
W eek 1
W eek 2
James Hung, 2004 FDA/Industry
W eek 3
W eek 4
120
100
80
60
40
Outcome
140
160
drug patients' responses
W eek 0
W eek 1
W eek 2
James Hung, 2004 FDA/Industry
W eek 3
W eek 4
Based on visual inspection,
drug seems to perform better than
Placebo.
James Hung, 2004 FDA/Industry
60
80
100
120
plac ebo
drug
40
Outcome Measure
140
160
lack of effect
W eek 0
W eek 1
W eek 2
James Hung, 2004 FDA/Industry
W eek 3
W eek 4
120
60
80
100
plac ebo
drug
40
Outcome Measure
140
160
withdraw consent
W eek 0
W eek 1
W eek 2
James Hung, 2004 FDA/Industry
W eek 3
W eek 4
60
80
100
120
plac ebo
drug
40
Outcome Measure
140
160
insufficient response
W eek 0
W eek 1
W eek 2
James Hung, 2004 FDA/Industry
W eek 3
W eek 4
60
80
100
120
plac ebo
drug
40
Outcome Measure
140
160
adverse events
W eek 0
W eek 1
W eek 2
James Hung, 2004 FDA/Industry
W eek 3
W eek 4
Difficult to tell whether missing
mechanism is ‘ignorable’ or not…
e.g., in a linear response profile, MAR
May be NIM.
James Hung, 2004 FDA/Industry
60
80
100
120
plac ebo
drug
40
Outcome Measure
140
160
completers
W eek 0
W eek 1
W eek 2
James Hung, 2004 FDA/Industry
W eek 3
W eek 4
160
placebo group's mean by dropout reason
120
3
4
6
1
5
2
2
1
5
6
4
3
3
2
1
6
4
1
6
6
W eek 2
W eek 3
W eek 4
60
80
100
4
3
40
Outcome Measure
140
lac k of effec t
w ithdraw c ons ent
ins uffic ient res pons e
adv ers e ev ents
protoc ol v iolation
c ompleters
W eek 0
W eek 1
James Hung, 2004 FDA/Industry
160
drug group's mean by dropout reason
120
5
1
5
2
6
3
4
1
80
100
2
4
1
3
2
5
3
4
3
2
4
6
5
6
6
W eek 3
W eek 4
60
6
40
Outcome Measure
140
lac k of effec t
w ithdraw c ons ent
ins uffic ient res pons e
adv ers e ev ents
non-c omplianc e
c ompleters
W eek 0
W eek 1
W eek 2
James Hung, 2004 FDA/Industry
These plots show difficulty in
classifying dropouts (informative or
not) in individual trials where each
cohort of dropout is small, (though total
dropout rate could be high).
These types of analysis should be done
with external historical trials, at least
for classification purpose.
James Hung, 2004 FDA/Industry
Statistical Analysis Methods
Literature guidance
1) No satisfactory statistical analysis
method for handling non-ignorable
missing data
2) Likelihood-based methods require
assumptions about missing data
mechanism (unverifiable from current
trial data)
James Hung, 2004 FDA/Industry
Facts
1)Validity of any analysis method is
very much in question.
2) Better alternative method is unclear.
Use of current trial data to seek
imputation method is futile.
3) Dropouts and missing data are
unavoidable.
James Hung, 2004 FDA/Industry
Glimpse of the analysis problem
samplemeanof size Ni : Yi ~ N (i , 2 / Ni ), i  1,2
 = µ1 - µ2 at last time point
ni = # of completers in group i
fi = ni/Ni
If there is no missing value, we have
D = Y1 – Y2 (unbiased for )
V(D) = estimated variance of D
Z = D/[V(D)]1/2
James Hung, 2004 FDA/Industry
Missing values  { D , V(D), Z } not
obtainable.
Can try to get
E( D | data) and V( D | data).
and construct
Z* = E( D | data ) / [V( D | data )]1/2
or Z+ = E( D | data ) / [V(D)]1/2
James Hung, 2004 FDA/Industry
For group i, observeddata : (Yoi , Ri )
Yoi = sample mean of completers
Ri = vector of indicators for completion or dropout
Ymi = unobservable sample mean of dropouts
E ( D | Yo1 , Yo 2 , R1 , R2 )
 f1Yo1  f 2Yo 2 
(1  f1 ) E (Ym1 | Yo1 , R1 ) 
(1  f 2 ) E (Ym 2 | Yo 2 , R2 )
James Hung, 2004 FDA/Industry
Immediately, when f1 ≠ f2, this statistic
has problem of interpretation, unless
Ri and Ymi are independent (MI).
Under MI,
E(Ymi | Yoi, Ri ) = E(Ymi) .
And if E(Ymi) = µi, then completer
analysis might offer a reasonable
estimate of .
James Hung, 2004 FDA/Industry
When f1 = f2 = f,
E ( D | Yo1 , Yo 2 , R1 , R2 )
 f (Yo1  Yo 2 )  (1  f ){E (Ym1 | Yo1 , R1 ) 
E (Ym 2 | Yo 2 , R2 )},
a linear combination of obs sample mean
difference of completers and difference in
conditional mean of dropouts (the latter
requires models).
James Hung, 2004 FDA/Industry
What about Var (D | data)?
Another formidable task !
Nonlikelihood-based methods are
difficult to provide useful solutions
unless some kind of ad-hoc conservative
imputation is feasible.
James Hung, 2004 FDA/Industry
LOCF (last observation carried forward)
LOCF tests H0: K = 0.
LOCF can be biased either in favor of
test drug (e.g., when its effect decays
over time*) or against test drug, even
in case of MCAR.
*Siddique and Hung (2003)
James Hung, 2004 FDA/Industry
For assessing drug effect over time,
LOCF can seriously underestimate
variability of measurement and is
unrealistic (i.e., impute a constant value
for every visit after the patient dropped
out).
James Hung, 2004 FDA/Industry
LAO (last available observation)
Operationally identical to LOCF, this
tests some global drug effect over time,
H0:  w1hµ1h =  w2hµ2h
Wih= E(dropout rate of drug group i at time h)
μih = expected response of patients dropping out
after time h in drug group i
Is this null hypothesis relevant?
Shao and Zhong (2003)
James Hung, 2004 FDA/Industry
LOCF versus LAO (in red)
1
1
Y
2
3
2
3
v0
v1
1
2
1
v2
v3
James Hung, 2004 FDA/Industry
The global mean µi =  wihµih can be
unbiasedly estimated by the sample
mean. But the usual MSE from ANOVA
may not estimate right target (Shao and
Zhong).
LAO results can be difficult to interpret
if dropout reasons or dropout rates are
different in treatment groups.
James Hung, 2004 FDA/Industry
If drug effect over time is at issue,
why not use all the pertinent data
(longitudinal data analysis should be
more efficient than LAO).
- need medical colleagues’ buy in
Ex. Analysis of cuff BP over time may be
more powerful (value of test statistic is
much larger) than LAO
Hung, Lawrence, Stockbridge, Lipicky (2000)
James Hung, 2004 FDA/Industry
MMRM* (mixed-effect model repeated
measure with saturated model)
Response = µ + treatment + time +
treatment*time + baseline +
subject (treatment) + error
subject (treatment) and error are random
effects
treatment and time are class variables
*Mallinckrodt et al (2001)
James Hung, 2004 FDA/Industry
MMRM* analysis used to test
H0: K = 0.
- statistically valid under MAR
- seem more stable in terms of type I
error rate than LOCF under MCAR or
MAR*# (LOCF can be very bad,
depending on  at other visits)
*Mallinckrodt et al (2001)
#Siddique and Hung (2003)
James Hung, 2004 FDA/Industry
LOCF, LAO, MMRM can be very
problematic in case of informative
missing.
Don’t know how to do ‘conservative’
imputation with these methods.
James Hung, 2004 FDA/Industry
Worst rank/score analysis
Test drug effect at time K in the
presence of events (e.g., death) that
cause informatively missing values of
the primary study outcome at time K.
Example: In congestive heart failure
trials, exercise time is missing after
death from heart failure.
Lachin (1999)
James Hung, 2004 FDA/Industry
Assign a worst score to any informatively
missing values (due to occurrence of an
absorbing event related to progression of
disease) and perform a nonparametric
rank analysis.
Valid and efficient for testing H0:
no treatment difference in distributions of
both event time and main study outcome
Lachin (1999)
James Hung, 2004 FDA/Industry
For a drug having little effect on nonmortal outcome (e.g., exercise time), this
analysis when used to test non-mortal
effect can be anti-conservative if the drug
improves survival.
Unclear how to perform a reasonable test
for the non-mortal effect alone
(e.g., labeling issue)
James Hung, 2004 FDA/Industry
Time to treatment failure analysis
In time to event analysis, if test drug has
severe side effects that cause more
dropouts, then time to treatment failure
(event or dropping out due to side
effects) analysis may provide a
conservative analysis.
James Hung, 2004 FDA/Industry
Like the worst score/rank analysis, it is
unclear how to perform a reasonable test
for time to the interested event alone
- censoring on dropout due to failure ?
James Hung, 2004 FDA/Industry
WLP opposite/pooled imputation
For binary outcome, opposite imputation
imputes sample event rate of
completers in one arm for unobserved
event rate of incompleters in the
opposite arm.
Wittes, Lakatos, Prostfield (1989)
Proschan et al (2001)
James Hung, 2004 FDA/Industry
Pooled imputation imputes sample event
rate of completers from both arms for
unobserved event rate of noncompleters
in each arm.
Treat imputed rate as ordinary rate.
Compute Z statistic in the ordinary
manner using a combination of the
observed and the imputed rates.
Wittes et al (1989), Proschan et al (2001)
James Hung, 2004 FDA/Industry
WLP is less conservative than the worst
case analysis (assign ‘event’ to dropouts
in the test drug group and ‘nonevent’ to
dropouts in the control group).
Proschan et al (2001)
James Hung, 2004 FDA/Industry
Partial list of other well-known methods
Likelihood-based method
Pattern-mixture model
selection model
Non-likelihood based method
GEE
Ad hoc imputation method
James Hung, 2004 FDA/Industry
Methodology Consideration
O’Neill (2003, 2004)
- better assume NIM in planning stage
missing data process not directly
verifiable
- choice of approach as the primary
strategy for handling missing data ?
- choice of approaches for sensitivity
analysis, robustness analysis ?
James Hung, 2004 FDA/Industry
Unnebrink and Windeler (2001)
• adequacy of ad hoc strategy (e.g.,
LOCF, ranking, imputation of mean of other
group, etc) for handling missing value
depends on whether the courses of
disease are similar in the study groups
• For large dropout rates or different
courses of disease, no adequate
recommendations can be given
James Hung, 2004 FDA/Industry
In planning strategies for handling
missing values, we need to consider:
1)Null hypothesis should be carefully
defined in anticipation of missing
data.
It should not be altered by the
presence of missing data after trial is
done, regardless of their pattern.
James Hung, 2004 FDA/Industry
2) For design, every attempt needs to be
made to minimize dropouts.
Alternative designs (e.g., enrichment
design*, randomized withdrawal*) may
be used to narrow the study population
(recognize problem of generalizability),
if ITT population cannot be properly
studied.
*Temple (2004)
James Hung, 2004 FDA/Industry
3) For analysis, the method needs to
facilitate ‘conservative’ imputation to:
- adjust the effect estimate toward null
- inflate variability
(double discounting for possible
exaggeration from imputation of missing
data), e.g., some type of worst score or
rank.
James Hung, 2004 FDA/Industry
Seek missing mechanism model to
help imputation.
This needs to use knowledge of disease
process (how? Need to get practical
experiences)
The model needs to be flexible for
sensitivity/robustness analysis.
4)
Note: such model is not verifiable
James Hung, 2004 FDA/Industry
5) Conduct better pilot trials or analyze
historical data to explore response
profiles of dropouts by reasons to see if
missing mechanism may be related to
outcome, and propose a reasonably
conservative imputation method
James Hung, 2004 FDA/Industry
Key to ‘reasonable’ imputation
samplemeanof size Ni : Yi ~ N (i , 2 / Ni ), i  1,2
 = µ1 - µ2 at last time point
ni = # of completers in group i
If there is no missing value, we have
D = Y1 – Y2 (unbiased for )
V(D) = estimated variance of D
Z = D/[V(D)]1/2
James Hung, 2004 FDA/Industry
Missing values  { D , V(D), Z } not
obtainable.
Can try to get
E( D | data) and V( D | data).
And thus we construct
Z* = E( D | data ) / [V( D | data )]1/2
or Z+ = E( D | data ) / [V(D)]1/2
All need models.
Proschan et al (2001)
James Hung, 2004 FDA/Industry
Goal is to use of a model such that
|Z*| ≤ |Z| or |Z+| ≤ |Z| .
Since functional forms of E(D | data)
and V(D | data) are unavailable, use of
linear model to remove 1st-order effect of
data is the first step. Then, what is the
impact of imposing such model on
estimation of V(D | data) or V(D)?
James Hung, 2004 FDA/Industry
SUMMARY
Intent-to-treat is the goal. If the dropout
rate is high, interpretable intent-to-treat
analysis may not be achievable.
Alternative designs (e.g., enrichment
design) that narrow study population may
need to be considered (caveat:
generalizability of interpretation).
James Hung, 2004 FDA/Industry
Intuitively, use of all data seems to be
more promising than use of end point
data to offer better guidance as to how to
reasonably impute missing values. Yet,
this advantage comes with a price that
unverifiable statistical models must be
dependent on.
Thus, every method needs to facilitate
‘conservative’ imputation approach.
James Hung, 2004 FDA/Industry
For regulatory applications, every
attempt needs to be made to:
- minimize dropout
- explore response pattern of dropout in
order to be able to propose a reasonably
conservative imputation method
- propose conservative strategies for
primary analysis and sensitivity
analyses
James Hung, 2004 FDA/Industry
Selected References
Lachin (1999, Controlled Clinical Trials)
Unnebrink, Windeler (2001, Statistics in Medicine)
Shao, Zhong (2003, Statistics in Medicine)
Proschan, McMahon, et al (2001, Journal of
Statistical Planning and Inference)
Wittes, Lakatos, Probstfield (1989, Statistic in
Medicine)
Mallinckrodt et al (2003, ASA JSM)
Siddique, Hung (2003, ASA JSM)
Hung, Lawrence, Stockbridge, Lipicky (2000,
unpublished manuscript)
James Hung, 2004 FDA/Industry
O’Neill (2003, ASA JSM; 2004, DIA EuroMeeting)
Temple (1994-2004, Lecture notes on Clinical
Trial Designs)
Temple (2004, Society of Clinical Trials talk)
James Hung, 2004 FDA/Industry