Transcript template

Multi-regional Clinical Trials
Why be concerned ?
A Regulatory Perspective on Issues
Robert T. O’Neill Ph.D.
Director , Office of Biostatistics
Office of Translational Sciences, CDER
To be presented at the 17th Annual Harvard Schering-Plough Workshop:
Global Trials, Challenges and Opportunities ; May 28 and 29, 2009
Outline

Why now - Increasing use of this study design - some FDA
experience

What guidance , if any, on the the multi-regional study- its
design, analysis, reporting, and interpretation

What are some concerns: quality, training, data collection and
management (bio)

Trial is only as good as the investigators: who are they - training
of investigators regarding protocol and its compliance

Implementation - time to increase our attention to planning and
analysis

Proposal for way forward
Why now

Increasing use of this design in most medical areas

A Summary of two FDA studies on the use and regulatory impact

Increasing experience raises many questions



Validity

Quality

Design

Monitoring

Single large study - relevance
Sparsity of written literature or FDA guidance on the issues that are raising
concern

It is time to address them - solutions will not be easy or simple

ICH E5 actually expresses a position on the design
Increasing emphasis on quality of clinical trials, cost and streamlining - Critical
Path
This is not just a regulatory or
drug development concern
Modernize
the clinical
trial process
Defining Quality
 Acceptable control of variation
 Sources of variation

Trial conduct problems

Poor record keeping

Flawed procedures
 Quality is ? - metrics of acceptable variation
Sources of measurement error variability that
can contribute to variability in estimates of
treatment effect / response
Auditing strategy vs. quality assurance strategy
Coordinator
Region 1
Site 1
Site 2
Site 3
Region 2
Site 4
Site 5
Site 6
Site 7
Region 3
Site 8
Site 9
Site 10
Site 11
Site 12
Investigator
Site 13
What does this have to do with the
training of clinical investigators ?
 No requirement to be trained in clinical trials
 No requirement to be trained nor certified in ‘Good
Clinical Practices’
 Is a requirement to have a license to practice medicine
 Investigators often are the measurement instrument of
treatment response and they implement the protocol
 Impact they have on the conduct of the study
 Whose responsibility - sponsor, monitor,
investigator ?
The regulatory review process often
serves as the end product audit
 FDA evaluates the study report and the conduct
and key metrics of quality
 FDA evaluates statistical displays of key sources
of variation, bias and uncertainty
 Regional and site outcomes evaluated: Dropouts,
differences in response rates, outcomes,
covariates, exposures, follow-up
 Individucal patient profiles nested within sites which sites and which patient records to audit
Some Experience with
statistical reviews of NDA’s and
Clinical Studies
 Summary of review of 7 years of clinical
studies involving foreign clinical data in
NDA’s
 21 NDA submissions whose decisions
depended upon analysis and interpretation
of treatment effects in multi-regional trials
 John Lawrence evaluation of large
cardiovascular outcome studies
Of 1,926 clinical trials analyzed by OB during FY01-FY07:
41% were domestic; 50% foreign-domestic; and 9% foreign.
Of all subjects enrolled in these trials:
30% were U.S.; 63% domestic-foreign; and 7% foreign.
Of 1,926 trials analyzed by OB Statisticians during FY01-FY07:
Trend toward increasing numbers
in participation of non-U.S. centers and subjects in trial.
Regulatory consequences
 Non approvals
 4 of 22 not approved because of regional
heterogeneity
 9 of 22 approvable but more information
needed - regional heterogeneity
 Need another study
 Labeling limitations or information - Merit
Study Undertaken by FDA
statisticians to evaluate possibility
of systematic regional differences
 Major cardiovascular outcome studies
evaluated over the last 10 years
 Overall study result statistically positive,
ie. demonstrated overall effect
 Region never pre-specified as a factor to be
evaluated statistically
 16 independent studies
Estimates and confidence intervals for difference
between US and Non-US treatment effects for each study
In 13 of 16 , US log hazard above 0
Study
% US
1
31
2
45
3
27
4
5
9
4
6
19
7
43
8
38
9
10
4
74
11
74
12
9
13
3
14
15
29
17
16
90
-1.5
-1.0
-0.5
0.0
0.5
difference of log-hazard ratios
J. Lawrence
1.0
1.5
An Example: Toprol -XL
Taken from the Current Drug Label ; “Clinical
Trials”
MERIT-HF was a double-blind, placebo-controlled study of Toprol-XL
conducted in 14 countries including the US. It randomized 3991
patients (1990 to Toprol-XL) with ejection fraction </= 0.40 and NYHA
Class II-IV heart failure attributable to ischemia, hypertension, or
cardiomyopathy. The protocol excluded patients with contraindications
to beta-blocker use, those expected to undergo heart surgery, and those
within 28 days of myocardial infarction or unstable angina. The
primary endpoints of the trial were (1) all-cause mortality plus allcause hospitalization (time to first event), and (2) all-cause mortality.
The trial was terminated early for a statistically significant reduction in
all-cause mortality (34%, nominal p=0.00009). The risk of all-cause
mortality plus all-cause hospitalization was reduced by 19% (p=0.00012).
The trial also showed improvements in heart failure-related mortality and
heart failure-related hospitalizations, and NYHA functional class.
The table below shows the principal results for the overall study
population. The figure below illustrates principal results for a wide
variety of subgroup comparisons, including US vs. non-US populations
(the latter of which was not pre-specified). The combined endpoints of allcause mortality plus all-cause hospitalization and of mortality plus heart
failure hospitalization showed consistent effects in the overall study
population and the subgroups, including women and the US population.
However, in the US subgroup and women, overall mortality and
cardiovascular mortality appeared less affected. Analyses of female and
US patients were carried out because they each represented about 25% of
the overall population. Nonetheless, subgroup analyses can be difficult to
interpret and it is not known whether these represent true differences or
chance effects.
A figure
From the label
Interpretation - Extrapolation
 Impact on composite endpoint
 Impact on components of composite endpoints
by region / subgroup
 Which factor (s) most important to evaluate
relationship of treatment effects
 Site/center/clinic, Country , Region
Wedel, DeMets, Deedwania, Fagerberg, et al. Challenges of subgroup
analyses in multinational clinical trials: Experiences from
the MERIT-HF trial. Amer. Heart J 2001; 142: 502-11
Antiepileptic Drugs and Suicidality:
Statistical Review
Mark Levenson, Ph.D.
Statistical Safety Reviewer
Quantitative Safety and Pharmacoepidemiology Group
Division of Biometrics 6/CDER/FDA
Joint Meeting of Peripheral and Central Nervous System Drugs
Advisory Committee and Psychopharmacologic Drugs Advisory
Committee
July 10, 2008
Version: 3 July 2008
Suicidal Behavior or Ideation
Odds Ratio Estimates by Location
Location
OR (95% CI) [Sample Sizes]*
North American
1.38 (0.90, 2.13) [68/16841 33/9941]
Non-North American
4.53 (1.86, 13.18) [36/11022 5/6088]
Overall
Risk lower in
NA
1.80 (1.24, 2.66) [104/27863 38/16029]
0.1
0.3
1
3.2
10
Odds Ratio
*[Treat. Events/Treat. n Plac. Events/Placebo n]
Rates are higher in North America sites
Rates are higher in North America sites
Guidance on the topics
 ICH E3 -Multicenter studies - reporting
 ICH E9 - Multicenter studies - Planning
and Analysis
 ICH E5 - Multiregional clinical trials global drug development - bridging
 Literature -
Modernizing the statistical planning of
a multi-regional study with more
realistic objectives
 Some ideas and work of Dr. Hung
 Modern planning should rely more on
simulations of a variety of assumptions for
known or expected sources of variability
and heterogeneity - scenario planning
Bridging
Region I
Region k
Global
Multi-regional trial
Global Trial Consideration
K geographical regions
nh: sample size of region h
N =  nh
yh | h  N( h , 2/nh )
 h  (  , 2 )
Hung, 2007
Effect sizes vary but
are all positive
Question:
Is  meaningful, i.e., interpretable
for all regions? (Interaction ?)
If not, only h is applicable to region h.
Then, the study will require a sufficient
sample size for each region.
Hung, 2007
If  is interpretable for each region,
estimate  by Y   rhyh , ( rh=nh/N )
E (Y )  

 
2 1
Var (Y )     

 N   
Hung, 2007
2

2
 rh 

Should plan N to detect  =  > 0 at
level  & power 1-,
assuming   0
2




1




2
2


2


N  

r


h







2
 

(
z

z
)









  
N 
r


h



  ( z  z  )    



2
Hung, 2007
1
If, instead,  = 0 is assumed for planning
sample size, then the resulting sample
size N0 may be too low. How low?
Pr( P   |    )
 0.5
2


N 0
  

2
2




2r
)

1

(
z

z



   z  ( z  z  ) 1 N 0     rh h 
 
N
   
  
 

 1 
2

Hung, 2007
Studies will be underpowered for effect sizes
and could fail because of it - increase size
Sample Size Ratio N/N0
8.000
6.000
4.000
2.000
0.000
0.3
0.5
0.7
0.9
1.1
1.3
1.5
delta
sigma_delta/sigma = 0.2
sigma_delta/sigma = 0.5
=0.025, =0.1, K=5, (r1 r2 r3 r4 r5)=(.2 .1 .4 .1 .2)
Hung, 2007
Clinical endpoints that may be
impacted by regional differences

Difficulty with diagnosis , with ascertaining progression or
resolution of condition

Anti-bacterial drugs for hospital- acquired pneumonia
(HAP) and ventilator-associated pneumonia (VAP)

Creation of composite endpoints whose components are
evaluated differently

Antibiotic resistance, clinical practice

Patient reported outcomes, symptomatic conditions

Safety outcomes (how ascertained or defined - suicides)
Non-inferiority trials vs.
Superiority trials
Impact of heterogeneity
 Timing of outcome measurement
 Investigator training
 Differential sensitivity and specificity
Reasons for concern when
extrapolating
 Regional differences in observed treatment
effects within the same study (not always clear
what is responsible, chance ?)
 Differences in results of separate independent
studies , each done in different regions
 What (bridging data) can explain the differences ?
 information gained prior to the studies
 information gained after studies completed
 A new study
The Way Forward
Some Recommendations to consider

For every multi-regional study, create a common template for
planning for homogeneity/heterogeneity of regional differences and
exploring sample sizing according to assumptions of dropouts,
follow-up, compliance, event ascertainment by investigator, degree of
internal consistency

Enhance all study reports with section that discusses process of
quality assurance, data management, quality of data collected,
monitoring strategies, important descriptors and outcomes by
region/country

Improve the statistical analysis plan to specifically address strategies
and interpretation of heterogeneity, power, internal consistency of by
region results

Address the training / certification of investigators and quality checks

Auditing strategies, metrics of quality

Update the study report for a MRCT to include new issues