Clinical Evaluation of Vaccines “The Long Haul” Steve Self Biostat 578 3/2/06 Outline • • • • Introduction Phase I/II Trials: HIV vaccines The End of Phase II Efficacy Evaluation – Test.

Download Report

Transcript Clinical Evaluation of Vaccines “The Long Haul” Steve Self Biostat 578 3/2/06 Outline • • • • Introduction Phase I/II Trials: HIV vaccines The End of Phase II Efficacy Evaluation – Test.

Clinical Evaluation of Vaccines
“The Long Haul”
Steve Self
Biostat 578 3/2/06
Outline
•
•
•
•
Introduction
Phase I/II Trials: HIV vaccines
The End of Phase II
Efficacy Evaluation
– Test of Concept (“Phase IIB”) Designs: HPV vaccines
– Pivotal Trials: HPV vaccines
– Multiple Test of Concept Trials: HIV vaccines
• Post-Marketing Surveillance: Rotovirus vaccines
• Conclusions
Introduction
• Development of an efficacious vaccine can
easily take 20 years and cost ½ billion dollars
• Clinical evaluation alone can involve a dozen
interlocking trials of different designs
conducted over a decade
• Goals of this presentation
– Overview of clinical evaluation programs
– Particular emphasis on crux move of these
programs… the move to efficacy evaluation
– HIV vaccine development as context
Iterative Nature of Vaccine Development
• Vaccines as molecular machines
– Historical vaccines
– Modern vaccines
• Chicken and egg problem
– What immune response is protective?
– How to induce protective responses?
• Idiosyncratic nature of pathogens and
vaccines
HIV-1 Vaccines in Clinical Trials in 2005
• 22 products
– 7 DNA vaccines: naked, multiclade, adjuvanted, etc.
– 8 viral vectors: Adeno, AAV, VEE, MVA, Fowlpox,
Canarypox, Vaccinia, NYVAC
– 3 subunits or peptides: V1-V2 deleted envelopes,
lipopeptides, adjuvanted protein
– 4 prime-boost combinations:
DNA + Viral Vectors (MVA, Fowlpox, Adeno)
Viral Vector + heterologous viral vector
Viral vector + lipopeptides
DNA + protein
Canarypox + rgp120
Phase IA Design
• First in humans: safety is question one
• Dose escalation
– From “dishwater” to either
“maximum tolerable” or
“feasibly manufacturable”
– 10 vaccinees (+2 placebos) per dose
Safety evaluation after second immunization
Systemic and local reactions
Safety outcomes specific to vaccine
Very little information about immunogenicity
Immunogen vs vaccine (regimen)
• Route of administration
– Tissue specificity: intramuscular, intradermal,
subcutaneous, mucosal
– Site specificity: deltoid, gluteus, nasal, oral,
intrarectal, intravaginal
• Multiple administration of immunogen
– Schedule (eg, 0,1,6 mo)
• Heterologous immunogens
– Schedule/route for each immunogen
– Co-administration: timing and/or route
Phase IA Design
• Heterologous
combination regimen
– Each component
immunogen assessed
via dose escalation
design before
combination is evaluated
– Eg 2 components (V1,
V2) each escalating over
3 dose levels
P1
P2
V2/D1
1
V2/D2
2
V2/D3
3
V1/D1
V1/D2
V1/D3
1
2
3
4
5
6
Optimization of vaccine regimen
• Large “parameter” space
• Multidimensional outcome space (immune
responses)
• Uncertainty of outcomes
– Statistical uncertainty
– Biological uncertainty
• Potential (hope) for interactions
Ranking and Selection Trial Designs
• Direct comparision of multiple regimens
• Goal is to select best regimen to move forward for
expanded evaluation
• Assumptions:
– Indifference in case of tie
– Unambiguous empirical ranking based on primary
outcome
• Efficient relative to standard superiority designs
• However
– Assumptions are rarely met precisely
– Some questions don’t fit paradigm at all (eg dose deescalation)
Ranking and Selection Trial Designs
• Multiple group randomized design (Phase IB)
– 30-50/arm to reliably pick winner w/ binary outcome if
response rates differ by ~15%
– No control arm required (unless concern about endpoint
assay validity)
– Biased estimate of immune response to “best” vaccine
• Logistically difficult if > 5 arms
– How to select a few regimens over which to optimize?
– Results may suggest other regimens worth of testing
– Multiple generations of trials to adequately explore
potential
Phase II Designs
• Goals:
– Characterization of immunogenicity
 Is efficacy plausible in target population?
 Comparative of regimens not amenable to ranking/selection
approaches
– Expand safety evaluation
 Reduce upper bound of rate for SAEs
• Design
–
–
–
–
–
Randomized, placebo controlled
May be comparative trial
Hundreds of vaccinees per arm
Study population reflect target in efficacy evaluation
Decision guidelines for go/no-go based on minimum
immune response tied to efficacy trial goals
End of Phase II
• Formal meeting with US FDA
• Integrated analyses of all relevant clinical
data
– Tiered approach for immunogenicity data
– Combined safety database
• Plan for efficacy evaluation
– Efficacy trial design
– Criteria for “success”
– Other aspects of evaluation program
Efficacy Evaluation
• Pivotal trial
– Goal is to provide “robust and compelling
evidence” for net clinical benefit
– Does Phase I/II trial experience provide enough
information to reliably design and conduct such a
vaccine efficacy trial?
– Is there an intermediate step… a trial that will
test the “concept of efficacy” at much reduced
time/cost?
TOC and Pivotal Trials
• Similarities
– Hypothesis-driven RCT
– Provide direct evaluation of vaccine efficacy
• Differences: Goals
– Pivotal Trials: Provide “compelling and robust” evidence of
efficacy, define balance of clinical benefits and risks
– TOC: Initial evaluation to provide sufficient information for
Making a go/no-go decision for pivotal evaluation
 If go: inform design (scientific, operational)
 If no-go: inform direction of further development (if any)
Must be conceptually coherent with plan for pivotal
evaluation
Statistical Design Parameters
Design
Parameter
H0
Test of Concept
Design
VE = 0%
Pivotal Trial
Design
VE = 30%
(minimum for continuing
evaluation)
(minimum for clinical
significance)
Type I Error
(a)
H1
0.025 or greater
0.025 or less
VE ~ 50%
VE ~ 60%
(efficacy to distinguish from
H0 with 90% power)
(Ex: STEP)
(Ex: VaxGen 003, 004)
Required # Endpoints for 90% Power
Phase III
TOC
VE1
VE0 = 30%,
a = 0.025
VE0 = 0%, VE0 = 0%, VE0 = 0%, VE0 = 0%,
a = 0.025
a = 0.05
a = 0.10
a = 0.20
30%
-
350
292
227
158
40%
1901
178
143
113
81
50%
419
99
85
66
45
60%
160
61
49
37
28
70%
78
37
30
26
17
STEP: 100 endpoints, VaxGen Phase III Trials: ~225/360 endpoints
Other Design Parameters
that May Differ
Design
Parameter
Vaccine
Test of Concept
Design
Prototype
Pivotal Trial
Design
Product
Population
Narrow
Representative
(optimize for sensitivity,
operational efficiency)
(target for licensure)
Biomarker
Clinical Outcome
Primary
Endpoint
(Distal)
Example: HPV Vaccine Evaluation
• Two vaccine development programs
– Merck
– GSK
• Both use TOC designs for early efficacy evaluation
• Both follow TOC trial with large pivotal evaluation
• HPV
–
–
–
–
Sexually transmitted virus
Chronic infection
Multiple viral strains
Strain-specific cause of cervical cancer, genital warts
Merck HPV Vaccine
Test of Concept Trial #1
• Monovalent (prototype) vaccine
– HPV16 L1 VLP vaccine with alum adjuvant
– 3 doses IM
• Placebo controlled trial of 2392 women (age 16-23)
• Primary endpoint: persistent HPV 16 infection
• Mean duration of follow-up: 17.4 months
• Target number of endpoints = 41
Koutsky et al., New Eng J Med 347:1645, 2002
Merck HPV Vaccine
Test of Concept Trial #1: Results
• Analyzed 1533 women (ATP):
– fully vaccinated
– HPV negative throughout vaccination period.
• Primary result: 41 endpoints with 0:41 split (V:C)
• Total (pers+trans) incident infection: 74 cases (6:68)
Koutsky et al., New Eng J Med 347:1645, 2002
Merck HPV Vaccine:
Test of Concept Trial #2
• Quadravalent vaccine (Gardasil)
– HPV (16, 18, 11, 6) L1 VLP vaccine with alum adjuvant
– 3 doses IM
• Placebo controlled trial of 552 women (age 16-23)
• Mean duration of follow-up: ~2.5 years
• Primary endpoint: persistent HPV infection (vaccine types)
• Target number of endpoints = 40
• Result: 40 endpoints observed with 4:36 split (V:C)
Villa et al., Lancet Oncology, 2005
Merck HPV Vaccine:
Gardasil Pivotal Trial
• Randomized, placebo controlled trial
• Study population
– ~25,000 women (age 16-23)
– 33 countries, ~150 study sites
•
3.5 years follow-up (post-vaccination)
• Primary efficacy endpoints
– HPV-associated CIN2-3
– Genital warts
• Results presented to US FDA VRBPAC 12/05
Merck HPV Vaccine
Test of Concept Trial #1: Redux
• Long-term followup
– 48 months post-vaccination
– Blinding of treatment assignment maintained
• Endpoints:
– Persistent HPV 16 infection
– HPV16-assoc CIN2-3
• Results:
– Persistent HPV16 infection: 118 cases with 7:111 split (V:C)
– HPV16-assoc CIN2-3: 12 cases with 0:12 split (V:C)
Mao et al., Obstet & Gyn 107(1): 18-27, 2006
Other supportive studies
• Adolescent immunogenicity and tolerability
– >4500 boys and girls
• Mid-adult women’s efficacy and tolerability
– Women age 24-45
• Nordic study
– Durability of protection
– Long-term safety
– > 50,000 men and women
GSK HPV Vaccine
Test of Concept Trial
• Bivalent vaccine (Cervarix)
– HPV16/18 L1 VLP vaccine with AS04 adjuvant
– 3 doses (IM)
• Placebo controlled trial of 1113 young women (age15-25)
• Mean duration of follow-up: 18 months.
• Primary endpoint: persistent HPV16/18 infection
Harper et al., The L:ancet 2004
GSK HPV Vaccine
Test of Concept Trial: ITT Results
% Efficacy
100
90
80
70
60
50
40
30
20
10
0
HPV16
*
Incident Infections
Persistent Infections
* 100% efficacy in ATP analysis
HPV18
Harper et al., The Lancet, 2004
HPV16/18
GSK HPV Vaccine
Pivotal Trials
• GSK Cervarix trial
–
–
–
–
–
Randomized controlled trial
~18,000 young women (age 18-25)
Efficacy endpoints: HPV-assoc CIN2-3
N. America, Latin America, Asia Pacific, Europe
Expected EU filing in ‘06
• NCI Cervarix trial
– Randomized controlled trial
– ~12,000 young women (age 18-25)
– Costa Rica (Guanacaste, Puntarenas)
Summary
• TOC designs are integral components of a larger
program for vaccine evaluation… planned or not!
– Consistent, coherent goals
– Sequence/timing for data and decisions
• TOC designs are used to achieve multiple goals
– Initial testing of prototype vaccine
– Screening evaluation of vaccine “product”
– Basis for initial data on durability of effects
• TOC designs are not used as a substitute for pivotal
trial designs
HIV Vaccines
• Nature of vaccine effect highly uncertain
• Uncertain that any efficacy in humans will obtain
• If there is efficacy, it is uncertain how it will manifest
• Stronger rationale for effect on VL than acquisition
endpoints for vaccines inducing primarily CMI responses
• However need appropriate due diligence in assessment of
impact on acquisition
• Pivotal trial designs are large/long/expensive
• Ideal setting to consider TOC design for initial
efficacy evaluation
STEP:
A HIV Vaccine TOC Trial:
 MRK Ad5 Trivalent HIV-1 gag/pol/nef (0,1,6)
 Study population:
 3000 men and women (18-45 yo) at risk for HIV infection
 Sites with predominately subtype B virus throughout the Americas,
Carribean and Asia
 Co-primary endpoints:
 HIV infection
 Viral load (during early HIV infection)
 a = 0.025 (overall)
 NE = 100
 Power of 90% to distinguish
 VES = 0% vs 53%
 D logVL = 0 vs 0.6-0.7 logs (depending on VES)
Immune Correlates of Protection
– Identification of immune correlates of protection is
an important secondary trial objective
– Test for difference between high and low
responders to vaccine
Infection endpoint: relative risk for infection
VL endpoint: difference in mean log-VL
– Power of tests depend on
Number of infection endpoints among vaccinees
Prevalence of high/low responders to vaccine
Magnitude of difference between high/low responders
Minimum Detectable Effect Sizes
(with 90% power)
Infection Endpoint
logVL Endpoint
Total #
Infections
VE0=30%
a=0.025
VE0=0%
a=0.025
VE0=0% RR*(L,H) D0 = 0
D**(H,L)
a=0.10
a=0.025
50
76%
64%
56%
20.0
0.85
0.30
100
66%
49%
43%
5.9
0.68
0.24
150
61%
42%
36%
3.8
0.55
0.21
200
57%
38%
32%
3.1
0.47
0.19
250
55%
34%
29%
2.6
0.42
0.18
* Relative risk for infection among low immune responders to vaccine relative to high resonders
** D mean logVL: Low immune responders – High immune responders
The Problem of Heterogeneity
– Important theme involves human and viral
variation
• At risk populations span large geographic regions
with different viral and human factors that plausibly
can affect vaccine efficacy
• Impact of human and viral variation on vaccine
efficacy uncertain
– How to design an HIV vaccine evaluation
program that rationally assesses efficacy across
this heterogeneity?
What pivotal trial design?
• If first TOC demonstrates efficacy in MSM with subtype
matched virus is efficacy plausible for
–
–
–
–
Heterosexual men?
Heterosexual women?
Injection drug users?
Subtype mismatched viral populations?
• A global vaccine would require evaluation across this
heterogeneity yet it is a large leap from efficacy results in a
single narrow TOC design to such an extensive evaluation
• Remember primary goals of a TOC trial are to inform
– a data-driven go/no-go decision and
– how to proceed with next step in evaluation
Two TOC Trials before Pivotal Trials?
• Because of heterogeneity, there are two basic concepts to
test in earliest stage of evaluation
– Is there any efficacy?
– Is there any robustness of efficacy?
• With a positive test of each of these concepts then ready to
design and conduct pivotal trial(s).
• Example: STEP + HVTN 503
– First TOC assesses efficacy in optimized (viral subtype matched)
setting
– Second TOC
 assesses robustness to different viral challenge,
 strengthens inference in women, hetero men
– What pivotal trials would follow if both TOCs are positive?
– Would efficacy in an IDU population be evaluated in a third TOC trial?
Continue Series of TOC Trials?
Harmonized TOC designs (same vaccine regimen,
same control, same endpoints)
Trial settings to cover specified set of “major”
human/viral heterogeneities
Joint assessment of impact on acquisition and VL
endpoints as in STEP/503 designs
Allow enough flexibility to consider conducting
trials both in parallel and in series
 Science and art of bridging
 Equipoise, ethics and perceptions
 Logistics and operational capacity
Series of TOC Trials?
– Trial-specific analyses
Powerful inferences about vaccine effect on VL endpoint
Modest power to assess vaccine effect on acquisition endpont
– Secondary analyses of pooled data across trials
• Power to assess overall impact on acquisition endpoint
• Power to assess pre-specified subgroup effects on VL (eg,
gender)
• Power to assess immune correlates of protection
– What are the risks with this strategy with respect to
licensure?
Basis for licensure?
• Evidence for clinical benefit must be “compelling
and robust”
– Two-trial rule often referred to as standard
 Two independent trials
 Each trial delivers p-value < 0.025 for primary test of efficacy
– “Compelling” evidence
 Overall false positive rate is small (0.000625 = 0.0252)
– “Robust” evidence
 Replicated results
 Evidence for efficacy consistent across two trial settings (ie each
trial delivers p-value < 0.025)
Spirit not the letter
• Other ways to develop evidence for efficacy
that is considered “compelling and robust”
– A single trial instead of two?
Compelling evidence:
 Use size of the single primary test for efficacy of 0.000625?
 Or negotiate to use size of test of 0.004 (=.0251.5), say
 Larger trial size required to maintain power with smaller size
of test
Robust evidence:
 Representative study population
 Homogeneity of study population
 Uniformity of efficacy result over key study strata
Spirit not the letter
• Three positive TOC trials as basis for licensure?
– How to balance strength of overall evidence required with
strength of evidence required from each trial?
 Fix maximum size of overall p-value at standard 0.000625
 Then each of 3 trials would be required to deliver a p-value no
greater than 0.085 (=0.0006251/3)
– Comparable strategy to a single large pivotal trial
 Study population includes three “strata” of pre-specified size
 Primary analysis plan includes overall analysis as well as prespecified stratum-specific analyses (with appropriate adjustment
for multiplicity)
PAVE 100: Going Global?
• NIH VRC Multivalent Vaccine
– Subtypes A, B, C env
– Subtype B gag/pol/nef
• Want to test two concepts
– Any efficacy
– Robustness of efficacy across 3 viral populations
• Strategy under discussion
–
–
–
–
Three simultaneous TOC trials (one stratified Ph III trial?)
Balance of overall vs study specific analyses?
Implications for licensure if uniformly positive?
Larger evaluation plan… eg, non-matched virus, IDU?
Post-Marketing Surveillance
• Even largest efficacy trials not large enough to
define adverse events caused by vaccine that occur
in low but important frequency
• VAERS: system for passive surveillance of adverse
events but lacks ability to estimate rates of events
• Very large (post-marketing) epidemiologic studies of
AEs associated with vaccine
– Statistical issues of design, analysis, interpretation
Rotovirus vaccine
• Wyeth vaccine licensed in late ’90s
– Highly efficacious in preventing severe gastroenteritis and
death esp in developing world
– Small but real risk of intussesception identified in Phase
IV studies
– Wyeth pulled vaccine from market
• Merck recently received license for their rotovirus
vaccine
– Highly efficacious
– Theoretical reasons to believe risk of intussesception
lower than that for Wyeth vaccine
– Data from efficacy trial showed somewhat lower rate and
different temporal pattern of intussesception cases
– Very large (60-80,000) person Phase IV studies planned
to define risk
Conclusions
• Clinical development and evaluation of
vaccines is a long haul
• Statistical reasoning is involved at every step
along the way
– Measurement technologies
– Study design
– Data analysis
• Statistical reasoning is also involved at a
programmatic level