Transcript Lecture10
Outline
• General endpoint considerations • Surrogate endpoints • Composite endpoints and recurrent events • Safety outcomes (adverse events)
Composite Event (def.)
“An event that is considered to have occurred if any one of several different events or outcomes are observed.” Meinert CL.
Clinical Trials Dictionary
, 1996.
Combined Endpoint = Composite Event
Examples of Combined Endpoints
Study Multiple Risk Factor Intervention Trial (MRFIT) Endpoint CHD death (MI, sudden death) Systolic Hypertension in the Elderly Trial (SHEP) Fatal or non-fatal stroke Physician’s Health Study Fatal/non-fatal myocardial infarction Fatal/non-fatal stroke START HIV Study Serious AIDS, serious non-AIDS or death GISSI-2* Death Late congestive heart failure EF < 35% 45% or more injured myocardial segments QRS score < 10 * Non-fatal events treated hierarchically
Survey of Cardiovascular Trials
• Composite outcomes in CVD trials are frequent (37% of 1,231 published trials) • Typically comprise 3-4 individual components • More components were used in the composite outcome in smaller in trials • The components vary in their clinical significance; death was the most common component included Ann Intern Med 2008;149:612-617
Composite Examples for Heart Failure (HF) Studies: Time to Event Analysis
• Time to the 1 st occurrence of any of the outcomes that are part of the combined endpoint • Examples: –
Time to death or hospitalization
–
Time to death or CVD hospitalization
–
Time to CVD death or or CVD hospitalization
–
Time to CVD death or hospitalization for HF (more sensitive to treatment differences particularly among patients with less severe heart failure)
Composite Example: CVD Death or HF Hospitalization
Patient 1 2 3 4 X HF Hosp.
X CVD Death 0 Non-CVD Death X X HF Hosp.
HF Hosp.
X HF Hosp.
0 Follow-up Time X CVD Death t
Progression to AIDS Endpoint (A Composite with Many Components)
• • Cryptosporidiosis • Isosporiasis • Toxoplasmosis
Mycobacterium avium
, other non-tuberculous mycobacterial infections •
Mycobacterium tuberculosis
, extrapulmonary or pulmonary • Cryptococcosis • Histoplasmosis • Cytomegalovirus disease • Lymphoma • Kaposi’s sarcoma (visceral) • HIV encephalopathy or AIDS dementia complex, Stage 2 or higher • Progressive multifocal leukoencephalopathy • HIV wasting syndrome •
Pneumocystis carinii
• Non-typhoidal , pulmonary or extrapulmonary • Candidiasis, esophageal or pulmonary • Herpes simplex bronchitis, pneumonitis, esophagitis • Herpes zoster, disseminated
Salmonella
septicemia
Clinical Relevance?
Patient 1 2 3 0
X
Candidiasis
X
Candidiasis
X
PCP
X
Death
X
MAI Follow-up Time
0
End of Study
0
End of Study t
Composites or Combined Endpoints Rationale
• More events = greater power (or smaller sample size or shorter trial duration) (maybe) • Inclusion of some components may reduce/eliminate bias due to informative censoring (but may result in a loss of power) • A solution to handling disagreement over which outcome should be primary (not always the best solution)
Freemantle N et al, JAMA 2003.
Composite Endpoint Cautions
Loss of power if:
•
Treatment has little or no effect on some
•
components Early events are less likely to represent “treatment failures” compared to later events (Yusuf and Negassa referred to this as “masking” of events)
Unclear interpretation if: •
Components show a different pattern for treatments
•
Less serious or more subjectively assessed
•
events are accounting for treatment difference “Mixing apples and oranges” Neaton JD et al, Stat Med 1994 and Yusuf S and Negassa A, Amer Heart J 2002.
Adding a Component to a Composite Does Not Always Have a Favorable Effect on Sample Size
• 10% versus 5% event rate – 1,170 patients total • Add a new component • 30% versus 15% event rate – 330 patients • 30% versus 22.5% event rate – 1,450 patients
Alpha = 0.05 (2-sided) and power = 0.90
Neaton J et al, J Cardiac Failure 2005
Informative Censoring - 1
Patient 1 2 3 4 0 X HF Hosp.
X CVD Death 0 Non-CVD Death X X HF Hosp.
HF Hosp.
X HF Hosp.
Follow-up Time X CVD Death t
Informative Censoring - 2
• If a patient dying from a non-CVD cause would have had a different risk of HF hospitalization (had they survived) than survivors, the censoring is “informative”.
• Bias could result if risk of non-CVD death varied by treatment group.
PICO HF Trial: Ranked Clinical Outcome at 24 Weeks
Test same/higher than baseline Assigned Treatment Pimobendan (N=209) Placebo (N=108) 132 (63%) 64 (59%) Test lower duration than baseline 48 (23%) Too sick to undergo exercise test 5 (2%) 34 (31%) 4 (4%) Died before 24 weeks 24 (12%) 6 (6%)
P=0.5 for 63% versus 59%; P < 0.05 for difference in exercise duration.
Women’s Angiographic Vitamin and Estrogen Trial (WAVE)
• Objective: to determine whether HRT or antioxidant vitamin supplements influenced the progression of coronary artery disease as measured by serial angiograms (2x2 factorial study).
• Target population: women with 15-75% coronary stenosis at entry.
• Primary endpoint: change in lumen diameter; deaths and MIs assigned worst rank. JAMA 2002; 288: 2432-2440.
Freemantle Guidelines for Reporting
1. Components of composite outcomes should always be defined as secondary outcomes and reported alongside the results of the primary analysis, preferably in a table.
2. Ensure that the reporting of composite outcomes is clear and avoids the suggestion that individual components of the composite have been demonstrated to be effective.
3. Systematic overviews and quantitative meta-analysis should be used to identify the effects of treatments on rare but important endpoints that may be included as part of composite outcomes in individual trials.
Freemantle N, et al. JAMA 2003.
Guide to Interpreting Composite End Points
1. Are the component end points of similar importance to patients?
2. Did the more and less important end points occur with similar frequency?
3. Is the underlying biology of the component end points similar?
4. Are the point estimates of the relative risk reduction similar and the confidence intervals sufficiently narrow? Montori VM et al, BMJ 2005.
Recommendations on Reporting of Composite Outcomes
•
How often did each component contribute to composite outcome (descriptive)?
•
What is the relative hazard for each component of the composite - the separate number of events and rate for each component (“Consumer Reports approach”)?
Multiple Outcomes are a Necessity, So No Matter What You Do…
• Collect data on all components of the combined endpoint for trial duration • Report not only the combined endpoint, but also: –
how often each component contributed to it
–
the separate number of events and rate for each component (“Consumer Reports approach”)
• See NuCOMBO (
N Eng J Med
(
N Eng J Med
1996) and EPHESUS 2003) trials for good examples of composite outcome reporting.
Example: NuCOMBO AIDS Trial
How often did each component occur as 1st event?
Death PCP Esophageal Candidiasis MAC CMV Other AIDS infections Malignancies Other conditions AIDS/Death
AZT+ddI (N=363)
75 32 30 23 20 27 9 10 226 54 51 22 30 28 29 13 17 244
AZT (N=372) Hazard ratio: 0.86 (0.71 to 1.03)
Example: NuCOMBO AIDS Trial
What is the separate incidence of each component of the combined endpoint “Consumer Reports approach”?
Death PCP Esophageal Candidiasis MAC CMV Other AIDS infections Malignancies Other conditions AIDS/Death
AZT+ddI (N=363)
176 42 43 42 49 37 19 17 226
AZT (N=372)
191 60 42 58 49 38 27 26 244
Hazard Ratio
0.88
0.65
0.97
0.66
0.96
0.94
0.64
0.60
0.86
Composite Endpoint Pitfalls
• Components of composite usually vary in severity and in impact on quality of life • Time to event analyses usually focus on 1 st event and ignore multiple events of the same or different types.
Weighting the Components of Composite Outcomes
• Risk of death associated with different components • Rank-ordering of outcomes in terms of severity and quality of life by clinicians and patients • Rating the entire event profile
•
Some Approaches for Accounting for Severity of Events and Event Histories
Ranking of entire event histories (Follmann et. al., Stat Med 1992)
•
Marginal models with ranking of events according to risk of death or subjective ranking by clinicians and/or patients (Neaton et.al., Stat Med 1994)
•
Rule based ranking (Bjorling and Hodges, Stat Med, 1997) - Severity, timing, number
•
Weights determined by clinical investigators for trials of thrombolytic therapy (Armstrong P et al, Am Heart J, 2011) [death 1.0, shock 0.5, CHF 0.3, recurrent MI 0.2]
•
Matched pairs (Win Ratio) for heart failure trials (Pocock S et al, Euro Heart J, 2012)
Considerations in Analysis of All Events
• Events are not independent – SE’s have to be adjusted • 2 nd , 3 rd … events may not add much to signal from 1 st event • A loss of power could result with an analysis of all events if treatment was modified after 1 st event
Recurrent Events of the Same Type
HF hospitalizations (Euro J Heart Fail 2014; 16:33-40) COPD exacerbations (N Engl J Med 2011; 365:689-698) Bacteriuria and pyuria at repeated visits in elderly women (JAMA 1994; 271:751-754) Other examples: Fungal infections Transient ischemic attacks Seizures in epileptic patients Statistical methods: Poisson and negative binomial regression; generalized linear mixed models.
Example: COPD Exacerbations (
N Engl J Med 2011)
• Fixed follow-up of 12 months • 741 exacerbations among 558 participants given azithromycin (317 had at least one event) • 900 exacerbations among 559 participants given placebo (380 had at least one event) • HR (1 st event)=0.73 (95% CI: 0.63-0.84; p<0.001) • RR (negative binomial regression) = 0.83 (95% CI: 0.72-0.95; p=0.01); p<0.001 by Poisson regression.
Example: Heart Failure Hospitalizations (Euro J Heart Fail 2014)
• Variable follow-up (median=36.6 months) • 392 HF hospitalizations among 1,514 participants given candesartan (230 had at least one event) • 547 HF hospitalizations among 1,509 participants given placebo (278 had at least one event) • HR (1 st event)=0.82 (95% CI: 0.70-0.97; p=0.018) • RR (Poisson regression) = 0.71 (95% CI: 0.62 0.81; p<0.001); RR (negative binomial regression) =0.68 (95% CI: 0.54-0.85) (lower point estimate but wider CI)
Alternatives to Composite or Combined Endpoints
• Single outcome (e.g., all-cause mortality) • Co-primary endpoints (requires an adjustment to Type I error if success is defined as “significant” on any) • Global index (may not be easily interpretable) • Hierarchical scoring/ranking of multiple outcomes • Primary + supportive outcome (SMART)
Multiple Primary Endpoints
• Different than a single combined endpoint • Type I error adjustment may be required (usually is) • Strategy for controlling type I error depends on research question
Early HIV (High CD4+) Treatment Trial: Co Primary Endpoints or Single Composite?
• Serious AIDS – Any fatal AIDS event – Non-fatal AIDS events except herpes simplex, esophageal candidiasis and pulmonary tuberculosis • Serious non-AIDS – Non-AIDS deaths – CV disease – Liver disease – Renal disease – Non-AIDS malignancies (excluding skin cancer)
What is the question? Four possible alternative hypotheses?
• H A : Treatment effect in at least one of K endpoints • H A : Treatment effect in all K endpoints (no type I error adjustment needed) • H A : Treatment effect in M of K endpoints • H A : Treatment effect in weighted average of K endpoints Capizzi T, Zhang J. Drug Info J, 30:949-956, 1996.
Strategies for
for 1 st (type I error) Adjustment Hypothesis:Treatment effect in at least 1 of K endpoints
Bonferroni adjustment most common -- conservative Suppose there are 2 co-primary endpoints.
Prob [no type 1 error for trial (T)] = 1 T T = (1 = 1 - (1 1 )(1 1 )(1 2 ) and 2 ) is the level for trial For case of 1 = 2 = 0.05, T =0.098 (unacceptably high) For T =0.05, each = 1- (1 T ) 1/2 = 0.0253 or more generally 1- (1 This is approximately equal to T /n or 0.05/2=0.025 for this case T ) 1/n Example: EPHESUS heart failure study of eplerenone (
Cardio Drugs and Therapy,
15:79-87, 2001) -- 2 primary endpoints – total mortality (0.04) and CV mortality or morbidity (0.01); overall study type 1 error of 0.05.
Other Strategies
• Global tests, e.g., MANOVA and Hotelling’s T 2 (good approach if endpoints are not correlated) or O’Brien’s rank test (best when all outcomes are expected to go in the same direction). Problem – not specific enough.
• Sequential testing procedures, e.g., Holm’s step down procedure or Hochberg’s step-up procedure (both less conservative than Bonferroni) – marginal testing with control of overall error rate
Example
• 4 endpoints (ordered by p-values): p=0.081; p=0.024; p=0.020; p=0.005
• Bonferroni: only 4 th judge each against 0.05/4=0.0125; endpoint is significant • Holm step-down : reject 4 th endpoint since p=0.005<0.0125; p-value for 3 rd endpoint = 0.020 > 0.05/3=0.017, therefore stop and accept H 0 other 3 endpoints for • Hochberg step-up : accept H 0 0.081 > 0.05; reject H 0 for 2 nd for 1 st endpoint since endpoint and all remaining endpoints since 0.024< 0.05/2=0.025.
Sankoh et al
Stat Med
16:2529-2542, 1997
O’Brien’s Rank Sum Procedure
• Rank the responses of patients for each of the K endpoints, e.g., Wilcoxon’s rank sum test • Sum the ranks for each patients • Carry out an analysis of variance (ANOVA) on the sum of the ranks O’Brien P.
Biometrics
40:1079-1087, 1984. See TOMHS report in
JAMA
for application
Advantages and Disadvantages of Different Approaches to Defining Primary Endpoint
Single outcome Composite Advantage Simple Sample size Co-primary outcomes Global index Eggs not all in one basket Power Hierarchical scoring Power; clinical relevance Disadvantage Sample size; multiple endpoints are a reality Interpretation not easy if components show different patterns Sample size and power Not easily interpretable Clinical relevance
Summary
• In study planning, focus on methods for defining, ascertaining, and measuring major endpoints.
• Composite outcomes can be difficult to interpret if the components do not go in the same direction – choose components carefully.
• If not primary, define secondary endpoints using all events during follow-up. • A “Consumer Reports” analysis should be kept in mind for reporting – full disclosure of all relevant outcomes.