Statistical challenges in the validation of surrogate endpoints Marc Buyse International Drug Development Institute (IDDI), Brussels Limburgs Universitair Centrum, Diepenbeek, Belgium [email protected] FDA Industry Workshop, September.
Download ReportTranscript Statistical challenges in the validation of surrogate endpoints Marc Buyse International Drug Development Institute (IDDI), Brussels Limburgs Universitair Centrum, Diepenbeek, Belgium [email protected] FDA Industry Workshop, September.
Statistical challenges in the validation of surrogate endpoints Marc Buyse International Drug Development Institute (IDDI), Brussels Limburgs Universitair Centrum, Diepenbeek, Belgium [email protected] FDA Industry Workshop, September 22-23, 2004 Outline Need for surrogates Definitions Validation criteria – Single trial – Several trials (meta-analysis) Case studies – PSA and survival (advanced prostatic cancer) – 3-year PFS and 3-year OS (early colorectal cancer) Why do we need surrogates? Practicality of studies: – Shorter duration – Smaller sample size (?) Availability of biomarkers: – Tissue, cellular, hormonal factors, etc. – Imaging techniques – Genomics, proteomics, other-ics Ref: Schatzkin and Gail, Nature Reviews (Cancer) 2001, 3. Validity of a surrogate endpoint Evidence that biomarkers predict clinical effects – – – – Epidemiological Pathophysiological Biological Statistical What are the conditions required to show this? Ref: Biomarkers Definition Working Group, Clin Pharmacol Ther 2001, 69: 89. Definitions Clinical endpoint: a characteristic or variable that reflects how a patient feels, functions, or survives Biomarker: a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention Surrogate endpoint: a biomarker that is intended to substitute for a clinical endpoint. A surrogate endpoint is expected to predict clinical benefit (or harm or lack of benefit or harm) Ref: Temple, JAMA 1999;282:790. Single trial Parameters of interest – effect of treatment on surrogate endpoint () – effect of treatment on true endpoint () – effect of surrogate on true endpoint () – adjusted effect of treatment on true endpoint (S) – adjusted effect of surrogate on true endpoint (Z) Ref: Buyse and Molenberghs, Biometrics 1998;54:1014. Surrogate endpoint Treatment True endpoint Correlation of endpoints is not enough Key point: “A correlate does not a surrogate make” 0 is not a sufficient condition for validity Ref: Fleming and DeMets, Ann Intern Med 1996, 125: 605. A first formal definition and criteria Prentice’s definition H0S : = 0 H0T : = 0 Prentice’s criteria An endpoint can be used as a surrogate if – it predicts the final endpoint ( 0) – it fully captures the effect of treatment upon the final endpoint ( 0 and S = 0) Ref: Prentice, Statist in Med 1989;8:431. A first formal definition and criteria Problems with Prentice’s approach – – – – rooted in hypothesis testing require significant treatment effects overly stringent criteria not equivalent to definition (except for binary endpoints) – one can never prove the null (S = 0) Ref: Buyse and Molenberghs, Biometrics 1998;54:1014. The proportion explained Freedman’s “proportion explained” is defined as PE = 1 - S / if S = , PE = 0 and the surrogate explains nothing if S = 0, PE = 1 and the surrogate explains the entire effect of treatment on the true endpoint Ref: Freedman et al, Statist in Med 1989;8:431. The proportion explained Problems with the proportion explained – PE is not a proportion (can be <0 or >1) – PE confuses two sources of variability, one at the individual level, the other at the trial level: PE = Z / – PE can be anywhere on the real line, depending on precision of S and T… Ref: Molenberghs et al, Controlled Clin Trials 2002;23:607. Statistical validation of surrogate endpoints “The effect of treatment on a surrogate endpoint must be reasonably likely to predict clinical benefit” Ref: Biomarkers Definitions Working Group, Clin Pharmacol Ther 2001;69:89. The relative effect Interest now focuses on the two components of PE: – the surrogate must predict the true endpoint (Z 0) – the relative effect, defined as RE = / allows prediction of the effect of treatment on the true endpoint () based on the effect of treatment on the surrogate () Ref: Buyse and Molenberghs, Biometrics 1998;54:1014. Prediction of true endpoint from surrogate endpoint Endpoints observed on individual patients True Endpoint R² indicates quality of regression Slope = Surrogate Endpoint Treatment Effect on True Endpoint () Prediction of treatment effect: one trial 1 Treatment effect observed in the trial .5 Slope = / 0 -.5 Regression through origin; only one point! -1 -1 0 1 Treatment Effect on Surrogate Endpoint () Several trials For a marker to be used as a surrogate, we need “repeated demonstrations of a strong correlation between the marker and the clinical outcome” Ref: Holland, 9th EUFEPS Conference on “Optimising Drug Development: Use of Biomarkers”, Basel, 2001. Treatment Effect on True Endpoint () Prediction of treatment effect: several trials 1 Treatment effects observed in all trials .5 Slope = / 0 -.5 R² indicates quality of regression -1 -1 0 1 Treatment Effect on Surrogate Endpoint () Validation criteria using several trials Parameters of interest – effect of treatment on surrogate endpoint () – effect of treatment on true endpoint () – effect of surrogate on true endpoint () – measure of association between surrogate endpoint and true endpoint (R²individual) – measure of association between effects of treatment on surrogate endpoint and on true endpoint (R²trial) Ref: Buyse et al, Biostatistics 2000;1:49; Gail et al, Biostatistics 2000;1:231. Technical difficulties: the endpoints are not normally distributed In practice, endpoints are often of the following type : response, survival, longitudinal. Such endpoints are not normally distributed, and therefore complex modelling is required to characterize the association between endpoints (“individual level association”). At the trial level, however, simple linear models are still adequate to characterize the association between treatment effects on the endpoints (“trial level association”). Refs: Molenberghs et al, Stat Med 20:3023, 2001; Burzykowski et al, J Royal Stat Soc A 50: 405, 2001; Renard et al, J Applied Statist 30:235, 2002. A case study in advanced prostatic cancer: the trials Two multicentric trials for patients in relapse after firstline endocrine therapy (596 patients) Unit of analysis for treatment effects: country (19 units) Patients randomized between two treatments: – Experimental (retinoic acid metabolism-blocking agent) – Control (anti-androgen) Ref: Buyse et al, in: Biomarkers in Clinical Drug Development (Bloom JC, ed.): Springer-Verlag, 2003. A case study in advanced prostatic cancer: the endpoints Potential surrogate endpoints: Longitudinal PSA measurements taken at pre-defined time points PSA response (decrease of at least 50%) Time to PSA progression (TPP) True endpoint: Overall survival A case study in advanced prostatic cancer Experimental Control 10 Surrogate endpoint Log(PSA) 8 Treatment Experimental 6 4 2 0 -2 0 .5 1 1.5 2 2.5 3 Time (years) Control Estimated hazard rate Rz 1.5 1 True endpoint .5 Experimental Control 0 0 .5 1 1.5 2 Time (years) 2.5 3 PSA response as surrogate for survival Treatment effect on survival time Very weak association between treatment effects R² = 0.05 2 1 0 -1 -2 -3 -2 -1 0 1 Treatment effect on PSA response 2 TTP as surrogate for survival Weak association between treatment effects Treatment effect on survival time 3 R² = 0.22 2 1 0 -1 -2 -3 -3 -2 -1 0 Treatment effect on time to PSA progression 1 Longitudinal PSA as surrogate for survival Moderate association between treatment effects Treatment effect on survival time 3 R²trial = 0.45 2 1 0 -1 -2 -4 -3 -2 -1 0 1 2 Treatment effect on longitudinal PSA 3 Individual-level and trial-level measures of association Individual-level association between PSA and survival [95% C.I.] PSA response Time to PSA progression Longitudinal PSA Survival odds ratio = 5.5 [2.7 - 8.2] Survival odds ratio = 6.3 [4.4 – 8.2] Coefficient of determination R²(t) > 0.84 at all times t Trial-level association between treatment effects on PSA and survival [S.E.] R²trial = 0.05 [0.13] R²trial = 0.22 [0.18] R²trial = 0.45 [0.18] A case study in early colorectal cancer: the trials Fifteen collaborative group trials for patients after resection of colorectal tumor (12,915 patients) Unit of analysis for treatment effects: 18 comparisons between 33 treatment arms Patients randomized between various 5-FU regimens and/or control A case study in early colorectal cancer: the endpoints Potential surrogate endpoint: 3-year disease-free survival True endpoint: 5-year overall survival Ref: Sargent et al, Proceedings ASCO (Abstract # 3502), 2004. Acknowledgement: the following slides are based on Dr Daniel Sargent’s presentations to ODAC on May 5 and at ASCO on June 6 Most recurrences occur before 3 years 8 7.2 6.9 Recurrence Rate (%) 7 5.6 6 5 4 3.5 4 3.2 3 2.2 2 2 1.3 1.2 1 0.9 0.8 0 0.5 0.5 0.4 0.3 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 Years after randomization 6 6.5 7 7.5 8 Strong association between endpoints 0.8 R2=0.86 Overall Survival 0.75 0.7 0.65 0.6 0.55 0.5 0.5 0.55 0.6 0.65 0.7 Disease Free Survival 0.75 0.8 Strong association between treatment effects 1.3 Overall Survival Hazard Ratio 2 R =0.87 1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.5 0.6 0.7 0.8 0.9 1 1.1 Disease Free Survival Hazard Ratio 1.2 1.3 Predicted versus actual OS hazard ratios 1.6 Predicted Overall Survival Hazard Ratio 1.4 Actual Overall Survival Hazard Ratio 1 0.8 0.6 0.4 c1 -8 9 N -8 9 c2 15 N S9 4 05 C c3 -8 9 -9 1 N N 01 C c1 04 C 04 c2 02 C C -8 7 N C IC N IO IV G 03 C -7 8 N D SI EN A IN T00 35 0.2 FF C Hazard Ratio 1.2 Overview of validation approaches Single trial – full capture (Prentice) – proportion explained (Freedman et al) – relative effect (Buyse & Molenberghs) – likelihood reduction factor (Alonso et al) Several trials (meta-analysis) – concordance (Begg & Leung) – correlation of effects (Daniels & Hughes) – trial-level measures of association (Gail et al) – individual- and trial-level measures of association (Buyse et al) – predicted treatment effect (Baker) – surrogate threshold effect (Burzykowski & Buyse) Conclusions on surrogate validation Ideally, statistical validation requires the following: – – – – – data from randomized trials replication at the trial or center level at least some observations of T large numbers of observations range of therapeutic questions (Z1, Z2, …) Hence: – individual patient data meta-analyses are needed – access to such data is a problem when they are proprietary Ref: Burzykowski, Molenberghs and Buyse (eds.), “The Evaluation of Surrogate Endpoints”, Springer-Verlag (in press).