Design of Clinical Research Protocols

Transcript Design of Clinical Research Protocols

Study Design and Hypothesis Testing in Clinical Research

Jonathan J. Shuster, Ph.D ([email protected]) Research Professor of Biostatistics Univ. of Florida, College of Medicine 1

Take-home Messages

• • • • •

Rely on Evidence-Based Medicine. Conventional wisdom can easily lead us astray. The objective of Statistics is to make informed inferences about a population, based on a sample. It is imperative to quantify the uncertainty.

The P-value is a quantity that allows us to infer something about whether a scientific hypothesis is false.

Non-significant results are inconclusive Randomization and intent-to-treat are vital components in sound clinical research

Topics

1. Motivating Evidence-Based Clinical Studies 2. Objective of Statistics 3. Hypothesis testing and P-values 4. Real Examples and their lessons 4

1. Motivating Evidence-Based Medicine

• A coin is “loaded”, with a 70% chance of landing heads. One player picks a three outcome sequence (e.g. HTH), then the other picks a different sequence. Whoever’s sequence comes up first is the winner.

• Do you want to choose first, and if so, what sequence to you select?

Evidence-Based Medicine

• So you decided to go first and pick HHH, right?

• OK, I pick THH.

• HHH can only occur before THH if it is on the first three flips. (If the first time HHH occurs is flips 6,7,8 then flip 5 is T, so flips 5,6,7 are THH, I win. (I make your first 2, my last 2, so I tend to stay ahead.) • Your chance of winning=.7

3 =.343 (34.3%) 7

Evidence-Based Medicine

• Lesson from this example.

• Things are not always what they seem. You need to be a healthy skeptic.

• Reference: Shuster, J. A two-player coin game paradox in the classroom.

American Statistician

, 2006(Feb), vol 60, pp 68-70.

2. Objective of Statistics

• To make an inference about a defined target population from a representative sample.

• That is, for us, to start from a medical hypothesis about a medical condition, help design a study that can collect data to test the question, and draw conclusions. Quantifying the uncertainty about the inference is a key part.

2. Comment on This

• Should we compare treatment groups statistically in a randomized study with respect to baseline parameter (e.g. age, gender, ethnicity, blood pressure)?

2. Provenzano: Clin J Am Soc Nephrol 4, 386-93, 2009 • “Baseline characteristics were similar except for more men in the oral iron group compared with the ferumoxytol group (62.9%

versus

50.0%,

0.04). Mean baseline laboratory measures were similar between the two treatment groups.” 12

2. Comment on This

• For hypothesis driven research, should we test for normality before using a t-test, and if we reject try to transform the data? 13

Nissen Article

•

JAMA.

2008;299(13):1561-1573.

Pioglitazone vs Glimepiride on Progression of Coronary Atherosclerosis in Patients With Type 2 Diabetes Comparison of

• ‘For continuous variables with a normal distribution, the mean and 95% confidence intervals (CIs) are reported. For variables not normally distributed, median and interquartile ranges are reported and 95% CIs around median changes were computed using bootstrap resampling.’ (N=273 vs 270 in groups) 14

2. Testing Assumptions

Diagnostic Test Passes Fails 15

3. Testing a Hypothesis (P-Value) • Put a statement on Trial: “Null Hypothesis” • ISIS #2 (International Sudden Infarct Study #2): The five week mortality rates for Streptokinase and Placebo are equivalent in patients with recent MIs • Results: Strep(791/8592=9.2%) vs. Plac(1029/8595=12.0%) 17

3. P-Value • P=3.8* 10

-9 • If you replicated the experiment in a population where the null hypothesis was true, there is a 3.8 in a billion chance of seeing a difference at least as extreme in either direction (2-sided) 18

3. ISIS #2 Reference

• ISIS #2 Collaborative Group. (1988) Randomised trial of intravenous streptokinase, oral aspirin, both, or neither among 17,187 cases of acute myocardial infarction: ISIS 2,

Lancet

2: 349-360.

3. P-Value and Proof by Contradiction • What is the probability that if you replicated your experiment in a target population where your null hypothesis is true that you would see differences at least as extreme as what you actually observed. If this value (

the p-value

) is small it is evidence against this null hypothesis.

• Analogy is beyond a reasonable doubt. Science uses 5% arbitrarily as “reasonable” doubt in most cases. 20

3. Was this overkill in terms of sample size • Suppose the results were 79/859 vs. 103/860 (same percentages of 9.2% vs. 12.0% but with one tenth the sample size).

• Now P=0.071 (7.1%), and would not be statistically significant. Would we be using this clot buster today? It was the biostatistician, Sir Richard Peto who determined this sample size.

3. ISIS #2:

• Any other questions about the study?

3. ISIS #2 Issues

• Who was watching the store. Accrual took 3.5 years and outcome was known for each patient within five weeks.

• Always report a sample size justification in your papers (Provenzano, slide 12, did not).

4. Real Example

• Coronary Drug Project 24

The Coronary Drug Project Research Group (1980)

• Influence of adherence to treatment and response of cholesterol on mortality in the Coronary Drug Project. NEJM 303: 1038 1041.

• Double blind randomized study of Clofibrate vs. Placebo in men who had prior MI.

Compliers vs. Not on Drug

Coronary Drug Project

20 15 10 5 0 C_Drug NC_Drug C_Drug NC_Drug 26

Compliers vs. Not

Drug vs. Placebo

Coronary Drug Project Take home Message

What can this study teach us about Clinical Studies?

Intent-to-Treat

• The gold standard for analyzing randomized clinical trials is Intent-to-treat. Patients are analyzed in the groups they were assigned to, irrespective of what they actually received. 30

4. Real UF Example:

• Effectiveness of Nesiritide on Dialysis or All-Cause Mortality in Patients Undergoing Cardiothoracic Surgery.

Clinical Cardiology

. 2006; Jan;29(1):18-24. with T. Beaver et. al. • Motivation: Shands impression was that it was harmful and costly.

4. Nesiritide Example

• Study Null Hypothesis: 20 day death/dialysis rate in patients getting nesiritide within two days of surgery have the same death rate as “similar” patients not getting it.

• Design Suggestions?

4. Possible Designs (+/-)

• Observational: Historical Control (Compare period before drug) to period after drug started to be given to a sizable fraction (gap during ramping up of use). Must include all comers and use electronic chart review.

• Observational: Compare those getting to those not getting the drug. • Randomized controlled prospective trial 34

4. Sources of Variation

• Within treatments, why might we not get the same result for every patient?

• Historical Control?

• Comparing concurrent nesiritide vs. not?

• Randomized prospective trial?

4. Sources of Bias (Confounders)

• Why might we see differences that might be totally unrelated to the treatment (nesiritide vs. not)?

• Historical Control?

• Comparing concurrent nesiritide vs. not?

• Randomized prospective trial?

4. Nesiritide: Propensity Scoring

• Actual Design: Compared Nesiritide vs. Not by Propensity Score Matching.

• Using 12 key covariates, we estimated the probability that a patient would get Nesiritide given these covariates. Then we matched the nesiritide patients to non nesiritide patients for the propensity, and did a matched analysis.

4. Conclusions

• Nesiritide showed no significant difference (inconclusive) within CABG patients, • Nesiritide showed promise in aneurysm subjects with baseline elevated SCR, but was inconclusive in other such patients.

• Run a future randomized double-blind trial in aneurisms with elevated SCR (Just completed and close to being in press with an inconclusive result.) 38

4. Conclusion (continued)

• Note that the Shands study data were very important in designing the randomized follow-up study, in terms of the number of subjects needed (power analysis).

Take-home Messages

• • • • •

The P-value is a quantity that allows us to infer something about whether a scientific hypothesis is false.

Non-significant results are inconclusive Randomization and intent-to-treat are vital components in sound clinical research

Design One Together

• Medical Question: Does Caffeine Withdrawal cause Headaches?

Eligibility

Design

• What are the sources of variation besides caffeine consumption?

• How do we control caffeine consumption • Should we use deception—hide purpose of study? Is this ethical?

Design

• Pre-Post?

• Double Blind Parallel Study?

• Double Blind Crossover Study?

Forensics for Irregularity

Phenylephrine 45

Phenylephrine Crossover Studies

Phenylephrine (Baseline NAR)

Study (10 mg vs Placebo) 1 (N=16) (EB) Std Dev 2.0

CV=100SD/Mea n 15.3% 2 (N=10) (EB) 3 (N=16) 4 (N=15) 5 (N=16) 6 (N=16) 7 (N=14) 0.9

7.8

9.5

6.2

9.8

9.4

6.7% 36.3% 35.6% 29.3% 40.4% 35.3% 47

How do we test for Data Irregularities?

• Background: Baseline NAR (Nasal Airway resistance) measures are typically xx.x (e.g. 20.2), and are always based on the mean of 10 observations (5 from each nostril).

• What null hypothesis can we test to find potential irregularities? What P-value might we use to declare significance?

Study 1 0:2 1:4 2:2 3:6 4:2 5:

6:8 7:9 8:3 9:5 Baseline Last Digit (3 rd sign) 9 4 7 5 Study 2 5 2 1 10 3 4 49

• Thank You!!

Coronary Drug ProjectCoronary Drug Project Data

Five Year Mortality (Clofibrate) • Compliers: 15.0% (15.7%) (N=708) • Non-Compliers: 24.6%(22.5%) (N=357) • Compliers took >80% of their meds to death or to 5 years whichever was first.

• In () is 5 year mortality, adjusted for prognostic factors.

Coronary Drug Project

Five Year Mortality (Placebo) • Compliers: 15.1% (16.4%) (N=1813) • Non-Compliers: 28.2%(25.8%) (N=882) • Compliers took >80% of their meds to death or to 5 years whichever was first.

• In () is 5 year mortality, adjusted for prognostic factors.

Coronary Drug Project

Five-year mortality (As randomized) • Clofibrate: 20.0% (N=1103) • Placebo: 20.9% (N=2789) • NB: Compliance could not be assessed in a small number of patients.

Design of Clinical Research Protocols

Transcript Design of Clinical Research Protocols

Study Design and Hypothesis Testing in Clinical Research

Take-home Messages

Topics

1. Motivating Evidence-Based Medicine

Evidence-Based Medicine

Evidence-Based Medicine

2. Objective of Statistics

2. Comment on This

2. Comment on This

Nissen Article

2. Testing Assumptions

3. P-Value • P=3.8* 10

3. ISIS #2 Reference

3. ISIS #2:

3. ISIS #2 Issues

4. Real Example

The Coronary Drug Project Research Group (1980)

Compliers vs. Not on Drug

Compliers vs. Not

Drug vs. Placebo

Coronary Drug Project Take home Message

Intent-to-Treat

4. Real UF Example:

4. Nesiritide Example

4. Possible Designs (+/-)

4. Sources of Variation

4. Sources of Bias (Confounders)

4. Nesiritide: Propensity Scoring

4. Conclusions

4. Conclusion (continued)

Take-home Messages

Design One Together

Eligibility

Design

Design

Forensics for Irregularity

Phenylephrine Crossover Studies

Phenylephrine (Baseline NAR)

• Thank You!!

Coronary Drug ProjectCoronary Drug Project Data

Coronary Drug Project

Coronary Drug Project

Directory