Transcript Document

Clinical Epidemiology

Needs Assessment (Pick 2/3) • Intent-to-treat analysis • Relative risk, odds ratio, risk ratio, NNT, ARR, RRR • Sample size estimation, study power, superiority, non-inferiority, equivalence

Dr. Jeff Mahon [email protected]

I have no conflicts of interest.

Intent-to-Treat Analysis

In a randomised trial, all subjects randomised to an arm of the trial are analyzed in that arm

no matter what happens.

Synonym ~ Antonyms “intention-to-treat” analysis “non-intent-to-treat” analysis “per-protocol” analysis “on-treatment” analysis “modified intent-to-treat” analysis “safety” analysis

Guidelines to Assess Therapy Articles

1.

Are the results valid? Were patients randomised? Was randomisation concealed?

Were patients analysed in the groups to which they were randomised? Were treatment and control groups similar re: known prognostic factors?

Were patients aware of group allocation?

Were clinicians aware of group allocation?

Were outcome assessors aware of group allocation?

Was follow-up complete?

Randomised?

Blinding?

ITT?

Losses to FU?

2.

What are the results? How large was the treatment effect?

3.

How precise was the estimate of the treatment effect?

Can, and should, I apply the results to my patients?

Were the study patients similar to my patients?

Were all clinically important outcomes considered?

Are the likely treatment benefits worth the potential harm and costs?

Intent-to-Treat (ITT) Analysis

The Principle.

Subjects whose data are withdrawn from the group to which they were randomised, and before the outcome is assessed, may differ in ways that affect the risk for the outcome and/or how they respond to the test treatment BIAS the risk of falsely concluding that the treatment works (or doesn’t work). ITT analysis preserves the main strength of randomisation (protection against confounding) .

Terms for withdrawn subjects and their data can be a problem.

lost, lost to follow-up, withdrew, ineligible, dropped out/dropped in, non-complier, protocol violater, cross-over, missing data, censored

Follow the data (it’s not about the subject - it’s about the subject’s data)

Why are data withdrawn before the outcome is assessed?

Subject not eligible in the first place Subject “lost”  completely disappears    refuses/unable to return formally withdraws consent Subject stops originally assigned treatment crosses over to alternate trial treatment   has a side-effect has a competing event that’s not a trial outcome but requires withdrawal, eg. death, pregnancy  non-compliant Poor quality/missing data

Intent-to-Treat Analysis

Two Key Ideas

1.

Effectiveness

not efficacy. 2. Be

exactly sure

what outcome is being assessed in the ITT analysis.

Effectiveness, Not Efficacy

• ITT rule is about the treatment in real-life (“effectiveness”) • NOT about the treatment in a perfect world (“efficacy”) • Clinicians are usually more interested in effectiveness

Effectiveness, Not Efficacy

Example • RCT acarbose vs. placebo, 100 persons, type 2 diabetes • Main outcome - A 1 C at 6 months after entry • 50 allocated to acarbose 15/50 gut grief - stop early 35/50 go to 6 months • 50 allocated to placebo all of whom go to 6 months Results • mean A 1 C level: acarbose 7.0% (35/50 subjects) A1 • mean A 1 C level: acarbose 7.6% (50/50 subjects) A2 • mean A 1 C level: placebo 8.3% (50/50 subjects) Placebo • A1 vs. Placebo (p = 0.023) A2 vs. Placebo (p = 0.049)

Effectiveness, Not Efficacy

Mean A1C levels at 6 months: • A1 vs. P (p = 0.023) Non-ITT analysis • A2 vs. P (p = 0.049) ITT analysis “In real-life (in which about 30% of patients who start acarbose have to stop because of gut-grief), acarbose resulted in a lower A 1 C level 6 months later.” (ITT analysis) “The effect of acarbose on lowering the A (non-ITT analysis).

1 C level over 6 months may be even better in persons who tolerate the drug these persons (eg., more adherent?) that account for their better A 1 C level .” However, we can’t rule out other factors in

Effectiveness, Not Efficacy

What if….

Mean A 1 C levels at 6 months: • Acarbose 7.0% (35/50 subjects) A1 • Acarbose 8.0% (50/50 subjects) A2 • Placebo 8.3% (50/50 subjects) Placebo • • A1 A2 vs. Placebo vs. Placebo (p = 0.023) Non-ITT analysis (p = 0.11) ITT analysis “In real-life (in which about 30% of patients who start acarbose have to stop because of gut-grief), acarbose did not result in a lower A 1 C level 6 months later.” (ITT Analysis)

Intent-to-Treat Analysis

Be

exactly sure

what outcome is being assessed in the ITT analysis

. Clinical trials always assess multiple outcomes: • benefit • harm Another way to think about these: • Primary Outcome: outcome upon which the trial sample • size was based (see “Sample size calculation” in Methods section) All-The-Rest: secondary/tertiary outcomes, adverse events, etc. etc. etc.

Intensive blood-glucose control with sulfonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33). UKPDS Study Group. Lancet 1998.

One or more of sudden death, death due to hyper/hypoglycemia, fatal/non-fatal MI, angina, heart failure, CVA, renal failure, amputation of ≥ 1 digit, vitreous bleed, laser therapy, blindness in 1 eye, and cataract surgery.

Multiple Hypothesis Testing

The Principle . The probability, by chance alone, that at least one statistical test will be “significant” goes up as you do more tests.

 “False discovery rate” in percentage given by 1 – [1- α] k x 100 where α is the probability of a false positive result (in effect, the p value) and k is the number of tests  For 10 statistical tests (k) and an α = 0.05:  1 – [1 – 0.05] 10 x 100 = 40%  With 10 tests for significance, there’s a 40% chance that at least 1 test will be statistically significant (p < 0.05) by chance alone

This comes up all the time in clinical trials (and clinical research period)

 multiple endpoints, eg. all cause mortality, CV mortality, non-fatal MI, etc  interim analysis for early stopping due to benefit, harm or futility  analysis of effects within subgroups  comparisons across more than 2 treatment arms  comparison of baseline characteristics between 2 (or more) treatment groups

Was it really an ITT analysis?

Many possible outcomes combined with misinterpreting ITT principle = pseudo-ITT analyses Acarbose RCT (again): • primary outcome - A 1 C at 6 months • acarbose group - 35/50 complete trial, give blood for A 1 C at 6 mos. - 15/50 stop drug, give blood for A • placebo group: 50/50 complete trial including A 1 1 C on last day drug taken (range: days 45 to 62) and also at 6 mos.

C at 6 months A1: mean A A3: mean A 1 1 C at 6 mos. acarbose group (35/50 subjects) = 7.0% A2: mean A1C at 6 mos. acarbose group (50/50 subjects) = 8.0% C at last test (2 - 6 mos) acarbose group (50/50 subjects) = 7.1% Placebo: mean A 1 C level at 6 mos. in placebo (50/50 subjects) = 8.3% A3 vs. Placebo: p = 0.032 (Is this really an ITT analysis?)

Was it really an ITT analysis?

STEP 1.

Be sure what the primary outcome was: • Best option – variable used to determine the sample size • See Methods section - “sample size calculation” STEP 2.

Find Figure 1 (“Subject Enrollment and Followup”) • Tracks what happened to all randomised subjects, esp. in terms of ascertainment of the primary outcome

Theophylline for irreversible chronic airflow limitation - a randomized study comparing n of 1 trials to standard practice.

Chest 1999. Outcome Measures and Sample Size “A sample of 80 patients was chosen based on an  error of 5%, 80% power, 20% loss-to-followup, and the CRQ Physical and Emotional Function Scores, and the 6-minute walking distance as the primary outcomes. The minimal important differences were changes over 1 year of 4.5 for the Physical Function Score, 5.5 for the Emotional Function score, and 30 meters for the 6-minute walk.”

Participant Flow and Follow-up (typical “Figure 1”)

Intent-to-Treat Analysis

Can the ITT rule be waived? On occasion, yes

.

Always trial specific: • depends exactly on question the trial hopes to answer Easier to justify if: • rule to waive ITT determined a priori • reason to exclude subjects occurred pre-randomisation

Waiving ITT Rule - Example

Time-sensitive Entry Criteria: • RCT of two different infant formulae to prevent type 1 diabetes in high-risk infants (TRIGR) • “High-risk” based on HLA typing in cord blood which takes about a week • Exposure to one or the other test formulae in the first 7 days of life may affect diabetes risk • Therefore, randomise all potentially eligible babies at birth, test HLA, and withdraw non-HLA eligible infants later

Hydrolyzed formula and early (beta)-cell autoimmunity: a randomized clinical trial. The TRIGR Study Group. JAMA 2014.

Waiving ITT Rule - Example

“Bomb Proof” argument that non-ITT will not be biased: • RCT of coumadin vs. placebo in patients with chronic non-valvular atrial fibrillation • • • Primary outcome - CVA (ischemic or hemorrhagic) and peripheral embolism A priori rule - primary outcomes occurring ≥ 28 days after stopping placebo or coumadin later not counted Rationale - no biological reason for coumadin to exert anti coagulating effects 28 days after it’s stopped

• • • • • •

Intent-To-Treat - Bottom Lines

There are always patients who stop the test drug early, disappear, etc. before the main outcome is assessed. The less the better, but study isn’t automatically ruined when it happens.

ITT analysis preserves the main advantage of randomisation. ITT analysis gives a better indication of the treatment’s “effectiveness” - what happens in real-life - whereas non-ITT (“per-protocol”) indicates the treatment’s “efficacy” - what happens in the ideal world. ITT is conservative - tends to make it harder to conclude that new Rx really works.

If Rx “wins” in an ITT, it’s reassuring - likely only better in non-ITT analysis. Look for analyses both ways. Concordant results (Rx wins in intent-

and

non-ITT analyses) are reassuring. For discordant results, usually safer to favor ITT results.

Relative Risk, Risk Ratio, Odds Ratio RRR, ARR, and NNT

Measures of association that apply to dichotomous health outcomes (dead/alive, DM/no DM, MI/no MI, etc) Relative Risk (RR) • Probability of an event in exposed persons divided by the probability of event in unexposed persons • In Epidemiology, synonymous with “Risk Ratio” Odds Ratio (OR) • Odds is the probability of an event occurring divided by the probability of an event not occurring • An odds ratio is the ratio of two odds

Relative Risk, Odds Ratio, Risk Ratio

Relative Risk/Risk Ratio and Odds Ratio: • Are nearly identical if the risk of the outcome is low (< 5% in exposed and unexposed groups) • “Relative Risk” is often therefore used to refer to any of the three

Relative Risk vs. Odds Ratio

An RCT (or a prospective cohort study): • Outcome - fatal MI • Exposure - atorvastatin • 4000 persons - half get atorvastatin and half dont • Overall incidence of fatal MI = 150/4000 = 3.75% Atorvastatin Yes Yes 50 No 100 Fatal MI No 1950 1900 Total 2000 2000 Total 150 3850 4000

Relative Risk vs. Odds Ratio

Relative risk for fatal MI = (50/2000)/(100/2000) = 0.50

prob (MI) in atorvastatin group prob ( no MI) in atorvastatin group Odds ratio for fatal MI = prob (MI) in prob ( no no atorvastatin group MI) in no atorvastatin group = 50/2000 1950/2000 100/2000 1900/2000 = 0.49

Odds Ratios by the Cross Product

Atorvastatin Yes Yes Fatal MI No 50 (a) 1950 (b) No 100 (c) 1900 (d) Total 2000 2000 Total 150 3850 4000 OR given by ad/bc (50 x 1900)/(1950 x 100) = 0.49

Absolute Risk Reduction, Relative Risk Reduction, and Number-Needed-To-Treat

Atorvastatin Yes Yes 50 Fatal MI No 1950 Total 2000 No 100 1900 2000 Total 150 3850 4000 RRR = (1 – RR) = 1 – 0.5 = 0.5 (100) = 50% ARR = 100/2000 – 50/2000 = 0.025 (100) = 2.5% NNT = 1/ARR = 100/2.5 = 40 “Atorvastatin cut the risk for fatal MIs in half.” “Atorvastatin resulted in 2.5% fewer fatal MIs.” “I’ve got to Rx 40 persons with atorvastatin for x yrs to prevent 1 fatal MI.”

The Heart Protection Study Collaborative Group*

Cause of death Simvastatin (N = 10,269) Placebo (N = 10,267) Vascular causes Coronary Other vascular

Any vascular

Non-vascular causes Neoplastic Respiratory Other medical Non-medical

Any non-vascular

587 (5.7%) 90 (1.9%) 781 (7.6%) 359 (3.5%) 90 (0.9%) 82 (0.8%) 16 (0.2%) 547 (5.3%) p < 0.0001

p = 0.4

707 (6.9%) 230 (2.2%) 937 (9.1%) 345 (3.4%) 114 (1.1%) 90 (0.9%) 21 (0.2%) 570 (5.6%)

Any death

1328 (12.9%) p < 0.0003

1507 (14.7%) * Lancet 2002; 360:7-22

The NNT helps clinicians and patients weigh the treatment’s effort (risks, costs) against its benefit.

HPS Lancet 2002 • a really good RCT with internally valid results • placebo deaths = 1507/10267 = 14.7% • simvastatin deaths = 1328/10269 = 12.9% • p value death simvastatin vs. placebo < 0.0003

• mean followup 5 years RR for death from simvastatin exposure = 12.9/14.7 = 0.88

RRR = 1 – 0.88 = 12% ARR death = 14.7/100 – 12.9/100 = 1.8% NNT 5yr death = 100/1.8 = 56 “I need to treat 56 persons with simvastatin for 5 yrs to prevent 1 death.” 1. Good deal? (you and the patient decide) 2. What about the other 55 patients who get simvastatin?

3. What’s the NNT over 10 years? NNT 5yr vascular death = 100/1.5 = 67

Relative Risk vs. Odds Ratio

An RCT (or a prospective cohort study): • Outcome - fatal MI • Exposure - atorvastatin • 4000 persons - half get atorvastatin and half do not • Incidence of fatal MI = 1500/4000 = 37.5% Atorvastatin Yes Yes 500 No 1000 Fatal MI No 1500 1000 Total 2000 2000 Total 1500 2500 4000

Relative Risk vs. Odds Ratio

Relative risk (risk ratio) for fatal MI = (500/2000)/(1000/2000) = 0.50

Odds ratio for fatal MI = ad/bc = (500) (1000) / (1500) (1000) = 0.33 RRR = (1 – RR) = 1 – 0.5 = 0.5 (100) = 50% ARR = 1000/2000 – 500/2000 = 0.25 (100) = 25% NNT = 1/ARR = 100/25 = 4 “Atorvastatin cut the risk for fatal MIs in half.” “ Atorvastatin resulted in 25% fewer fatal MIs.” “I’ve got to Rx 4 persons with atorvastatin for x yrs to prevent 1 fatal MI.”

Effect of Simvastatin on First Major Vascular Event (fatal/non-fatal MI/CVA or revascularization) in Patients with Diabetes*

Subgroup Diabetes, prior CHD Diabetes, no prior CHD Simvastatin Placebo 325/972 (33.4%) 381/1009 (37.8%) p < 0.01

276/2006 (13.8%) 367/1976 (18.6%) p < 0.001

NNT 5 yrs to prevent 1 major vascular event in persons with diabetes without CHD ~ 21 NNT 5 yrs to prevent 1 major vascular event in persons with diabetes with CHD ~ 22 * Heart Protection Study Group. Lancet 2002; 360:7-22

Relative Risk, Odds Ratio, RRR, ARR, NNT, GHG, etc

For the average lazy clinical reader (such as myself…) • In clinical trials, including RCTs, the RR and OR aren’t that useful – the NNT, ARR, and to less degree the RRR, are more informative • ORs turn up in prospective cohort studies – when they do, they’re usually a good approximation for the RR (but double check the risk for the outcome) • OR are needed for certain statistical procedures (eg. logistic regression) and are the only option to assess risk in case-control studies.

Intensive vs. Conventional Glucose Control in Critically Ill Patients. The NICE-SUGAR Study Investigators. NEJM 2009. Primary outcome - death @ 90 days Control group: 751/3012 (24.9%) Intensive group: 829/3010 (27.5%) p = 0.02

OR = (829 x 2261)/(751 x 2181) = 1.14 (95% CI 1.02 – 1.28) RR = (829/3010)/(751/3012) = 1.10

RRR/I = (829/3010) – (751/3012) (751/3012) = 10.4% increase in mortality ARR/I = 829/3010 – 751/3012 = 2.6% NNH (harm) = 100/2.6 = 38 I need to Rx 38 critically ill persons with intensive control to kill one within 90 days…..…

Some Really Fundamental Epidemiology Rules (That Really Don’t Matter to Clinicians)

RISK RATE = the probability that an event will occur = the frequency at which an event occurs over time RATIO = the value obtained by dividing one quantity by another PROPORTION = a sub-type of a ratio in which the numerator is included in the population defined by the denominator

RELATIVE RISK

= the ratio of the risk among exposed persons to the risk among un-exposed persons (syn: RISK RATIO )

ODDS

= the probability that an event occurs divided by the probability that it does not occur

ODDS RATIO

= the ratio of two odds, eg., the ratio of the odds among exposed persons to the odds among un-exposed persons RATE RATIO = the ratio of the rates among exposed persons to un-exposed persons HAZARD RATE limit as ∆ t event by ( = theoretical measure of the risk of an event at a point in time, approaches zero, of the probability that a person well at time t + ∆t )/ ∆t t t , defined as the will experience the HAZARD RATIO = the ratio of two hazard rates, eg. exposed vs. un-exposed persons (“relative risk in time-to-event analyses”)

95% CL on an NNT is based on:

Pc – Pe ± 1.96 √ Pc (1 – Pc) + Pe (1 – Pe) Nc Ne

Sample Size, Statistical Power

An RCT of A vs B

The Truth

A is better A is not better than B than B A is better than B TP “power” FP “type 1 error”

Trial Conclusion

A is not better FN than B “type 2 error” TN TP = true positive conclusion FP = false negative conclusion Probability of a type 1 (FP or  ) error = 5% ( TN = true negative conclusion FN = false negative conclusion or p = 0.05) Probability of a type 2 (FN or  ) error = 10 to 20% Power = 1 - (  ) error = 80 to 90% (or the probability of a TP result)

Sample Size, Statistical Power

 error probability: probability of concluding there when there really

is not is

a real difference * a difference (false pos)  error probability: probability of concluding there when there really

is is not

a real difference * a difference (false neg) Power (1  error): probability of concluding there when there really

is is

a real difference * a difference (true pos) * A difference greater than you’d expect by chance alone where the investigator gets to decide what she/he means by “chance alone”. Most of the time, it’s less often than 5 in a 100 times (or p = 0.05).

Sample Size, Statistical Power

• Type 1 (  or false positive) error probability : - Set by convention at 5%, though it can be (and sometimes is) set lower (1%) or higher (10%) • - The only thing that directly affects it is the investigator Type 2 (  or false negative) error probability (and its complement, power ): - Set by convention at ~ 10 to 20% ( so power ~ 80 to 90%) - Directly depends on the size of the difference between the two groups that is of interest (the “effect size”) - Directly depends on the total number of subjects and/or events of interest in the trial (the “sample size”)

Sample Size and Statistical Power

Study sample size estimation • Derive the sample size : • In effect, recreates the main statistical analysis that’s going to be done when the study’s finished before starting the trial Example: • Drug A (experimental group) vs. placebo to prevent T2DM • Inputs:  - error = 5% (Z  = 1.96),  -error = 10% (Z  = 1.28) - event rate (eg. risk for DM/5yrs) in controls = 30% (P C ) - event rate in experimental group = 20% (P E ) - effect size (P E – P C ) or  = 30% - 20% = 10% N/group = (Z  + Z  ) 2 [(P E (1-P E ) + P C (1 – P C )] = (P C – P E ) 2 389/group, or ~ 780 overall

Sample Size and Statistical Power

1. The upon which the pre-study sample size was based. Look for something in the primary outcome Methods that describes the primary outcome (“Sample Size Calculation”). of a study (eg. a randomized trial) is the outcome 2. Above everything else, the value of the sample size calculation is that it declares the study’s primary hypothesis. This helps readers (you) not to get confused by the problem of multiple hypothesis testing. 3. The effect size ( new treatment?”  ) is the hardest thing to define, but is also the variable that clinicians are the best at assessing. One way to think about it is “What’s the smallest effect size that would make me want to use the

Sample Size and Statistical Power

Some simple rules to get you started. All other things being equal: • the larger the study, the greater the power • the larger the study, the smaller the effect size that can be detected • if the study is “negative” (ie. statistically significant difference not seen), ask yourself enough power (ie. a big enough sample size) to detect a clinically important difference?” “Did the study have

Risk of cardiovascular events and rofecoxib: cumulative meta-analysis. Juni et al. Lancet 2004.