Cases - SACEMA

Download Report

Transcript Cases - SACEMA

Advanced Epi August 15-19

th

SACEMA 2011

Matthew Fox Boston University Center for Global Health and Development Department of Epidemiology Health Economics and Epidemiology Research Office [email protected]

Introductions

Who are you?

Where do you work/study?

What do you study?

Welcome

About me

Week long short course on epi methods

 2 Sessions/day each about 3 hours (depending)  Assumes intro/intermediate epi, practical experience with epi and stats 

Mix of lecture and discussion

 Too much material, take good notes, go back to them 

Finish mid-day on Friday

Course works if you read and participate

Course Overview

Review basic epidemiologic principles

 Reinterpret them in a new light 

Think through problems/implications of what we learned in intro/intermed epi

 Develop a causal framework(s) to hang our epidemiologic thinking 

Learn/apply advanced epi methods

Modern Epidemiology III

Questions for Today

 

What is epidemiology, what is its goal?

   

What are measures of association and measures of effect?

What do these measures really mean?

Which ones have causal meanings?

What is the odds ratio really about Why does everyone use it?

The goal of epidemiologic research

Epidemiology is study of:

 The distribution and determinants of disease in human populations and the application of that knowledge to the control of disease 

But the goal is:

 To obtain a

valid

and

precise

(and

generalizable

estimate of the effect of an exposure on a disease )  Validity is the opposite of bias, precision is the opposite of random error  Fundamentally concerned with measurement

Anyone remember Type I and Type II error?

What are they?

Basic Statistics

Truth about Null Effect No effect Effect Our study null No effect

Correct Type II error (beta) Type I error (alpha) Correct Type I: If we reject the null, what are the chance there is no effect?

Type II: If we fail to reject the null, what are the chances there is an effect?

How do we know a particular epidemiologic finding is true?

Find that the relative risk of exposure to vitamin # on cancer @ is 2.5, p=0.049

Assume we did the perfect study

 No bias (confounding, selection, information)  80% power, alpha = 0.05

What is chance there is really no effect of vitamins on cancer?

 i.e. True relative risk is 1

Syphilis testing in the US

In US pre-2005, Massachusetts required a syphilis test before marriage

 Assume the test was:  95% sensitive and 95% specific 

If I test positive, how likely is it that I truly have syphilis?

 Answer is that it depends

Syphilis

Se = 95% Sp = 95%

Test + +

95 5

Total

100

Truth Total

495 590 PPV = 16% 9405 9900 9410 10,000 Prevalence is: 1%

Back to our study

Effect Effect Correct Our study No effect Type II error (beta) Truth No effect Type I error (alpha) Correct Alpha and beta use the TRUTH as the denominator and so are like Se and Sp

Back to our study

Effect Effect Correct Our study No effect Type II error (beta) Truth No effect Type I error (alpha) Correct Judging the “correctness” of a single study is the PPV, and depends of the prevalence of true hypotheses

Back to our study

alpha = 5%, (Sp 95%) beta = 5%, (Se 95%)

Our Study + Total Truth +

950 50 1000

-

450 8550 9000

Total

1400 68% chance our study is right 8600 10,000 Prevalence of true hypotheses is: 10%

Take home message: We need to critically examine the way we have been taught to design and interpret epidemiologic research

Review of basic concepts

Study design, measures of disease frequency, measures of effect/association

The Source Population

The population that gives rise to cases

It is defined:

 In time and place  With respect to population characteristics  With respect to external influences (modifiers)  Not as a sample of the general population

Cohorts

Membership in a cohort requires a person meet admissibility criteria

 Have common admissibility-defining events 

Membership begins once the temporally last criterion is met

 Once a member, a person never leaves (membership is static or closed)  A

closed

cohort adds no new members and loses only to death, an

open

cohort is adding new members

Dynamic population

Membership requires a person satisfy the membership status criteria

 They have common admissibility-defining characteristics 

Membership exists so long as all of the status criteria are satisfied

A person can enter a dynamic population, leave it, and then re-enter

Cohorts vs. Dynamic Populations

Framingham heart study

Cohort

– the admissibility criteria are enrolling in the study in 1948. Never leave the cohort once you enroll.

Dynamic population

– could have instead studied all residents of Framingham from 1948 onwards, the catchment population for a case registry there. Some will leave, new people will join.

STUDY DESIGN: How to harvest information from the base

Census (cohort) or Sample (case-control)

Cases are valuable (information rich)

 In SE calcs, these drive your standard error  Ex. SE(LN(RR)) = sqrt(1/A –1/N 1 +1/B –1/N 0 )  Include all the cases in the population 

Information density of population that gave rise to cases is not great

 Can include all or sample  Nearly all base’s info is harvested when sample of base is small multiple of the cases

Which is the best measure to assess causal effects?

1) Risk Difference 2) Risk Ratio 3) Odds Ratio

In a case-control study, from what population do we sample controls?

1) 2) 3) Those with disease Those without disease Everyone, regardless of whether they have the disease

Cohort Study

Case-control Study

Kramer and Bovin 1987

 We define a cohort study as a study in which subjects are followed forward from exposure to outcome… Inferential reasoning is from cause to effect. In case control studies, the directionality is the reverse. Study subjects are investigated backwards from outcome to exposure, and the reasoning is from effect to cause.”

Cohort Study: Relative Risks

Index (E+) Reference (E-) Cases Non-cases Total

A C N 1 B D N 0 

Relative risk: (A/N 1 ) / (B/N 0 )

 Risk in exposed / risk in unexposed  Risk is number of cases / total at risk  Numerator is number of cases 

Denominator is cases and controls!

Cohort Concept

N E+ t 0 N E Exposed Cases A C (N E+ - a) D (N E t - b) Unexposed Cases B

Cohort Study: Relative Risks

Index (E+) Reference (E-) Cases Non-cases Total

A C N 1 B D N 0 

Relative risk:

 (A/N 1 )/(B/N 0 ) can be rearranged as (A/B)/(N 1 /N 0 )  A/B is ratio of exposed to unexposed cases  N 1 /N 0 is ratio of exposed to unexposed in population

Relative risk has meaning: average increase in risk produced by exposure

Case-control: Cases

Members of population who develop disease over the follow-up period

 Same cases as the analogous cohort study  Case ascertainment is influenced by design  Primary base: population defined first  Secondary base: cases defined first

Case-control: Controls

A sample of the population experience that gave rise to the cases

3 options (paradigms)

 Un-diseased experience  Population at risk at beginning of the study  Population experience over follow-up Cases Non-cases

0 mos 6 mos 12 mos 18 mos 24 mos

0 5 10 15 20 100 95 90 85 80

t 0

Case-control Concept

Option 2: Case-cohort N E+ N E Option 3: Density Sampling Exposed Cases A Unexposed Cases B Option 1: Cumulative C (N E+ - a) D (N E t - b)

Case-control study

Cases Controls Index A C Reference B D 

Now we can’t estimate risk A/N 1 and B/N 0 because we don’t know the denominators

Left with an odds ratio

 But how to interpret?

2 ways to calculate an OR

Cases Controls Index A C Reference B D 

Cross product ratio:

 (A*D)/(B*C)  Not particularly meaningful, but it works

2 ways to calculate an OR

Cases Controls Index A C Reference B D 

Case ratio/base ratio:

 (A/B) / (C/D)  A/B is the ratio of exposed to unexposed cases  C/D is the ratio of exposed to unexposed controls  Remember back to Relative Risk  Here C/D fills in for N 1 /N 0

The trohoc fallacy

Cases Non-cases Total Index 400 600 1000 Reference 100 900 1000 Cases Non-cases Total

10% sample of non-cases

Index 400 60 Not Reference 100 90 sampled RR = (400/1000) / (100/1000) = 4.0

OR = (400/60) / (100/90) = 6.0

The trohoc fallacy is idea that a case-control study is a cohort study done backwards (heteropalindrome)

Requires a rare disease assumption for the odds ratio to approximate the relative risk

t 0

Case-control Concept

Option 2: Case-cohort N E+ N E Exposed Cases A Unexposed Cases B Option 1: Cumulative C (N E+ - a) D (N E t - b)

10% sample of population that gave rise to cases

The trohoc fallacy revealed

Cases Non-cases Total Index 400 600 1000 Reference 100 900 1000 Cases Non-cases Controls Index 400 Not 100 Reference 100 sampled 100 RR = (400/1000) / (100/1000) = 4.0

OR = (400/100) / (100/100) = 4.0

Sample total population that gave rise to cases (which includes cases), not undiseased at end

 Cases can be their own controls if randomly sampled 

Requires no rare disease assumption

Miettinen on the trohoc fallacy

 

“Consider the clinical trial: the concern is, as always, to contrast categories of treatment as to subsequent occurrence of some outcome phenomenon, whereas comparing different categories of the outcome as to the antecedent distribution of treatment is uninteresting if not downright perverse.”

Preferred terms like “case-referent” and “case base” studies as “the base sample is no more a control series than a census of the base is”

Why it works

OR = [A*D] / [B*C] = [A/B] / [C/D]

 If we sample 10% of the base then the odds ratio is: 

OR = [A/B] /[(10%*N 1 )/(10%*N 0 )]

= [A/B]/(N 1 /N 0 ) = RR

Cases Non case Total Index Ref A C N 1 B D N 0

Cohort studies exclude those who are not at risk for disease (though they don’t need to). In a case control study. Should we exclude those not at risk for exposure? Ex. In a study of hormonal contraception and heart disease, should we exclude nuns?

With appropriate sampling, odds ratio is interpreted as estimate of relative risk, which has meaning. Case control studies are cohort studies done efficiently, not cohort studies done backwards.

Measures of Disease Frequency

Provide an estimate of the occurrence of disease in a population

 Typically we study first occurrence as later occurrences are often affected by first 

Incorporates:

 Disease state  Time  Population definition

Measures of Disease Frequency

Prevalence:

 Proportion of population with disease at a particular time  Cross-sectional  Reflects rate of disease occurrence and survival with disease

Measures of Disease Frequency

Cumulative Incidence (Simple)

 Proportion of a population that develops disease over a follow-up period  Also called incidence proportion or risk  Bounded by 0 and 1  Time not part of measure but must report  Difficult to measure in dynamic populations CI (t0,t) = I (t0,t) /N 0

Measures of Disease Frequency

Incidence rate (density)

 Number of newly developed cases divided by accumulated person time  Time is part of the denominator  Can be used in dynamic populations/cohorts  Ignores distinction between individuals  (2/100 py could be 2 followed 50 yrs each, both get event or 100 followed 1 yr each, 2 get event) IR (t 0 ,t) = I (t0,t) / ∑PT where

PT

i N

  1 

t i

or

PT

N

t

Measures of Disease Frequency

Rules for counting person time

 Start disease free, free of history of disease at entry  At risk for outcome? Not necessary, but wasteful  Start after exposure is complete (not during) and after minimum induction period  Stop when disease occurs (date or midpoint)  Stop if withdrawn (lost to follow up, death from another cause, study ends, no longer at risk) 

Only those eligible to be counted in numerator are in denominator

 Ask, if became a case, would I have counted them?

Person Time Issues I

We conduct a cohort study of continuous smoking vs. no smoking and prostate cancer

 Enroll 1000 smokers and 1000 non-smokers 

At end, find 100 non-smokers became smokers. Should we exclude them?

 Can’t because if they became cases while not smoking we would have included them

Person Time Issues II

Study HAART regimens and death

 But much death and LTFU in first 6-months and we care about long term mortality 

Exclude any deaths in first 6-months

 OK if all we care about is long-term effects 

When should person time start?

 Immortal person-time biases towards null

Black triangle Prevalence = 2/8 = 0.25

Black triangle Cum Inc = 2/9

2 5 5 5 5 5 5 5 5

Black triangle Inc Rate = 2/42

Measure of Effect

Comparison of occurrence of outcome in the same population at same time under two different conditions

 Only one can be observed  Second is “counterfactual” (we will come back to this) 

Theoretical, as such we substitute measure of association

 But as an approximation to measure of effect

Measures of Association

Comparison of incidence in 2+ populations

Relative:

 Comparison by division  Null (no effect) is 1  Log scale (distance from 0-1 is same as 1 to infinity) 

Difference:

 Comparison by subtraction  Null (no effect) is 0  Distance above and below null is equivalent

Calculations

RD

CI E

CI E RR

CI CI E E IRD

IR E

IR E IRR

IR IR E E

Conclusion

Objective is a VALID and PRECISE estimate of the effect of an exposure on an outcome

Need to think critically about the logic of the methods we have been taught

 Make sure we understand how to validly design studies and how to correctly interpret study findings 

Odds ratios are odd

 Correct sampling means can reduce reliance on them