No Slide Title

Download Report

Transcript No Slide Title

Measures of disease frequency (I)

MEASURES OF DISEASE FREQUENCY

Absolute measures of disease frequency:

– Incidence – Prevalence – Odds

Measures of association

:

– Ratios (Relative risk-type measures) – Differences (Attributable risk-type measures)

Two types of disease frequency measures

• Incidence and Odds of Incidence – New disease – Deaths in the population (mortality) – Deaths in patients (case-fatality) – Recurrences – etc.

• Prevalence and Odds of Prevalence

Analyses of cohort (prospective) data:

• Calculation of incidence rates • Comparison of incidence rates across (exposure) groups

Unexposed Exposed

What is "incidence"?

Two major ways to define incidence • Cumulative incidence (cumulative probability, hazards)

SURVIVAL ANALYSIS

• Rate

ANALYSIS BASED ON PERSON-TIME

Calculation of incidence Strategy #1

SURVIVAL ANALYSIS

• Variable of interest:

TIME to occurrence of an EVENT (death,disease, relapse)

• Primary objectives:

1) TO ESTIMATE CUMULATIVE INCIDENCE (q) or SURVIVAL FUNCTION (1-q)

1.0

• Methods:

– LIFE TABLE (Actuarial) – KAPLAN-MEIER

Time

2) TO COMPARE SURVIVAL IN DIFFERENT GROUPS

1.0

Unexposed Exposed • Methods: Time

– PROPORTIONAL HAZARDS (COX) REGRESSION – LOGRANK TEST

Examples: • Clinical trial (experimental study):

Group of infants with acute diarrhea: randomized to 3 treatment groups: (

NEJM

1993;328:1653): • Bismuth (100 mg/kg) • Bismuth (150 mg/kg) • Placebo

Which treatment results in earlier remission of diarrhea?

Examples: • Cohort (observational) study:

Group of Johns Hopkins University medical students, classes 1948-64 (

Precursors Study

) • Positive family history of hypertension • Negative family history of hypertension –

Which group results in a higher cumulative incidence of hypertensives?

Is there evidence that the hypertension diagnoses occur earlier in one of the groups?

OBJECTIVE OF SURVIVAL ANALYSIS:

To compare the “cumulative incidence” of an event (or the cumulative probability surviving event-free) in exposed and unexposed (characteristic present or absent) •

BASIS FOR THE ANALYSIS

• NUMBER of

EVENTS

TIME

of occurrence

Need to precisely define:

• “EVENT”

(failure): – Death – Disease (diagnosis, start of symptoms, relapse) – Remission of diarrhea – Quit smoking – Menopause

Need to precisely define:

• “EVENT”

(failure): – Death – Disease (diagnosis, start of symptoms, relapse) – Remission of diarrhea – Quit smoking – Menopause

• “TIME”

: – Time from recruitment into the study – Time from employment – Time from diagnosis (prognostic studies) – Time from infection – Calendar time – Age

Why is survival analysis “tricky”?

• Different follow-up for the study participants: – Because of staggered (late) entries – Because of losses to follow-up – Example: • Follow up of 6 patients (2 yrs) – 3 Deaths – 2 censored before 2 years – 1 survived 2 years Question: What is the Cumulative Incidence (or the Cumulative Survival) up to 2 years?

Person ID 1 2 3 4 5 6

(6) (3)

Jan 1999 Jan 2000

( ≠)

Death Censored observation (lost to follow-up, withdrawal) Number of months to follow-up

(18) (15) (12) (24)

Jan 2001 Crude Survival: 3/6= 50%

Change time scale from calendar time to follow-up time:

Person ID 1 2 3 4 5 6

(3) (6) (12) (15) (18) (24)

0 1

Follow-up time (years)

2

What is the 2-year cumulative survival?

• Assume both censored individuals survived up to 2-years:

S

Person ID ( 2

yrs

)  3 6  0 .

50 1

(24) (6)

2

(18)

3

(15)

4 5 6

(3) (12)

0 1

Follow-up time (years)

2

What is the 2-year cumulative survival?

• Assume both censored individuals survived up to 2 years:

S

( 2

yrs

)  3 6  0 .

50 • Assume both censored individuals died before 2 years: Person ID

S

( 2

yrs

)  1 6  0 .

17 1

(24)

2

(6)

3

(18)

4 5 6

(3) (12) (15)

0 1 2

Follow-up time (years)

“True” survival is probably somewhere in between these extreme estimates …, but where?

• Calculating CUMULATIVE INCIDENCE (up to time “t”)

q t

Number of individual s with the event by t Number at risk at baseline

• Calculating CUMULATIVE SURVIVAL

S

(

t

)  1 

q t

Number of individual s alive beyond t Number at risk at baseline

Problem:

requires accounting for censoring (losses to follow-up)

ID 1 2 3 4 5 6 0 One solution:

• Actuarial life table

Assume that censored observations over the period contribute one-half the persons at risk in the denominator.

(6) (12) (15) (18) (3)

1

Follow-up time (years)

2

(24)

q

2

yrs

 6  3 1 2  2  3 5  0 .

60

S

( 2

yrs

)  1 

q

2

yrs

 0 .

40

Further problem: If the follow-up is long, the risks cannot be assumed to be constant, and thus, the follow up time needs to be partitioned.

Methods:

– LIFE TABLE – KAPLAN-MEIER

LIFE TABLE (Actuarial method)

Non-parametric method for grouped data Example:

Precursors Study

, incidence of CHD by follow-up time

i 1 2 3 4 5 6 Time (yr) 0 -9 10-19 20-24 25-29 30-34 35 N i 1170 1145 1076 980 687 396 d i 2 21 24 24 23 23 c i 23 48 72 269 268 373

Data: N i : Number alive (disease-free) at the beginning of each interval d i : Number of cases during each interval c i : Number of losses during each interval

LIFE TABLE (Actuarial method)

, cont’d

i Time (yr) N i d i c i N i * 1 2 3 4 0 -9 10-19 20-24 25-29 1170 1145 1076 980 2 21 24 24 23 48 72 269 1158.5

1121.0

1040.0

845.5

5 30-34 687 23 268 553.0

6 35 396 23 373 †Cumulative survival by the end of each interval 209.5

Calculations: N i * : “Corrected” number at risk in the interval

N i * = N i – c i / 2

q i : Probability of the event in each interval

q i = d i / N i *

p i : Probability of survival in each interval

p i = 1- q i q i 0.00173

0.0187

0.0231

0.0284

0.0416

0.1098

:

Cumulative probability of survival

p i 0.99827

0.9813

0.9769

0.9716

0.9584

0.8902

0.99827

0.9796

0.9570

0.9298

0.8911

0.7933

Note: p i and q i are “conditional” probabilities (of the event and of survival, respectively).

I.e., in order to have the event or to survive throughout a given interval one has to have survived (is conditioned on…) through all the previous ones.

S(t i ) Cumulative probability of survival at (or “up to”) time t: S(t i )   All t intervals j  t i p j Example in English: the cumulative survival up to to the beginning of year 25 (end of interval 20-24) is the product of the conditional survival probabilities through all previous interval up to that date: S(25)  p 1  p 2  p 3  .

99827  .

9813  .

9769  .

9570

Plotting the survival function: Survival 1.00

0.95

0.90

0.85

0.80

0.75

0.70

0 10 20 30 40 Years of follow-up 50 Time (yr) 0 -9 10-19 20-24 25-29 30-34 35-

i † 0.9983

0.9796

0.9570

0.9298

0.8911

0.7933

VARIANCE FOR CUMULATIVE SURVIVAL ESTIMATE

Method described by Greenwood in 1926* Var [ ˆ (t i )]  [ ˆ (t i )] 2   All t intervals j  t i  

d i N i

* (

N i

* 

d i

)   And 95% Confidence interval can then be obtained: ˆ (t i )  1.96

 Var [ ˆ (t i )] *Greenwood M. A report on the natural duration of cancer.

Rep Pub Health Med Subjects

1926;33:1-26.

i Time (yr) N i d i c i N i * 1 2 3 4 0 -9 10-19 20-24 25-29 1170 1145 1076 980 2 21 24 24 23 48 72 269 1158.5

1121.0

1040.0

845.5

5 30-34 687 23 268 553.0

6 35 396 23 373 †Cumulative survival by the end of each interval 209.5

q i 0.00173

0.0187

0.0231

0.0284

0.0416

0.1098

p i 0.99827

0.9813

0.9769

0.9716

0.9584

0.8902

i † 0.9983

0.9796

0.9570

0.9298

0.8911

0.7933

Var [ ˆ (t i )]  [ ˆ (t i )] 2   All t intervals j  t i  

d i N i

* (

N i

* 

d i

)  

Example:

Var [ S (25)]  [0.957] 2   2  1158 .

5 ( 1158 .

5  2 )  0 .

0000378  21 1121 ( 1121  21 )  24 1040 ( 1040  24 )  

VARIANCE FOR CUMULATIVE SURVIVAL ESTIMATE

Method described by Greenwood in 1926* Var [ ˆ (t i )]  [ ˆ (t i )] 2   All t intervals j  t i  

d i N i

* (

N i

* 

d i

)   And 95% Confidence interval can then be obtained: ˆ (t i )  1.96

 Var [ ˆ (t i )]

Example:

Var [ Sˆ (25)]  [0.957] 2   2  1158 .

5 ( 1158 .

5  2 )  0 .

0000378  21 1121 ( 1121  21 )  24 1040 ( 1040  24 )   95% CI: 0 .

957  1.96

 0 .

0000378 0 .

957  1.96

 0 .

00614 [ 0 .

945  0 .

969 ] *Greenwood M. A report on the natural duration of cancer.

Rep Pub Health Med Subjects

1926;33:1-26.

KAPLAN-MEIER METHOD

E.L. Kaplan and P. Meier, 1958*

Calculate the cumulative probability of event (and survival) based on conditional probabilities at each event time

Person ID 1 2 3 4 5 6 0

(6) (12) (15) (18) (3)

1

Follow-up time (years)

2

(24)

*Kaplan EL, Meier P.Nonparametric estimation from incomplete observations.

J Am Stat Assoc

1958;53:457-81.

KAPLAN-MEIER METHOD

E.L. Kaplan and P. Meier, 1958*

Calculate the cumulative probability of event (and survival) based on conditional probabilities at each event time Step 1:

Sort the survival times from shortest to longest Person ID 6 2 5 4 3 1

(3) (6) (12) (15) (18) (24)

0 1

Follow-up time (years)

2 *Kaplan EL, Meier P.Nonparametric estimation from incomplete observations.

J Am Stat Assoc

1958;53:457-81.

Step 2:

For each time of occurrence of an event, compute the conditional survival Person ID 6 2 5 4 3 1

(3) (6) (12) (15) (18) (24)

0 1

Follow-up time (years)

2

When the first event occurs (3 months after beginning of follow-up), there are 6 persons at risk. One of them dies at that point; 5 of the 6 survive beyond that point. Thus: • Instantaneous incidence of event at time 3 months: 1/6 • Probability of survival beyond 3 months: 5/6

Person ID 6 2 5 4 3 1 (3) (6)

(12) (15) (18) (24)

0 1

Follow-up time (years)

2

When the second event occurs (12 months), there are 4 persons at risk. One of them dies at that point; 3 of the 4 survive beyond that point. Thus: • Incidence of event at time 12 months: 1/4 • Probability of survival beyond 12 months: ¾

Person ID 6 2 5 4 3 1 (3) (6) (12) (15)

(18) (24)

0 1

Follow-up time (years)

2

When the third event occurs (18 months), there are 2 persons at risk. One of them dies at that point; 1 of the 2 survive beyond that point. Thus: • Incidence of event at time 18 months: 1/2 • Probability of survival beyond 18 months: ½

CONDITIONAL PROBABILITY OF AN EVENT (or of survival) The probability of an event (or of survival) at time t for the individuals at risk at time t, that is, conditioned on being at risk at time t.

Step 3:

For each time of occurrence of an event, compute the cumulative survival (survival function) of surviving beyond that time, by multiplying conditional probabilities of survival.

3 months: 12 months: 18 months: In Greek: S(3)=5/6=0.833

S(12)=5/6  3/4=0.625

S(18)=5/6  3/4  1/2 =0.3125

S(t i )   All t deaths j  t i 1  d j n j

Plotting the survival function when using the Kaplan-Meier approach: Survival 1.00

0.80

0.60

0.40

0.20

Time (mo) 3 12 18

i 0.833

0.625

0.3125

0 5 10 15 20 Month of follow-up 25

The cumulative incidence (up to 24 months): 1 - 0.3125 = 0.6875 (or  69%)

Greenwood’s formula for variance calculation also works for the KM estimate … and thus, confidence limits for the cumulative survival estimates can be calculated and plotted. E.g.:

Life table vs. Kaplan-Meier

• Generally (if N is large and/or if life-table intervals small), it wont make much difference. E.g.: Survival after diagnosis of Ewing’s sarcoma* * (Solid line, actuarial life table estimate; broken line, KM estimate)

ASSUMPTIONS IN SURVIVAL ESTIMATES

• (For the actuarial life table only)

Risk is constant within each interval

• (If individuals are recruited over a long period of time)

No secular trends

Calendar time Follow-up time

ASSUMPTIONS IN SURVIVAL ESTIMATES

(Cont’d)

Censoring is independent of survival

(uninformative censoring): Those censored at time t have the same prognosis as those remaining.

Types of censoring: • Lost to follow-up – Migration – Refusal • Death (from another cause) • Administrative withdrawal (study finished)

• If censored observations tend to have worse prognosis than those remaining in the study:

1.0

Observed in study True Survival Time

• If censored observations tend to have better prognosis than those remaining in the study:

1.0

True Survival Observed in study Time

Note:

This assumption is generic to any kind of analysis (absolute risk calculation) of prospective data.