No Slide Title

Download Report

Transcript No Slide Title

Measures of disease frequency (I)

•

MEASURES OF DISEASE FREQUENCY

Absolute measures of disease frequency:

– Incidence – Prevalence – Odds

• Measures of association

:

– Ratios (Relative risk-type measures) – Differences (Attributable risk-type measures)

Two types of disease frequency measures

• Incidence and Odds of Incidence – New disease – Deaths in the population (mortality) – Deaths in patients (case-fatality) – Recurrences – etc.

• Prevalence and Odds of Prevalence

Analyses of cohort (prospective) data:

• Calculation of incidence rates • Comparison of incidence rates across (exposure) groups

Unexposed Exposed

What is "incidence"?

Two major ways to define incidence • Cumulative incidence (cumulative probability, hazards)

SURVIVAL ANALYSIS

• Rate

ANALYSIS BASED ON PERSON-TIME

Calculation of incidence Strategy #1

SURVIVAL ANALYSIS

• Variable of interest:

TIME to occurrence of an EVENT (death,disease, relapse)

• Primary objectives:

1) TO ESTIMATE CUMULATIVE INCIDENCE (q) or SURVIVAL FUNCTION (1-q)

1.0

• Methods:

– LIFE TABLE (Actuarial) – KAPLAN-MEIER

Time

2) TO COMPARE SURVIVAL IN DIFFERENT GROUPS

1.0

Unexposed Exposed • Methods: Time

– PROPORTIONAL HAZARDS (COX) REGRESSION – LOGRANK TEST

Examples: • Clinical trial (experimental study):

Group of infants with acute diarrhea: randomized to 3 treatment groups: (

NEJM

1993;328:1653): • Bismuth (100 mg/kg) • Bismuth (150 mg/kg) • Placebo

Which treatment results in earlier remission of diarrhea?

Examples: • Cohort (observational) study:

Group of Johns Hopkins University medical students, classes 1948-64 (

Precursors Study

) • Positive family history of hypertension • Negative family history of hypertension –

Which group results in a higher cumulative incidence of hypertensives?

–

Is there evidence that the hypertension diagnoses occur earlier in one of the groups?

•

OBJECTIVE OF SURVIVAL ANALYSIS:

To compare the “cumulative incidence” of an event (or the cumulative probability surviving event-free) in exposed and unexposed (characteristic present or absent) •

BASIS FOR THE ANALYSIS

• NUMBER of

EVENTS

•

TIME

of occurrence

Need to precisely define:

• “EVENT”

(failure): – Death – Disease (diagnosis, start of symptoms, relapse) – Remission of diarrhea – Quit smoking – Menopause

Need to precisely define:

• “EVENT”

(failure): – Death – Disease (diagnosis, start of symptoms, relapse) – Remission of diarrhea – Quit smoking – Menopause

• “TIME”

: – Time from recruitment into the study – Time from employment – Time from diagnosis (prognostic studies) – Time from infection – Calendar time – Age

Why is survival analysis “tricky”?

• Different follow-up for the study participants: – Because of staggered (late) entries – Because of losses to follow-up – Example: • Follow up of 6 patients (2 yrs) – 3 Deaths – 2 censored before 2 years – 1 survived 2 years Question: What is the Cumulative Incidence (or the Cumulative Survival) up to 2 years?

Person ID 1 2 3 4 5 6

(6) (3)

Jan 1999 Jan 2000

( ≠)

Death Censored observation (lost to follow-up, withdrawal) Number of months to follow-up

(18) (15) (12) (24)

Jan 2001 Crude Survival: 3/6= 50%

Change time scale from calendar time to follow-up time:

Person ID 1 2 3 4 5 6

(3) (6) (12) (15) (18) (24)

0 1

Follow-up time (years)

What is the 2-year cumulative survival?

• Assume both censored individuals survived up to 2-years:

Person ID ( 2

yrs

)  3 6  0 .

50 1

(24) (6)

(18)

(15)

4 5 6

(3) (12)

0 1

Follow-up time (years)

What is the 2-year cumulative survival?

• Assume both censored individuals survived up to 2 years:

( 2

yrs

)  3 6  0 .

50 • Assume both censored individuals died before 2 years: Person ID

( 2

yrs

)  1 6  0 .

17 1

(24)

(6)

(18)

4 5 6

(3) (12) (15)

0 1 2

Follow-up time (years)

“True” survival is probably somewhere in between these extreme estimates …, but where?

• Calculating CUMULATIVE INCIDENCE (up to time “t”)

q t



Number of individual s with the event by t Number at risk at baseline

• Calculating CUMULATIVE SURVIVAL

(

)  1 

q t



Number of individual s alive beyond t Number at risk at baseline

Problem:

requires accounting for censoring (losses to follow-up)

ID 1 2 3 4 5 6 0 One solution:

• Actuarial life table

Assume that censored observations over the period contribute one-half the persons at risk in the denominator.

(6) (12) (15) (18) (3)

Follow-up time (years)

(24)

yrs

 6  3 1 2  2  3 5  0 .

( 2

yrs

)  1 

yrs

 0 .

Further problem: If the follow-up is long, the risks cannot be assumed to be constant, and thus, the follow up time needs to be partitioned.

Methods:

– LIFE TABLE – KAPLAN-MEIER

LIFE TABLE (Actuarial method)

Non-parametric method for grouped data Example:

Precursors Study

, incidence of CHD by follow-up time

i 1 2 3 4 5 6 Time (yr) 0 -9 10-19 20-24 25-29 30-34 35 N i 1170 1145 1076 980 687 396 d i 2 21 24 24 23 23 c i 23 48 72 269 268 373

Data: N i : Number alive (disease-free) at the beginning of each interval d i : Number of cases during each interval c i : Number of losses during each interval

LIFE TABLE (Actuarial method)

, cont’d

i Time (yr) N i d i c i N i * 1 2 3 4 0 -9 10-19 20-24 25-29 1170 1145 1076 980 2 21 24 24 23 48 72 269 1158.5

1121.0

1040.0

845.5

5 30-34 687 23 268 553.0

6 35 396 23 373 †Cumulative survival by the end of each interval 209.5

Calculations: N i * : “Corrected” number at risk in the interval

N i * = N i – c i / 2

q i : Probability of the event in each interval

q i = d i / N i *

p i : Probability of survival in each interval

p i = 1- q i q i 0.00173

0.0187

0.0231

0.0284

0.0416

0.1098



Cumulative probability of survival

p i 0.99827

0.9813

0.9769

0.9716

0.9584

0.8902



0.99827

0.9796

0.9570

0.9298

0.8911

0.7933

Note: p i and q i are “conditional” probabilities (of the event and of survival, respectively).

I.e., in order to have the event or to survive throughout a given interval one has to have survived (is conditioned on…) through all the previous ones.

S(t i ) Cumulative probability of survival at (or “up to”) time t: S(t i )   All t intervals j  t i p j Example in English: the cumulative survival up to to the beginning of year 25 (end of interval 20-24) is the product of the conditional survival probabilities through all previous interval up to that date: S(25)  p 1  p 2  p 3  .

99827  .

9813  .

9769  .

9570

Plotting the survival function: Survival 1.00

0.95

0.90

0.85

0.80

0.75

0.70

0 10 20 30 40 Years of follow-up 50 Time (yr) 0 -9 10-19 20-24 25-29 30-34 35-



i † 0.9983

0.9796

0.9570

0.9298

0.8911

0.7933

VARIANCE FOR CUMULATIVE SURVIVAL ESTIMATE

Method described by Greenwood in 1926* Var [ ˆ (t i )]  [ ˆ (t i )] 2   All t intervals j  t i  

d i N i

* (

N i

* 

d i

)   And 95% Confidence interval can then be obtained: ˆ (t i )  1.96

 Var [ ˆ (t i )] *Greenwood M. A report on the natural duration of cancer.

Rep Pub Health Med Subjects

1926;33:1-26.

i Time (yr) N i d i c i N i * 1 2 3 4 0 -9 10-19 20-24 25-29 1170 1145 1076 980 2 21 24 24 23 48 72 269 1158.5

1121.0

1040.0

845.5

5 30-34 687 23 268 553.0

6 35 396 23 373 †Cumulative survival by the end of each interval 209.5

q i 0.00173

0.0187

0.0231

0.0284

0.0416

0.1098

p i 0.99827

0.9813

0.9769

0.9716

0.9584

0.8902



i † 0.9983

0.9796

0.9570

0.9298

0.8911

0.7933

Var [ ˆ (t i )]  [ ˆ (t i )] 2   All t intervals j  t i  

d i N i

* (

N i

* 

d i

)  

Example:

Var [ S (25)]  [0.957] 2   2  1158 .

5 ( 1158 .

5  2 )  0 .

0000378  21 1121 ( 1121  21 )  24 1040 ( 1040  24 )  

VARIANCE FOR CUMULATIVE SURVIVAL ESTIMATE

Method described by Greenwood in 1926* Var [ ˆ (t i )]  [ ˆ (t i )] 2   All t intervals j  t i  

d i N i

* (

N i

* 

d i

)   And 95% Confidence interval can then be obtained: ˆ (t i )  1.96

 Var [ ˆ (t i )]

Example:

Var [ Sˆ (25)]  [0.957] 2   2  1158 .

5 ( 1158 .

5  2 )  0 .

0000378  21 1121 ( 1121  21 )  24 1040 ( 1040  24 )   95% CI: 0 .

957  1.96

 0 .

0000378 0 .

957  1.96

 0 .

00614 [ 0 .

945  0 .

969 ] *Greenwood M. A report on the natural duration of cancer.

Rep Pub Health Med Subjects

1926;33:1-26.

KAPLAN-MEIER METHOD

E.L. Kaplan and P. Meier, 1958*

Calculate the cumulative probability of event (and survival) based on conditional probabilities at each event time

Person ID 1 2 3 4 5 6 0

(6) (12) (15) (18) (3)

Follow-up time (years)

(24)

*Kaplan EL, Meier P.Nonparametric estimation from incomplete observations.

J Am Stat Assoc

1958;53:457-81.

KAPLAN-MEIER METHOD

E.L. Kaplan and P. Meier, 1958*

Calculate the cumulative probability of event (and survival) based on conditional probabilities at each event time Step 1:

Sort the survival times from shortest to longest Person ID 6 2 5 4 3 1

(3) (6) (12) (15) (18) (24)

0 1

Follow-up time (years)

2 *Kaplan EL, Meier P.Nonparametric estimation from incomplete observations.

J Am Stat Assoc

1958;53:457-81.

Step 2:

For each time of occurrence of an event, compute the conditional survival Person ID 6 2 5 4 3 1

(3) (6) (12) (15) (18) (24)

0 1

Follow-up time (years)

When the first event occurs (3 months after beginning of follow-up), there are 6 persons at risk. One of them dies at that point; 5 of the 6 survive beyond that point. Thus: • Instantaneous incidence of event at time 3 months: 1/6 • Probability of survival beyond 3 months: 5/6

Person ID 6 2 5 4 3 1 (3) (6)

(12) (15) (18) (24)

0 1

Follow-up time (years)

When the second event occurs (12 months), there are 4 persons at risk. One of them dies at that point; 3 of the 4 survive beyond that point. Thus: • Incidence of event at time 12 months: 1/4 • Probability of survival beyond 12 months: ¾

Person ID 6 2 5 4 3 1 (3) (6) (12) (15)

(18) (24)

0 1

Follow-up time (years)

When the third event occurs (18 months), there are 2 persons at risk. One of them dies at that point; 1 of the 2 survive beyond that point. Thus: • Incidence of event at time 18 months: 1/2 • Probability of survival beyond 18 months: ½

CONDITIONAL PROBABILITY OF AN EVENT (or of survival) The probability of an event (or of survival) at time t for the individuals at risk at time t, that is, conditioned on being at risk at time t.

Step 3:

For each time of occurrence of an event, compute the cumulative survival (survival function) of surviving beyond that time, by multiplying conditional probabilities of survival.

3 months: 12 months: 18 months: In Greek: S(3)=5/6=0.833

S(12)=5/6  3/4=0.625

S(18)=5/6  3/4  1/2 =0.3125

S(t i )   All t deaths j  t i 1  d j n j

Plotting the survival function when using the Kaplan-Meier approach: Survival 1.00

0.80

0.60

0.40

0.20

Time (mo) 3 12 18



i 0.833

0.625

0.3125

0 5 10 15 20 Month of follow-up 25

The cumulative incidence (up to 24 months): 1 - 0.3125 = 0.6875 (or  69%)

Greenwood’s formula for variance calculation also works for the KM estimate … and thus, confidence limits for the cumulative survival estimates can be calculated and plotted. E.g.:

Life table vs. Kaplan-Meier

• Generally (if N is large and/or if life-table intervals small), it wont make much difference. E.g.: Survival after diagnosis of Ewing’s sarcoma* * (Solid line, actuarial life table estimate; broken line, KM estimate)

ASSUMPTIONS IN SURVIVAL ESTIMATES

• (For the actuarial life table only)

Risk is constant within each interval

• (If individuals are recruited over a long period of time)

No secular trends

Calendar time Follow-up time

•

ASSUMPTIONS IN SURVIVAL ESTIMATES

(Cont’d)

Censoring is independent of survival

(uninformative censoring): Those censored at time t have the same prognosis as those remaining.

Types of censoring: • Lost to follow-up – Migration – Refusal • Death (from another cause) • Administrative withdrawal (study finished)

• If censored observations tend to have worse prognosis than those remaining in the study:

1.0

Observed in study True Survival Time

• If censored observations tend to have better prognosis than those remaining in the study:

1.0

True Survival Observed in study Time

Note:

This assumption is generic to any kind of analysis (absolute risk calculation) of prospective data.

No Slide Title

Transcript No Slide Title

Measures of disease frequency (I)

MEASURES OF DISEASE FREQUENCY

•

Measures of association

:

Two types of disease frequency measures

Analyses of cohort (prospective) data:

• Calculation of incidence rates • Comparison of incidence rates across (exposure) groups

What is "incidence"?

Two major ways to define incidence • Cumulative incidence (cumulative probability, hazards)

• Rate

Examples: • Clinical trial (experimental study):

Examples: • Cohort (observational) study:

Need to precisely define:

Need to precisely define:

Why is survival analysis “tricky”?

What is the 2-year cumulative survival?

Methods:

Directory