Survival analysis

Download Report

Transcript Survival analysis

Survival analysis
Brian Healy, PhD
Previous classes

Regression
– Linear regression
– Multiple regression
– Logistic regression
What are we doing today?

Survival analysis
– Kaplan-Meier curve
– Dichotomous predictor
– How to interpret results

Cox proportional hazards
– Continuous predictor
– How to interpret results
Big picture
In medical research, we often confront
continuous, ordinal or dichotomous
outcomes
 One other common outcome is time to
event (survival time)

– Clinical trials often measure time to death or
time to relapse

We would like to estimate the survival
distribution
Types of analysis-independent
samples
Outcome
Explanatory
Analysis
Continuous
Dichotomous
t-test, Wilcoxon
test
Continuous
Categorical
Continuous
Continuous
ANOVA, linear
regression
Correlation, linear
regression
Dichotomous
Dichotomous
Chi-square test,
logistic regression
Dichotomous
Continuous
Logistic regression
Time to event
Dichotomous
Log-rank test
Definitions
Survival time: time to event
 Survival function: probability survival time
is greater than a specific value
S(t)=P(T>t)
 Hazard function: risk of having the event
l(t)=# who had event/# at risk
 These two factors are mathematically
related

Example

An important marker of disease activity in MS is
the occurrence of a relapse
– This is the presence of new symptoms that lasts for
at least 24 hours

Many clinical trials in MS have demonstrated that
treatments increase the time until the next
relapse
– How does the time to next relapse look in the clinic?

What is the distribution of survival times?
Kaplan-Meier curve
Each drop in
the curve
represents an
event
Survival data

To create this curve, patients placed on
treatment were followed and the time of the first
relapse on treatment was recorded
– Survival time
If everyone had an event, some of the methods
we have already learned could be applied
 Often, not everyone has event

– Loss to follow-up
– End of study
Censoring

The patients who did not have the event
are considered censored
– We know that they survived a specific amount
of time, but do not know the exact time of the
event
– We believe that the event would have
happened if we observed them long enough

These patients provide some information,
but not complete information
Censoring

How could we account for censoring?
– Ignore it and say event occurred at time of censoring
 Incorrect because this is almost certainly not true
– Remove patient from analysis
 Potential bias and loss of power
– Survival analysis

Our objective is to estimate the survival
distribution of patients in the presence of
censoring
Example
For simplicity, let’s
focus on 10 patients
whose time to relapse
is provided here
 We assume that no
one is censored
initially
 We would like to
estimate S(t) and l(t)

Patient
Time
1
3
2
8
3
15
4
27
5
32
6
46
7
49
8
51
9
55
10
70
What do we see
from our curve?
1) Drops in the
curve only occur
at time of event
0.75
1.00
Kaplan-Meier survival estimate
0.00
0.25
0.50
2) Between events,
the estimated
survival remains
constant
0
20
40
analysis time
60
80
What is the size of
the drops?
Calculating size of drop
Patient

To calculate the
hazard at each
time point=#
events/# at risk
– If no event,
hazard=0

To calculate
estimated survival
use:

Sˆ (t )  Sˆ (t  1) * 1  lˆ(t )

Time
0
lˆ(t )
Sˆ (t )
0
1
1
2
3
4
3
8
15
27
1/10
1/9
1/8
1/7
0.9
0.8
0.7
0.6
5
6
7
32
46
49
1/6
1/5
1/4
0.5
0.4
0.3
8
9
10
51
55
70
1/3
1/2
1/1
0.2
0.1
0
Example-censoring
For simplicity, let’s
focus on 10 patients
whose time to relapse
is provided here
 We assume that no
one is censored
initially
 We would like to
estimate S(t) and l(t)

Patient
Time
1
3
2
8+
3
15
4
27+
5
32
6
46
7
49
8
51
9
55+
10
70
What do we see
from our curve?
1) Drops in the
curve only occur
at time of event
0.75
1.00
Kaplan-Meier survival estimate
0.00
0.25
0.50
2) Between events,
the estimated
survival remains
constant
0
20
40
analysis time
60
80
3) Survival curve
does not drop at
censored times
Calculating size of drop
Patient

To calculate the
hazard at each
time point=#
events/# at risk
– If no event,
hazard=0

To calculate
estimated survival
use:

Sˆ (t )  Sˆ (t  1) * 1  lˆ(t )

Time
0
lˆ(t )
Sˆ (t )
0
1
1
2
3
4
3
8+
15
27
1/10
0
1/8
1/7
0.9
0.9
0.79
0.68
5
6
7
32+
46
49
0
1/5
1/4
0.68
0.54
0.41
8
9
10
51
55+
70
1/3
0
1/1
0.27
0.27
0
Confidence interval for survival
curve
1
.75
0
– Greenwood’s
formula
Kaplan-Meier survival estimate
.5
A confidence
interval can
be placed
around the
estimated
survival curve
.25

0
20
40
analysis time
95% CI
60
Survivor function
80
Summary
Kaplan-Meier curve represents the
distribution of survival times
 Drops only occur at event times
 Censoring easily accommodated
 If last time is not event, curve does not go
to zero

Comparison of survival curve
One important aspect of survival analysis
is the comparison of survival curves
 Null hypothesis: S1(t)=S2(t)
 Method: log-rank test

Example
Untreated
Treated
Patient
Time
Patient
Time
1
3
1
30
2
8+
2
38
3
15
3
52+
4
27+
4
58
5
32
5
66
6
46
6
73+
7
49
7
77
8
51
8
89
9
55+
9
107+
10
70
0.00
0.25
0.50
0.75
1.00
Kaplan-Meier survival estimates
0
20
40
60
analysis time
group = 0
80
group = 1
100
Log-rank test-technical

To compare survival curves, a log-rank
test creates 2x2 tables at each event time
and combines across the tables
– Similar to MH-test
Provides a c2 statistic with 1 degree of
freedom (for a two sample comparison)
and a p-value
 Same procedure for hypothesis testing

Hypothesis test
1)
2)
3)
4)
5)
6)
7)
H0: S1(t)=S2(t)
Time to event outcome, dichotomous predictor
Log rank test
Test statistic: c2=4.4
p-value=0.036
Since the p-value is less than 0.05, we reject
the null hypothesis
We conclude that there is a significant
difference in the survival time in the treated
compared to untreated
. sts test group, logrank
failure _d:
analysis time _t:
event
weeks
Log-rank test for equality of survivor functions
group
0
1
Total
Events
observed
Events
expected
7
6
3.81
9.19
13
13.00
chi2(1) =
Pr>chi2 =
4.38
0.0364
p-value
Notes
Inspection of Kaplan-Meier curve will allow
you to determine which of the groups had
the significantly longer survival time
 Other tests are possible

– Gehan’s generalized Wilcoxon test
– Tarone-Ware test
– Peto-Peto-Prentice test

Generally give similar results, but
emphasize different parts of survival curve