Introduction to medical survival analysis

Download Report

Transcript Introduction to medical survival analysis

Introduction to medical survival
analysis
John Pearson
Biostatistics consultant
University of Otago Canterbury
7 October 2008
1
Objectives
•
•
•
•
Describe survival data
Define survival analysis terms
Compare survival of groups
Describe study design
Acknowledgement:
Thanks to Colm Fahy for providing the
example data.
2
Omissions
• Not covered:
– most methodology issues
– mathematical justification
• See
– Collett: Modelling Survival Data in Medical
Research
– Hosmer & Lemeshow: Applied Survival
Analysis
– Many other good texts.
3
Example: Metastatic Parotid SCC
• Disease risk factors:
– >50 yo
– Male
– Exposure to sun
– Caucasian ancestry
• 61 patients operated on since 1990
• Audit done 1/6/8
• 14 patients died from SCCMP, 20 died
from other causes, 1 couldn’t be found
4
Example: Patient data
Died
OpDate
7/05/2002
15/11/2007
1/03/2008
12/10/2007
1/08/1993
17/04/1992
1/04/1997
7/10/1996
1/05/1991
1/05/2005
12/03/2003
Status
ALIVE
ALIVE
DOC
DOD
DOC
LOST
DOC
Preserved
PARTIAL
NO
YES
YES
NO
YES
YES
RadioTx
YES
YES
YES
YES
YES
YES
YES
ICOMP
N
N
N
Y
N
N
Y
Only 7 patients shown.
Dates have been confidentialized.
5
Example: Patient data
7
Alive
Dead OC
Dead OD
6
5
4
3
2
audit
Patient
Parotidectomy patient medical records
? Lost to follow up
1
1990
1995
2000
2005
6/2008
6
Example: Patient data
7
Alive
Dead OC
Dead OD
?
6
5
?
4
3
2
audit
Patient
Parotidectomy patient medical records
? Lost to follow up
1
1990
1995
2000
2005
6/2008
7
Example: Survival Data
Patient
Parotidectomy patient survival data
Alive
Dead OC
Dead OD
7
6
5
4
3
2
?
1
0
5
10
15
Years post operation
8
Example: Survival Data
Patient
Parotidectomy patient survival data
Alive
Dead OC
Dead OD
7
6
5
4
3
2
?
1
0
5
10
15
Years post operation
Date formats and manipulation can cause headaches.
Check what happens when your software subtracts dates to get survival time.
9
Example: Survival Data
Patient
Parotidectomy patient survival data
censored
7
Alive
Dead OC
Dead OD
6
5
censored
4
3
2
Missing data
1
0
5
10
?
15
Years post operation
10
Example: Survival Data
Patient
Parotidectomy patient survival data
7
censored
6
censored
5
censored
4
censored
3
censored
Alive
Dead OC
Dead OD
2
Missing data
1
0
5
10
?
15
Years post operation
11
Patient
Example: Survival
Data
Censored data is explicitly addressed by survival
7
6
5
analysis, using simple linear regression is not
recommended.
Parotidectomy patient survival data
Options:
Alive
1. SPSS censored
Dead OC
2. SAS
Dead OD
censored
3. R
4. Other software
censored
4
censored
3
2
Missing data
1
0
5
10
?
15
Years post operation
12
Patient
Example: Survival Data
7
6
Missing data can have a large effect on results,
requires
careful management.
Parotidectomy patient survival
data
Options:
Alive
1. Omit censored
Dead OC
Dead OD
2. Imputecensored
3. Model
censored
5
4
censored
3
2
Missing data
1
0
5
10
?
15
Years post operation
13
What is survival analysis
• Time to event data
– Continuous
– Right skewed, ≥0, not normal
– Censored
– Analyse risk (hazard function)
• Examples
– Time to death
– Time to onset/relapse of disease
– Length of stay in hospital
14
What is survival analysis
• Time to event data
Patients
Post operative survival
– Continuous
– Right skewed, ≥0, not15normal
– Censored
10
– Analyse risk (hazard function)
• Examples
5
– Time to death
– Time to onset/relapse 0of disease
0
2
4
– Length of stay in hospital
6
8
10
Years
15
Censoring
• Right censoring
• Left censoring
• Interval censoring
Censoring is also categorised by
1. Fixed study length
2. Fixed number of events
3. Random entry to study
16
Censoring
• Right censoring
– observed survival time is less than actual
– Study ends before event
Parotidectomy patient medical records
Patient
• Left Alive
censoring
7
Dead OC
Dead OD
• Interval
censoring
6
?
5
?
4
3
audit
2
? Lost to follow up
1
1990
1995
2000
2005
6/2008
17
Censoring
• Right censoring
• Left censoring
– Time to relapse
Surgery
0
Recurrence
t
3 month exam
– Time to event is less than observed t < 3
• Interval censoring
18
Censoring
• Right censoring
• Left censoring
• Interval censoring
– Time to relapse
Surgery
0
Free of disease
3 month exam
Recurrence
t
6 month exam
– 3<t<6
19
Censoring
Independent censoring
Survival time is independent of censoring
process.
A censored patient is representative of those at
risk at censoring time.
The methods described here assume
independent censoring
20
Censoring
Independent censoring
Survival time is independent of censoring
process.
Informative censoring
Patients removed from study if condition
deteriorates.
21
Censoring example
How are the SCCMP patients censored?
22
Censoring example
How are the SCCMP patients censored?
• Enter study on surgery date
• Last known status is at audit
Random right censoring.
23
Survival function
The survival function S(t) is the probability of
surviving longer than time t.
S(t) = P(T>t)
Where T is the survival time.
Number of patients survivinglonger than t
S(t) 
total number of patients
24
Hazard function
The hazard function λ(t) is the
probability of dying “at” time t.
f(t)
(t) 
S(t)
Also called the instantaneous failure
rate and force of mortality.
Usually plotted is the cumulative
hazard function, that is the
accumulated hazard until time t.
(t)   log S (t )
25
Survival function
For censored data the survival function can only
be estimated.
Patient
Parotidectomy patient survival data
Alive
Dead OC
Dead OD
7
6
5
4
3
2
1
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Years post operation
26
Survival function
Life table estimates
All causes mortality
Percent surviving
100
80
NZ
60
Australia
40
Chad
20
0
0
10
20
30
40
50
60
70
80
90
100
Age
WHO, StatsNZ
27
Survival function
Kaplan Meier estimates
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Months
2.2
6.12
10.32
10.78
10.88
13.08
13.35
16.11
26.2
29.42
37.48
45.86
59.08
65.33
n
57
51
46
45
44
41
39
37
34
31
26
23
19
14
d
1
1
1
1
1
1
1
1
1
1
1
1
1
1
(n-d)/n
0.982
0.980
0.978
0.978
0.977
0.976
0.974
0.973
0.971
0.968
0.962
0.957
0.947
0.929
S(t)
0.982
0.963
0.942
0.921
0.9
0.878
0.856
0.833
0.808
0.782
0.752
0.719
0.682
0.633
28
1. Order data by time to event
(death)
Survival function
2. Number at risk of
event is number
surviving less number
censored.
Kaplan Meier estimates
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Months
2.2
6.12
10.32
10.78
10.88
13.08
13.35
16.11
26.2
29.42
37.48
45.86
59.08
65.33
n
57
51
46
45
44
41
39
37
34
31
26
23
19
14
d
1
1
1
1
1
1
1
1
1
1
1
1
1
1
(n-d)/n
0.982
0.980
0.978
0.978
0.977
0.976
0.974
0.973
0.971
0.968
0.962
0.957
0.947
0.929
S(t)
0.982
0.963
0.942
0.921
0.9
0.878
0.856
0.833
0.808
0.782
0.752
0.719
0.682
0.633
3. Estimate of
probability of surviving
to next event
4. Multiply probabilities
to estimate survival
29
Kaplan Meier plot
Estimated survivor function
Kaplan Meier estimate
1.0
0.8
0.6
0.4
0.2
0.0
0
20
40
60
80
100
120
Months
30
Kaplan Meier plot
SCCMP
Estimated survivor function
Kaplan Meier estimate
1.0
0.8
0.6
0.4
Standard errors and 95%
CI’s calculated by most
software
(SPSS, R,
SAS)
0.2
0.0
0
20
40
60
80
100
Usually use Greenwood’s or Tsiatis’ formula, software dependent.
120
31
Cumulative Hazard
SCCMP
Cumulative hazard
Cumulative Hazard Function
0.4
0.3
0.2
0.1
0.0
0
20
40
60
80
100
120
Months
32
Summary statistics
1. Median survival: time when S(t) = 0.5
• Must have enough data
2. Mean survival: area under the survival
curve
3. 5 year survival is survival rate at 5 years
33
Kaplan Meier estimate
KM and lifetables are non-parametric
methods: no assumptions are made about
the distribution on the survival times.
Typical distributions are exponential and
Weibull. More powerful but can be sensitive
to getting the distribution right.
34
Disease specific survival
Estimated survivor function
SCCMP survival
1.0
Disease specific
All causes
0.8
0.6
0.4
0.2
0.0
0
20
40
60
80
100
120
Months
35
Comparing 2 groups
Log rank test
• Computed in SPSS, SAS, R
• Most popular
– (Bland Altman BMJ 2004;328:1073 (1 May)
• Limitations
– No estimate of size
– Unlikely to detect a difference when risk is not
consistent
36
Immuno compromised
Estimated survivor function
SCCMP survival: Immuno Compromised
1.0
0.8
No
0.6
0.4
0.2
0.0
Yes
0
20
40
60
80
100
120
140
Months
37
Immuno compromised
Estimated survivor function
SCCMP survival: Immuno Compromised
1.0
0.8
No
Case Processing Summary
0.6
ICOMP
N
Y
Overall
0.4
0.2
0.0
Total N
53
7
60
N of Events
9
5
14
Censored
N
Percent
44
83.0%
2
28.6%
46
76.7%
Yes
0
20
40
60
80
100
120
140
Months
38
Immuno compromised
Estimated survivor function
SCCMP survival: Immuno Compromised
1.0
0.8
No
Means and Medians for Survival Time
0.6
a
Mean
ICOMP
N
Y
Overall
0.4
0.2
Estimate
101.048
22.978
91.761
Median
Std. Error
7.616
7.653
7.842
Estimate
.
16.110
.
Std. Error
.
3.293
.
a. Estimation is limited to the largest survival time if it
is censored.
0.0
Yes
0
20
40
60
80
100
120
140
Months
39
Immuno compromised
Estimated survivor function
SCCMP survival: Immuno Compromised
1.0
0.8
No
0.6
Overall Comparisons
Log Rank (Mantel-Cox)
0.4
Chi-Square
19.579
df
1
Sig .
.000
Test of equality of survival distributions for the different levels of
ICOMP.
0.2
0.0
Yes
0
20
40
60
80
100
120
140
Months
40
Age group
Estimated survivor function
SCCMP survival: Age group
1.0
0.8
<75
0.6
75+
0.4 Call:
survdiff(formula = Surv(mths,Status == "DOD") ~ ICOMP)
0.2
N Observed Expected (O-E)^2/E (O-E)^2/V
Age75=<75 24
7
5.63
0.332
0.557
Age75=75+ 36
7
8.37
0.224
0.557
0.0
0 Chisq= 20
0.6
60
80
100
120
on401 degrees
of freedom,
p= 0.455
140
Months
41
Facial Nerve
Estimated survivor function
SCCMP survival: Facial Nerve Preserved
1.0
0.8
YES
PARTIAL
0.6
NO
0.4
0.2
Log rank p value: 0.09
0.0
0
20
40
60
80
100
120
140
Months
42
Multiple independent variables
Cox proportional hazards model
• Most common model
• Linear model for the log of the hazard ratio
h1 (t )
B1Z1  B2 Z 2 
e
h0 (t )
• Baseline hazard unspecified
43
SCCMP example
CPH model:
Survival ~ Preserved + Age + ICOMP
Preserved and ICOMP categorical
Age continuous
Plot survival for patients with each of
/Y/N/partial nerve preservation adjusted for
age and immuno compromised status
44
SCCMP example - SPSS
Analyze > Survival > Cox Regression
COXREG
Months /STATUS=Status('DEAD')
/PATTERN BY Preserved
/CONTRAST (Preserved)=Indicator /CONTRAST (ICOMP)=Indicator(1)
/METHOD=ENTER Preserved Age ICOMP
/PLOT SURVIVAL
/SAVE=PRESID XBETA
/PRINT=CI(95) CORR SUMMARY BASELINE
/CRITERIA=PIN(.05) POUT(.10) ITERATE(20) .
45
SCCMP example - SPSS
Variables in the Equation
B
Preserved
No
Partial
ICOMP
Age
2.535
2.091
3.588
-.011
SE
.871
1.110
.918
.028
Wald
8.493
8.470
3.549
15.274
.149
df
2
1
1
1
1
Sig .
.014
.004
.060
.000
.700
Exp(B)
12.617
8.093
36.166
.989
95.0% CI for Exp(B)
Lower
Upper
2.288
.919
5.981
.936
69.564
71.279
218.676
1.046
Patients with their facial nerve preserved have 12.6 times
less hazard ratio, (95% CI 2-70) .
Preserving the facial nerve significantly reduces patients
risk, (p value <0.001 CPH model).
46
SCCMP CPH model
Estimated survivor function
SCCMP survival: Facial nerve preserved
1.0
YES
0.8
0.6
PARTIAL
0.4
NO
0.2
Adjusted for age and immuno compromised patients
0.0
0
10
20
30
40
50
60
70
Months
47
Next Steps:
• Check proportional hazards assumption
– Residual plots for groups
• Time dependent covariates
• More complex models
• we also didn’t do power calculations
48
Summary
• Survival analysis accounts for censoring in
time to event data
• Log rank test: difference in survival
between 2 groups
• Cox proportional hazard model
• More complex/powerful models available
• SPSS, R, SAS, Stata
49