Transcript Slide 1
Introduction to Survival Analysis
Public Health Intelligence Training Day 6: Further Statistics and Analytical Techniques 24/03/2009 Contributors: Vivian Mak & Dr Daniela Tataru
Thames Cancer Registry
http://www.tcr.org.uk
Topics
Definitions Basic concepts Types of survival analysis Published survival data Exercise Discussions
Survival analysis - definitions
Definition of cases – who are they?
Survival analysis - definitions
Definition of cases – who are they?
Survival analysis - definitions
Definition of cases – who are they?
Definition of starting point – when?
Survival analysis - definitions
Definition of cases – who are they?
Definition of starting point – when?
Defined outcome – recurrence of disease, death
Survival analysis - definitions
Studying the time between entry to a study and a subsequent event.
Survival times often refer to the development of a particular symptom to time of death. diagnosis death/censor date
Survival analysis - definitions
Survival analysis - definitions
Calendar time
Survival analysis - definitions
Period of diagnosis Calendar time
Survival analysis - definitions
Period of diagnosis Period of follow-up Censor date Calendar time
Survival analysis - definitions
Period of diagnosis How many patients are alive at censor date?
Period of follow-up Censor date Calendar time
Survival analysis - definitions
Period of diagnosis Period of follow-up Censor date Calendar time
Survival analysis - censoring
• Survival data is usually censored and not normally distributed.
•
Censoring
occurs when incomplete information is available about the survival time of some individuals. •
Right censoring:
the subject did not have the event (death) during the study period (lost to follow-up or has not experienced the event by the time of the end of follow-up period).
Survival analysis - background
• There are three approaches to estimating survival: • observed (crude) survival • net (corrected survival) • relative survival
Survival analysis - Observed survival
Example n=10 patients followed for 2 years d=3 deaths (failures) 10 probability of death/failure is d/n = 3/10 = 0.3
probability of survival is 1- d/n = 0.7
3 7 F S
Survival analysis - Observed survival
• Limitations of single follow-up interval - it ignores any information about when the deaths and censoring took place.
- to compare the survival experiences of different cohorts it is good to have short intervals.
• It is better to use a number of shorter consecutive intervals of time
Survival analysis - Observed survival
10 3 7 S F 2 F π(1) 1 π(1) F π(2) S 1 π(2) S F 5 S - the follow-up times can be divided in consecutive time bands - there are 3 possible outcomes: - failure during the first year - failure during the second year - survival for two years - probability of surviving 2 years is the probability of surviving first year multiplied by the probability of surviving the second year
Survival analysis - Observed survival
Time from diagnosis
Survival analysis - Observed survival
Time from diagnosis
Survival analysis - Observed survival
There may be more than 1 outcome.
Time from diagnosis
Survival analysis - Observed survival
Calculation of observed (crude) survival rates (actuarial life-table method) 4 5 0 1 2 3 Year No. at start of interval (N) No.of deaths (D) No. lost to follow-up (last seen during year) (L) Corrected no. at risk Probability of dying during year Probability of surviving the year (N-0.5L) (Pi)=(D)/(N-0.5L) 1-(Pi) Cummulative survival 40 33 7 3 0 6 40 30 0.18
0.10
0.83
0.90
0.83
0.74
24 17 9 4 4 4 2 1 3 4 3 2 22.5
15 7.5
3 0.18
0.27
0.27
0.33
0.82
0.73
0.73
0.67
0.61
0.45
0.33
0.22
Remember to adjust numbers @risk
0.4
0.3
0.2
0.1
0 0 0.7
0.6
0.5
1 0.9
0.8
Survival analysis - Observed survival
Calculation of observed (crude) survival rates (actuarial life-table method)
Life-table survival curve
1 2 3
Years
4 5 6 7
0.4
0.3
0.2
0.1
0 0 0.7
0.6
0.5
1 0.9
0.8
Survival analysis - Observed survival
Calculation of observed (crude) survival rates (actuarial life-table method)
Life-table survival curve
1 2 3
Years
4 5 6 7
Survival probability of 1 @start.
Proportion surviving @ any point in time.
Survival analysis - Observed survival
Survival curve using the Kaplan-Meier method (x indicates censoring times)
Survival analysis – Net/cause specific cancer survival
• Interpreted as probability of survival from cancer in the absence of other causes of death. • For cancer patients the mortality risk has 2 separate components: background risk in the general population and extra risk of death due to cancer which is to be estimated. • These risks are assumed to be independent of each other.
• Those certified as dying of other causes (non-cancer) are treated as censored observations. • Such information may not be available for population-based survival estimates.
Survival analysis - Relative survival
• The relative survival rate is the ratio of the survival rate observed among the cancer patients and the survival that would have been expected if they had the same overall mortality rate as the background population. • So, relative survival can be interpreted as the survival of cancer patients relative to, or compared with, that of the population. • For example, if five-year survival is 40% among a group of cancer patients of whom 80% would have been expected to survive that long, then their relative survival is 40/80 = 50%. • The 2 common types of methods used are cohort and period analysis.
Survival analysis – life tables
Survival analysis – life tables
Survival analysis – life tables
Age x Death Probability q x,x+t Number of persons alive Number of deaths I x d x,x+t Life expectancy e x • •
Age
Complete life table: data for all single ages; Abridged life table: data for 5 or 10 years age classes
• • •
Death probability
This column is estimated from empiric data. It is the probability of death between the ages x and x+t for a person alive at age x. From this column we derive the entire life table Survival probability p x,x+t = 1− q x,x+t
Survival analysis – life tables
• A life table describes the general mortality in a population • Can be constructed by age, by geographical area, by sex, by period • We need life tables to calculate the expected survival of a group of cancer patients, i.e. the survival they would have had they been subject only to mortality rates in the general population • As
survival=1-mortality
, the life tables also describe the general survival of a population
Survival analysis – life tables Age
0 1 2
Sex
1 1 1
Period
1986-1995 1986-1995 1986-1995
Region
South Thames South Thames South Thames
Annual probability of death
0.010639
0.000617
0.000373
Annual probability of survival
0.989361
0.999383
0.999627
66 67 68 2 2 2 1986-1995 1986-1995 1986-1995 South Thames South Thames South Thames 0.013309
0.014702
0.016293
0.986691
0.985298
0.983707
• Expected individual 5-year survival probabilities can be obtained by multiplication of annual probabilities of survival.
• Cohort’s expected 5-year survival probability is the average of the individual expected survival probabilities.
Survival analysis – cohort method
• Based on the follow-up of a defined cohort, often patients diagnosed during a specified period.
• Cohort= patients diagnosed in 1998-2000 • 5-year survival follow-up period: 1998 2005, so that each patient is followed up for 5 years.
Years of diagnosis 1998 1999 2000 2001 2002 2003 2004 2005 Survival analysis – cohort method 1998
1
1999
1/2 1
2000
2/3 1/2 1
Years of follow-up 2001
3/4 2/3 1/2 1
2002
4/5 3/4 2/3 1/2 1
2003
5 4/5 3/4 2/3 1/2 1
2004
5 4/5 3/4 2/3 1/2 1
2005
5 4/5 3/4 2/3 1/2 1 Period of follow-up: 1998-2005 Year of diagnosis of informative patients: 1998-2000
Survival analysis – period analysis
• Computation of survival probability is based on a selected period of follow-up • Informative patients: - any cancer patient that is alive at the beginning of follow-up period - any cancer patient diagnosed during the follow-up period • Interpreted as expected (predicted) survival of patients diagnosed in 2003-2005, assuming that the conditional survival probabilities (for each year following diagnosis) remain constant at the levels observed in 2003-2005
Survival analysis – period analysis Years of follow-up
Years of diagnosis
1998 1999 2000 2001 2002 2003 2004 2005
1998 1999 2000 2001 2002 2003 2004 2005 1 1/2 1 2/3 1/2 1 Period of follow-up: 2003-2005 Year of diagnosis of informative patients: 1998-2005 3/4 2/3 1/2 1 4/5 3/4 2/3 1/2 1 5 4/5 3/4 2/3 1/2 1 5 4/5 3/4 2/3 1/2 1 5 4/5 3/4 2/3 1/2 1
Survival analysis – cohort vs period
Cohort It describes the survival experience of patients diagnosed long time ago.
It fails to show the survival effects of recent changes in cancer care Period It includes more data, and more importantly, it includes more recent data It anticipates changes in survival probabilities, that cohort analysis could reveal much later
Survival analysis – sources of cancer survival data
Survival analysis – sources of data
Survival analysis – sources of data
Survival analysis – sources of data
Survival analysis – age-standardisation
• The calculation of the expected survival probability adjusts only for the age-specific mortality from other causes. • If an overall (all-ages) estimate of relative survival for cancer patients is used to compare survival rates for two populations with very different age structures, the results may be misleading. • Age-adjustment is also important for the analysis of time trends in relative survival because if survival varies markedly with age, a change in the age distribution of cancer patients over time can produce spurious survival trends (or obscure real trends).
Survival analysis – period analysis
Survival analysis – interpretation
• Care is needed in interpreting survival patterns as screening can introduce a number of biases. • These include an earlier disease lead time, where an earlier diagnosis results in a longer diagnosis of cancer without improving true survival; a volunteer bias, with those being screened tending to be more health conscious and hence at lower risk; a tendency for tumours detected by screening to be slower growing and therefore having a better prognosis. • Increases in breast cancer survival can be partly attributed to early detection through screening. • Similarly, improved prostate cancer survival can be partly attributed to the detection of more latent, earlier, slow-growing tumours. • Small numbers of cases.
Survival analysis - Exercise
Question 1: What is the probability of surviving 2 years?
10 3 7 F S 2 F 5 S
Survival analysis - Exercise
Question 2: At the next interval, there were 3 deaths due to breast cancer and 1 death due to a road traffic accident. 3 patients left the study. Calculate the cause-specific cumulative survival at the end of the second year.
0 1 2-
Time No. at start of interval No. of deaths No. lost to follow-up Corrected no. at risk Probability of dying during year Probability of surviving the year Cumulative survival
(N) (D) (L) 1-Pi
40 33 24 7 3 0 6
(N-0.5L)
40 30
Pi=D/(N-0.5L)
0.175
0.100
0.825
0.900
0.825
0.743
Survival analysis - Exercise
Question 3: We want to analyse the 1 year relative survival of a group of patients diagnosed with cancer between 2003 and 2005. Fill in the number of informative years in these cells if we are using the cohort method.
Years of diagnosis Years of follow-up 2003 2004 2003 2004 2005 2005 2006
Survival analysis - Exercise
Question 4: Which of these cancer sites showed the largest improvement in survival?
Survival analysis - Exercise
Question 5: For this particular cancer site, which set of data would you find more useful? One-year or five year relative survival?
Survival analysis – Thank you
Office for National Statistics Colleagues at East Midlands PHO Colleagues at Thames Cancer Registry
Survival analysis - Exercise
Question 1: What is the probability of surviving 2 years?
10 3 7 F S 2 F 5 S Answer: (1-3/10) * (1-2/7) = 0.5
Survival analysis - Exercise
Question 2: At the next interval, there were 3 deaths due to breast cancer and 1 death due to a road traffic accident. 3 patients left the study. Calculate the cause-specific cumulative survival at the end of the second year.
0 1 2-
Time No. at start of interval No. of deaths No. lost to follow-up Corrected no. at risk Probability of dying during year Probability of surviving the year Cumulative survival
(N) (D) (L)
40 33 24 7 3 3 0 6 4
(N-0.5L)
40 30 22
Pi=D/(N-0.5L)
0.175
0.100
0.136
1-Pi
0.825
0.900
0.864
0.825
0.743
0.614
Adjust the no. of deaths and no. lost to follow-up
Survival analysis - Exercise
Question 3: We want to analyse the 1 year relative survival of a group of patients diagnosed with cancer between 2003 and 2005. Fill in the number of informative years in these cells if we are using the cohort method.
Years of diagnosis Years of follow-up 2003 2004 2003
1 1/2 1
2004 2005
1
2005
1/2 1/2
2006
Survival analysis - Exercise
Question 4: Which of these cancer sites showed the largest improvement in survival?
Survival analysis - Exercise
Question 5: For this particular cancer site, which set of data would you find more useful? One-year or five year relative survival?
Probably the one-year relative survival.
Survival analysis – contact details
020 7378 7688