Comparing Kaplan Meier Estimated at a Fixed Time

Download Report

Transcript Comparing Kaplan Meier Estimated at a Fixed Time

Comparing Kaplan Meier
Estimates at a Fixed Time
Timothy Costigan
Kyoungah See
1
Outline
♦ Motivation of Topic
♦ Review Practice in Therapeutic Areas
♦ Issues in Implementation
• Type 1 error inflation
– Transformations
– Estimation of Variance
♦ Simulation Study
2
Objective
♦ Focus on cumulative proportion of patients
experiencing an outcome at a specific point in
time where some patients are censored
♦ Compare Kaplan Meier Estimates at a fixed time
rather than using a technique based on the
pattern of the curves over time such as the
logrank test or the Cox proportional hazards
model
3
Scenarios for Use
♦ Know that one treatment has an early advantage
and late disadvantage
• Invasive vs. less invasive treatment
♦ Primary outcome is 30 Day mortality with no
interest in earlier differences
• Sepsis
♦ Intermediate term study (6 months, 12 months)
with uniform follow-up of patients
• 6 month CV outcome studies
• Fractures in osteoporosis
• Revision surgery in fracture healing studies
4
Potential for Late Appearing Differences
♦ Logrank test assumes equal treatment effect
throughout the study and gives equal weight to
early and late appearing differences and this is
not the best scenario for this test
♦ Wilcoxon test gives more weight to early events
than to late events and should not be used in
this situation
♦ Comparing KM estimates at a fixed point is a
good choice here, especially when follow-up of
all subjects is equal
5
Common Methods in Sepsis Studies
♦ Landmark analysis – Analysis population is
patients who survive to the 30 day landmark or
die within 30 days – censored patients excluded
from the analysis: treat as binary outcome
• Sensitivity analysis assumes all censored
patients die; all censored patients in experimental
arm die
♦ Clearly the comparison of Kaplan Meier
estimates at 30 days is a better method
6
Common Methods for Osteoporosis with
Common Follow-up
♦ M1: Analyze completers, similar to sepsis
• Potential bias
♦ M2: Use jackknife to analyze proportions at endpoint
subject to dropout
• Comparing KM estimates is a more fundamental way
to control for censoring
♦ M3: Use the log rank test
• Usually the pattern of emerging differences are not of
interest
♦ We advocate comparing KM estimates at endpoint
instead
7
Common Methods for 6 Month CV
Outcome Studies
♦ Use the same method as CV outcome studies
where follow-up varies (typically 6-15 months, or 630 months)
• Logrank test, hazard ratio estimate from Cox model
with 95% CI, KM estimates at endpoint
♦ The case for using a method that is based on the
overall pattern is less compelling with equal followup
♦ We understand the desire to use the same
technique as for variable follow-up, but there are
advantages to using a method less dependent on
the pattern of the curves
8
Kaplan Meier Estimates (1)
♦ Estimates the cumulative proportion of patients
not experiencing an event over time
♦ Subtraction yields cumulative proportion
experiencing an event over time
♦ Incorporates censoring (administrative due to
unequal length of follow-up and drop-outs)
♦ When there is no censoring the KM estimate at
the endpoint is the proportion of patients not
experiencing the event
9
Kaplan Meier Estimates (2)
♦ In a 7 month study patients who are lost to follow
up at 1 month contribute to estimation of the KM
curve through 1 month only
♦ In a 7 month study patients who are lost to follow
up at 6 months contribute to estimation of the
KM curve through 6 month only
10
Kaplan Meier Estimate Case 1
♦ 1000 patients followed 7 days with no censoring
♦ 50 patients experience event on day1, 40 on day
2, 30 on each of days 3-7
♦ KM estimates through each day
•
•
•
•
1
2
3
4
5
6
7
0.95 0.91 0.88 0.85 0.82 0.79 0.76
0.05 0.09 0.12 0.15 0.18 0.21 0.24
950 x 910 x 880 x 850 x 820 x 790 x 760
1000 950 910 880 850 820 790
11
Kaplan Meier Estimate Case 2
10% Dropout Between Month 6 and 7
♦ KM with no censoring
• 1
2
3
4
5
6
7
• 0.95 0.91 0.88 0.85 0.82 0.79 0.76
•
0.240
♦ KM with 10% censoring at Month 6
• 1
2
3
4
5
6
7
• 0.95 0.91 0.88 0.85 0.82 0.79 0.756
•
0.244
(x660/690 instead of 760/790 due to censoring)
♦ Naïve with 10% Censoring at Month 6
• 240/900 = 73.33% ; 26.67%
• Worst case at Month 7 340/1000=34.00%
• Best case at Month7 240/1000=24.00%
12
Kaplan Meier Estimate Case 3
10% Dropout Between Month 1 and 2
♦ KM with no censoring
• 1
2
3
4
5
6
7
• 0.95 0.91 0.88 0.85 0.82 0.79 0.76 (0.24)
♦ KM with 10% censoring at Month 6.1
• 1
2
3
4
5
6
7
• 0.95 0.91 0.88 0.85 0.82 0.79 0.756
•
0.244
♦ KM with 10% censoring at Month 1.1
• 1
2
3
4
5
6
7
• 0.95 0.905 0.87 0.84 0.805 0.77 0.737
•
0.263
• Naïve is 0.267 - less difference with early censoring
13
KM curves are step functions
14
Log Rank Test 1
♦ Compares cumulative proportion with the event
through the end of study
♦ ∑ d(i) – p(i) / √ p(i) (1 – p(i))
♦ Best used when treatment effect is constant over
time
♦ Commonly used in CV outcome studies
♦ Focus on relative risk of an event through the
study, not on time to event
Log Rank Test 2
♦ “The log-rank test can be derived as the score
test for the Cox proportional hazards model
comparing two groups (with one binary covariate
for treatment). It is therefore asymptotically
equivalent to the likelihood ratio test statistic
based from that model.”
• Wikipedia: Log-rank test
♦ The Wilcoxon test is a modification of the logrank test which gives more weight to early
events
16
Comparing KM Estimates at Endpoint
♦ One obtains KM estimates at endpoint for each
treatment
♦ Test statistic is the difference in KM estimates
♦ One calculates the SD of each KM estimate by
Greenwood’s formula which are used to calculate
the SD of the test statistic
♦ Analysis of binary data adjusted for censoring
♦ If there is no censoring and all events occur at one
time point, Greenwood’s estimate is the same as the
SD for a proportion based on the binomial
distribution
♦ σ2=∑ d(i)/{n(i) (n(i) -d(i)) } is Greenwoods formula
multiplier
17
KM estimates, Greenwood’s formula
^
SKM(t) = ∏ (ni – di) / ni
ti<t
^ ^
^
VG (SKM(t)) = SKM(t) 2 Σ di / ni (ni – di) =
ti<t
^
^
= SKM(t) 2 σ2
18
Abbreviated Notation
S1 = ∏ (n1i – d1i)/n1i
t1i <t
S2 = ∏ (n2i – d2i)/n2i
t2i <t
VG(S1) = S12 Σ d1i /(n1i – d1i) = S12 σ1 2
t1i <t
VG(S2) = S22 Σ d2i /(n2i – d2i) = S22 σ2 2
t2i <t
___________
Z = S1(t) – S2(t) / √ VG (S1 )+ VG (S2)
19
Confidence Intervals for Difference in KM
estimates
♦ 95% confidence intervals for the difference in KM
estimates based on the quadratic form
______________
Z = S1(t) – S2(t) / √ VG (S1 )+ VG (S2)
are very close to CIs for differences in proportions based on
the binomial distribution form naïve estimators from landmark
analyses
CIs for differences are closer than the estimates
themselves
♦ Klein and Moeschberger (2005) Section 7.8
♦ For the situations of interest to us (large phase 3
studies) CIs will be above zero
• If not alternative solutions exist
20
Issues with Standard Method
♦ Inflated type 1 error with small sample size
• Common in Oncology
• Klein et al (Stat Med2007) propose
transformations and show via simulation that
issue is mitigated
– log, log(-log), arcsine (√ )
♦ Inflated type 1 error when proportion is close to
0 or 1 (<0.2, >0.8) in a simulation study with
n=100
• Barber and Jenison (1998) discuss use of
alternative estimators of the variance to mitigate
issue
21
Oncology
♦ Outcome is survival and there is interest in the
patterns of the curves which often approach 0
♦ Sample size is sometimes small
♦ Transformations sometimes used to protect type
1 error when comparing KM estimates
♦ log(-log) and arcsine (√ ) transformations
preserve type 1 error with sample sizes as small
as 25 and up to 50% censoring (Klein et al
2007)
22
Variance by the Delta Method
♦ The variance of Φ(S1(t)) is
VG (S1(t)) [Φ'((S1(t))]2
23
Log (-log) Transformation
[log(-log(S1(t))) – (log(-log(S2(t)))]2
X2 =
___________________________________________________
σ1 2 /log(S1(t))2+ σ2 2/ log(S2(t))2
CONFTYPE=LOGLOG in PROC LIFETEST
24
Arcsine (√) Transformation
X2=[arcsin(√S1(t))-arcsin(√S2(t))]2/[ν1(t)+ν2(t)]
Whereν1(t) = S1(t) σ1(t) /4 (1 - S1(t))
ν2(t) = S2(t) σ2(t) /4 (1 – S2(t))
CONFTYPE=ASINSQRT in PROC LIFETEST
25
Properties of the arcsin(√)
♦ Variance stabilizing transformation for a
proportion
♦ Typically used when the unit of measurement is
a proportion based on summaries of daily diaries
• Fleiss (1983, Wiley)
• Linear in middle, large effect on tails
26
Properties of Greenwood's Variance
Estimator
♦ Good censoring conditioning properties
♦ When all events occur at the same time then
Greenwood’s formula reduces to the standard
estimator of a proportion:
^ ^
^
VG (SKM(t)) = SKM(t) 2 ∏ di / ni (ni – di)
ti<t
reduces to (D/N)([N-D/N])/N = p(1-p)/N
• Reduces to Simon and Lee’s modification of Peto’s
estimator of variance
27
Peto’s Variance Estimate
♦ Peto’s estimator of Variance:
Assume tk< t < tk+1
VP (SKM(t)) = SKM(t) ( 1 - SKM(t))/ nk
• Proposed when proportion is close to 0 or 1
♦ Barber and Jennisson (1998) show that both Peto’s
estimate and Greenwood’s estimate result in severe
type 1 error inflation and asymmetry in the tails in a
simulation study of 100 observations
♦ Does not depend on censoring pattern
♦ Standard estimator of a proportion when nk is
replaced by nk+1 (Simon and Lee, 1982)
28
Other Variance Estimators
♦ Thomas and Grunkenmeier (1975) JASA –
Constrained estimator – One sample under Ho
♦ Rothman (1978) J Chronic Disease – Adjusted
effective sample size estimator incorporating
constrained and Peto’s estimate
♦ Jennison and Turnbull’s (1998) simulation
shows less asymmetry and better type 1 error
protection of constrained and adjusted sample
size estimators
♦ Zhao (1996) Stat in Med. – Homogenic estimate
29
Bootstrap Methods
♦ Efron (1981) JASA – Censored data and the
bootstrap
♦ Efron (1987) JASA - Better bootstrap confidence
intervals
♦ Akritas (1986)Biometrics – Bootstrapping the
Kaplan Meier estimator
30
Simulation Study 1: Six Analysis
Techniques
♦
♦
♦
♦
♦
♦
♦
Compare KMs with Greenwood variance
Compare KMs with Peto variance
Log (-log) transformation of KMs
Arcsine square root transformation of KMs
Logrank test
Wilcoxon test
TEST=LOGRANK, WILCOXON, PETO in PROC LIFETEST
31
Simulation Study 1: Event Patterns 1
1
2
3
4
5
6
Event Accumulation
Treatment Effect
Consistent
Consistent
Late
Consistent
Early
Consistent
Consistent
Early Only
Consistent
Late Only
Consistent
Consistent, half expected
32
Simulation Study 1: Event Patterns 2
Event AccumulationTreatment Effect
7 Late
Late only
8 Late
Early only
9 Early
Early only
10 Early
Late only
33
Simulation Study 1 (a,b,c): Parameters
♦ Generate piecewise exponential distributions over 03 and 3-6 months for selected CDFs in SAS
♦ Control event rate at 6 months is 20%: planned and
attained
♦ 10% (5%,20%) censoring – exponential distribution:
planned and attained
♦ Sample size for 80% power for 20% risk ratio:
N=1843(1796, 1948) per group for 10% (5%,20%)
censoring: n-Query Advisor 2.0
♦ Parameters: type1 error, power
♦ 1000 replicates – limitation for determining type 1
error
34
Simulation Study 1d: Parameters
♦ Generate piecewise exponential distributions
over 0-3 and 3-6 months for selected CDFs
♦ Control event rate at 6 months is 20%
♦ 10% censoring – exponential distribution
♦ Sample size for 50% power for 20% risk ratio:
N=902 per group (versus 1843 for 80% power)
♦ Parameters: type1 error, power
♦ 1000 replicates
35
Simulation Study 1e: Parameters
♦ Generate piecewise exponential distributions
over 0-3 and 3-6 months for selected CDFs
♦ Control event rate at 6 months is 20%: planned
and attained
♦ Experimental event rate at 6 months is 16%
♦ 10% censoring – exponential distribution: planed
and attained
♦ Available sample of 500 per group
♦ Parameters: type1 error, power
♦ 1000 replicates
36
Simulation Study 2-1: Parameters
♦ Generate piecewise exponential distributions over 03 and 3-6 months for selected CDFs
♦ Control event rate at 6 months is 10%: planned and
attained
♦ Experimental event rate at 6 months is 8%
♦ 10% censoring – exponential distribution: planned
and attained
♦ Sample size for 80% power for 20% risk ratio:
N=3705 per group (versus 1843 for 20% control
event rate)
♦ Parameters: type1 error, power
♦ 1000 replicates
37
Simulation Study 2-2: Parameters
♦ Generate piecewise exponential distributions
over 0-3 and 3-6 months for selected CDFs
♦ Control event rate at 6 months is 10%: planned
and attained
♦ Experimental event rate at 6 months is 8%
♦ 25% censoring – exponential distribution when
10% planned: N=3705 per group as in study 2-1
♦ Parameters: type1 error, power
♦ 1000 replicates
38
Simulation Study 1: CDFs
1: C-C
2:L-C
3: E-C
4: C-L
5: C-E
6:C-C half
exp.
Control
Experiment
al
Risk Ratio
F(3) = 0.10
F(3) = 0.08
0.80
F(6) = 0.20
F(6) = 0.16
0.80
F(3)= 0.05
F(3) = 0.04
0.80
F(6)= 0.20
F(6) = 0.16
0.80
F(3)= 0.15
F(3) = 0.12
0.80
F(6)= 0.20
F (6) = 0.16
0.80
F(3)=0.10
F(3) = 0.10
1.00
F(6) = 0.20
F(6) = 0.16
0.80
F(3)= 0.10
F(3) = 0.06
0.60
F(6) = 0.20
F(6)=0.16
0.80
F(3)=0.10
F(3)=009
0.90
F(6)=0.20
F(6)=0.18
0.90
39
Simulation Study 1: CDFs
7: L-L
8:L-E
9: E-E
10: E-L
Control
Experiment
al
Risk Ratio
F(3) = 0.05
F(3) = 0.05
1.00
F(6) = 0.20
F(6) = 0.16
0.80
F(3)= 0.05
F(3) = 0.03
0.60
F(6)= 0.20
F(6) = 0.16
0.80
F(3)= 0.15
F(3) = 0.09
0.60
F(6)= 0.20
F (6) = 0.16
0.80
F(3)=0.15
F(3) = 0.15
1.00
F(6) = 0.20
F(6) = 0.16
0.80
40
Figure 1: Simulation 1a: Power Results (For 80%
power with 10% censoring)
Simulation 1a: Power Results
(nSim=1000, censoring=10%, N=1843/group, Control F(6)=0.2)
1
0.9
Power
0.8
0.7
Arcsine (sqrt)
0.6
Greenwood
Log (-log)
0.5
Log-Rank
Peto
0.4
Wilcoxon
0.3
0.2
0.1
0
1:C-C
2:L-C
3:E-C
4:C-L
5:C-E 6:C-C,0.9
7:L-L
8:L-E
9:E-E
10:E-L
Case (Event Accumulation - Treatment Effect)
41
Power Conclusions for Simulation 1a
♦ When the treatment effect is consistent all methods
have similar high power regardless of event
accumulation
♦ Greenwood and the two transformations are always
close to each other and fairly constant across the 9
event accumulation/treatment effect patterns
♦ For C-L Greenwood > Log-rank> Peto> Wilcoxon;
• this pattern more pronounced for E-L and less
pronounced for L-L
♦ For C-E the pattern is reversed
• Wilcoxon>Peto>Log-rank>Greenwood
• more pronounced for E-E and less pronounced for LE
42
Figure 2: Simulation 1d: Power Results (For
50% power with 10% censoring)
Simulation 1d: Power Results
(nSim=1000, censoring=10%, N=902/group, Control F(6)=0.2)
1.000
0.900
Power
0.800
0.700
Arcsine (sqrt)
0.600
Greenwood
Log (-log)
0.500
Log-Rank
Peto
0.400
Wilcoxon
0.300
0.200
0.100
0.000
1:C-C
2:L-C
3:E-C
4:C-L
5:C-E 6:C-C,0.9
7:L-L
8:L-E
9:E-E
10:E-L
Case (Event Accumulation - Treatment Effect)
43
Figure 3: Simulation 1a: Type 1 Error Results (For 80%
power with 10% censoring)
Simulation 1a: Type 1 Error Results
(nSim=1000, censoring=10%, N=1843/group, Control F(6)=0.2)
0.04
0.03
Arcsine (sqrt)
Type 1 error
Greenwood
Log (-log)
0.02
Log-Rank
Peto
Wilcoxon
0.01
0
1:C-C
2:L-C
3:E-C
4:C-L
5:C-E 6:C-C,0.9 7:L-L
8:L-E
9:E-E
10:E-L
Case (Event Accumulation - Treatment Effect)
44
Figure 4: Simulation 1d: Type 1 Error Results (For 50%
power with 10% censoring)
Simulation 1d: Type 1 Error Results
(nSim=1000, censoring=10%, N=902/group, Control F(6)=0.2)
0.040
0.030
Arcsine (sqrt)
Type 1 error
Greenwood
Log (-log)
0.020
Log-Rank
Peto
Wilcoxon
0.010
0.000
1:C-C
2:L-C
3:E-C
4:C-L
5:C-E 6:C-C,0.9 7:L-L
8:L-E
9:E-E
10:E-L
Case (Event Accumulation - Treatment Effect)
45
Simulation 1a Case 1 (C-C) Results
Method
Power
Type 1 error
KM - Arcsine (√)
0.864
0.027
KM - Greenwood
0.864
0.027
KM - Log (-log)
0.864
0.027
Log-rank test
0.860
0.024
KM - PETO
0.856
0.023
Wilcoxon test
0.854
0.024
46
Simulation 1a Case 4 (C-L) Results
Method
Power
Type 1 error
KM - Arcsine (√)
0.870
0.027
KM - Greenwood
0.870
0.027
KM - Log (-log)
0.869
0.027
Log-rank test
0.814
0.024
KM - PETO
0.752
0.025
Wilcoxon test
0.728
0.024
47
Simulation 1a Case 5 (C-E) Results
Method
Power
Type 1 error
KM - Arcsine (√)
0.853
0.027
KM - Greenwood
0.853
0.027
KM - Log (-log)
0.852
0.027
Log-rank test
0.898
0.024
KM - PETO
0.918
0.025
Wilcoxon test
0.928
0.024
48
Simulation 1a Case 7 (L-L) Results
Method
Power
Type 1 error
KM - Arcsine (√)
0.853
0.022
KM - Greenwood
0.853
0.022
KM - Log (-log)
0.852
0.021
Log-rank test
0.819
0.021
KM - PETO
0.806
0.023
Wilcoxon test
0.792
0.023
49
Simulation 1a Case 10 (E-L) Results
Method
Power
Type 1 error
KM - Arcsine (√)
0.866
0.034
KM - Greenwood
0.865
0.034
KM - Log (-log)
0.865
0.034
Log-rank test
0.779
0.032
KM - PETO
0.707
0.033
Wilcoxon test
0.660
0.033
50
Figure 5: Simulation 2-1: Power Results
(For N=3705/group with 10% censoring)
1.0
Simulation 2-1: Power Results
(nSim=1000, censoring=10%, N=3705/group,
half the event rate of simulation 1, Control F(6)=0.1)
0.9
0.8
Arcsine (sqrt)
0.7
Greenwood
Power
0.6
Log (-log)
Log-Rank
0.5
Peto
0.4
Wilcoxon
0.3
0.2
0.1
0.0
1:C-C
2:L-C
3:E-C
4:C-L
5:C-E
6:C-C,
RR 0.9
7:L-L
8:L-E
9:E-E
10:E-L
Case (Event Accumulation - Treatment Effect)
51
Comparison of 10% versus 20% Control
Cumulative Event Rate
♦ Power is adequate but lower for 10% then 20%
even though the sample size was modified
accordingly
♦ Greenwood and the two transformations are
always close to each other, above 80%, and
fairly constant across the 9 event
accumulation/treatment effect patterns
♦ Patterns of the 6 methods are similar for 10%
and 20% cumulative event rate for control
• Greenwood > Log-rank for late treatment effects
• Log-rank > Greenwood for early treatment effects
52
Figure 6: Simulation 2-2: Power Results
(For N=3705/group with 25% censoring)
1.0
Simulation 2-2: Power Results
(nSim=1000, censoring=25%, N=3705/group,
half the event rate of simulation 1, Control F(6)=0.1)
0.9
0.8
Arcsine (sqrt)
0.7
Greenwood
Power
0.6
Log (-log)
Log-Rank
0.5
Peto
0.4
Wilcoxon
0.3
0.2
0.1
0.0
1:C-C
2:L-C
3:E-C
4:C-L
5:C-E
6:C-C,
RR 0.9
7:L-L
8:L-E
9:E-E
10:E-L
Case (Event Accumulation - Treatment Effect)
53
Comparison of 10% versus 25% for
Proportion Censored
♦ Power is adequate but lower for 25% censored
with 10% planned than for 10% censored
♦ Greenwood and the two transformations are
always close to each other, above or close to
80%, and fairly constant across the 9 event
accumulation/treatment effect patterns
♦ Patterns of the 6 methods are similar for 10%
and 25% censoring
• Greenwood > Log-rank for late treatment effects
• Log-rank > Greenwood for early treatment effects
54
Figure 7: Simulation 2-1: Type 1 Error Results (For
N=3705/group with 10% censoring)
Simulation 2-1: Type 1 Error Results
(nSim=1000, censoring=10%, N=3705/group,
half the event rate of simulation 1, Control F(6)=0.1)
0.04
Arcsine (sqrt)
0.03
Type 1 error
Greenwood
Log (-log)
Log-Rank
0.02
Peto
Wilcoxon
0.01
0.00
1:C-C
2:L-C
3:E-C
4:C-L
5:C-E 6:C-C, RR 7:L-L
8:L-E
0.9
Case (Event Accumulation - Treatment Effect)
9:E-E
10:E-L
55
Figure 8: Simulation 2-2: Type 1 Error Results (For
N=3705/group with 25% censoring)
Simulation 2-2: Type 1 Error Results
(nSim=1000, censoring=25%, N=3705/group,
half the event rate of simulation 1, Control F(6)=0.1)
0.04
Arcsine (sqrt)
0.03
Type 1 error
Greenwood
Log (-log)
Log-Rank
0.02
Peto
Wilcoxon
0.01
0.00
1:C-C
2:L-C
3:E-C
4:C-L
5:C-E 6:C-C, RR 7:L-L
8:L-E
0.9
Case (Event Accumulation - Treatment Effect)
9:E-E
10:E-L
56
Conclusion 1
♦ If the cumulative event rate is accurately
predicted for a study with 80% or 90% power
and 5% to 25% censoring then Greenwood’s
formula works well and transformations are not
required
• Endpoint driven CV studies
57
Conclusion 2
♦ For late appearing treatment differences
Greenwood’s formula has higher power than the
log-rank test
• Primary prevention efficacy CV outcome studies
where delayed benefit is anticipated
• CV safety outcome studies where delayed harm
is a possibility
• Time is required for treatment effect to evolve
such osteoporosis
58
Conclusion 3
♦ For early appearing differences which diminish
over time the log-rank test is more powerful than
Greenwood’s method. However, Greenwood’s
method has good power.
• Secondary prevention CV efficacy outcome
studies
59
Primary References 1
♦ Barber S. and Jennison C. (1998). A review of
inferential methods for the Kaplan-Meier
estimator, Research report 98:02, statistics
group, University of Bath, UK
♦ Klein JP., Logan B, Harhoff M, and Anderson
PK. (2007)Analyzing survival curves at a fixed
point in time. Statistics in Medicine: 26: 45054519.
60
Primary References 2
♦ Klein JP and Moeschberger ML (2005). Survival
Analysis, Springer, 2nd addition.
♦ Fleiss, J. L. (1986). Design and Analysis of
Clinical Experiments. New York: John Wiley &
Sons.
61
Variance Estimation References 1
♦ Rothman KJ (1978). Estimation of confidence
intervals for the cumulative probability of survival
in life-table analysis. Journal of Chronic
Diseases 31, 57-560.
♦ Simon R and Lee WJ (1982). Nonparametric
confidence limits for survival probabilities and
median survival time. Cancer Treatment Reports
66, 37-42.
62
Variance Estimation References 2
♦ Thomas DR and Grunkemeier GL (1975).
Confidence interval estimates of survival
probabilities for censored data. Journal of the
American Statistical Association 70, 865-871.
♦ Zhao GL (1996). The homogenetic estimate for
the variance of a survival rate. Statistics in
Medicine: 15: 51-60.
63
Bootstrap References
♦ Akritas MG (1986). Bootstrapping the Kaplan
Meier estimator. Journal of the American
Statistical Association 81, 1032-1038.
♦ Efron B (1981). Censored data and the
bootstrap. Journal of the American Statistical
Association 76, 312-319.
♦ Efron B (1987). Better bootstrap confidence
intervals (with discussion). Journal of the
American Statistical Association 82, 171-200.
64
Backup Slides TOC
♦ References for testing the proportional hazards
assumption – slides 66, 67
♦ Models with different early and late treatment
effects – slide 68
♦ References for testing for a change point – slide
69
♦ Alternative Summary Measures of KM curves –
slide 70
65
References for testing the proportional
hazards assumption 1
♦ Anderson PK (1982). Biometrics 38, 67-77.
♦ Gill RD and Schumacher M (1887). Biometrika
74, 289-300.
♦ Grambsch P and Therneau T (1994). Biometrika
81, 515-526.
♦ Lin DY (1991). Journal of the American
Statistical Association 86, 725-728.
♦ Moreau T et al (1985). Applied Statistics 34,
212-218
66
References for testing the proportional
hazards assumption 2
♦ Moreau T et al (1985). Biometrika 73, 513-515.
♦ Parzen M (1999). Biometrics 55, 580-584
♦ Schoenfeld D (1980). Biometrika 67, 145-153.
67
Models with different early and late
treatment effects
♦ Include time period (early, late) and the
interaction of time period by treatment as time
dependent covariates in addition to treatment
and other factors in the primary Cox regression
model
• Can be use to test proportional hazards
assumption
• Can be used to allow for different early and late
treatment effects.
68
References for testing for a change point
♦ Karasoy DS and Kadilar C (2006).
Computational Statistics and Data Analysis 51,
2993-3001.
♦ Gijbels I and Gurler U (2003). Lifetime Data
Analysis 9, 395-411.
69
Alternative Summary Measures of KM
curves
♦ Area under the KM curve to a fixed time
♦ Topic of Haoda Fu’s talk during this session
70