The View From Space: understanding the methods and context of a new major paper in the cancer screening literature

Download Report

Transcript The View From Space: understanding the methods and context of a new major paper in the cancer screening literature

“The View From Space”
Understanding the methods and context of a new
major paper in the cancer screening literature
Ben Larson (MS-3) December 2012
Screening criteria: a reminder
Two necessary, though not sufficient, criteria for
any screening program which are particularly
relevant to breast mammography in average risk
patients:
1. There should be scientific evidence of
screening program effectiveness
2. The overall benefits of screening should
outweigh the harm
“Effectiveness”
In the context of cancer screening, effectiveness
means:
1. Deadly cancers are diagnosed earlier
2. Earlier diagnosis leads to decreased mortality
November 22, 2012
N Engl J Med 2012;367:1998-2005
The Question:
There has been a decrease in the incidence of
breast cancer mortality over the past 30 years.
How much of this decrease is due to screening
mammography, which began around that time?
The Problem:
Many things have changed over the past 30
years, especially the surgical, medical, and
radiotherapy for breast carcinomata.
How can the effect of screening itself upon
mortality be isolated?
Possible Solution:
Screening attempts to detect pre-clinical
disease, i.e., increase its lead time.
Putting aside whether and how much increasing
the lead time improves mortality, one could use
stage at initial presentation as a meaningful
evaluation of screening’s ability to diagnose
deadly cancers earlier.
Their Conclusion
Even with “very extreme” assumptions biasing
in favor of screening, the authors estimate there
are about a million women who underwent
treatment for breast cancer who would not
otherwise have presented clinically with
advanced disease.
Seriously?
A million women?
Is that even possible?
Where exactly did that number come from?
The Nitty Gritty
To determine this with 30 years’ data, Drs. Bleyer
and Welch needed to:
• Determine a reasonable estimate for the baseline
incidence of breast cancer
• Determine if and how much incidence has
changed over 30 years of screening
• Control for any major external drivers of changing
incidence
• Present data in multiple ways, preferably biasing
toward screening
Estimate for the baseline incidence
• Data on incidence was first collected in 1973
• The years 1976-1978 were chosen to estimate
true baseline incidence
• This interval was a few years after an artificial
uptick in breast cancer incidence (due to the socalled “Betty Ford blip”), but also just before
screening mammography became truly
widespread
Adjustments
1. Adjust for increasing incidence
1. Adjust for hormone replacement therapy
1. Increasing incidence
Must attempt to separate true increasing baseline
incidence from lead-time bias “increased incidence”
Women under 40 are not screened, and their
incidence increased about 0.25% / year
(95% [CI] 0.04% – 0.47%) for the past 30 years
This increasing incidence was hypothesized to apply
to women over 40, though this cannot be proven
2. HRT
A strong link exists between therapeutic postmenopausal sex steroids and diagnosed cancers.
The study authors proposed “capping” the
incidence of these early cancers between 1990
and 2005 (where the bulk of HRT’s effect was
felt), by using today’s incidence and not
counting incidence during those years above
that cap.
DEFINITIONS
Confined to the breast
These areas under the curve represent
proposed “excess incidence” due to HRT
Nodal involvement or
direct extension
Ductal carcinoma in situ
Distant metastases
This line represents a
zoomed in view of the red,
“Late-stage” line on the
previous slide (note the yaxis change)
These lines represent the two
components of late-stage disease
Inference?
Almost all the variability in late-stage disease is
mirrored by changes in regional disease,
whereas the number of women presenting with
distant metastases over the past 30 years is
remarkably constant.
We would not expect this if screening
mammography were able to detect small lesions
destined to spread widely.
“Overdiagnosis”
The authors use this term to mean the
calculated number of women who underwent
treatment for breast carcinomata, who
otherwise would not have presented with “late
stage” clinical disease.
Late-stage disease, in this case, refers to the sum
of regional disease and distant metastases.
“Best guess” represents the estimated baseline rate of incidence increase, assuming the
increase among women under 40 holds for those over 40. “Extreme assumption” doubles
that rate; or, if you will, uses a value just outside the 95% confidence interval (95% [CI]
0.04% – 0.47%).
The “very extreme assumption” not only uses this larger value (0.5%) for increasing
baseline incidence, but also “assumes the highest rate of baseline disease ever observed”
(113 cases per 100,000 women, observed in 1985).
How exactly do these assumptions bias in favor or screening?
The “excess detection” column is synonymous with “overdiagnosis.” It is the difference
between the extra diagnoses of early-stage disease yielded by screening and the reduction
in diagnoses of late stage disease for each set of assumptions. “Surplus” and “reduction”
imply a known baseline incidence; that baseline rate is what changes between the Base
Case, Best Guess, and Extreme Assumption, in a manner outlined on the previous slide.
The Very Extreme Assumption combines the generous assumption of increasing underlying
incidence with a generous assumption of late stage disease presentation. Remember, the
difference between surplus of early disease and reduction in late disease equals “overdiagnosis.”
There are multiple arithmetic operations going into these numbers, which are detailed in the
paper’s appendix and explained on the next slide.
This column is the incidence of late-stage breast cancer presentation per 100,000 women
capped for HRT. You can see how the Very Extreme Assumption was generated, using
1985’s highest recorded incidence and then starting with that number in 1979, increasing
that base rate with the 0.5%/yr derived from the Extreme Assumption.
We can now see that the Very Extreme Assumption favors screening by maximizing the
base rate of late-stage diagnoses, and therefore also maximizes the calculated reduction in
late-stage disease. The actual number of observed cases, shown in Table 1 above, is simply
a measured value.
“Breast cancer overdiagnosis is a complex and sometimes
contentious issue...Our investigation takes a different view,
which might be considered the view from space. It does not
involve a selected group of patients, a specific protocol, or a
single point in time. Instead, it considers national data over a
period of three decades and details what has actually happened
since the introduction of screening mammography. There has
been plenty of time for the surplus of diagnoses of early-stage
cancer to translate into a reduction in diagnoses of late-stage
cancer — thus eliminating concern about lead time. This broad
view is the major strength of our study.”
N Engl J Med 2012;367:1998-2005
A bit of context
What Barron Lerner, a physician and historian,
has written about a debate over mammography
in 1997 still applies:
“Experts on opposing sides of the screening
debate had not really disagreed about what the
data showed. Rather, they had interpreted and
then presented the statistics differently.”
Barron H. Lerner, “To See With the Eyes of Tomorrow: A History of Screening Mammography,”
Background Paper for the Institute of Medicine report: Mammography and Beyond: Developing
Technologies for the Early Detection of Breast Cancer, March 2001.
A single example among many
A 2002 Swedish paper purports to demonstrate
to its critics that the mortality benefit from
invitation to mammography improves mortality.
What’s interesting is not how they arrived at
their conclusion. Rather, it’s how excited they
are about what their data show:
Lancet 2002; 359: 909–19
Mortality RR = 0.98
(95% [CI] 0.96 – 1.00)
Lancet 2002; 359: 909–19
A parting thought from Dr. Lerner
“The production of better data alone cannot
eliminate the role that economics, authority and
ideology play in the assessment of
mammography and other early detection
technologies.
Sociocultural factors not only influence the
answers to questions about cancer screening,
but also the questions themselves.”