SCREENING FOR DISEASE Nigel Paneth THREE KEY MEASURES OF VALIDITY 1. SENSITIVITY 2. SPECIFICITY 3.
Download
Report
Transcript SCREENING FOR DISEASE Nigel Paneth THREE KEY MEASURES OF VALIDITY 1. SENSITIVITY 2. SPECIFICITY 3.
SCREENING FOR DISEASE
Nigel Paneth
THREE KEY MEASURES
OF VALIDITY
1. SENSITIVITY
2. SPECIFICITY
3. PREDICTIVE VALUE
SENSITIVITY
Sensitivity tells us how well a positive test
detects disease.
It is defined as the fraction of the diseased
who test positive.
Its complement is the false negative rate,
defined as the fraction of the diseased who
test negative.
Sensitivity and false negative rate add up to
one.
SENSITIVITY AND THE FALSE
NEGATIVE RATE ARE
COMPLEMENTARY
N who test positive + N who test negative = 1
All with disease
All with disease
SENSITIVITY + FALSE NEGATIVE RATE = 1
SPECIFICITY
Specificity tells us how well a negative test
detects non-disease.
It is defined as the fraction of the nondiseased who test negative.
Its complement is the false positive rate,
defined as the fraction of the non-diseased
who test positive.
Specificity and the false positive rate add up
to one.
SPECIFICITY AND THE FALSE
POSITIVE RATE ARE
COMPLEMENTARY
N who test negative + N who test positive = 1
All without disease All without disease
SPECIFICITY + FALSE POSITIVE RATE = 1
DENOMINATORS OF THESE RATES
• Note that all the denominators of the
four rates so far defined (sensitivity,
specificity and the false + and false –
rates) are DISEASE STATES
• The denominators of sensitivity and
the false negative rate is PEOPLE
WITH DISEASE
• The denominators of specificity and
the false positive rate is PEOPLE
WITHOUT DISEASE
PREDICTIVE VALUE
Positive predictive value is the
proportion of all people with positive
tests who have the disease.
Negative predictive value is the
proportion of all people with negative
tests who do not have the disease.
PREDICTIVE VALUES
DEFINED
• POSITIVE PREDICTIVE VALUE =
All people with disease
All people with a positive test
• NEGATIVE PREDICTIVE VALUE =
All people without disease
All people with a negative test
POINTS TO NOTE
• Note that the numerators and denominators
are reversed compared to sensitivity and
specificity. In predictive values, the
denominator is the test result, and the
numerator is disease or non-disease
• In general, the positive predictive value is
the one most used. Positive predictive value
and sensitivity are perhaps the two most
important parameters in understanding the
usefulness of a test under field conditions.
CRITICAL DIFFERENCE BETWEEN
DISEASE-DENOMINATORED AND
TEST-DENOMINATORED MEASURES
• Sensitivity and specificity do not
vary according to the prevalence of
the disease in the population.
• Predictive value of a test, however is
HIGHLY DEPENDENT on the
prevalence of the disease in the
population
CALCULATING THE RATES
A test is used in 50 people with disease
and 50 people without. These are the
results:
Disease
Test
+
-
+
-
48
3
51
2
47
49
50
50
100
Disease
Test
+
-
+
-
48
3
51
2
47
49
50
50
100
Sensitivity = 48/50 = 96%
Specificity = 47/50 = 94%
Positive predictive value = 48/51 = 94%
Negative predictive value = 47/49 = 96%
Now lets take this test out into a population
where 2% of people have the disease, not
50% as in the previous example. Assume
there are 10,000 people, and the same
sensitivity and specificity as before,
namely 96% and 94%, respectively
Disease
Test
+
-
+
-
192
588
780
8
9,212
9,220
200
9,800
10,000
• What is the positive predictive value
now?
192/780 = 24.6%
• When the prevalence of disease is 50%,
94% of positive tests indicate disease. But
when prevalence is only 2%, less than one
in four test results indicate a person with
disease, and 2% actually would represents
a quite common disease.
• False positives tend to swamp true
positives in populations, because most
diseases we test for are rare.
CHANGING THE THRESHOLD FOR A TEST
When disease is defined by a threshold on a
continuous test, the test characteristics can be
altered by changing the threshold or cut-off point.
Lowering the threshold improves sensitivity, but
often at the price of lowered specificity (i.e. more
false-positives).
Raising the threshold improves specificity, but
often at the price of lowered sensitivity (i.e. more
false negatives).
This can be especially important when the
distribution of a characteristic is unimodal, such
as blood pressure, cholesterol, weight, etc.
(Because the gray area is so large).
PROBLEMS WITH SCREENING
1. Do we have the right threshold?
2. Is there a truly effective treatment
available for the discovered disease?
3. Is that treatment more effective in
screened than non-screened cases?
4. What are the side effects of the screening
process?
5. How efficient is screening? i.e. how many
people must be screened to obtain a
case?
EXAMPLE OF SCREENING
ASSESSMENT
A randomized trial to assess a
screening program for colon
cancer is instituted. The
intervention group gets regular
screening, the control group is
left to its own devices.
After five years it is found that:
1. More cases are discovered in the
screened group than in the controls.
2. The cases are discovered at an earlier
stage of the cancer in the screened group.
3. Five year survival is higher for the people
with cancer in the screened group.
Can we conclude that this screening
program is necessarily effective?
NO, THE PROGRAM IS NOT NECESSARILY
EFFECTIVE
The apparent benefits may only demonstrate
the effects of LEAD-TIME BIAS.
If it is possible to diagnose a condition earlier,
but not to improve survival after diagnosis, the
screening program will have an overrepresentation of earlier diagnosed cases,
whose survival will be increased by exactly the
amount of time their diagnosis was advanced
by the screening program. Thus they have not
benefited, but the amount of time they know
they have cancer has been increased.
Consider how time of diagnosis
changes with screening in the
scenario below:
unscreened group:
Age 50
Dx
51
52
Death
55
53
54
53
Death
54 55
screened group:
Dx
Age 50
51
52
In the previous scenario, incidence of
disease is initially higher, diagnosis is made
earlier, stage of diagnosis is earlier, and
duration of survival from diagnosis is longer.
All of these give the impression of benefit
from screening.
However, the patient does not benefit, as
death is not postponed.
The only proper evidence of effectiveness of
a screening program is a reduction of total
age-specific mortality or morbidity, ideally
demonstrated by randomized trial.
MAMMOGRAPHY
EXERCISE
The next two slides are answers to
questions in the following website
http://mammography.ucsf.edu/inform/index.cfm
QUESTION 12
Part 1. Under age 50, sensitivity is 75%,
over 50, sensitivity is 90%.
Part 2. Under age 50, specificity is about
97%, over 50, about 98.5%.
Part 3. Under age 50, PP+ is about 3%;
over 50, about 6-7%. At all ages, about
5%
QUESTIONS 13 AND 14
• These questions raise the concept of Number needed to screen
How many women in each age group
must be screened to save one life from
breast cancer?