Document 7546470

Download Report

Transcript Document 7546470

Lecture 3
Validity of screening and
diagnostic tests
• Reliability: kappa coefficient
• Criterion validity:
–
–
–
–
–
–
“Gold” or criterion/reference standard
Sensitivity, specificity, predictive value
Relationship to prevalence
Likelihood ratio
ROC curve
Diagnostic odds ratio
1
Clinical/public health
applications
• screening:
– for asymptomatic disease (e.g., Pap test,
mammography)
• for risk (e.g., family history of breast cancer
• case-finding: testing of patients for diseases
unrelated to their complaint
• diagnostic: to help make diagnosis in
symptomatic disease or to follow-up on
screening test
2
Evaluation of screening and
diagnostic tests
• Performance characteristics
– test alone
• Effectiveness (on outcomes of disease):
– test + intervention
3
Criteria for test selection
•
•
•
•
•
•
Reliability
Validity
Feasibility
Simplicity
Cost
Acceptability
4
Measures of inter- and intra-rater
reliability: categorical data
• Percent agreement
– limitation: value is affected by prevalence higher if very low or very high prevalence
• Kappa statistic
– takes chance agreement into account
– defined as fraction of observed agreement not
due to chance
5
Kappa statistic
Kappa = p(obs) - p(exp)
1 - p(exp)
p(obs): proportion of observed agreement
p(exp): proportion of agreement expected by chance
6
Example of Computation of Kappa
Agreement between the First and the Second Readings to Identify Atherosclerosis Plaque
in the Left Carotid Bifurcation by B-Mode Ultrasound Examination in the
Atherosclerosis Risk in Communities (ARIC) Study
Second reading
Plaque
Normal
Total
Plaque
140
69
209
First Reading
Normal
Total
52
192
725
794
777
986
Observed agreement = 140 +725/986 = 0.877
Chance agreement for plaque – plaque cell = (209 x 192)/986 = 40.7
Chance agreement for normal- normal cell = 777 x 794/986 = 625.7
Total chance agreement = 40.7 + 625.7/986 = 0.676
Kappa = 0.877 – 0.676 = 0.62
1 – 0.676
7
Interpretation of kappa
• Various suggested interpretations
• Example: Lanis & Koch, Fleiss
excellent:
over 0.75
fair to good: 0.40 - 0.75
poor:
less than 0.40
8
Validity (accuracy) of
screening/diagnostic tests
• Face validity, content validity: judgement of the
appropriateness of content of measurement
• Criterion validity
– concurrent
– predictive
9
Normal vs abnormal
• Statistical definition
– “Gaussian” or “normal” distribution
• Clinical definition
– using criterion
10
11
12
13
14
Selection of criterion
(“gold” or criterion standard)
• Concurrent
– salivary screening test for HIV
– history of cough more than 2 weeks (for TB)
• Predictive
– APACHE (acute physiology and chronic
disease evaluation) instrument for ICU patients
– blood lipid level
– maternal height
15
Sensitivity and specificity
Assess correct classification of:
• People with the disease (sensitivity)
• People without the disease (specificity)
16
"True" Disease Status
Present
Absent
Screening
test results
Positive
"True positives"
A
"False positives"
B
Negative
"False negatives"
C
"True negatives"
D
Sensitivity of screening test =
A
A+C
Specificity of screening test =
D
B+D
Predictive value of positive test =
A
A+B
Predictive value of negative test =
D
C+D
17
Predictive value
• More relevant to clinicians and patients
• Affected by prevalence
18
Choice of cut-point
If higher score increases probability of disease
• Lower cut-point:
– increases sensitivity, reduces specificity
• Higher cut-point:
– reduces sensitivity, increases specificity
19
Considerations in selection of
cut-point
Implications of false positive results
• burden on follow-up services
• labelling effect
Implications of false negative results
• Failure to intervene
20
Receiver operating characteristic
(ROC) curve
• Evaluates test over range of cut-points
• Plot of sensitivity against 1-specificity
• Area under curve (AUC) summarizes
performance:
– AUC of 0.5 = no better than chance
21
22
Likelihood ratio
• Likelihood ratio (LR) = sensitivity
1-specificity
• Used to compute post-test odds of disease
from pre-test odds:
post-test odds = pre-test odds x LR
• pre-test odds derived from prevalence
• post-test odds can be converted to
predictive value of positive test
23
Example of LR
•
•
•
•
prevalence of disease in a population is 25%
sensitivity is 80%
specificity is 90%,
pre-test odds = 0.25 = 1/3
1 - 0.25
• likelihood ratio = 0.80 = 8
1-0.90
24
Example of LR (cont)
• If prevalence of disease in a population is
25%
• pre-test odds = 0.25 = 1/3
1 - 0.25
• post-test odds = 1/3 x 8 = 8/3
• predictive value of positive result = 8/3+8
= 8/11 = 73%
25
Diagnostic odds ratio
• Ratio of odds of positive test in diseased vs
odds of negative test in non-diseased:
a.d
b.c
• From previous example:
OR = 8 x 27 = 36
2x3
26
Summary: LR and DPR
• Values:
– 1 indicates that test performs no better than
chance
– >1 indicates better than chance
– <1 indicates worse than chance
• Relationship to prevalence?
27
Applications of LR and DOR
• Likelihood ratio: Primarily in clinical
context, when interest is in how much the
likelihood of disease is increased by use of
a particular test
• Diagnostic odds ratio Primarily in research,
when interest is in factors that are
associated with test performance (e.g., using
logistic regression)
28