Meningitis bla bla bla

Download Report

Transcript Meningitis bla bla bla

Appraising diagnostic studies
Matthew J. Thompson
GP & Senior Clinical Scientist
Department of Primary Health Care
University of Oxford
Overview of talk


Diagnostic reasoning
Appraising diagnostic studies
What is diagnosis?






Increase certainty about
presence/absence of disease
Disease severity
Monitor clinical course
Assess prognosis – risk/stage
within diagnosis
Plan treatment e.g., location
Stalling for time!
Knottnerus, BMJ 2002
Diagnostic errors

Most diagnostic errors due to cognitive errors:



Conditions of uncertainty
Thinking is pressured
Shortcuts are used
(Ann Croskerry. Ann Emerg Med 2003)

Human toll of diagnostic errors (Diagnostic errors -
The next
frontier for Patient Safety. Newman-Toker, JAMA 2009)


40,000-80,000 US hospital deaths from misdiagnosis
per year
Adverse events, negligence cases, serious disability
more likely to be related to misdiagnosis than drug
errors
Diagnostic reasoning

Diagnostic strategies particularly important
where patients present with variety of conditions
and possible diagnoses.
Diagnostic reasoning


For example, what causes cough?
Comprehensive history
examination
diagnosis
final diagnosis
differential
Diagnostic reasoning


For example, what causes cough?
Comprehensive history…examination…differential
diagnosis…final diagnosis

Cardiac failure, left sided , Chronic obstructive pulmonary disease , Lung abscess
Pulmonary alveolar proteinosis, Wegener's granulomatosis, Bronchiectasis
Pneumonia, Atypical pneumonia, Pulmonary hypertension
Measles, Oropharyngeal cancer, Goodpasture's syndrome
Pulmonary oedema, Pulmonary embolism, Mycobacterium tuberculosis
Foreign body in respiratory tract, Diffuse panbronchiolitis, Bronchogenic carcinoma
Broncholithiasis, Pulmonary fibrosis, Pneumocystis carinii
Captopril, Whooping cough, Fasciola hepatica
Gastroesophageal reflux, Schistosoma haematobium, Visceral leishmaniasis
Enalapril, Pharyngeal pouch, Suppurative otitis media
Upper respiratory tract infection, Arnold's nerve cough syndrome, Allergic bronchopulmonary aspergillosis
Chlorine gas, Amyloidosis, Cyclophosphamide
Tropical pulmonary eosinophilia, Simple pulmonary eosinophilia, Sulphur dioxide
Tracheolaryngobronchitis, Extrinsic allergic alveolitis, Laryngitis
Fibrosing alveolitis, cryptogenic, Toluene di-isocyanate, Coal worker's pneumoconiosis
Lisinopril, Functional disorders, Nitrogen dioxide, Fentanyl
Asthma, Omapatrilat, Sinusitis
Gabapentin, Cilazapril

……diagnostic reasoning

















53!
Diagnostic reasoning strategies

Aim: identify types and frequency of diagnostic
strategies used in primary care

6 GPs collected and recorded strategies used on 300
patients.
(Diagnostic strategies used in primary care. Heneghan, Glasziou, Thompson et
al,. BMJ in press)
Diagnostic stages & strategies
Stage
Initiation of the
diagnosis
Refinement of
the diagnostic
causes
Defining the
final diagnosis
Strategy
Spot diagnoses
Self-labelling
Presenting complaint
Pattern recognition
•Restricted Rule Outs
•Stepwise refinement
•Probabilistic reasoning
•Pattern recognition fit
•Clinical Prediction Rule
Known Diagnosis
Further tests ordered
Test of treatment
Test of time
No label
Diagnostic stages & strategies
Stage
Initiation of the
diagnosis
Refinement of
the diagnostic
causes
Defining the
final diagnosis
Strategy
Spot diagnoses
Self-labelling
Presenting complaint
Pattern recognition
•Restricted Rule Outs
•Stepwise refinement
•Probabilistic reasoning
•Pattern recognition fit
•Clinical Prediction Rule
Known Diagnosis
Further tests ordered
Test of treatment
Test of time
No label
Initiation: Spot diagnosis



Unconscious recognition of non-verbal pattern, e.g.:
visual (skin condition)
auditory (barking cough with croup)
Fairly instantaneous, no further history needed.
20% of consultations
*Brooks LR. Role of specific similarity in a medical
diagnostic task. J Exp Psychol Gen 1991;220:278-87
Initiation: Self-labelling



“Its tonsillitis doc– I’ve
had it before”
“I have a chest infection
doctor”
20% of consultations


Accuracy of selfdiagnosis in recurrent UTI
88 women with 172 selfdiagnosed UTIs
Uropathogen in 144 (84%)
 Sterile pyuria in 19 cases
(11%)
 No pyuria or bacteriuira in
9 cases (5%)
(Gupta et al Ann Int Med
2001)

Stage
Initiation of the
diagnosis
Refinement of
the diagnostic
causes
Defining the
final diagnosis
Strategy
Spot diagnoses
Self-labelling
Presenting complaint
Pattern recognition
•Restricted Rule Outs
•Stepwise refinement
•Probabilistic reasoning
•Pattern recognition
•Clinical Prediction Rule
Known Diagnosis
Further tests ordered
Test of treatment
Test of time
No label
Refining: Restricted rule-out (or Murtagh’s)
process

A learned diagnostic strategy for each presentation




Think of the most common/likely condition
AND… what needs to be ruled out also?
Example: patient with headache …learn to check for
migraine, tension type headache, but to rule out
temporal arteritis, subarachnoid haemorrhage etc
Used in 30% consultations
Murtagh. Australian Fam Phys 1990. Croskerry Ann Emerg Med 2003
Refining: Probabilistic reasoning



The use of a specific but probably
imperfect symptom, sign or
diagnostic test to rule in or out a
diagnosis.
E.g. urine dipstick for UTI, arterial
tenderness in Temporal Arteritis
Used 10% of cases
Refining: Pattern recognition



Symptoms and signs volunteered or elicited
from the patient are compared to previous
patterns or cases and a disease is recognized
when the actual pattern fits.
Relies on memory of known patterns, but no
specific rule is employed.
Used in 40% cases
Refining: Clinical prediction rules



Formal version of pattern
recognition based on a well
defined and validated
series of similar cases.
Examples: Ottawa ankle
rule, streptococcal sore
throat,
Rarely used <10% cases
Stage
Initiation of the
diagnosis
Refinement of
the diagnostic
causes
Defining the
final diagnosis
Strategy
Spot diagnoses
Self-labelling
Presenting complaint
Pattern recognition
•Restricted Rule Outs
•Stepwise refinement
•Probabilistic reasoning
•Pattern recognition fit
•Clinical Prediction Rule
Known Diagnosis
Further tests ordered
Test of treatment
Test of time
No label




Known diagnosis
Order further tests
Test of treatment
Test of time
Can’t label
0%

10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Final diagnostic stage
Defining the final diagnoses
Known
Further
tests
Diagnosis
ordered
Test of
Test of
treatment time
No Label
Appraising diagnostic tests
1. Are the results valid?
2. What are the results?
3. Will they help me
look after my patients?
Appraising diagnostic tests
1. Are the results valid?
2. What are the results?
3. Will they help me
look after my patients?
Basic design of diagnostic accuracy study
Series of patients
Index test
Reference (“gold”) standard
Blinded cross-classification
Validity of diagnostic studies
1. Was an appropriate spectrum of patients
included?
2. Were all patients subjected to the gold standard?
3. Was there an independent, blind or objective
comparison with the gold standard?
1. Was an appropriate spectrum of
patients included? Spectrum bias
Selected Patients
Index test
Reference standard
Blinded cross-classification
1. Was an appropriate spectrum of
patients included? Spectrum bias



You want to find out how good chest X
rays are for diagnosing pneumonia in the
Emergency Department
Best = all patients presenting with
difficulty breathing get a chest X-ray
Spectrum bias = only those patients in
whom you really suspect pneumonia get a
chest X ray
2. Were all patients subjected to the gold
standard? Verification (work-up) bias
Series of patients
Index test
Reference standard
Blinded cross-classification
2. Were all patients subjected to the gold
standard? Verification (work-up) bias




You want to find out how good is exercise
ECG (“treadmill test”) for identifying patients
with angina
The gold standard is angiography
Best = all patients get angiography
Verification (work-up bias) = only patients
who have a positive exercise ECG get
angiography
3. Was there an independent, blind or
objective comparison with the gold
standard? Observer bias
Series of patients
Index test
Reference standard
Unblinded cross-classification
3. Was there an independent, blind or
objective comparison with the gold
standard? Observer bias



You want to find out how good is exercise
ECG (“treadmill test”) for identifying
patients with angina
All patients get the gold standard
(angiography)
Observer bias = the Cardiologist who
does the angiography knows what the
exercise ECG showed (not blinded)
Incorporation Bias
Series of patients
Index test
Reference standard….. includes parts of
Index test
Unblinded cross-classification
Differential Reference Bias
Series of patients
Index test
Ref. Std A
Ref. Std. B
Blinded cross-classification
Validity of diagnostic studies
1. Was an appropriate spectrum of patients
included?
2. Were all patients subjected to the Gold
Standard?
3. Was there an independent, blind or objective
comparison with the Gold Standard?
Appraising diagnostic tests
1. Are the results valid?
2. What are the results?
3. Will they help me
look after my patients?
Sensitivity, specificity,
positive & negative
predictive values,
likelihood ratios
…aaarrrggh!!
2 by 2 table
+
+
Test
-
Disease
-
2 by 2 table
+
Disease
-
+
a
b
-
c
d
Test
2 by 2 table
+
+
Test
-
Disease
-
a
b
True
positives
False
positives
c
d
False
negatives
True
negatives
2 by 2 table: sensitivity
+
+
Disease
-
a
Proportion of people
with the disease who
have a positive test
result.
c
.…a highly sensitive test
will not miss many
people
Test
-
Sensitivity = a / a + c
2 by 2 table: sensitivity
+
+
99
-
1
Disease
-
Test
Sensitivity = a / a + c
Sensitivity = 99/100 = 99%
2 by 2 table: specificity
+
+
Disease
-
b
Test
-
d
Proportion of people
without the disease
who have a negative
test result.
….a highly specific test
will not falsely identify
people as having the
disease.
Specificity = d / b + d
Tip…..

Sensitivity is useful to me

Specificity isn’t….I want to know about the
false positives
…so……use 1-specificity which is the
false positive rate
2 by 2 table:
+
Disease
-
+
a
b
-
c
d
Test
Sensitivity = a/a+c
False positive rate = b/b+d
(same as 1-specificity)
2 by 2 table:
+
Disease
-
+
99
10
-
1
90
Test
Sensitivity = 99%
False positive rate = 10%
(same as 1-specificity)
Example
Your father went to his doctor and was told that his
test for a disease was positive. He is really
worried, and comes to ask you for help!

After doing some reading, you find that for men
of his age:
The
prevalence of the disease is 30%
The test has sensitivity of 50% and specificity of 90%

“Son, tell me what’s the chance
I have this disease?”
A disease with a
prevalence of 30%.
The test has sensitivity
of 50% and specificity
of 90%.

100%
Always

50%
maybe

0%
Never
Prevalence of 30%, Sensitivity of 50%, Specificity of 90%
Disease +ve
Sensitivity
= 50%
30
100
Disease -ve
15
Testing +ve
70
False
positive
rate = 10%
7
22 people
test
positive……….
of whom 15
have the
disease
So, chance of
disease is
15/22 about
70%
Try it again



A disease with a prevalence of 4% must be
diagnosed.
It has a sensitivity of 50% and a specificity of
90%.
If the patient tests positive, what is the
chance they have the disease?
Prevalence of 4%, Sensitivity of 50%, Specificity of 90%
Disease +ve
Sensitivity
= 50%
4
100
Disease -ve
2
Testing +ve
9.6
96
False
positive
rate = 10%
11.6 people
test
positive……….
of whom 2
have the
disease
So, chance of
disease is
2/11.6 about
17%
Doctors with an average of 14 yrs experience
….answers ranged from 1% to 99%
….half of them estimating the probability as 50%
Gigerenzer G BMJ 2003;327:741-744
Sensitivity and specificity don’t vary
with prevalence


Test performance can vary in different settings/
patient groups, etc.
Occasionally attributed to differences in disease
prevalence, but more likely is due to differences in
diseased and non-diseased spectrums
2 x 2 table: positive predictive value
+
+
a
Disease
-
b
Test
-
c
d
PPV = a / a + b
Proportion of people
with a positive test who
have the disease
2 x 2 table: negative predictive value
+
Disease
-
+
a
b
-
c
d
Test
NPV = d / c + d
Proportion of people
with a negative test
who do not have the
disease
What’s wrong with PPV and NPV?

Depend on accuracy of the test and
prevalence of the disease
Likelihood ratios


Can use in situations with more than 2
test outcomes
Direct link from pre-test probabilities to
post-test probabilities
2 x 2 table: positive likelihood ratio
+
+
a
Disease
-
How much more often a positive
test occurs in people with
compared to those without the
disease
b
LR+ = a/a+c / b/b+d
Test
-
c
d
or
LR+ = sens/(1-spec)
2 x 2 table: negative likelihood ratio
+
+
a
Disease
-
b
LR- = c/a+c / d/b+d
Test
-
How less likely a negative test
result is in people with the
disease compared to those
without the disease
or
c
d
LR- = (1-sens)/(spec)
LR<0.1 = strong
negative test
result
LR=1
No diagnostic
value
LR>10 = strong
positive test
result
McGee: Evidence based Physical Diagnosis (Saunders Elsevier)
%
Bayesian
reasoning
Pre test 5%
Post test 20%
? Appendicitis:
McBurney tenderness
LR+ = 3.4
%
Fagan
nomogram
Do doctors use quantitative methods
of test accuracy?

Survey of 300 US physicians
8 used Bayesian methods, 3 used
ROC curves, 2 used LRs
 Why?
…indices unavailable…
…lack of training…
…not relevant to setting/population.
…other factors more important…

(Reid et al. Academic calculations versus clinical
judgements: practicing physicians’ use of
quantitative measures of test accuracy. Am J Med
1998)
Appraising diagnostic tests
1. Are the results valid?
2. What are the results?
3. Will they help me
look after my patients?
Will the test apply in my setting?






Reproducibility of the test and interpretation in my
setting
Do results apply to the mix of patients I see?
Will the results change my management?
Impact on outcomes that are important to patients?
Where does the test fit into the diagnostic strategy?
Costs to patient/health service?
Reliability – how reproducible is the
test?

Kappa = measure of intraobserver reliability
Test
Kappa value
Tachypnoea
0.25
Crackles on
auscultation
0.41
0.52
Value of Kappa
Strength of Agreement
Pleural rub
<0.20
Poor
0.21-0.40
Fair
CXR for
0.48
cardiomegaly
0.41-0.60
Moderate
0.61-0.80
Good
0.81-1.00
Very Good
MRI spine for 0.59
disc
Will the result change management?
0%
Probability of disease
No action
Testing
threshold
Test
100%
Action
(e.g. treat)
Action
threshold
Any questions!