Transcript Slide 1

Diagnosis Articles

Much Thanks to

:

Rob Hayward & Tanya Voth, CCHE

Outline

• Philosophy of Diagnosis: – Probability of disease – Test and treatment thresholds • ANALYZING STUDIES • Validity: – Gold (reference) standard • Numbers: – Sensitivity, Specificity, Likelihood ratio • Applicability: – Observer agreement, Kappa

Philosophy of Diagnosis?

• Pre-test Probability – The probability that a disease is present before doing a test. – A clinical best guess • Post-test Probability – The probability that a disease is present after doing a test – a combination of clinical best guess & test result.

Philosophy of Diagnosis?

•When Tests are good: Target Negative (Normal) Target Positive (Severely ill) Very Normal

B A Test results

Very Abnormal

Philosophy of Diagnosis?

•When Tests aren’t so good: Target Positive Target Negative Very Normal

4 1

Very Abnormal

Test result (LR = 1) Test result (LR = 4)

EBM TP: Diagnostic Tests

• How good are: – Phalen’s Test, – Shifting Dullness, – Patient Report of Fever, – Interstitial Edema on C-Xray, – Ottawa Ankle Rules – Canadian C-Spine Rules vs NEXUS.

Users Guides: Diagnosis

Are the results valid?

•Did clinicians face diagnostic uncertainty? – Were subjects drawn from a common group in which it is not known whether the condition of interest is present or absent? – E.g First CEA studies used known bowel cancer patients 1 1. Proc Natl Acad Sci USA 1969; 64: 161-7

Are the Results Valid

Was an acceptable gold standard used?

• Imagine a study investigating WBC for Appendicitis that use U/S for the gold standard?

Are the results valid?

The test being studied and the gold standard should be completely separate.

Studied

Are the results valid?

The test being studied and the gold standard should be completely separate?

1) Were the test and gold standard independent?

• A study looking at Serum Amylase for Pancreatitis that used a gold standard made of a combination of tests including serum amylase.

1 2) Were the test & gold standard results assessed blindly?

• Imagine a study investigating Ottawa Ankle Rules, in which the radiologist was told the results of the Ankle rules before reading the films.

1. NEJM 1997; 336: 1788-93

Are the results valid?

• Did test being studied effect if gold standard was done?

– Was a different gold standard applied to subjects testing negative?

– E.g. When evaluating VQ scans for PE, those with normal scans often did not go on the gold standard (pulmonary angiography).

1 – In these cases (frequent) we need to be assured of a reasonable back-up gold standard. 1 JAMA 1990; 263:2753-59.

Users Guides: Diagnosis

EBM Tool for Diagnostic Tests Should:

• Tell if a symptom, sign or test is useful • Useful in which way: – Screening (Ruling out) – Making a Diagnosis (Ruling in) • Help us determine the probability of a disease

EBM Diagnostic test Standards

• Sensitivity • SNOUT –

S

ensitive tests if

N

egative rule

OUT

disease. • Specificity • SPIN –

S

pecific tests if

P

ositive rule

IN

disease • Helpful to sort out if a test is good for Screening (Ruling out) or Diagnosis (Ruling in)

LR Advantage

• LR’s – Take into account all elements (false positives/negatives and true positives/negatives) – Have Criteria for Usefulness of each Test.

– Can be used over a Range of Test Results (e.g. WBC) – Can calculate the actual Likelihood of a disease

Key Concept

• Likelihood Ratio: Determine the

usefulness

of tests. •

(Positive) Likelihood Ratios >1 :

• ↑ Likelihood Ratio (1 - ∞) = ↑ likelihood of disease • Make the diagnosis (Rule in disease) •

(Negative) Likelihood Ratio <1:

• ↓ Likelihood Ratio (1 – 0) = ↓ likelihood of disease • Exclude the diagnosis (Rule out disease)

What does the LR mean?

(Criteria for Usefulness)

LR Increase probability Decrease probability Excellent Good Moderate/Small Poor > 10 5-10 2-5 1-2 < 0.1

0.2-0.1

0.2-0.5

0.5 - 1

How do I use the LR?

Nomogram LR calculator

What are the results?

• What range of likelihood ratios were associated with the range of possible test results?

– Ferritin to detect Fe deficiency (GS = bone marrow)

Serum Ferritin

Positive (< 45) Negative (>45)

Iron Deficient Patients

70 15

Sensitivity = 82% Specificity = 90% LR + = 8.2

LR - = 0.2

Not Iron Deficient

15 135

What are the results?

• What range of likelihood ratios were associated with the range of possible test results?

– Ferritin to detect Fe deficiency (GS = bone marrow)

Serum Ferritin

< 18 19 – 45 46 – 100 > 100

Total patients Iron Deficient Patients

47 23 7 8

85 Not Iron Deficient

2 13 27 108

150

What are the results?

• What range of likelihood ratios were associated with the range of possible test results?

– Ferritin to detect Fe deficiency (GS = bone marrow)

Serum Ferritin Iron Deficient Patients L 1 Not Iron Deficient L 2 LR = L 1 /L 2 < 18 19 – 45 46 – 100 > 100 47 23 7 8 47/85= 0.553

23/85= 0.271

7/85= 0.082

8/85= 0.094

2 13 27 108 2/150= 0.013

13/150= 0.086

27/150= 0.180

108/150= 0.720

42.5

3.15

0.46

0.13

Total patients 85 150

Applying LR: Examples

• A 30 y.o. woman complaining of fatigue and vague MDD Sx (Normal periods).

– Guess 20% anemia before test.

– Ferritin = 12, (LR = 42.5) • Anemia = 90% • Same woman, – Ferritin =108, (LR = 0.13) • Anemia = 2%

LR Examples

• Phalen Test (Carpal Tunnel): • LR= 1.3 • Shifting Dullness (Ascites): • LR= 2.3

• Patient Reporting Fever (>38 Temp): • LR = 4.9

• Interstitial Edema on Chest X-Ray (CHF): • LR= 12.7

• Ottawa Ankle Rules (Ankle #): • -ve LR = 0.08

• Canadian C-Spine Rules (C-spine #): • -ve LR= 0.013. (vs NEXUS –ve LR = 0.25) JAMA 2000; 283: 3110-7. J Gen Intern Med 1988: 423-8. Ann Emerg Med 1996: 27: 693-5. Am J Med 2004; 116: 363-8. BMJ 2003; 326: 417. NEJM 2003; 349: 2510-8.

Math Diagnostic Tests: Summary

• Likelihood Ratios are the best we have • Tell if a symptom, sign or test is useful • Help us determine the probability of a diagnosis

Users Guides: Diagnosis

Apply to patient care?

• Is the test and its interpretation reproducible (Kappa)?

• Is the test result the same when reapplied by the same observer (intra-observer variability)?

• Do different observers agree about the test result (inter-observer variability)?

• Examples – Specialist doing JVP = 0.42, – Specialist assessing DM retinopathy from photograph = 0.55

– Interpreting mammogram = 0.67

Greenhalgh T. How to Read a Paper (The basics of evidence based medicine). 2001

Apply to patient care?

• Are the results applicable to the patient in my practice?

-Are the patients in the study like mine.

Apply to patient care?

• Will the results change my management strategy?

– Are the test LRs high or low enough to shift post-test probability across a test or treatment threshold?

Apply to patient care?

• Will patients be better off as a result of the test?

– Will the anticipated changes do more good than harm?

– Effect of clinically insignificant disease

Summary

Key concepts:

Reference Standard

– You cannot decide if a test works unless you have a “gold standard”.

Likelihood Ratio

– To determined the utility of a test, Find how much a given result will shift the Likelihood of a Diagnosis.

Who cares?

– Think about the “ignore” and “act” thresholds and if the test moves you from uncertainty into either zone.

The End

Much Thanks to

:

Rob Hayward & Tanya Voth, CCHE