Approaches to test evaluation - AusVet Animal Health Services

Download Report

Transcript Approaches to test evaluation - AusVet Animal Health Services

Approaches to test evaluation
Evan Sergeant
AusVet Animal Health Services
10 May 2010
1
Comparing tests
 Kappa – how well tests agree
 McNemar’s chi-sq – are tests
significantly different?
Kappa
Test 2 result
Test 1 result
T2+
T2-
Total
T1+
121
36
157
T1-
34
931
965
155
967
1122
Total





Expected no. both +ve = (157 x 155)/1122 = 21.7
Expected no. both -ve = (965 x 967)/1122 = 831.6
Total Agreement = 1052
Chance Agreement = 853.4
K=(1052-853.4)/(1122-853.4) = 0.739
McNemar Chi-Squared
Test 2 result
Test 1 result
T2+
T2-
Total
T1+
58
37
95
T1-
5
196
201
Total
63
233
296
McNemar's Chi-squared test with
continuity correction
McNemar's chi-squared = 22.881, df = 1,
p-value = 1.724e-06
OJD AGID and ELISA
ELISA
AGID
+
–
Total
+
34
15
49
–
21
154
175
 Enter data into epitools
• Application of diagnostic tests > compare 2 tests
• see kappa, McNemar’s and level of agreement
Total
55
169
224
Kappa
0.5496
SE for kappa = 0
0.0666
Z(kappa)
p(kappa) - one-tailed
8.25
0
Proportion positive agreement
0.6538
Proportion negative agreement
0.8953
Overall proportion agreement
0.8393
McNemar's Chi sq
0.6944
p(Chi sq)
0.4
Gold Standard Tests
 Use tests with perfect sensitivity and/or
specificity to identify the true disease
status of the individual from which the
samples were taken.
 What are the advantages and
disadvantages of this approach?
Gold Standards Tests
 Advantages
• Known disease status,
• Relatively simple calculations
 Disadvantages
•
•
•
•
May not exist, or be prohibitively expensive
Rare diseases may only allow small sample size
Disease may not be present in the country?
Difficult to get representative (or even
comparable) samples of diseased/non-diseased
individuals
Exercises
 Calculate Se and Sp for OJD AGID
using data provided in
OJD_AGID_Data.xls
• Calculate confidence limits using epitools
Non-gold standard methods
 Do not depend on determining true
infection status of individual.
 Rely on statistical approaches to
calculate best fit values for Se and Sp.
 Tests must satisfy some important
assumptions.
Comparison with a known
reference test
 Assumptions
• Independence of tests
• Se/Sp of reference test is known.
 For ~100% specific reference test,
• Se(new test) = Number positive both tests /
Total number positive to the reference test
Culture vs Serology
 Estimate sensitivity of culture and serology (as flock
tests)
 Serology followed-up by histopathology to confirm
flock status
 Both tests 100% specificity (as flock tests)
 How would you estimate sensitivity for these test(s)
 Which test has better Se? Is the difference significant?
All Flocks
Serology
+ve
PFC
-ve Total
+ve
58
37
95
-ve
5
196
201
63
233
296
Total
Example
 Se (PFC) = 58/63 = 92% (83% - 97%)
 Se (Serology) = 58/95 = 61% (51% - 70%)
Value
Kappa
0.6427
SE for kappa = 0
0.0559
Z(kappa)
p(kappa) - one-tailed
11.49
0
Proportion positive agreement
0.7342
Proportion negative agreement
0.9032
Overall proportion agreement
0.8581
McNemar's Chi sq
22.881
p(Chi sq)
0
Estimation from
routine testing data
 test-positives are subject to follow-up
and truly infected animals are identified
and removed from the population
 Can be used to estimate specificity
when the disease is rare in the
population of interest.
 Sp = 1 – (Number of reactors / Total
number tested)
Se and Sp of equine influenza ELISA
 During the equine influenza outbreak in
Australia, horses were tested by PCR and
serology:
• to confirm infection;
• to demonstrate seroconversion and/or absence of
infection >30 days later;
• As part of random and targeted surveillance for
case detection, to confirm area status and for zone
progression in presumed “EI free” areas.
 How could you use the resulting data to
estimate sensitivity and specificity of the
Equine influenza
ELISA
 475 PCR-positive horses, 471 also
positive on ELISA
 1323 horses from properties in areas
with no infection, 1280 ELISA negative
 Analyse in Epitools
• Application of diagnostic tests> test
evaluation against gold standard

Sergeant, E. S. G., Kirkland, P. D. & Cowled, B. D. 2009. Field Evaluation of an
equine influenza ELISA used in New South Wales during the 2007 Australian
outbreak response. Preventive Veterinary Medicine, 92, 382-385.
Point Estimate
Lower 95% CL
Upper 95% CL
Sensitivity
0.9916
0.9786
0.9977
Specificity
0.9675
0.9565
0.9764
Mixture modelling
 Assumptions
• observed distribution of test results (for a
test with a continuous outcome reading
such as an ELISA) is actually a mixture of
two frequency distributions, one for
infected individuals and one for uninfected
individuals

Opsteegh, M., Teunis, P., Mensink, M., Zuchner, L., Titilincu, A., Langelaar, M.
& van der Giessen, J. 2010. Evaluation of ELISA test characteristics and
estimation of Toxoplasma gondii seroprevalence in Dutch sheep using mixture
models. Preventive Veterinary Medicine.
Latent Class Analysis
 What is Latent Class Analysis?
 Maximum Likelihood
 Bayesian
Maximum likelihood
estimation
 Assumptions
• The tests are independent conditional on disease status (the
sensitivity [specificity] of one test is the same, regardless of
the result of the other test);
• The tests are compared in two or more populations with
different prevalence between populations;
• Test sensitivity and specificity are constant across
populations; and
• There are at least as many populations as there are tests
being evaluated.
 TAGS software
•
Hui, S. L. & Walter, S. D. 1980. Estimating the error rates of diagnostic
tests. Biometrics, 36, 167-171
.
TAGS
 Open R – shortcut in root directory of
stick
 Open tags.R in text editor or word
 Select all and copy/paste into R console
 Type TAGS() and <Enter> to run
 Hui Walter example
• 2 tests for TB
• Test 1 = Mantoux
• Test 2 = Tine test
 Follow the prompts to enter data:
•
•
•
•
•
•
Data•
Data set = new
Name = test
Number of tests = 2, Number of populations = 2
Reference population? = No (0)
Enter results for each population from table below
Best guesses use defaults
Bootstrap CI = Yes (1000 iterations)
Test 1
0
1
0
1
Test 2
0
0
1
1
Population 1
528
4
9
14
Population 2
367
31
37
887
 $Estimations
pre1
pre2
Sp1
Sp2
Se1
Se2
Est
0.0268 0.7168 0.9933 0.9841 0.9661 0.9688
CIinf 0.0159 0.6911 0.9797 0.9684 0.9495 0.9540
CIsup 0.0450 0.7412 0.9978 0.9921 0.9774 0.9790
Bayesian estimation
 What is Bayesian estimation?
• Combines prior knowledge/belief (what you think you know)
with data to give best estimate
• Incorporates existing knowledge on parameters (Se, Sp,
prevalence)
• “Priors” entered as probability (usually Beta) distributions
• Uses Monte Carlo simulation to solve
• Outputs also as probability distributions
• Can get very complex
 Assumptions
•
•
•
•
•
Independence of the tests
Appropriate prior distributions chosen.
Need information on prior probabilities
Some methods can adjust for correlated tests
Multiple tests in multiple populations
 Methods
• EpiTools (only allows one population so must
have good information on one or more test
characteristics)
• WinBUGS models
Bayesian analysis surra data
Test 2
Test 1
CATT
ELISA
+ve
-ve
Total
+ve
0
39
39
-ve
0
251
251
Total
0
290
290
Inputs for Bayesian analysis for revised sensitivity and specificity estimates
Prior distributions for Bayesian analysis
x
n
Prev
alpha
beta
1
1
Se_CATT (81%)
100
81
82
20
Sp_CATT (99.4%)
160
159
160
2
Se_ELISA_2 (75%)
100
75
76
26
Sp_ELISA_2 (97.5%)
120
117
118
4
EpiTools
 Run EpiTools > Estimating true prevalence >
Bayesian estimation with two tests
 Enter parameters:
• Data from 2x2 table: 0, 39, 0, 251
• Prevalence = Beta(1,1) (uniform = don’t know)
• Test 1 (CATT): Se = Beta(82, 20), Sp = Beta(160,
2)
• Test 2 (ELISA): Se = Beta(76, 26), Sp = Beta(118,
4)
• Starting values: 0, 38, 0, 245
• Other values as defaults and click submit
Prevalence
Sensitivity-1
Specificity-1
Sensitivity-2
Specificity-2
Minimum
<0.0001
0.6219
0.8535
0.5475
0.9554
2.5%
0.0001
0.7210
0.8818
0.6510
0.9789
Median
0.0038
0.8064
0.9109
0.7418
0.9910
97.5%
0.0201
0.8749
0.9354
0.8217
0.9973
Maximum
0.0567
0.9370
0.9517
0.8891
0.9998
Mean
0.0055
0.8044
0.9103
0.7406
0.9903
SD
0.0055
0.0393
0.0136
0.0436
0.0048
Iterations
20000
20000
20000
20000
20000