Approaches to test evaluation - AusVet Animal Health Services
Download
Report
Transcript Approaches to test evaluation - AusVet Animal Health Services
Approaches to test evaluation
Evan Sergeant
AusVet Animal Health Services
10 May 2010
1
Comparing tests
Kappa – how well tests agree
McNemar’s chi-sq – are tests
significantly different?
Kappa
Test 2 result
Test 1 result
T2+
T2-
Total
T1+
121
36
157
T1-
34
931
965
155
967
1122
Total
Expected no. both +ve = (157 x 155)/1122 = 21.7
Expected no. both -ve = (965 x 967)/1122 = 831.6
Total Agreement = 1052
Chance Agreement = 853.4
K=(1052-853.4)/(1122-853.4) = 0.739
McNemar Chi-Squared
Test 2 result
Test 1 result
T2+
T2-
Total
T1+
58
37
95
T1-
5
196
201
Total
63
233
296
McNemar's Chi-squared test with
continuity correction
McNemar's chi-squared = 22.881, df = 1,
p-value = 1.724e-06
OJD AGID and ELISA
ELISA
AGID
+
–
Total
+
34
15
49
–
21
154
175
Enter data into epitools
• Application of diagnostic tests > compare 2 tests
• see kappa, McNemar’s and level of agreement
Total
55
169
224
Kappa
0.5496
SE for kappa = 0
0.0666
Z(kappa)
p(kappa) - one-tailed
8.25
0
Proportion positive agreement
0.6538
Proportion negative agreement
0.8953
Overall proportion agreement
0.8393
McNemar's Chi sq
0.6944
p(Chi sq)
0.4
Gold Standard Tests
Use tests with perfect sensitivity and/or
specificity to identify the true disease
status of the individual from which the
samples were taken.
What are the advantages and
disadvantages of this approach?
Gold Standards Tests
Advantages
• Known disease status,
• Relatively simple calculations
Disadvantages
•
•
•
•
May not exist, or be prohibitively expensive
Rare diseases may only allow small sample size
Disease may not be present in the country?
Difficult to get representative (or even
comparable) samples of diseased/non-diseased
individuals
Exercises
Calculate Se and Sp for OJD AGID
using data provided in
OJD_AGID_Data.xls
• Calculate confidence limits using epitools
Non-gold standard methods
Do not depend on determining true
infection status of individual.
Rely on statistical approaches to
calculate best fit values for Se and Sp.
Tests must satisfy some important
assumptions.
Comparison with a known
reference test
Assumptions
• Independence of tests
• Se/Sp of reference test is known.
For ~100% specific reference test,
• Se(new test) = Number positive both tests /
Total number positive to the reference test
Culture vs Serology
Estimate sensitivity of culture and serology (as flock
tests)
Serology followed-up by histopathology to confirm
flock status
Both tests 100% specificity (as flock tests)
How would you estimate sensitivity for these test(s)
Which test has better Se? Is the difference significant?
All Flocks
Serology
+ve
PFC
-ve Total
+ve
58
37
95
-ve
5
196
201
63
233
296
Total
Example
Se (PFC) = 58/63 = 92% (83% - 97%)
Se (Serology) = 58/95 = 61% (51% - 70%)
Value
Kappa
0.6427
SE for kappa = 0
0.0559
Z(kappa)
p(kappa) - one-tailed
11.49
0
Proportion positive agreement
0.7342
Proportion negative agreement
0.9032
Overall proportion agreement
0.8581
McNemar's Chi sq
22.881
p(Chi sq)
0
Estimation from
routine testing data
test-positives are subject to follow-up
and truly infected animals are identified
and removed from the population
Can be used to estimate specificity
when the disease is rare in the
population of interest.
Sp = 1 – (Number of reactors / Total
number tested)
Se and Sp of equine influenza ELISA
During the equine influenza outbreak in
Australia, horses were tested by PCR and
serology:
• to confirm infection;
• to demonstrate seroconversion and/or absence of
infection >30 days later;
• As part of random and targeted surveillance for
case detection, to confirm area status and for zone
progression in presumed “EI free” areas.
How could you use the resulting data to
estimate sensitivity and specificity of the
Equine influenza
ELISA
475 PCR-positive horses, 471 also
positive on ELISA
1323 horses from properties in areas
with no infection, 1280 ELISA negative
Analyse in Epitools
• Application of diagnostic tests> test
evaluation against gold standard
Sergeant, E. S. G., Kirkland, P. D. & Cowled, B. D. 2009. Field Evaluation of an
equine influenza ELISA used in New South Wales during the 2007 Australian
outbreak response. Preventive Veterinary Medicine, 92, 382-385.
Point Estimate
Lower 95% CL
Upper 95% CL
Sensitivity
0.9916
0.9786
0.9977
Specificity
0.9675
0.9565
0.9764
Mixture modelling
Assumptions
• observed distribution of test results (for a
test with a continuous outcome reading
such as an ELISA) is actually a mixture of
two frequency distributions, one for
infected individuals and one for uninfected
individuals
Opsteegh, M., Teunis, P., Mensink, M., Zuchner, L., Titilincu, A., Langelaar, M.
& van der Giessen, J. 2010. Evaluation of ELISA test characteristics and
estimation of Toxoplasma gondii seroprevalence in Dutch sheep using mixture
models. Preventive Veterinary Medicine.
Latent Class Analysis
What is Latent Class Analysis?
Maximum Likelihood
Bayesian
Maximum likelihood
estimation
Assumptions
• The tests are independent conditional on disease status (the
sensitivity [specificity] of one test is the same, regardless of
the result of the other test);
• The tests are compared in two or more populations with
different prevalence between populations;
• Test sensitivity and specificity are constant across
populations; and
• There are at least as many populations as there are tests
being evaluated.
TAGS software
•
Hui, S. L. & Walter, S. D. 1980. Estimating the error rates of diagnostic
tests. Biometrics, 36, 167-171
.
TAGS
Open R – shortcut in root directory of
stick
Open tags.R in text editor or word
Select all and copy/paste into R console
Type TAGS() and <Enter> to run
Hui Walter example
• 2 tests for TB
• Test 1 = Mantoux
• Test 2 = Tine test
Follow the prompts to enter data:
•
•
•
•
•
•
Data•
Data set = new
Name = test
Number of tests = 2, Number of populations = 2
Reference population? = No (0)
Enter results for each population from table below
Best guesses use defaults
Bootstrap CI = Yes (1000 iterations)
Test 1
0
1
0
1
Test 2
0
0
1
1
Population 1
528
4
9
14
Population 2
367
31
37
887
$Estimations
pre1
pre2
Sp1
Sp2
Se1
Se2
Est
0.0268 0.7168 0.9933 0.9841 0.9661 0.9688
CIinf 0.0159 0.6911 0.9797 0.9684 0.9495 0.9540
CIsup 0.0450 0.7412 0.9978 0.9921 0.9774 0.9790
Bayesian estimation
What is Bayesian estimation?
• Combines prior knowledge/belief (what you think you know)
with data to give best estimate
• Incorporates existing knowledge on parameters (Se, Sp,
prevalence)
• “Priors” entered as probability (usually Beta) distributions
• Uses Monte Carlo simulation to solve
• Outputs also as probability distributions
• Can get very complex
Assumptions
•
•
•
•
•
Independence of the tests
Appropriate prior distributions chosen.
Need information on prior probabilities
Some methods can adjust for correlated tests
Multiple tests in multiple populations
Methods
• EpiTools (only allows one population so must
have good information on one or more test
characteristics)
• WinBUGS models
Bayesian analysis surra data
Test 2
Test 1
CATT
ELISA
+ve
-ve
Total
+ve
0
39
39
-ve
0
251
251
Total
0
290
290
Inputs for Bayesian analysis for revised sensitivity and specificity estimates
Prior distributions for Bayesian analysis
x
n
Prev
alpha
beta
1
1
Se_CATT (81%)
100
81
82
20
Sp_CATT (99.4%)
160
159
160
2
Se_ELISA_2 (75%)
100
75
76
26
Sp_ELISA_2 (97.5%)
120
117
118
4
EpiTools
Run EpiTools > Estimating true prevalence >
Bayesian estimation with two tests
Enter parameters:
• Data from 2x2 table: 0, 39, 0, 251
• Prevalence = Beta(1,1) (uniform = don’t know)
• Test 1 (CATT): Se = Beta(82, 20), Sp = Beta(160,
2)
• Test 2 (ELISA): Se = Beta(76, 26), Sp = Beta(118,
4)
• Starting values: 0, 38, 0, 245
• Other values as defaults and click submit
Prevalence
Sensitivity-1
Specificity-1
Sensitivity-2
Specificity-2
Minimum
<0.0001
0.6219
0.8535
0.5475
0.9554
2.5%
0.0001
0.7210
0.8818
0.6510
0.9789
Median
0.0038
0.8064
0.9109
0.7418
0.9910
97.5%
0.0201
0.8749
0.9354
0.8217
0.9973
Maximum
0.0567
0.9370
0.9517
0.8891
0.9998
Mean
0.0055
0.8044
0.9103
0.7406
0.9903
SD
0.0055
0.0393
0.0136
0.0436
0.0048
Iterations
20000
20000
20000
20000
20000