Diagnostics Methods Comparison Studies for Quantitative Nucleic Acid Assays Jacqueline Law, Art DeVault Roche Molecular Systems Sept 19, 2003

Download Report

Transcript Diagnostics Methods Comparison Studies for Quantitative Nucleic Acid Assays Jacqueline Law, Art DeVault Roche Molecular Systems Sept 19, 2003

Diagnostics
Methods Comparison Studies for Quantitative
Nucleic Acid Assays
Jacqueline Law, Art DeVault
Roche Molecular Systems
Sept 19, 2003
1
 Introduction
 PCR based quantitative nucleic acid assays
 Literature references
 Acceptance criteria
 Examples
 References
2
Diagnostics
Outline
to validate a new assay
 Purposes:
 To show that the new assay has good agreement
with the reference assays
 To show that the assay performs similarly with
different types of specimen
 Premises of methods comparison studies:
 A linear relationship between the two assays
 LOD, dynamic range have to be already established
 Appropriate transformation to normalize the data
 Analysis:
 To detect constant bias and proportional bias
3
Diagnostics
Methods Comparison Studies:
Constant Bias: the difference between the two
Y v s . M
D if
f
er en
D
i
f
e
r
n
c
-2 -1 0 1 2
M
e
t
h
o
d
Y
3 4 5 6 7 8
Me t
hod
345678
A
2345678
4
Diagnostics
methods is constant across the data range
M e th o d
X
v e ra g e
the two methods is linear across the data range
Me t
hod
Y v s . M
D
i
f
e
r
n
c
-0.5 0. 0.5 1.0
M
e
t
h
o
d
Y
3 4 5 6 7 8
D if
f
er enc
3 4 5 6 7 8
A
3
5
Diagnostics
Proportional Bias: the difference between
4
5
6
7
M e th o d
X
)
v e ra g e
 To quantify the viral load by PCR method
 Characteristics:
 A wide dynamic range (e.g. 10cp/mL to 1E7
6
cp/mL)
 Skewed distribution (non-normal): typically
log10 transformation for the data
 Heteroscedasticity: variance is higher at higher
titer levels
log10 transformation may not achieve
homogeneity in variance (variance at lower
end may increase)
Other transformation: log x  x 2   2


Diagnostics
PCR based nucleic acid assays
a wide dynamic range - data are log10 transformed
Diagnostics
PCR based assays:
O
b
s
e
r
v
d
T
i
t
(
L
o
g
c
p
/
m
)
-4 -2 0 2 4 6 8
O
b
s
e
r
v
d
T
i
t
(
c
p
/
m
L
)
0 5*10^7 10^8 1.5*0^8 2*10^8 2.5*10^8 3*10^8
U nt
r an
L
s
o
f
o
g
r
1
m
0
2*
6
1
*
1
0
1
0
^
1
0
^
7
.
^
4
8
7
*
2
14
0^
6
8
8
7
N
om i nal
N
T
o
i t
m
er
i n
(c
a
log10 transformation may remove some skewness
p
7
/m
.2L
E1
1 .4
cE
p
0 5*10^-8 1.5*0^-7
0. 0.1 0.2 0.3 0.4
Untransformed
c
0. 0.1 0.2 0.3
7
Diagnostics
PCR based assays:
0
2
4
6
8
1 0
2
4
0
6
0
8
1
0
1
0
0
1
2
0
4
6
0
0
*
1
1
1
0
0
.
^
^
4
7
6
*1
Ti te r
Ti te r
p
7
/m
.2L
E1
1 .4
cE
p
012345
c
0. 0.5 1.0 1.5 2.0 2.5 3.0
log10 transformed
0. 0.2 0.4 0.6 0.8
7
Ti te r
-1
-0
.
0
5
0
.
.
1
5
0
.5
.0
1
1
.1
4
.2
6
.2
8
.0
.6
2
6
.7
8
.7
9
.7
0
.1
.2
Ti te r
8
Ti te r
Ti te r
 Correlation coefficient
 Other coefficients
 T-test
 Bland-Altman plot
 Ordinary least squares regression
 Passing-Bablok regression
 Deming regression
9
Diagnostics
Literature references on Methods
Comparison Studies
 Measures the strength of linear relationship
between two assays
 Does not measure agreement: cannot detect
constant or proportional bias
 Correlation coefficient can be artificially high
for assays that cover a wide range: how high
is high? 0.95? 0.99? 0.995?
10
Diagnostics
Correlation coefficient R or R2
 Concordance coefficient (Lin, 1989):
 Measures the strength of relationship between
two assays that fall on the 45o line through the
origin
2 1 2
C    2
 1   22  1   2 2
 Gold-standard correlation coefficient
(St.Laurent 1998):
 Measures the agreement between a new assay
and a gold standard
11
SGG
G 
S DD  SGG 
Diagnostics
Other coefficients
 Paired t-test on the difference in the
measurements by two assays
 Can only detect constant bias
 Cannot detect proportional bias
12
Diagnostics
T-test
(Bland and Altman, 1986)
 Methods:
 Plot the Difference of the two assays (D = X-Y) vs.
the Average of the two assays (A = (X+Y)/2)
 Visually inspect the plot and see if there are any
trends in the plot  proportional bias
 Summarize the bias between the two assays by
the mean, SD, 95% CI  constant bias
 Modification: regress D with A, test if slope = 0
(Hawkins, 2002)
 A useful visual tool:
 transformation, heteroscedasticity, outliers,
curvature
13
Diagnostics
Bland-Altman graphical analysis
Me t
hod
Diagnostics
Bland Altman plot (continued)
Y v s . M
M
e
t4 hodY(lgTiter) 6
D
i
f
e
r
n
c
(
l
o
g
T
i
t
e
r
)
-0.5 0. 0.5 1.0
8
D if
f
er en
345678
v e ra g e
(
2
A
2
14
4
6
8
M e th o d
X
(l o g
Ti t e r)
 Methods:
 Regress the observed data of the new assay (Y)
with those of the reference assay (X)
 Minimize the squared deviations from the identity
line in the vertical direction
 Modifications: weighted least squares
 Assumptions:
 The reference assay (X) is error free, or the error
is relatively small compared to the range of the
measurements
 e.g. in clinical chemistry studies, the
measurement errors are minimal
15
Diagnostics
Ordinary least-squares regression
(continued)
 If measurement errors exist in both assays,
the estimates are biased
 slope tends to be smaller
 intercept tends to be larger
16
Diagnostics
Ordinary least-squares regression
(Passing and Bablok, 1983)
 A nonparametric approach - robust to outliers
 Methods:
Diagnostics
Passing-Bablok regression
 Estimate the slope by the shifted median of the slopes
between all possible sets of two points (Theil estimate)
 Confidence intervals by the rank techniques
 Assumptions:
 The measurement errors in both assays follow the
same type of distribution (not necessarily normal)
 The ratio of the variance is a constant (variance not
necessarily constant across the range of data)
 The sampling distributions of the samples are arbitrary
17
(Linnet, 1990)
 Methods:
Diagnostics
Deming regression
 Orthogonal least squares estimates: minimize the
squared deviation of the observed data from the
regression line
 Standard errors for the estimates obtained by
Jackknife method
 Weighted Deming regression when heteroscedastic
 Assumptions:
 Measurement errors for both assays follow
independent normal distributions with mean 0
 Error variances are assumed to be proportional
(variance not necessarily constant across the range of
18
data)
(Linnet, 1993)
 Electrolyte study (homogeneous variance):
Diagnostics
Comparison of the 3 regression methods
 OLS, Passing-Bablok: biased slope, large Type I
error, larger RMSE than Deming
 Deming: unbiased slope, correct Type I error
 Metabolite study (heterogeneous variance):
 All have unbiased slope estimates
 Weighted LS and weighted Deming are most efficient
 Type I error is large for OLS, weighted LS, Deming
and Passing-Bablok
 Presence of outliers:
 Passing-Bablok is robust to outliers
19
 Deming regression requires detection of outliers
 Statistical packages: SAS, Splus
 Other packages (for Bland-Altman plot, OLS regression,
Passing-Bablok regression, Deming regression):
 Analyse-it (Excel add-on): does not support
weighted Deming regression
 Method Validator (a freeware)
 CBStat (Linnet K.)
20
Diagnostics
Software
 Independent acceptance criteria for slope
and intercept estimates:
 e.g. slope estimate within (0.9, 1.1), intercept
estimate within (-0.2, 0.2)
 Drawback: asymmetrical acceptance
region across the data range
21
Diagnostics
Acceptance criteria for regression
type analysis
Diagnostics
Asymmetrical acceptance region
=
2
Y
2
4
6
B
i
a
s
=
M
e
t
h
o
d
Y
X
(
L
o
g
T
i
t
e
r
)
-1.5 -1.0 -0.5 0. 0.5 1.0 1.5
Y
MethodY(LgTiter)4
6
8
S
lo p e =(
A
s
0y
.
9
m
, 1
m
.
1
e
)
t
0. 2
=
8
M e th o d
22
+
1. 1
- 0. 2
2
4
6
+
0
8
X
M
(L
eo
th
g
oT
d i te
X r)
(
 Goals:
 to show that the new assay is ‘equivalent’ to the
reference assay
 to demonstrate that the bias between the two
assays is within some acceptable threshold
across the clinical range
 Acceptance Criteria:
EBias  EY  X   A
 Choice of tolerance level A:
 accuracy specification for the new assay
23
Diagnostics
Proposed acceptance criteria
Reference Assay:
X i  i   i
New Assay:
Yi      i  i
where i is the true concentration,
 i and i are the independent random measurement errors
Bias:
Yi  X i       1  i  i   i
Acceptance Criteria: E Yi  X i        1  i  A
24
Diagnostics
Mathematical models
{Int  (-0.2,0.2), Slope  (0.9,1.1) } vs. { A= 0.5, L=2, U=7}
Diagnostics
Comparison of the acceptance criteria:
B
i
a
s
:
M
e
t
h
o
d
Y
X
(
L
o
g
T
i
t
e
r
)
-1.5 -1.0 -0.5 0. 0.5 1.0 1.5
M
e
t
h
o
d
Y
(
L
g
T
i
t
e
r
)
2 3 4 5 6 7
A
c c ept
a
S
n
y
c
m
e
m
R
e
e
2 3 4 5 6 7
25
2 3 4 5 6 7
M e th o d
X
M
(L
eo
th
g
oT
d i te
X
criteria for the intercept and slope are dependent
Diagnostics
Acceptance region for the parameters:
S
l
o
p
e
(
B
t
a
)
0.8 0.9 1.0 1. 1.2
A c c e p ta n
- 00
.5
.0
0 .5
26
In te r c
e p t
( A
H0 : Bias  A
Diagnostics
Equivalence test
vs.
Ha : Bias  A
where A is the accuracy specification of the new assay
 Methods:
 If the 90% two-sided confidence interval of the
Bias lies entirely within the acceptance region
(- A, A), then the two assays are equivalent
 Deming-Jackknife is used to do the estimation
27
(a.k.a. errors-in-variables regression, a structural
or functional relationship model)
 Minimize the sum of squares:
Diagnostics
Deming regression:
n
2
2

S    xi  i     yi    i  


i 1
where  = Var()/Var() (assumed known or to be estimated)
 The solutions are given by:
1 
ˆ 
 S  S   S   S   4 S

2 S 
2
yy
xx
xy
xx
yy
2
xy


ˆ  y  ˆ x
 Weighted Deming regression:
wi 
28
1
1

SDi2
Xˆ i  Yˆi


2
 Duplicate measurements:
1
SD 
2N
2
X
1
 xi1  xi 2  , SD  2 N
2
SD
X
 ˆ 
SDY2
2
2
Y
2


y

y
 i1 i 2
 >2 replicates: residual errors by ANOVA
 Mis-specification of  (Linnet 1998):
 biased slope estimate
 large Type I error
29
Diagnostics
Estimation of  in Deming regression
to obtain the final parameter estimates and the SEs
 Omit one pair of data at a time, obtain the
Deming-regression estimates: ˆ , ˆ
i  i 
 The ith pseudo-values of the intercept and
slope are:
 i  nˆ   n  1ˆi 
i  n ˆ   n  1 ˆi 
 Final estimates and SEs for  and  are the
mean and standard error of i and i
30
Diagnostics
Jackknife estimation:
 At each nominal level , the ith pseudo-value
of the Bias is:




Biasi  n     1     n  1  (i )   (i )  1  




 The bias estimate and the SE at each nominal
level are the mean and SE of Biasi
 The 90% CI of the bias at each nominal level
are compared to the acceptance region (-A, A)
 The two assays are concluded to be
equivalent if all the CI lie entirely within (-A, A)
31
Diagnostics
Bias estimation by Jackknife
Example 1:
Diagnostics
methods comparison for two HIV-1 assays
N
e
w
M
t
h
o
d
(
l
g
T
i
e
r
)
3 4 5 6
M
e th o d s
3
32
4
R
5
6
e fe r e n c
e
Bland-Altman plot:
Diagnostics
potential outliers in the data
Difer-n0c.5=Nw-Refrnc(logTiter) 0. 0.5 1.0
B l a n d -A l
3
33
4
Av
5
6
e r a g e
o f
Identify outliers: fitting a linear regression
Diagnostics
line to the Bland Altman plot
34860
L
e
v
r
a
g
0.2 0.4 0.6 0.8 0.1
S
t
u
d
e
n
i
z
R
s
d
u
a
l
-3 -2 -1 0 1 2 3
R e s id u
L
a
e
l P
vl
e
o
r
t
34944
34851
34794
0 .0
0 .0
0 .0
0
1.0
0
2.0
3
01
4
0
20
30
40
50
34
Fi tte d
D
i ffe
S
a
re
m
n
p
c
le
es
Remove outliers: Bland-Altman plot shows
Diagnostics
no trend in Difference vs. Average
D-0.6 ifernc=Nw-Refrnc-0(.l4ogTiter) -0.2 0. 0.2 0.4
B l a n d -A
mean difference = 0.02
(95% CI: -0.06, 0.10)
slope = 0.033 (p-value =
0.5)
3
4
Av
35
5
6
e r a g e
o f
Regression analysis:
Diagnostics
results from the 3 methods are very similar
N
e
w
M
t
h
o
d
(
l
g
T
i
e
r
)
3 4 5 6
Re g re s
O LS :
Y =0
P as s i ngB
D emi ng-J a
3
36
4
R
5
6
e fe r e n c
e
Bias estimation: almost all 90% CI lie within
Diagnostics
the tolerance bounds (-0.2, +0.2)
D-0.6 ifernc=Nw-Refrnc(-l0o.g4Titer) -0.2 0. 0.2 0.4
E s ti m a te d
3
37
4
R
5
6
e fe r e n c
e
between EDTA Plasma and Serum
E
S
e
r
u
m
(
l
o
g
T
i
t
)
2 3 4 5 6 7 8
M
a tri x
Diagnostics
Example 2: to show matrix equivalency
2
3
4
5
6
ED
38
7
8
TA
( lo g
most titers higher than 1E5 IU/mL, heteroscedasticity?
Diagnostics
Bland-Altman plot on average titer:
D
i
f
e
r
n
c
=
S
u
m
E
D
T
A
(
l
o
g
i
t
e
r
)
-1.0 -0.5 0. 0.5
B l a n d -A
slope = 0.03 (p-value = 0.6)
mean difference = -0.06 (95% CI: -0.16, 0.04)
3
39
4
5
Av
6
7
e r a g e
o f
E
Checking for heteroscedasticity:
S
D
E
T
A
(
l
o
g
i
t
e
r
)
0.2 0.4 0.6 0.8 0.1 0.12 0.14
3
4
5
6
7
M ean
40
S
er um
S
D
e
r
u
m
(
l
o
g
T
i
t
)
0.2 0.4 0.6 0.8 0.1 0.12 0.14
E
D TA
Diagnostics
residual errors from random effects models
3
E
4
5
6
7
D
M
TA
e a(
n
l o
S
ge
T
  1:
Diagnostics
Pooled within-sample SD for EDTA = 0.0706
Pooled within-sample SD for Serum = 0.0715
V
a
r
(
E
D
T
A
o
r
s
)
/
V
a
(
S
e
u
m
E
r
o
s
)
0 2 4 6 8
Lam bda
M e d ia n
3
41
4
5
Av
L a mb d
6
7
e r a g e
o f
E
large variability at low titers due to sparse data fail to demonstrate equivalency at low end
Diagnostics
Bias estimation:
D-1.0 ifernc=Sum-EDTA(logiter) -0.5 0. 0.5
E s ti m a te d
3
4
5
ED
42
6
7
TA
( lo g
 Bland M., Altman D. (1986). ‘Statistical methods for assessing agreement
between two methods of clinical measurement’. Lancet 347: 307-310.
 Hawkins D. (2002). ‘Diagnostics for conformity of paired quantitative
measurements’. Stat in Med 21: 1913-1935.
 Lin L.K. (1989). ‘A concordance correlation coefficient to evaluate
reproducibility’. Biometrics 45: 255-268.
 Linnet K. (1990). ‘Estimation of the linear relationship between the
measurements of two methods with proportional bias’. Stat in Med 9: 14631473.
 Linnet K. (1993). ‘Evaluation of regression procedures for methods comparison
studies’. Clin Chem 39: 424-432.
 Linnet K. (1998). ‘Performance of Deming regression analysis in case of
misspecified analytical error ratio in method comparisons studies’. Clin Chem
44: 1024-1031.
 Linnet K. (1999). ‘Necessary sample size for method comparison studies based
on regression analysis’. Clin Chem 45: 882-894.
43
Diagnostics
References
 Passing H., Bablok W. (1983). ‘A new biometrical procedure for testing the
equality of measurements from two different analytical methods’. J Clin Chem
Clin Biochem 21: 709-720.
 Passing H., Bablok W. (1984). ‘Comparison of several regression procedures
for method comparison studies and determination of sample sizes’. J Clin
Chem Clin Biochem 22: 431-445.
 St. Laurent R.T. (1998). ‘Evaluating Agreement with a Gold Standard in Method
Comparison Studies’. Biometrics 54: 537-545.
44
Diagnostics
References (continued)