Regression Analysis - School of Public Health

Download Report

Transcript Regression Analysis - School of Public Health

Chapter 9.2
ROC Curves
How does this relate
to logistic regression?
Two Types of Error
False positive (“false alarm”), FP
alarm sounds but person is not carrying metal
=1-specificity
False negative (“miss”), FN
alarm doesn’t sound but person is carrying metal
= 1-sensitivity
Slide copied from : Lecture on Cost-Sensitive Classifier Evaluation by Robert
Holte at the Computing Science Dept. University of Alberta
2
ROC
Receiver Operating Characteristic
(historic name from radar studies)
Relative Operating Characteristic
(psychology, psychophysics)
Operating Characteristic
(preferred by some)
Slide adapted from :
An Overview of Contemporary ROC Methodology in Medical Imaging and Computer-Assist
Modalities by Robert F. Wagner, Ph.D., OST, CDRH, FDA
3
Non-diseased
patient
Diseased
patient
Threshold
Test result value
or
likelihood that patient is diseased –>P( Y = 1) = p (based on proc logistic)
Slide adapted from :An Overview of Contemporary ROC Methodology in Medical Imaging and
Computer-Assist Modalities by Robert F. Wagner, Ph.D., OST, CDRH, FDA
4
Typical Results of Testing
Non-diseased
patient
Diseased
patient
Threshold
False Positives
False Negatives
Negative Test
p<0.5
Positive test
p>0.5
Test result value ( Logistic P(Y=1) = p )
Slide adapted from :An Overview of Contemporary ROC Methodology in Medical Imaging and
Computer-Assist Modalities by Robert F. Wagner, Ph.D., OST, CDRH, FDA
5
Non-diseased
patient
Threshold
Diseased
patient
TPF, sensitivity
p>0.7 is positive test
less aggressive
mindset
FPF, 1-specificity
Slide adapted from :An Overview of Contemporary ROC Methodology in Medical Imaging and
Computer-Assist Modalities by Robert F. Wagner, Ph.D., OST, CDRH, FDA
6
Non-diseased
patient
Threshold
Diseased
patient
TPF, sensitivity
p>0.5 is positive test
moderate
mindset
FPF, 1-specificity
Slide adapted from :An Overview of Contemporary ROC Methodology in Medical Imaging and
Computer-Assist Modalities by Robert F. Wagner, Ph.D., OST, CDRH, FDA
7
Non-diseased
patient
Threshold
Diseased
patient
TPF, sensitivity
p>0.3 is positive test
more
aggressive
mindset
FPF, 1-specificity
Slide adapted from :An Overview of Contemporary ROC Methodology in Medical Imaging and
Computer-Assist Modalities by Robert F. Wagner, Ph.D., OST, CDRH, FDA
8
Non-diseased
patients
Threshold
Diseased
patients
TPF, sensitivity
p>0 is positive test
Entire ROC curve
FPF, 1-specificity
Slide adapted from :An Overview of Contemporary ROC Methodology in Medical Imaging and
Computer-Assist Modalities by Robert F. Wagner, Ph.D., OST, CDRH, FDA
9
TPF, sensitivity
Entire ROC curve
Skill to predict Y=1 correctly
FPF, 1-specificity
Slide adapted from :An Overview of
Contemporary ROC Methodology in
Medical Imaging and Computer-Assist
Modalities by Robert F. Wagner, Ph.D.,
OST, CDRH, FDA
10
Suppose the n individuals undergo a test for
predicting the event and the test is based on the
estimated probability of the event (p).
Higher values of this estimated probability are
assumed to be associated with the event.
A receiver operating characteristic (ROC) curve can
be constructed by varying the cut-point that
determines which estimated event probabilities are
considered to predict the event.
The statistic c estimates the area under the ROC
curve.
11
ROC Curves in SAS
/* area under the curve is c statistic – generally speaking
bigger c is better */
proc logistic descending;
model event=diabetes gender diabetes_gender/outroc=roc1;
RUN;
/* plotting the roc curve */
symbol1 i=join v=none c=blue;
proc gplot data=roc1;
title 'ROC Curve';
plot _sensit_*_1mspec_=1 / vaxis=0 to 1 by .1 cframe=ligr;
run;
Model event=diabetes gender
diabetes_gender
Association of Predicted Probabilities and Observed Responses
Percent Concordant
47.7
Somers' D
0.206
Percent Discordant
27.1
Gamma
0.276
Percent Tied
25.2
Tau-a
0.055
Pairs
41741
c
0.603
Measures area under
ROC curve
13
Model event=diabetes gender
diabetes_gender
14
( Chest film study by E. James Potchen, M.D., 1999 )
15
16
ROC curve in Logistic
Regression
Goto:www.biostat.umn.edu/~susant/PH6415DATA.html
 C
• Slim down data to contain any event, history of
diabetes, gender and history of hypertension.
• Use Proc Logistic to model the relationship
between cardiac event and diabetes history,
gender and hypertension history.
• Use Gplot to plot the ROC curve.
• Compare the value of c to 0.603 <- our previous
value.
17