Transcript Slide 1

Predicting Parolee Risk of Recidivism
--Challenge of Finding Instruments with Sufficient Predictive Power
Association for Criminal Justice Research (California)
66th. Semi-annual Meeting, October 11- 12, 2007
Sheldon Zhang, SDSU
David Farabee, UCLA
Robert Roberts, CSU San Marcos
The Need for Reentry Risk Assessment





Five millions of adults on probation and parole nationwide.
High rates of incarceration in the U.S. means high volumes of
prisoner reentry. High rates of parole failures lead to additional
imprisonments.
Risk/needs assessments can best allocate resources and afford
appropriate supervision plans. These assessments can guide
sentencing, institutional placement, treatment plans, parole
supervision intensity, and the restrictiveness of conditions for
community reentry.
Risk/needs assessment has again gained traction in recent
years in correctional agencies in several states.
A recent study by the Girls Study Group identified some 300
risk/needs assessment tools of various kinds for youth
offenders alone. Most lack evidence of sufficient validation.
Many studies report reliability but not as much on validity*
*. Margaret A. Zahn. 2006. Issues in Assessing Risk with Delinquent Girls. Girls Study Group. Crime, Violence, and Justice Program, RTI.
Available at: http://girlsstudygroup.rti.org/docs/2006_NIJ_Conference_Risk_Assessment.pdf.
Need to Test and Validate Risk Instruments






Development of risk instruments is often based on specific
correctional populations, and does not transplant easily.
LSI-R model that was developed in Canada and found to be
predictive in Canadian correctional populations in several
studies.
Studies in Washington and Pennsylvania show that many
factors used in the LSI-R scale were not predictive of reoffending (Austin 2004).
In one study in Pennsylvania, only eight of the 54 LSI-R items
were found to be associated with recidivism. Significant interrater reliability problems were also found (Austin et al. 2003).
A risk assessment instrument needs to be tested in its intended
population.
Instruments developed and tested with general populations or
unintended populations may lead to over-classification (an
unreasonable number of false positives in either direction).
Predictive Accuracy of LSI-R*



In 1999, the Washington State Department of
Corrections began using LSI-R, as part of the
offender risk classification system.
A 2003 Institute study found that this instrument is
not a strong predictor of felony and violent felony
recidivism for Washington State offenders.
A later analysis again found that LSI-R as a whole
predicts felony sex recidivism with weak accuracy
(AUC=.65). Five items on the LSI-R can be
combined to predict felony sex recidivism with
moderate accuracy.
*. Robert Barnoski, 2006. Sex offender sentencing in Washington state: Predicting recidivism based on the LSI-R.
Available at: http://www.wsipp.wa.gov/rptfiles/06-02-1201.pdf.
Some Examples



Level of Service Inventory-Revised (LSI-R). Comprised of 54 static and
dynamic items across ten sub-scales (O’Keefe and Wensus, 2001);
developed in the late 1970s in Canada through a collaboration of probation
officers, correctional managers, practitioners and researchers (AUC .65 in a
Washington state validation study).
Washington State Department of Corrections Static Risk Instruction (based
on LSI-R) (AUC .74) (http://www.wsipp.wa.gov/rptfiles/07-03-1201.pdf ).
Virginia’s Risk Assessment Instrument, developed by the Virginia Criminal
Sentencing Commission for sentencing and diversion purposes
(http://www.ncsconline.org/WC/Publications/Res_Senten_RiskAssessPub.
pdf ):



Higher “risk scores” on the instrument have been associated with a greater
likelihood of recidivism
Diversion through risk assessment has produced positive net benefits for the
state
No AUC was computed.
Ways to Assess Risk Assessment Tools



Correlation analysis
Multivariate regression
Stepwise logistic regression
Area Under the ROC Curve
The best measure of predictive accuracy between risk
assessment and recidivism is the Area Under the
Receiver Operating Characteristic Curve. AUC
measures discrimination--the ability of the
instrument to correctly classify different levels of risk
in anticipation of recidivism.
Instrumentation: suppose we have a group of parolees
who were already correctly classified (those who
failed parole and those who didn’t). You randomly
select one who failed parole and one who didn’t and
developed a profile of risk factors. The one with a
higher level of risk should be the one who failed.
AUC calculates the percentage of randomly drawn
pairs for which the risk classification is correct. AUC
varies between .50 (pure chance) and 1.00 (prefect
prediction). AUC less than .60 is considered weak, .70
moderate, .80 strong.*
*T.G. Tape, 2003, Interpreting Diagnostic Tests, The Area Under the ROC Curve, Omaha:
University of Nebraska Medical Center, see: http://gim.unmc.edu/dxtests/roc3.htm.
Source: http://gim.unmc.edu/dxtests/roc3.htm
The Challenge of Finding Instruments with
Sufficient Predictive Power
—A Canadian Comparison Study

Assessment of five actuarial instruments and one guided
clinical instrument designed to assess risk for recidivism were
compared on 215 sex offenders released from prison for an
average of 4.5 years. These five actuarial instruments are
objectively scored and provide probabilistic estimates of risk
based on the empirical relationships between their combination
of items and the outcome of interest.






Violence Risk Appraisal Guide (VRAG) (Harris, Rice, & Quinsey,
1993),
Sex Offender Risk Appraisal Guide (SORAG) (Quinsey, Harris, Rice,
& Cormier, 1998)
Rapid Risk Assessment of Sexual Offense Recidivism (RRASOR)
(Hanson, 1997)
Static-99 (Hanson & Thornton, 1999)
Minnesota Sex Offender Screening Tool–Revised (MnSOST-R)
(Epperson, Kaul, & Hesselton, 1998).
Psychopathy Checklist–Revised (PCL-R) (Hare, 1991)
AUC of the Receiver Operating Characteristic
for the Six Risk Assessment Instruments
OUTCOME
Any
Re-offense
Serious
Re-offense
Sexual
Re-offense
RATE PCL-R
VRAG
SORAG RRASOR Static-99 MnSOST-R
38%
0.71
0.77
0.76
0.6
0.71
0.65
24%
0.65
0.69
0.73
0.65
0.70
0.58
9%
0.61
0.61
0.70
0.77
0.70
0.65
No one instrument was found to be superior in predicting recidivism outcomes.
Barbaree et al. 2001. Evaluating the predictive accuracy of six risk assessment instruments for adult sex offenders.
Criminal Justice and Behavior 28(4): 490-521.
Relative Predictive Accuracy of the RRASOR,
SACJ-Min and Static-99
Combined Sample (n = 1,208)
Rapists
Child Molesters
(n = 363) (n = 799)
ROC
Area 95% C.I.
r
95% C.I. ROC area
ROC area
Sexual recidivism
RRASOR .68
SACJ-Min
Static-99
.65-.72
.67
.71
.28
.63-.71
.68-.74
.23-.33
.23
.33
.68
.18-.28
.28-.38
.69
.69
.71
.68
.72
Any violent
Recidivism
RRASOR .64
SACJ-Min
Static-99
.60-.67
.64
.69
.22
.61-.68
.66-.72
.16-.27
.22
.32
.64
.16-.27
.27-.37
.66
.62
.69
.66
.71
R. Karl Hanson and David Thornton. 2002. Static 99: Improving Actuarial Risk Assessments for Sex Offenders, 1999-02.
Available at: http://ww2.ps-sp.gc.ca/publications/corrections/199902_e.pdf.
COMPAS

COMPAS (Correctional Offender Management and
Profiling Alternative Sanctions) is a computerized
database and analysis system for criminal justice
practitioners to make decisions regarding the
placement, supervision and case-management of
offenders in community and secure settings.

The system includes several modules:





risk/needs assessment,
criminal justice agency decision tracking,
treatment and intervention tracking,
outcome monitoring,
agency integrity and programming implementation monitoring.
COMPAS—Risk and Needs Assessment


CDCR adopted the risk/needs components.
Current study evaluates the risk assessment
component, which includes four dimensions:





recidivism
violence
failure to appear
community failure
Offenders are classified into three categories:
high, medium, and low risk.


Previous validation study by the instrument
developers (Northpointe) found encouraging
psychometric properties and concurrent
validity, based on retrospective data.
Our study attempts to address COMPAS’
predictive validity.

Observation period=365 days
Demographics

Show word file.
Status of COMPAS Subjects at One Year
Parolee Status One Year after Release
Number
Percent of
Sample
Percent of
Violation
Type
Continuous Parole--No Return to Custody
261
50.7
-----
Returned to Custody
254
49.3
-----
Total Sample
515
100.0
-----
Had Technical Violation
52
10.1
100.0
Returned for Technical Violation
48
9.3
92.3
247
48.0
100.0
Had Non-Technical Violation
COMPAS Recidivism Scale
(Outcome: Returned To Custody in 365 Days of Parole)
.
1
1.00
2
3
0.90
4
Cutpoints
True Positive Rate (Sensitivity)
0.80
5
6
0.70
0.60
7
0.50
8
0.40
Area Under Curve = 0.67
0.30
9
0.20
0.10
0.00
0.00
0.10
0.20
0.30
0.40
0.50
0.60
False Positive Rate (1 - Specificity)
0.70
0.80
0.90
1.00
COMPAS Community Non-Compliance Scale
(Outcome: Returned To Custody in 365 Days of Parole)
1.00
1
2
0.90
3
Cutpoints
True Positive Rate (Sensitivity)
0.80
4
5
0.70
6
0.60
7
0.50
0.40
8
Area Under Curve = 0.61
0.30
9
0.20
0.10
0.00
0.00
0.10
0.20
0.30
0.40
0.50
0.60
False Positive Rate (1 - Specificity)
0.70
0.80
0.90
1.00
Failure-To-Appear Risk Scale Score Decile
(Outcome: Technical Parole Violation in 365 Days of Parole)
1
1
0.9
4
3
2
Cutpoints
True Positive Rate (Sensitivity)
0.8
5
0.7
6
7
0.6
0.5
8
0.4
Area Under Curve = 0.64
9
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
False Positive Rate (1 - Specificity)
0.7
0.8
0.9
1
Statistical Analysis



AUC for COMPAS for Recidivism = .67
AUC for COMPAS for Non-Technical Parole
Violation = .61
Adding other static variables in existing CDCR
warehouse data can improve COMPAS
Recidivism subscale to .72.
Odds-Ratios from Logistic Regression of Return to Custody
within One Year on COMPAS Risk Measures and Parolee
Characteristics (Males only, N = 457)
Predictor
Model 1
Model 2
Model 3
Model 4
Failure-to-Appear Risk Decile
---
1.06
1.00
---
Violence Risk Decile
---
0.99
0.99
---
Community Non-Compliance Risk Decile
---
1.05
~1.08
---
Recidivism Risk Decile
---
***1.21
***1.23
***1.24
0.99
---
1.02
1.02
Number Prior Prison Incarcerations
**1.12
---
1.06
~1.08
Paroled to Region III
***.41
***0.44
***0.43
1.02
1.01
---
Age
Recidivism Risk of Principal Commitment
Offense
African American
~1.52
---
1.45
---
0.89
---
0.76
---
**2.33
---
*2.23
*2.09
Test Accuracy (AUC)
0.68
0.67
0.72
0.71
Likelihood Ratio Chi-Square
43.33
42.35
73.26
63.16
Mexican
Latino
Note: ~: p < .10; *: p < .05; **: p < .01; ***: p < .001; two-tailed tests.
Next Step



Search for static variables to increase AUC.
Wait for larger sample size for validation.
Explore possibilities to conduct a head-to-head
comparison between parole agents’ judgments and
COMPAS assessment.


Example: In 1998, ADJC collaborated with NCCD to
develop the Arizona Risk/Needs Instrument.
Subsequent validation found the assessment method was
less accurate at predicting risk than probation officer’s
judgments.