Application of the Empirical Bayes Method

Transcript Application of the Empirical Bayes Method

The Empirical Bayes Method
for Safety Estimation
Doug Harwood
MRIGlobal
Kansas City, MO
Key Reference
Hauer, E., D.W. Harwood, F.M. Council, M.S.
Griffith, “The Empirical Bayes method for
estimating safety: A tutorial.”
Transportation Research Record 1784, pp.
126-131. National Academies Press,
Washington, D.C.. 2002
http://www.ctre.iastate.edu/educweb/CE55
2/docs/Bayes_tutor_hauer.pdf
The Problem

You are a safety engineer for a highway
agency. The agency plans next year to
implement a countermeasure that will
reduce crashes by 35% over the next
three years. To estimate the benefits
of this countermeasure, what safety
measure will you multiply by 0.35?
What Do We Need To Know?
You need to know – or, rather, estimate
– what would be expected to happen in
the future if no action is taken
 Then, you can apply crash modification
factors (CMFs) for the known effects of
planned actions to estimate their effects
quantitatively

Common Approach:
Use Last 3 Years of Crash Data
Observed
Crashes
2008
2009
2010
30
19
21
More Data Gives a Different Result
Observed
Crashes
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
22
23
16
16
9
14
17
30
19
21
RTM Example with Average Observed
Crashes
7
6
Crashes
5
3- Year average (Xa)
Long-term
average (m)
Random
error
4
3
2
1
0
1993
1995
1997
1999
Year
2001
2003
2005
“True Safety Impact of a Measure”
7
3-year average
‘before’ (Xa)
6
Long-term
average (m)
Crashes
5
Observed safety
effect
4
True safety
effect
3
2
3-year
average
‘after’
1
0
1993
1995
1997
1999
2001
Year
2003
2005
2007
Regression to the mean problem …
High crash locations are chosen for one reason
(high number of crashes!) – might be truly high or
might be just random variation
 Even with no treatment, we would expect, on
average, for this high crash frequency to
decrease
 This needs to be accounted for, but is often not,
e.g., reporting crash reductions after treatment
by comparing before and after frequencies over
short periods

The “imprecision” problem …
Assume 100 crashes per year,
and 3 years of data, we can
reliably estimate the number of
crashes per year with (Poisson)
standard deviation of about…
or 5.7% of the mean
However, if there are relatively
few crashes per time period (say, 1
crash per 10 years) the estimate
varies greatly …
or 180% of the mean!
Things change…
BEWARE about assuming that
everything will remain the same ….
 Future conditions will not be identical to
past conditions
 Most especially, traffic volumes will
likely change
 Past trends can help forecast future
volume changes

Focus on Crash Frequency vs. AADT
Relationships: Use of Crash Rates May
Be Misleading
30
F1
Crash Frequency
25
R1
20
C1
F2
15
F3
10
E1
5
C2
E2
0
0
5000
10000
15000
AADT
20000
25000
Before
30000
After
The Empirical Bayes Approach

Empirical Bayes: an approach to estimating
what will crashes will occur in the future if
no countermeasure is implemented (or what
would have happened if no countermeasure
had been implemented)
 Simply
assuming that what occurred in a recent
short-term “before period” will happen again in
the future is naïve and potentially very
inaccurate
 Yet, this assumption has been the norm for
many years
The Empirical Bayes Approach
The observed crash history for the site
being analyzed is one useful and important
piece of information
 What other information do we have
available?

The Empirical Bayes Approach
We know the short-term crash history for the
site
 The long-term average crash history for that
site would be even better, BUT…

 Long-term
crash records may not available
 If the average crash frequency is low, even the longterm average crash frequency may be imprecise
 Geometrics, traffic control, lane use, and other site
conditions change over time

We can get the crash history for other similar
sites, referred to as a REFERENCE GROUP
Empirical Bayes
Increases precision
 Reduced RTM bias
 Uses information from the site, plus …
 Information from other, similar sites

Safety Performance Functions
SPF = Mathematical relationship between
crash frequency per unit of time (and road
length) and traffic volumes (AADT)
30
Crash Frequency
25
20
15
10
5
0
0
5000
10000
15000
20000
25000
30000
AADT
3-17
How Are SPFs Derived?
SPFs are developed using negative binomial
regression analysis
 SPFs are based on several years of crash
data
 SPFs are specific to a given reference group
of sites and severity level

 Different
road types = different SPFs
 Different severity levels = different SPFs
3-18
The overdispersion parameter
The negative binomial is a generalized Poisson where
the variance is larger than the mean (overdispersed)
 The “standard deviation-type” parameter of the
negative binomial is the overdispersion parameter φ
 variance = η[1+η/(φL)]
 Where …

 μ=average
crashes/km-yr (or /yr for intersections)
 η=μYL (or μY for intersections) = number of crashes/time
 φ=estimated by the regression (units must be
complementary with L, for intersections, L is taken as one)
SPF Example
Regression model for total crashes
at rural 4-leg intersections with
minor-road STOP control
Np= exp(-8.69 + 0.65 lnADT1 + 0.47 lnADT2)
where:
Np = Predicted number of intersection-related crashes
per year within 250 ft of intersection
ADT1 = Major-road traffic flow (veh/day)
ADT2 = Minor-road traffic flow (veh/day)
3-20
Calculating the Long-Term Average
Expected Crash Frequency
The estimate of expected crash
frequency:
Ne
Expected
Accident
Frequency
=
w (Np)
Predicted
Accident
Frequency
+
(1 – w) (No)
Observed
Accident
Frequency
Weight (w; 0<w<1) is calculated from
the overdispersion parameter
3-21
Weight (w) Used in EB
Computations
w = 1 / ( 1 + k Np)
w = weight
k = overdispersion parameter for the
SPF
Np = predicted accident frequency for
site
3-22
Graphical Representation of the EB Method
3-23
Predicting Future Safety Levels from
Past Safety Performance
Ne(future) = Ne(past) x (Np(future) / Np(past))
Ne = expected accident frequency
Np = predicted accident frequency
3-24
Predicting Future Safety Levels from
Past Safety Performance

The Np(future)/Np(past) ratio can reflect
changes in:
 Traffic
volume
 Countermeasures (based on CMFs)
3-25
CMFs—How to Use Them

CMFs are expressed as a decimal
factor:
 CMF
of 0.80 indicates a 20% crash
reduction
 CMF of 1.20 indicates a 20% crash increase
CMFs—How to Use Them

Expected crash frequencies and CMFs
can be multiplied together:
Ne(with) = Ne(without) CMF
Crashes Reduced = Ne(without) - Ne(with)
CMFs—Single Factor

CMF for shoulder rumble strips
 Rural
freeways (CMFTOT = 0.79)
Ne(with) = Ne(without) x 0.79
3-28
CMF Functions
CMFs for Lane Width (two-lane rural roads) (Harwood et al.,
2000)
3-29
CMFs for Combined
Countermeasures

CMFs can be multiplied together if their
effects are independent:
Ne(with) = Ne(without) CMF1 CMF2
Are countermeasure effects
independent?
EB applications
HSM
 IHSDM
 Safety Analyst

EB applications
HSM Part C
 Estimate long-term expected crash frequency
for a location under current conditions
 Estimate long-term expected crash frequency
for a location under future conditions
 Estimate long-term expected crash frequency
for a location under future conditions with
one or more countermeasures in place
HSM Part B
 Evaluate countermeasure effectiveness using
before and after data
EB applications
Site-Specific EB Method
 Based on equations in this presentation
Project-Level EB Method
 If project is made up of components
with different SPFs, then there is no
single value of k, the overdispersion
parameter
EB Before-After Effectiveness Evaluation
 See Chapter 9 in HSM Part B
Questions?