Toward a unified approach to fitting loss models
Download
Report
Transcript Toward a unified approach to fitting loss models
Toward a unified
approach to fitting
loss models
Jacques Rioux and Stuart
Klugman, for presentation at
the IAC, Feb. 9, 2004
Handout/slides
E-mail me
[email protected]
Overview
What problem is being addressed?
The general idea
The specific ideas
Models
to consider
Recording the data
Representing the data
Testing a model
Selecting a model
The problem
Too many models
books – 26 distributions!
Can mix or splice to get even more
Two
Data can be confusing
Deductibles,
limits
Too many tests and plots
Chi-square,
K-S, A-D, p-p, q-q, D
The general idea
Limited number of distributions
Standard way to present data
Retain flexibility on testing and selection
Distributions
Should be
Familiar
Few
Flexible
A few familiar distributions
Exponential
Only
Gamma
Two
parameters, a mode if a>1.
Lognormal
Two
one parameter
parameters, a mode
Pareto
Two
parameters, a heavy right tail
Flexible
Add by allowing mixtures
That is, f ( x) a1 f1 ( x)
ak f k ( x)
where a1 ak 1
and all a j > 0
Some restrictions:
Only
the exponential can be used more than
once.
Cannot use both the gamma and lognormal.
Why mixtures?
Allows different shape at beginning and
end (e.g. mode from lognormal, tail from
Pareto).
By using several exponentials can have
most any tail weight (see Keatinge).
Estimating parameters
Use only maximum likelihood
Asymptotically
optimal
Can be applied in all settings, regardless of
the nature of the data
Likelihood value can be used to compare
different models
Representing the data
Why do we care?
Graphical
tests require a graph of the
empirical density or distribution function.
Hypothesis tests require the functions
themselves.
What is the issue?
None if,
All
observations are discrete or grouped
No truncation or censoring
But if so,
For
discrete data the Kaplan-Meier productlimit estimator provides the empirical
distribution function (and is the nonparametric
mle as well).
Issue – grouped data
For grouped data,
If
completely grouped, the histogram
represents the pdf, the ogive the cdf.
If some grouped, some not, or multiple
deductibles, limits, our suggestion is to
replace the observations in the interval with
that many equally spaced points.
Review
Given a data set, we have the following:
A way
to represent the data.
A limited set of models to consider.
Parameter estimates for each model.
The remaining tasks are:
Decide
which models are acceptable.
Decide which model to use.
Example
The paper has two example, we will look
only at the second one.
Data are individual payments, but the
policies that produced them had different
deductibles (100, 250, 500) and different
maximum payments (1,000, 3,000, 5,000).
There are 100 observations.
Empirical cdf
Kaplan-Meier estimate
1
F-emp(x)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
1000
2000
3000
loss
4000
5000
6000
Distribution function plot
Plot the empirical and model cdfs together.
Note, because in this example the
smallest deductible is 100, the empirical
cdf begins there.
To be comparable, the model cdf is
calculated as
F ( x) F (d )
Fd ( x)
1 F (d )
Example model
All plots and tests that follow are for a
mixture of a lognormal and exponential
distribution. The parameters are
lognormal: 7.109459, 0.254236
exponential: 1839.174
a1 0.238301
Distribution function plot
Distribution function plot
1
0.9
0.8
F(x)
0.7
0.6
0.5
F-emp
F-model
0.4
0.3
0.2
0.1
0
0
1000
2000
3000
loss
4000
5000
6000
Confidence bands
It is possible to create 95% confidence
bands. That is, we are 95% confident that
the true distribution is completely within
these bands.
Formulas adapted from Klein and
Moeschberger with a modification for
multiple truncation points (their formula
allows only multiple censoring points).
CDF plot with bounds
F(x)
CDF plot with 95% bounds
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
F-emp
F-model
lower
upper
0
1000
2000
3000
loss
4000
5000
6000
Other CDF pictures
Any function of the cdf, such as the limited
expected value, could be plotted.
The only one shown here is the difference
plot – magnify the previous plot by plotting
the difference of the two distribution
functions.
CDF difference plot
CDF difference plot
0.6
0.5
0.4
0.3
Difference
0.2
lower
0.1
upper
0
-0.1 0
1000
2000
3000
-0.2
-0.3
loss
4000
5000
6000
Histogram plot
Plot a histogram of the data against the
density function of the model.
For data that were not grouped, can use
the empirical cdf to get cell probabilities.
Histogram plot
Histogram plot
0.0007
0.0006
0.0005
0.0004
hist
0.0003
model
0.0002
0.0001
0
0
1000
2000
3000
loss
4000
5000
6000
Hypothesis tests
Null-model fits
Alternative-it doesn’t
Three tests
Kolmogorov-Smirnov
Anderson-Darling
Chi-square
Kolmogorov-Smirnov
Test statistic is maximum difference
between the empirical and model cdfs.
Each difference is multiplied by a scaling
factor related to the sample size at that
point.
Critical values are way off when
parameters estimated from data.
Anderson-Darling
Test statistic looks complex:
[ Fe ( x) Fm ( x)]
A
f m ( x)dx
d F ( x )[1 F ( x )]
m
m
where e is empirical and m is model.
The paper shows how to turn this into a
sum.
More emphasis on fit in tails than for K-S
test.
2
u
2
Chi-square test
You have seen this one before.
It is the only one with an adjustment for
estimating parameters.
Results
K-S: 0.5829
A-D: 0.2570
Chi-square p-value of 0.5608
The model is clearly acceptable.
Simulation study needed to get p-values
for these tests. Simulation indicates that
the p-values are over 0.9.
Comparing models
Good picture
Better test numbers
Likelihood criterion such as Schwarz
Bayesian. The SBC is the loglikelihood
minus (r/2)ln(n) where r is the number of
parameters and n is the sample size.
Several models
Model Loglike
A-D
K-S
Chi-sq SBC
Exp
-628.23 1.2245 0.9739 0.1054 -630.53
Ln
-626.26 0.6682 0.9375 0.2126 -630.87
Gam
-627.35 0.8369 1.0355 0.2319 -631.96
L/E
-623.77 0.2579 0.5829 0.5608 -632.98
G/E
-623.64 0.2804 0.5773 0.5260 -632.85
L/E/E -623.39 0.1484 0.4494 0.3472 -637.21
G/E/E -623.26 0.1353 0.4652 0.3348 -637.08
Which is the winner?
Referee A – loglikelihood rules – pick
gamma/exp/exp mixture
This
is a world of one big model and the best is the
best, simplicity is never an issue.
Referee B – SBC rules – pick exponential
Parsimony
is most important, pay a penalty for extra
parameters.
Me – lognormal/exp. Great pictures, better
numbers than exponential, but simpler than
three component mixture.
Can this be automated?
We are working on software
Test version can be downloaded at
www.cbpa.drake.edu/mixfit.
MLEs are good. Pictures and test
statistics are not quite right.
May crash.
Here is a quick demo.