Toward a unified approach to fitting loss models

Transcript Toward a unified approach to fitting loss models

Toward a unified
approach to fitting
loss models
Jacques Rioux and Stuart
Klugman, for presentation at
the IAC, Feb. 9, 2004
Handout/slides
E-mail me
 [email protected]

Overview
What problem is being addressed?
 The general idea
 The specific ideas

 Models
to consider
 Recording the data
 Representing the data
 Testing a model
 Selecting a model
The problem

Too many models
books – 26 distributions!
 Can mix or splice to get even more
 Two

Data can be confusing
 Deductibles,

limits
Too many tests and plots
 Chi-square,
K-S, A-D, p-p, q-q, D
The general idea
Limited number of distributions
 Standard way to present data
 Retain flexibility on testing and selection

Distributions

Should be
 Familiar
 Few
 Flexible
A few familiar distributions

Exponential
 Only

Gamma
 Two

parameters, a mode if a>1.
Lognormal
 Two

one parameter
parameters, a mode
Pareto
 Two
parameters, a heavy right tail
Flexible
Add by allowing mixtures
 That is, f ( x)  a1 f1 ( x) 
 ak f k ( x)
where a1   ak  1
and all a j > 0
 Some restrictions:

 Only
the exponential can be used more than
once.
 Cannot use both the gamma and lognormal.
Why mixtures?
Allows different shape at beginning and
end (e.g. mode from lognormal, tail from
Pareto).
 By using several exponentials can have
most any tail weight (see Keatinge).

Estimating parameters

Use only maximum likelihood
 Asymptotically
optimal
 Can be applied in all settings, regardless of
the nature of the data
 Likelihood value can be used to compare
different models
Representing the data

Why do we care?
 Graphical
tests require a graph of the
empirical density or distribution function.
 Hypothesis tests require the functions
themselves.
What is the issue?

None if,
 All
observations are discrete or grouped
 No truncation or censoring

But if so,
 For
discrete data the Kaplan-Meier productlimit estimator provides the empirical
distribution function (and is the nonparametric
mle as well).
Issue – grouped data

For grouped data,
 If
completely grouped, the histogram
represents the pdf, the ogive the cdf.
 If some grouped, some not, or multiple
deductibles, limits, our suggestion is to
replace the observations in the interval with
that many equally spaced points.
Review

Given a data set, we have the following:
 A way
to represent the data.
 A limited set of models to consider.
 Parameter estimates for each model.

The remaining tasks are:
 Decide
which models are acceptable.
 Decide which model to use.
Example
The paper has two example, we will look
only at the second one.
 Data are individual payments, but the
policies that produced them had different
deductibles (100, 250, 500) and different
maximum payments (1,000, 3,000, 5,000).
 There are 100 observations.

Empirical cdf
Kaplan-Meier estimate
1
F-emp(x)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
1000
2000
3000
loss
4000
5000
6000
Distribution function plot
Plot the empirical and model cdfs together.
Note, because in this example the
smallest deductible is 100, the empirical
cdf begins there.
 To be comparable, the model cdf is
calculated as

F ( x)  F (d )
Fd ( x) 
1  F (d )
Example model

All plots and tests that follow are for a
mixture of a lognormal and exponential
distribution. The parameters are
lognormal:   7.109459,   0.254236
exponential:   1839.174
a1  0.238301
Distribution function plot
Distribution function plot
1
0.9
0.8
F(x)
0.7
0.6
0.5
F-emp
F-model
0.4
0.3
0.2
0.1
0
0
1000
2000
3000
loss
4000
5000
6000
Confidence bands
It is possible to create 95% confidence
bands. That is, we are 95% confident that
the true distribution is completely within
these bands.
 Formulas adapted from Klein and
Moeschberger with a modification for
multiple truncation points (their formula
allows only multiple censoring points).

CDF plot with bounds
F(x)
CDF plot with 95% bounds
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
F-emp
F-model
lower
upper
0
1000
2000
3000
loss
4000
5000
6000
Other CDF pictures
Any function of the cdf, such as the limited
expected value, could be plotted.
 The only one shown here is the difference
plot – magnify the previous plot by plotting
the difference of the two distribution
functions.

CDF difference plot
CDF difference plot
0.6
0.5
0.4
0.3
Difference
0.2
lower
0.1
upper
0
-0.1 0
1000
2000
3000
-0.2
-0.3
loss
4000
5000
6000
Histogram plot
Plot a histogram of the data against the
density function of the model.
 For data that were not grouped, can use
the empirical cdf to get cell probabilities.

Histogram plot
Histogram plot
0.0007
0.0006
0.0005
0.0004
hist
0.0003
model
0.0002
0.0001
0
0
1000
2000
3000
loss
4000
5000
6000
Hypothesis tests
Null-model fits
 Alternative-it doesn’t
 Three tests

 Kolmogorov-Smirnov
 Anderson-Darling
 Chi-square
Kolmogorov-Smirnov
Test statistic is maximum difference
between the empirical and model cdfs.
Each difference is multiplied by a scaling
factor related to the sample size at that
point.
 Critical values are way off when
parameters estimated from data.

Anderson-Darling

Test statistic looks complex:
[ Fe ( x)  Fm ( x)]
A 
f m ( x)dx
d F ( x )[1  F ( x )]
m
m
 where e is empirical and m is model.
 The paper shows how to turn this into a
sum.
 More emphasis on fit in tails than for K-S
test.
2
u
2
Chi-square test
You have seen this one before.
 It is the only one with an adjustment for
estimating parameters.

Results
K-S: 0.5829
 A-D: 0.2570
 Chi-square p-value of 0.5608
 The model is clearly acceptable.
Simulation study needed to get p-values
for these tests. Simulation indicates that
the p-values are over 0.9.

Comparing models
Good picture
 Better test numbers
 Likelihood criterion such as Schwarz
Bayesian. The SBC is the loglikelihood
minus (r/2)ln(n) where r is the number of
parameters and n is the sample size.

Several models
Model Loglike
A-D
K-S
Chi-sq SBC
Exp
-628.23 1.2245 0.9739 0.1054 -630.53
Ln
-626.26 0.6682 0.9375 0.2126 -630.87
Gam
-627.35 0.8369 1.0355 0.2319 -631.96
L/E
-623.77 0.2579 0.5829 0.5608 -632.98
G/E
-623.64 0.2804 0.5773 0.5260 -632.85
L/E/E -623.39 0.1484 0.4494 0.3472 -637.21
G/E/E -623.26 0.1353 0.4652 0.3348 -637.08
Which is the winner?

Referee A – loglikelihood rules – pick
gamma/exp/exp mixture
 This
is a world of one big model and the best is the
best, simplicity is never an issue.

Referee B – SBC rules – pick exponential
 Parsimony
is most important, pay a penalty for extra
parameters.

Me – lognormal/exp. Great pictures, better
numbers than exponential, but simpler than
three component mixture.
Can this be automated?
We are working on software
 Test version can be downloaded at
www.cbpa.drake.edu/mixfit.
 MLEs are good. Pictures and test
statistics are not quite right.
 May crash.
 Here is a quick demo.


Toward a unified approach to fitting loss models

Transcript Toward a unified approach to fitting loss models

Directory