Practical GLM Analysis of Homeowners David Cummings State Farm Insurance Companies

Download Report

Transcript Practical GLM Analysis of Homeowners David Cummings State Farm Insurance Companies

Practical GLM Analysis
of Homeowners
David Cummings
State Farm Insurance Companies
Overview
• How are GLMs different?
– Practical Implications
• Modeling Deductibles in GLMs
How are GLMs different?
• Multivariate Analysis
• Statistical Framework
• Flexible Modeling Tool
Multivariate Analysis
• Multivariate analyses reduce bias
• Practical implications
– Requires analysis of all rate factors
– Ensures consistency in analysis
– May change your processes
Consistent Analysis
• Consistent Exposure Base
Pure Premium Relativities
5
4
3
2
1
0
0
200,000
400,000
600,000
Amount of Insurance
Earned Policies
Earned Exposure
800,000
Statistical Framework
• Enhances the analysis
• Practical Implications
– Re-learn hypothesis testing and
analysis of standard errors
– Different application of “Credibility”
Flexible Modeling Tool
• Allows for many analyses
• Practical Implications
– Freq/Severity vs. Pure Premium vs.
Loss Ratio
– Design an analysis process
– Easily accommodates new data
– Fight the urge to overanalyze
Modeling Deductibles
• Traditional Deductible Analyses
• GLM Approaches to Deductibles
• Tests on simulated data
Empirical Method
All losses at $500 deductible $1,000,000
Losses eliminated by
$1000 deductible
Loss Elimination Ratio
$ 100,000
10%
Empirical Method
• Pros
– Simple
• Cons
– Need credible data at low deductible
– No $1000 deductible data is used to
price the $1000 deductible
Loss Distribution Method
• Fit a severity distribution to data
0
2000
4000
6000
8000
10000
Loss Distribution Method
• Fit a severity distribution to data
• Calculate expected value of truncated
distribution
0
2000
4000
6000
8000
10000
Loss Distribution Method
• Pros
– Provides framework to relate data at
different deductibles
– Direct calculation for any deductible
• Cons
– Need to reflect other rating factors
– Framework may be too rigid
Complications
• Deductible truncation is not clean
• “Pseudo-deductible” effect
– Due to claims awareness/self-selection
– May be difficult to detect in severity
distribution
0
2000
4000
6000
8000
10000
GLM Modeling Approaches
1. Fit severity distribution using other
rating variables
2. Use deductible as a variable in
severity/frequency models
3. Use deductible as a variable in
pure premium model
GLM Approach 1
– Fit Distribution w/ variables
• Fit a severity model
• Linear predictor relates to untruncated
mean
• Maximum likelihood estimation adjusted
for truncation
• Reference:
– Guiahi, “Fitting Loss Distributions with
Emphasis on Rating Variables”, CAS Winter
Forum, 2001
GLM Approach 1
– Fit Distribution w/ variables
X = untruncated random variable ~ Gamma
Y = loss data, net of deductible d
log(  X )   0  1v1     n vn
f X ( y  d; X )
fY ( y ) 
1  FX (d ;  X )
GLM Approach 1
– Fit Distribution w/ variables
• Pros
– Applies GLM within framework
– Directly models truncation
• Cons
– Non-standard GLM application
– Difficult to adapt to rate plan
– No frequency data used in model
Practical Issues
• No standard statistical software
– Complicates analysis
– Less computationally efficient
log(  X )   0  1v1     n vn
f X ( y  d; X )
fY ( y ) 
1  FX (d ;  X )
Not a member of Exponential Family of distributions
Practical Issues
• No clear translation into a rate plan
– Deductible effect depends on mean
– Mean depends on all other variables
– Deductible effect varies by other variables
log(  X )   0  1v1     n vn
f X ( y  d; X )
fY ( y ) 
1  FX (d ;  X )
Practical Issues
• No use of frequency information
– Frequency effects derived from
severity fit
1  FX ( y  d ;  X )
– Loss of information
GLM Approach 2
-- Frequency/Severity Model
• Standard GLM approach
• Fit separate frequency and
severity models
• Use deductible as independent
variable
GLM Approach 2
-- Frequency/Severity Model
• Pros
– Utilizes standard GLM packages
– Incorporates deductible effects on
frequency and severity
– Allows model forms that fit rate plan
• Cons
– Potential inconsistency of models
– Specification of deductible effects
Test Data
• Simulated Data
– 1,000,000 policies
– 80,000 claims
• Risk Characteristics
–
–
–
–
Amount of Insurance
Deductible
Construction
Alarm System
• Gamma Severity Distribution
• Poisson Frequency Distribution
Conclusions from Test Data
– Frequency/Severity Models
• Deductible as categorical variable
– Good overall fit
– Highly variable estimates for higher
or less common deductibles
– When amount effect is incorrect,
interaction term improves model fit
Severity Relativities
Using Categorical Variable
3.5
3
2.5
2
1.5
1
0.5
0
0
2000
4000
6000
8000
10000
Conclusions from Test Data
– Frequency/Severity Models
• Deductible as continuous variable
– Transformations with best likelihood
• Ratio of deductible to coverage amount
• Log of deductible
– Interaction terms with amount
improve model fit
– Carefully examine the results for
inconsistencies
Frequency Relativities
1.2
Coverage
Amount
1
0.8
100,000
500,000
0.6
0.4
0.2
0
0
1000
2000
3000
Deductible
4000
5000
Severity Relativities
1.2
Coverage
Amount
1
0.8
100,000
500,000
0.6
0.4
0.2
0
0
1000
2000
3000
Deductible
4000
5000
Pure Premium Relativities
1.2
Coverage
Amount
1
0.8
100,000
500,000
0.6
0.4
0.2
0
0
1000
2000
3000
Deductible
4000
5000
GLM Approach 3
– Pure Premium Model
• Fit pure premium model using
Tweedie distribution
• Use deductible as independent
variable
GLM Approach 3
– Pure Premium Model
• Pros
– Incorporates frequency and severity
effects simultaneously
– Ensures consistency
– Analogous to Empirical LER
• Cons
– Specification of deductible effects
Conclusions from Test Data
– Pure Premium Models
• Deductible as categorical variable
– Good overall fit
– Some highly variable estimates
• Good fit with some continuous
transforms
– Can avoid inconsistencies with
good choice of transform
Extension of GLM
– Dispersion Modeling
• Double GLM
• Iteratively fit two models
–Mean model fit to data
–Dispersion model fit to residuals
• Reference
Smyth, Jørgensen, “Fitting Tweedie’s
Compound Poisson Model to Insurance
Claims Data: Dispersion Modeling,”
ASTIN Bulletin, 32:143-157
Double GLM in Modeling
Deductibles
• Gamma distribution assumes that
variance is proportional to µ2
• Deductible effect on severity
– Mean increases
– Variance increases more gradually
• Double GLM significantly
improves model fit on Test Data
– More significant than interactions
Pure Premium Relativities
Tweedie Model – $500,000 Coverage Amount
1.1
1
0.9
0.8
0
1000
2000
3000
4000
Deductible
Constant Dispersion
Double GLM
5000
Conclusion
• Deductible modeling is difficult
• Tweedie model with Double GLM
seems to be the best approach
• Categorical vs. Continuous
– Need to compare various models
• Interaction terms may be
important