modern reserving techniques for the insurance business

Download Report

Transcript modern reserving techniques for the insurance business

Katholieke Universiteit Leuven
FACULTEIT WETENSCHAPPEN
DEPARTEMENT WISKUNDE
MODERN RESERVING TECHNIQUES
FOR THE INSURANCE BUSINESS
door
Tom HOEDEMAKERS
Promotor:
Prof. Dr. J. Beirlant
Prof. Dr. J. Dhaene
Proefschrift ingediend tot het
behalen van de graad van
Doctor in de Wetenschappen
Leuven 2005
Acknowledgments
Four years ago I became part of the stimulating and renowned academic
environment at K. U. Leuven, the Department of Applied Economics, and
the AFI Leuven Research Center in particular. As a researcher, I had the
opportunity to interact, work with and learn from many interesting people.
I consider myself extremely fortunate to have had the following people in
support for the realization of this thesis.
I feel very privileged to have worked with my two supervisors, Jan
Beirlant and Jan Dhaene. To each of them I owe a great debt of gratitude
for their continuous encouragement, patience, inspiration and friendship.
I especially want to thank them for the freedom they allowed me to seek
satisfaction in research, for supporting me in my choices and for believing
in me. They carefully answered the many (sometimes not well-defined)
questions that I had and they always found a way to make themselves
available for yet another meeting. Each chapter of this thesis has benefitted
from their critical comments, which often inspired me to do further research
and to improve the vital points of the argument. It has been a privilege
to study under Jan and Jan, and to them goes my highest personal and
professional respect.
I am also grateful to Marc Goovaerts for giving me the opportunity to
start my thesis in one of the world-leading actuarial research centers. Marc
Goovaerts and Jan Dhaene have taught me a great deal about the field of
actuarial science by sharing with me the joy of discovery and investigation
that is the heart of research. They brought me in contact with a lot of
interesting people in the actuarial world and gave me the possibility to
present my work at different congresses all over the world.
I would also like to thank the other members of the doctoral committee
Michel Denuit, Rob Kaas, Wim Schoutens and Jef Teugels for their valuable contributions as committee members. Their detailed comments as
i
ii
Acknowledgments
well as their broader reactions definitely helped me to improve the quality
of my research and its write-up.
Many thanks go also to my (ex-)colleagues Ales, Björn, David, Grzegorz,
Katrien, Marc, Piotr, Steven and Yuri for their enthusiasm and stimulating cooperation. A lot of sympathy goes to Emiliano Valdez for the serious
discussions, and even more important, for the fun we had during his stay
at the K. U. Leuven in the beginning of this year.
After the professionals, a word of thanks is addressed to all my friends
and fellow students for their friendship and support.
Finally, not least, I would like to thank my parents and my sister Leen
for their love, guidance and support. They constantly reminded me of their
confidence and encouraged me to pursue my scientific vocation, especially
in moments of doubt. You have always believed in me and that was a great
moral support.
Tom
Leuven, 2005
Table of Contents
Acknowledgments
i
Preface
vii
Publications
xix
List of abbreviations and symbols
xxi
1 Risk and comonotonicity in the actuarial world
1.1 Fundamental concepts in actuarial risk theory . .
1.1.1 Dependent risks . . . . . . . . . . . . . .
1.1.2 Risk measures . . . . . . . . . . . . . . .
1.1.3 Actuarial ordering of risks . . . . . . . . .
1.2 Comonotonicity . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
2
4
10
15
2 Convex bounds
21
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Convex bounds for sums of dependent random variables . . 23
2.2.1 The comonotonic upper bound . . . . . . . . . . . . 25
2.2.2 The improved comonotonic upper bound . . . . . . . 26
2.2.3 The lower bound . . . . . . . . . . . . . . . . . . . . 28
2.2.4 Moments based approximations . . . . . . . . . . . . 29
2.3 Upper bounds for stop-loss premiums . . . . . . . . . . . . 30
2.3.1 Upper bounds based on lower bound plus error term 31
2.3.2 Bounds by conditioning through decomposition of
the stop-loss premium . . . . . . . . . . . . . . . . . 33
2.3.3 Partially exact/comonotonic upper bound . . . . . . 35
2.3.4 The case of a sum of lognormal random variables . . 35
iii
iv
Table of Contents
2.4
.
.
.
.
.
.
.
.
.
.
.
.
.
47
48
52
56
58
61
62
68
68
72
77
84
89
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
93
93
96
100
100
113
119
120
125
4 Reserving in non-life insurance business
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 The claims reserving problem . . . . . . . . . . . . . . . .
4.3 Model set-up: regression models . . . . . . . . . . . . . .
4.3.1 Lognormal linear models . . . . . . . . . . . . . . .
4.3.2 Loglinear location-scale models . . . . . . . . . . .
4.3.3 Generalized linear models . . . . . . . . . . . . . .
4.3.4 Linear predictors and the discounted IBNR reserve
4.4 Convex bounds for the discounted IBNR reserve . . . . .
4.4.1 Asymptotic results in generalized linear models . .
4.4.2 Lower and upper bounds . . . . . . . . . . . . . .
4.5 The bootstrap methodology in claims reserving . . . . . .
4.5.1 Introduction . . . . . . . . . . . . . . . . . . . . .
4.5.2 Central idea . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
127
127
131
133
135
137
141
146
148
148
151
157
157
158
2.5
2.6
2.7
Application: discounted loss reserves . . . . . . . . . . . .
2.4.1 Framework and notation . . . . . . . . . . . . . . .
2.4.2 Calculation of convex lower and upper bounds . .
Convex bounds for scalar products of random vectors . .
2.5.1 Theoretical results . . . . . . . . . . . . . . . . . .
2.5.2 Stop-loss premiums . . . . . . . . . . . . . . . . . .
2.5.3 The case of log-normal discount factors . . . . . .
Application: the present value of stochastic cash flows . .
2.6.1 Stochastic returns . . . . . . . . . . . . . . . . . .
2.6.2 Lognormally distributed payments . . . . . . . . .
2.6.3 Elliptically distributed payments . . . . . . . . . .
2.6.4 Independent and identically distributed payments
Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Reserving in life insurance business
3.1 Introduction . . . . . . . . . . . . . . . . . . . . .
3.2 Modelling stochastic decrements . . . . . . . . .
3.3 The distribution of life annuities . . . . . . . . .
3.3.1 A single life annuity . . . . . . . . . . . .
3.3.2 A homogeneous portfolio of life annuities
3.3.3 An ‘average’ portfolio of life annuities . .
3.3.4 A numerical illustration . . . . . . . . . .
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Table of Contents
4.6
4.7
4.5.3 Bootstrap confidence intervals .
4.5.4 Bootstrap in claims reserving .
Three applications . . . . . . . . . . .
4.6.1 Lognormal linear models . . . .
4.6.2 Loglinear location-scale models
4.6.3 Generalized linear models . . .
Conclusion . . . . . . . . . . . . . . .
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
158
159
163
164
171
177
183
5 Other approximation techniques for sums of dependent
random variables
185
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
5.2 Moment matching approximations . . . . . . . . . . . . . . 187
5.2.1 Two well-known moment matching approximations . 187
5.2.2 Application: discounted loss reserves . . . . . . . . . 190
5.3 Asymptotic approximations . . . . . . . . . . . . . . . . . . 192
5.3.1 Preliminaries for heavy-tailed distributions . . . . . 192
5.3.2 Asymptotic results . . . . . . . . . . . . . . . . . . . 194
5.3.3 Application: discounted loss reserves . . . . . . . . . 198
5.4 The Bayesian approach . . . . . . . . . . . . . . . . . . . . 201
5.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 201
5.4.2 Prior choice . . . . . . . . . . . . . . . . . . . . . . . 203
5.4.3 Iterative simulation methods . . . . . . . . . . . . . 205
5.4.4 Bayesian model set-up . . . . . . . . . . . . . . . . . 207
5.5 Applications in claims reserving . . . . . . . . . . . . . . . . 209
5.5.1 The comonotonicity approach versus the Bayesian
approximations . . . . . . . . . . . . . . . . . . . . . 209
5.5.2 The comonotonicity approach versus the asymptotic
and moment matching approximations . . . . . . . . 216
5.6 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Samenvatting in het Nederlands (Summary in Dutch)
227
Bibliography
237
Preface
Uncertainty is very much a part of the world in which we live. Indeed, one
often hears the well-known cliche that the only certainties in life are death
and taxes. However, even these supposed certainties are far from being
completely certain, as any actuary or accountant can attest. For although
one’s eventual death and the requirement that one pay taxes may be facts
of life, the timing of one’s death and the amount of taxes to pay are far from
certain and are generally beyond one’s control. Uncertainty can make life
interesting. Indeed, the world would likely be a very dull place if everything
were perfectly predictable. However, uncertainty can also cause grief and
suffering.
Actuarial science is the subject whose primary focus is analyzing the
financial consequences of future uncertain events. In particular, it is concerned with analyzing the adverse financial consequences of large, unpredictable losses and with designing mechanisms to cushion the harmful financial effects of such losses.
Insurance is based on the premise that individuals faced with large and
unpredictable losses can reduce the variability of such losses by forming a
group and sharing the losses incurred by the group as a whole. This important principle of loss sharing, known as the insurance principle, forms
the foundation of actuarial science. It can be justified mathematically
using the Central Limit Theorem from probability theory. For the insurance principle to be valid, essentially four conditions should hold (or very
nearly hold). The losses should be unpredictable. The risks should be
independent in the sense that a loss incurred by one member of the group
makes additional losses by other members of the group no more or less
likely. The risks should be homogeneous in the sense that a loss incurred
by one member of the group is not expected to be any different in size or
likelihood from losses incurred by other members of the group. Finally,
vii
viii
Preface
the group should be sufficiently large so that the portion of the total loss
that each individual is required to pay becomes relatively certain. In practice, risks are not truly independent or homogeneous. Moreover, there will
always be situations where the condition of unpredictability is violated.
Actuarial science seeks to address the following three problems associated
with any such insurance arrangement:
1. Given the nature of the risk being assumed, what price (i.e. premium)
should the insurance company charge?
2. Given the nature of the overall risks being assumed, how much of
the aggregate premium income should the insurance company set
aside in a reserve to meet contractual obligations (i.e. pay insurance
claims) as they arise?
3. Given the importance to society and the general economy of having
sound financial institutions able to meet all their obligations, how
much capital should an insurance company have above and beyond
its reserves to absorb losses that are larger than expected? Given the
actual level of an insurance company’s capital, what is the probability
of the company remaining solvent?
These are generally referred to as the problems of pricing, reserving, and
solvency.
This thesis focuses on the problem of reserving and total balance sheet
requirements. A reserving analysis involves the determination of the random present value of an unknown amount of future loss payments. For a
property/casualty insurance company this uncertain amount is usually the
most important number on its financial statement. The care and expertise
with which that number is developed are crucial to the company and to its
policyholders. It is important not to let the inherent uncertainties serve as
an excuse for providing anything less than a rigorous scientific analysis.
Among those who rely on reserve estimates, interests and priorities may
vary. To company management the reserve estimate should provide reliable
information in order to maximize the company’s viability and profitability. To the insurance regulator, concerned with company solvency, reserves
should be set conservatively to reduce the probability of failure of the insurance company. To the tax agent charged with ensuring timely reporting
Preface
ix
of earned income, the reserves should reflect actual payments as “nearly
as it is possible to ascertain them”. The policyholder is most concerned
that reserves are adequate to pay insured claims, but does not want to be
overcharged.
Besides all the techniques, the primary goal of the reserving process
can be stated quite simply. As of a given date, an insurer is liable for
all claims incurred from that date on. As well as for claims that arise
from already occurred events as for claims that arise from risks covered by
the insurer but for which the uncertain event has not yet occurred. Costs
associated with these claims fall into two categories: those which have been
paid and those which have not. The primary goal of the reserving process
is to estimate those which have not yet been paid (i.e. unpaid losses). As
of a given reserve date, the distribution of possible aggregate unpaid loss
amounts may be represented as a probability density function. Much has
been written about the statistical distributions that have proven to be
most useful in the study of risk and insurance. In practice full information
about the underlying distributions is hardly ever available. For this reason
one often has to rely on partial information, for example estimations of the
first couple of moments. Not only the basic summary measures, but also
more sophisticated risk measures (such as measures of skewness or extreme
percentiles of the distribution) which require much deeper knowledge about
the underlying distributions are of interest. The computation of the first
few moments may be seen as just a first attempt to explore the properties of
a random distribution. Moreover in general the variance does not appear to
be the most suitable risk measure to determine the solvency requirements
for an insurance portfolio. As a two-sided risk measure it takes into account
both positive and negative discrepancies which leads to underestimation
of the reserve in the case of a skewed distribution. Moreover it does not
emphasize the tail properties of the distribution. In this case it seems
much more appropriate to use the Value-at-Risk (the p-th quantile) or
also the Tail Value-at-Risk (which is essentially the same as an average of
all quantiles above a predefined level p). Also risk measures based on stoploss premiums (for example the Expected Shortfall) can be used in this
context. These trends are also reflected in the recent regulatory changes
in banking and insurance (Basel 2 and Solvency 2) which stress the role
of the risk-based approach in asset-liability management. This creates a
need for new methodological tools which allow to obtain more sophisticated
information about the underlying risks, like the upper quantiles, stop-loss
x
Preface
premiums and others.
There is little in the actuarial literature which considers the adequate
computation of the distribution of reserve outcomes. Several methods exist
which allow to approximate efficiently the distribution functions for sums
of independent risks (e.g. Panjer’s recursion, convolution, ...). Moreover if
the number of risks in an insurance portfolio is large enough, the Central
Limit Theorem allows to obtain a normal approximation for aggregate
claims. Therefore even if the independence assumption is not justified (e.g.
when it is rejected by formal statistical tests), it is often used in practice
because of its mathematical convenience. In a lot of practical applications
the independence assumption may be often violated, which can lead to
significant underestimation of the riskiness of the portfolio. This is the
case for example when the actuarial technical risk is combined with the
financial investment risk.
Unlike in finance, in insurance the concept of stochastic interest rates
emerged quite recently. Traditionally actuaries rely on deterministic interest rates. Such a simplification allows to treat efficiently summary measures of financial contracts such as the mean, the standard deviation or the
upper quantiles. However due to a high uncertainty about future investment results, actuaries are forced to adopt very conservative assumptions
in order to calculate insurance premiums or mathematical reserves. As
a result the diversification effects between returns in different investment
periods cannot be taken into account (i.e. the fact that poor investment
results in some periods are usually compensated by very good results in
others). This additional cost is transferred either to the insureds who
have to pay higher insurance premiums or to the shareholders who have
to provide more economic capital.
For these reasons the need for introducing models with stochastic interest rates has been well-understood also in the actuarial world. The move
toward stochastic modelling of interest rates is additionally enhanced by
the latest regulatory changes in banking and insurance (Basel 2, Solvency
2) which promote the risk-based approach to determine economic capital,
i.e. they state that traditional actuarial conservatism should be replaced
by the fair value reserving, with the regulatory capital determined solely
on the basis of unexpected losses which can be estimated e.g. by taking the
Value-at-Risk measure at appropriate probability level p. Projecting cash
flows with stochastic rates of return is also crucial in pricing applications
Preface
xi
in insurance, like the embedded value (the present value of cash flows generated only by policies-in-force) and the appraisal value (the present value
of cash flows generated both by policies-in-force and by new business, i.e.
the policies which will be written in the future).
A mathematical description of the discussed problem can be summarized
as follows.
Let Xi denote a random amount to be paid at time ti , i = 1, . . . , n and let
Vi denote the discount factor over the period [0, ti ]. We will consider the
present value of future payments being a scalar product of the form
S=
n
X
Xi Vi .
(1)
i=1
~ = (X1 , X2 , . . . , Xn ) may reflect e.g. the insurance
The random vector X
~ = (V1 , V2 , . . . , Vn ) represents the finanor credit risk while the vector V
cial/investment risk. In general we assume that these vectors are mutually
independent. In practical applications the independence assumption is often violated, e.g. due to an inflation factor which strongly influences both
payments and investment results. One can however tackle this problem by
considering sums of the form
S=
n
X
X̃i Ṽi ,
i=1
where X̃i = Xi /Zi and Ṽi = Vi Zi are the adjusted values expressed in
real terms (Zi denotes here an inflation factor over period [0, ti ]). For this
reason the assumption of independence between the insurance risk and the
financial risk is in most cases realistic and can be efficiently applied to
obtain various quantities describing risk within financial institutions, e.g.
discounted insurance claims or the embedded/appraisal value of a company.
Typically these distribution functions are rather involved, which is
mainly due to two important reasons. First of all, the distribution of
the sum of random variables with marginal distributions in the same distribution class in general does not belong to the particular distribution
class. Secondly, the stochastic dependence between the elements in the
sum precludes convolution and complicates matters considerably.
xii
Preface
Consequently, in order to compute functionals of sums of dependent
random variables, approximation methods are generally indispensable. Provided that the whole dependency structure is known, one can use Monte
Carlo simulation to obtain empirical distribution functions. However, this
is typically a time consuming approach, in particular if we want to approximate tail probabilities, which would require an excessive number of
simulations. Therefore, alternative methods need to be explored. In this
thesis we discuss the most frequent used approximation techniques for reserving applications.
The central idea in this work is the concept of comonotonicity. We
suggest to solve the above described problem by calculating upper and
lower bounds for the sum of dependent random variables making efficient
use of the available information. These bounds are based on a general
technique for deriving lower and upper bounds for stop-loss premiums of
sums of dependent random variables, as explained in Kaas et al. (2000),
Dhaene et al. (2002a,b), among others.
The first approximation we will consider for the distribution function of
the discounted reserve is derived by approximating the dependence structure between the random variables involved by a comonotonic dependence
structure. In this way the multi-dimensional problem is reduced to a twodimensional one which can easily be solved by conditioning and using some
numerical techniques. It is argued that this approach is plausible in actuarial applications because it leads to prudent and conservative values of
the reserves and solvency margin. If the dependency structure between
the summands of S is strong enough, this upper bound in convex order
performs reasonably well.
The second approximation, which is derived by considering conditional
expectations, takes part of the dependence structure into account. This
lower bound in convex order turns out to be extremely useful to evaluate
the quality of approximation provided by the upper bound. The lower
bound can also be applied as an approximation of the underlying distribution. This choice is not actuarially prudent, however the relative error of
this approximation significantly outperforms the relative error of the upper
bound. For this reason, the lower bound will always be preferable in the
applications which require high precision of approximations, like pricing of
exotic derivatives (e.g. Decamps et al. (2004), Deelstra et al. (2004) and
Vyncke et al. (2004)) or optimal portfolio selection problems (e.g. Dhaene
et al. (2005)).
Preface
xiii
This thesis is set out as follows.
The first chapter recalls the basics of actuarial risk theory. We define
some frequently used measures of dependence and the most important orderings of risks for actuarial applications. We further introduce several
well-known risk measures and the relations that hold between them. We
summarize properties of these risk measures that can be used to facilitate
decision-taking. Finally, we provide theoretical background for the concept of comonotonicity and we review the most important properties of
comonotonic risks.
In Chapter 2 we recall how the comonotonic bounds can be derived and
illustrate the theoretical results by means of an application in the context of discounted loss reserves. The advantage of working with a sum of
comonotonic variables has to be that the calculation of the distribution of
such a sum is quite easy. In particular this technique is very useful to find
reliable estimations of upper quantiles and stop-loss premiums.
In practical applications the comonotonic upper bound seems to be
useful only in the case of a very strong dependency between successive
summands. Even then the bounds for stop-loss premiums provided by
the comonotonic approximation are often not satisfactory. In this chapter
we present a number of techniques which allow to determine much more
efficient upper bounds for stop-loss premiums. To this end, we use on the
one hand the method of conditioning as in Curran (1994) and in Rogers
and Shi (1995), and on the other hand the upper and lower bounds for
stop-loss premiums of sums of dependent random variables. We show also
how to apply the results to the case of sums of lognormally distributed
random variables. Such sums are widely encountered in practice, both in
actuarial science and in finance.
We derive comonotonic approximations for the scalar product of random vectors of the form (1) and explain a general procedure to obtain
accurate estimates for quantiles and stop-loss premiums. We study the
distribution of the present value function of a series of random payments
in a stochastic financial environment described by a lognormal discounting
process. Such distributions occur naturally in a wide range of applications
within fields of insurance and finance. Accurate approximations are obtained by developing upper and lower bounds in the convex order sense for
xiv
Preface
such present value functions. Finally, we consider several applications for
discounted claim processes under the Black & Scholes setting. In particular
we analyze in detail the cases when the random variables Xi denote insurance losses modelled by lognormal, normal (more general: elliptical) and
gamma or inverse Gaussian (more general: tempered stable) distributions.
As we demonstrate by means of a series of numerical illustrations, the
methodology provides an excellent framework to get accurate and easily
obtainable approximations of distribution functions for random variables
of the form (1).
Chapters 3 and 4 apply the obtained results to two important reserving
problems in insurance business and illustrate them numerically.
In Chapter 3 we consider an important application in the life insurance
business. We aim to provide some conservative estimates both for high
quantiles and stop-loss premiums for a single life annuity and for a whole
portfolio. We focus here only on life annuities, however similar techniques
may be used to get analogous estimates for more general life contingencies.
Our solution enables to solve with a great accuracy personal finance
problems such as: How much does one need to invest now to ensure — given
a periodical (e.g. yearly) consumption pattern — that the probability of
outliving ones money is very small (e.g. less than 1%)?
The case of a portfolio of life annuity policies has been studied extensively in the literature, but only in the limiting case — for homogeneous
portfolios, when the mortality risk is fully diversified. However the applicability of these results in insurance practice may be questioned: especially
in the case of the life annuity business a typical portfolio does not contain enough policies to speak about full diversification. For this reason we
propose to approximate the number of active policies in subsequent years
using a normal power distribution (by fitting the first three moments of
the corresponding binomial distributions) and to model the present value
of future benefits as a scalar product of mutually independent random
vectors.
Chapter 4 focuses on the claims reserving problem. To get the correct
picture of its liabilities, a company should set aside the correctly estimated
amount to meet claims arising in the future on the written policies. The
past data used to construct estimates for the future payments consist of a
Preface
xv
triangle of incremental claims.
The purpose is to complete this run-off triangle to a square, and even
to a rectangle if estimates are required pertaining to development years of
which no data are recorded in the run-off triangle at hand. To this end, the
actuary can make use of a variety of techniques. The inherent uncertainty
is described by the distribution of possible outcomes, and one needs to
arrive at the best estimate of the reserve. In this chapter we look at the
discounted reserve and impose an explicit margin based on a risk measure
from the distribution of the total discounted reserve. We will model the
claim payments using lognormal linear, loglinear location-scale and generalized linear models, and derive accurate comonotonic approximations for
the discounted loss reserve.
The bootstrap technique has proved to be a very useful tool in many
statistical applications and can be particularly interesting to assess the
variability of the claim reserving predictions and to construct upper limits
at an adequate confidence level. Its popularity is due to a combination of
available computing power and theoretical development. One advantage of
the bootstrap is that the technique can be applied to any data set without
having to assume an underlying distribution. Moreover, most computer
packages can handle very large numbers of repeated samplings, and this
should not limit the accuracy of the bootstrap estimates.
In the last chapter we derive, review and discuss some other methods
to obtain approximations for S. In the first section we recall two wellknown moment matching approximations: the lognormal and the reciprocal gamma approximation. Practitioners often use a moment matching
lognormal approximation for the distribution of S. The lognormal and
reciprocal gamma approximations are chosen such that their first two moments are equal to the corresponding moments of S.
Although the comonotonic bounds in convex order have proven to be
good approximations in case the variance of the random sum is sufficiently
small, they perform much worse when the variance gets large. In actuarial
applications it is often merely the tail of the distribution function that is
of interest. Indeed, one may think of Value-at-Risk, Conditional Tail Expectation or Expected Shortfall estimations. Therefore, approximations
for functionals of sums of dependent random variables may alternatively
be obtained through the use of asymptotic relations. Although asymptotic
results are valid at infinity, they may as well serve as approximations near
xvi
Preface
infinity. We establish some asymptotic results for the tail probability of
a sum of heavy tailed dependent random variables. In particular, we derive an asymptotic result for the randomly weighted sum of a sequence of
non-negative numbers. Furthermore, we establish under two different sets
of conditions, an asymptotic result for the randomly weighted sum of a
sequence of independent random variables that consist of a random and a
deterministic component. Throughout, the random weights are products of
i.i.d. random variables and thus exhibit an explicit dependence structure.
Since the early 1990’s, statistics has seen an explosion in applied Bayesian
research. This explosion has had little to do with a warming of the statistics
and econometrics communities to the theoretical foundation of Bayesianism, or to a sudden awakening to the merits of the Bayesian approach
over frequentist methods, but instead can be primarily explained on pragmatic grounds. Bayesian inference is the process of fitting a probability
model to a set of data and summarizing the result by a probability distribution on the parameters of the model and on unobserved quantities such
as predictions for new observations. Simple simulation methods exist to
draw samples from posterior and predictive distributions, automatically
incorporating uncertainty in the model parameters. An advantage of the
Bayesian approach is that we can compute, using simulation, the posterior
predictive distribution for any data summary, so we do not need to put a
lot of effort into estimating the sampling distribution of test statistics. The
development of powerful computational tools (and the realization that existing statistical tools could prove quite useful for fitting Bayesian models)
has drawn a number of researchers to use the Bayesian approach in practice. Indeed, the use of such tools often enables researchers to estimate
complicated statistical models that would be quite difficult, if not virtually impossible, using standard frequentist techniques. The purpose of this
third section is to sketch, in very broad terms, basic elements of Bayesian
computation.
Finally, we compare these approximations with the comonotonic approximations of the previous chapter in the context of claims reserving. In
case the underlying variance of the statistical and financial part of the discounted IBNR reserve gets large, the comonotonic approximations perform
worse. We will illustrate this observation by means of a simple example
and propose to solve this problem using the derived asymptotic results for
the tail probability of a sum of dependent random variables, in the presence
of heavy-tailedness conditions. These approximations are compared with
Preface
xvii
the lognormal moment matching approximations. We finally consider the
distribution of the discounted loss reserve when the data in the run-off triangle is modelled by a generalized linear model and compare the outcomes
of the Bayesian approach with the comonotonic approximations.
Publications
• Ahcan A., Darkiewicz G., Hoedemakers T., Dhaene J. and Goovaerts
M.J. (2004), “Optimal portfolio selection: Applications in insurance
business”, Proceedings of the 8th International Congress on Insurance: Mathematics & Economics, June 14-16, Rome, pp. 40.
• Ahcan A., Darkiewicz G., Goovaerts M.J. and Hoedemakers T. (2005),
“Computation of convex bounds for present value functions of random payments”, Journal of Computational and Applied Mathematics, to be published.
• Antonio K., Goovaerts M.J. and Hoedemakers T. (2004), “On the
distribution of discounted loss reserves”, Medium Econometrische
Toepassingen, vol. 12, no. 2, pp. 14-18.
• Antonio K., Beirlant J. and Hoedemakers T. (2005), Discussion of
“A Bayesian generalized linear model for the Bornhuetter-Ferguson
method of claims reserving” by Richard Verrall, North American
Actuarial Journal, to be published.
• Antonio K., Beirlant J., Hoedemakers T. and Verlaak R. (2005),
“On the use of general linear mixed models in loss reserving”, North
American Actuarial Journal, submitted.
• Darkiewicz G. and Hoedemakers T. (2005), “How the co-integration
analysis can help in mortality forecasting”, British Actuarial Journal,
submitted.
• Hoedemakers T., Beirlant J., Goovaerts M.J. and Dhaene J. (2003),
“Confidence bounds for discounted loss reserves”, Insurance: Mathematics & Economics, vol. 33, no. 2, pp. 297-316.
xix
xx
Publications
• Hoedemakers T. and Goovaerts M.J. (2004), Discussion of “Risk and
discounted loss reserves” by Greg Taylor, North American Actuarial
Journal, vol. 8, no. 4, pp. 146-150.
• Hoedemakers T., Beirlant J., Goovaerts M.J. and Dhaene J. (2005),
“On the distribution of discounted loss reserves using generalized
linear models”, Scandinavian Actuarial Journal, vol. 2005, no. 1, pp.
25-45.
• Hoedemakers T., Darkiewicz G. and Goovaerts M.J. (2005), “Approximations for life annuity contracts in a stochastic financial environment”, Insurance: Mathematics & Economics, to be published.
• Hoedemakers T., Darkiewicz G., Deelstra G., Dhaene J. and Vanmaele M. (2005), “Bounds for stop-loss premiums of stochastic sums
(with applications to life contingencies)”, Scandinavian Actuarial
Journal, submitted.
• Hoedemakers T., Goovaerts M.J. and Dhaene J. (2003), “IBNR problematiek in historisch perspectief”, De Actuaris, vol. 11, no. 2, pp.
27-29.
• Hoedemakers T., Goovaerts M.J. and Dhaene J. (2004), “De IBNRdiscussie”, De Actuaris, vol. 11, no. 4, pp. 26-29.
• Laeven R.J.A, Goovaerts M.J. and Hoedemakers T. (2005), “Some
asymptotic results for sums of dependent random variables with actuarial applications”, Insurance: Mathematics & Economics, to be
published.
• Vanduffel S., Hoedemakers T. and Dhaene J. (2005), “Comparing
approximations for risk measures of sums of non-independent lognormal random variables”, North American Actuarial Journal, to be
published.
List of abbreviations and
symbols
Abbreviation
or symbol
Explanation
ARMA(p, q)
AutoRegressive-Moving Average Process of
order (p, q)
cdf
cumulative distribution function
c.f.
characteristic function
CLT
Central Limit Theorem
Corr(X, Y ) = r(X, Y ) Pearson’s correlation coefficient between
the r.v.’s X and Y
Cov[X, Y ]
covariance between the r.v.’s X and Y
D
class of dominatedly varying functions
d.f.
distribution function
E
exponential r.v.
En (~
µ, Σ, φ)
n-dimensional elliptical distribution with
parameters µ
~ , Σ and φ
F
d.f. and distribution of a r.v.
F
tail of the d.f. F : F = 1 − F
F ∗n
n-fold convolution of the Rd.f. or distribution F
∞
Γ(x)
gamma function: Γ(x) = 0 tx−1 e−t dt, x > 0
Gamma(a, b)
gamma distribution with parameters a and b:
f (x) = ba (Γ(a))−1 xa−1 e−bx , x ≥ 0
I(a, x)
incomplete gamma
function:
R∞
Γa (x) = (Γ(a))−1 x e−t ta−1 dt, x ≥ 0
GPD
Generalized Pareto Distribution
xxi
xxii
I(.)
i.i.d.
L
LLN
logN(µ, σ 2 )
List of abbreviations and symbols
indicator function: I(c) = 1 if the condition c is true
and I(c) = 0 if it is not
independent, identically distributed
class of long-tailed distributions
Law of Large Numbers
lognormal distribution with parameters µ and σ 2 :
−(log x−µ)2
1
e 2σ2
f (x) = xσ√
,
x>0
2π
MLE
Maximum Likelihood Estimator
2
N(µ, σ ), N(µ, Σ) Gaussian (normal) distribution with mean µ and
variance σ 2 or covariance matrix Σ
N(0, 1)
standard normal distribution
o(1)
a(x) = o(b(x)) as x → x0 means that
limx→x0 a(x)/b(x) = 0
O(1)
a(x) = O(b(x)) as x → x0 means that
limx→x0 |a(x)/b(x)| < ∞
ϕX (t)
c.f. of the r.v. X: ϕX (t) = E[eitX ]
Φ(.)
the cdf of the standard normal r.v.
lim supn→∞ (xn ) limit superior of the bounded sequence {xn }:
= lim(sn ), where sn = supk≥n xk = sup{xn , xn+1 , . . .}
lim infn→∞ (xn )
limit inferior of the bounded sequence {xn }:
= lim(tn ), where tn = inf k≥n xk = inf{xn , xn+1 , . . .}
p.d.f
probability density function
Pr[.]
probability measure
p(.|.)
conditional probability density
p(.)
marginal distribution
R
class of the d.f.’s with regularly varying right tail
Rα
class of the regularly varying functions with index α
R−∞
class of the rapidly varying functions
r.v.
random variable
S
class of the subexponential distributions
2
σX
variance of the r.v. X
Cov[Xi , Xj ]
σ Xi Xj
sign(a)
sign of the real number a
T S(δ, a, b)
tempered stable law with parameters δ, a and b
U (a, b)
uniform random variable on (a, b)
UMVUE
Uniformly Minimum Variance Unbiased Estimator
Var[X]
variance of the r.v. X
List of abbreviations and symbols
∼
≈
d
→
d
=
b.c
xxiii
a(x) ∼ b(x) as x → x0 means that limx→x0 a(x)/b(x) = 1
a(x) ∼ 0 means a(x) = o(1)
a(x) ≈ b(x) as x → x0 means that a(x) is approximately
(roughly) of the same order as b(x) as x → x0 .
It is only used in a heuristic sense.
a(x) b(x) as x → x0 means that 0 < lim infx→x0 a(x)/b(x)
≤ lim supx→x0 a(x)/b(x) < ∞
convergence in distribution
equal in distribution
floor function: bxc is the largest integer less than or
equal to x
d.e
ceiling function: dxe is the smallest integer greater than or
equal to x
(x − d)+ max(x − d, 0)
=: or := notation
Chapter 1
Risk and comonotonicity in
the actuarial world
Summary In order to make decisions one has to evaluate the (distribution function of the) multivariate risk (or random variable) one faces.
In this chapter we recall the basics of actuarial risk theory. We define
some frequently used measures of dependence and the most important orderings of risks for actuarial applications. We further introduce several
well-known risk measures and the relations that hold between them. We
summarize properties of these risk measures that can be used to facilitate
decision-taking. Finally, we provide theoretical background for the concept of comonotonicity and we review the most important properties of
comonotonic risks.
1.1
Fundamental concepts in actuarial risk theory
In this section we briefly recall the most important concepts in actuarial risk theory. The study of dependence has become of major concern
in actuarial research. We start by defining three important measures of
dependence: Pearson’s correlation coefficient, Kendall’s τ and Spearman’s
ρ. Once dependence measures are defined, one could use them to compare
the strength of dependence between random variables.
The determination of capital requirements for an insurance company
is a complex and non-trivial task. From their nature, capital requirements
are numeric values expressed in monetary units and based on quantifiable
1
2
Chapter 1 - Risk and comonotonicity in the actuarial world
measures of risks. Formally a risk measure is defined as a mapping from
the set of risks at hand to the real numbers. In other words, with any
potential loss X one associates a real number ρ[X]. Thus a risk measure
summarizes the riskiness of the underlying distribution in one single number. Usually such quantification serves as a risk management tool (e.g. an
insurance premium or an economic capital), but it can be also helpful in
overall decision making. We review and place the four popular risk measures (Value-at-Risk, Tail Value-at-Risk, Conditional Tail Expectation and
Expected Shortfall) in their context.
In the actuarial literature, orderings of risks are an important tool for
comparing the attractiveness of different risks. The essential tool for the
comparison of different concepts of orderings of risks will be the stop-loss
transform/premium and its properties. In the actuarial literature it is a
common feature to replace a risk by a “less favorable” risk that has a
simpler structure, making it easier to determine the distribution function.
We clarify what we mean with a “less favorable” risk and define the three
most important orderings of risks for actuarial applications: stochastic
dominance, stop-loss order and convex order.
This chapter is essentially based on Dhaene, Denuit, Goovaerts, Kaas &
Vyncke (2002a) and Dhaene, Vanduffel, Tang, Goovaerts, Kaas & Vyncke
(2004).
1.1.1
Dependent risks
In risk theory, all the random variables are traditionally assumed to be
mutually independent. It is clear that this assumption is made for mathematical convenience. In some situations however, insured risks tend to act
similarly. The independence assumption is then violated and is not an adequate way to describe the relations between the different random variables
involved. The individual risks of an earthquake or flooding risk portfolio
which are located in the same geographic area are correlated, since individual claims are contingent on the occurrence and severity of the same
earthquake or flood. On a foggy day all cars of a region have higher probability to be involved in an accident. During dry hot summers, all wooden
cottages are more exposed to fire. More generally, one can say that if the
density of insured risks in a certain area or organization is high enough,
then catastrophes such as storms, explosions, earthquakes, epidemics and
1.1. Fundamental concepts in actuarial risk theory
3
so on can cause an accumulation of claims for the insurer. In life insurance,
there is ample evidence that the lifetimes of husbands and their wives are
positively associated. There may be certain selection mechanisms in the
matching of couples (“birds of a feather flock together”): both partners
often belong to the same social class and have the same life style. Further, it is known that the mortality rate increases after the passing away
of one’s spouse (the “broken heart syndrome”). These phenomena have
implications on the valuation of aggregate claims in life insurance portfolios. Another example in a life insurance context is a pension fund that
covers the pensions of persons working for the same company. These persons work at the same location, they take the same flights. It is evident
that the mortality of these persons will be dependent, at least to a certain
extent.
The study of dependence has become of major concern in actuarial
research. There are a variety of ways to measure dependence.
First Pearson’s product moment correlation coefficient, captures the
linear dependence between couples of random variables. For a random
couple (X1 , X2 ) having marginals with finite variances, Pearson’s product
correlation coefficient is defined by
Cov[X1 , X2 ]
.
Corr(X1 , X2 ) = p
Var[X1 ]Var[X2 ]
Pearson’s correlation coefficient contains information on both the strength
and direction of a linear relationship between two random variables. If one
variable is an exact linear function of the other variable, a positive relationship leads to correlation coefficient 1, while a negative relationship leads
to correlation coefficient −1. If there is no linear predictability between
the two variables, the correlation is 0.
Kendall’s τ is a nonparametric measure of association based on the
probabilities of concordances and discordances in paired observations. Concordance occurs when paired observations vary together, and discordance
occurs when paired observations vary differently. Specifically, Kendall’s τ
for a random couple (X1 , X2 ) of random variables with continuous cdf’s is
defined as
τ (X1 , X2 ) = Pr[(X1 − X10 )(X2 − X20 ) > 0]
−Pr[(X1 − X10 )(X2 − X20 ) < 0]
= 2Pr[(X1 − X10 )(X2 − X20 ) > 0] − 1,
4
Chapter 1 - Risk and comonotonicity in the actuarial world
where (X10 , X20 ) is an independent copy of (X1 , X2 ).
Contrary to Pearson’s r, Kendall’s τ is invariant under strictly monotone transformations, that is, if φ1 and φ2 are strictly increasing (or decreasing) functions
of X1 and X2 , respectively, then
on the supports
τ φ1 (X1 ), φ2 (X2 ) = τ X1 , X2 provided the cdf’s of X1 and X2 are
continuous. Further, (X1 , X2 ) are perfectly dependent if and only if,
|τ (X1 , X2 )| = 1.
Another very useful dependence measure is Spearman’s ρ. The idea
behind this dependence measure is very simple. Given random variables X1
and X2 with continuous cdf’s FX1 and FX2 , we first create U1 = FX1 (X1 )
and U2 = FX2 (X2 ), which are uniformly distributed over [0, 1] and then
use Pearson’s r. Spearman’s ρ is thus defined as ρ(X1 , X2 ) = r(U1 , U2 ).
Dependence measures can be used to compare the strength of dependence between random variables.
1.1.2
Risk measures
Measuring risk and measuring preferences is not the same. When ordering preferences, activities, for example, alternatives A and B with financial consequences XA and XB , are compared in order of preference
under conditions of risk. A preference order A B means that A is
preferred to B. This order is represented by a preference function Ψ
with A B ⇔ Ψ[XA ] > Ψ[XB ]. In contrast, a risk order A R B
means that A is riskier than B and is represented by a function ρ with
A R B ⇔ ρ[XA ] > ρ[XB ]. Every such function ρ is called a risk measure.
Models in actuarial science are used both for quantifying risks and
for pricing risks. Quantifying risk requires a risk measure to convert a
random future gain or loss into a certainty equivalent that can then be
used to order different risks and for decision-making purposes. In order
to quantify risk, it is necessary to specify the probability distributions of
the risks involved and to apply a preference function to these probability
distributions. Thus, this process involves both statistical assumptions and
economic assumptions. Individuals are assumed to be risk averse and to
have a preference to diversify risks.
Banks and regulatory agencies use monetary measures of risk to assess
the risk taken by financial investors; important examples are given by the
so-called Value-at-Risk and Tail Value-at-Risk.
Two-sided risk measures measure the magnitude of the distance (in
1.1. Fundamental concepts in actuarial risk theory
5
both directions) from X to E[X]. Different functions of distance lead to
different risk measures. Looking, for instance, at quadratic deviations,
this leads to the risk measure variance or to the risk measure standard
deviation. These risk measures have been the traditional measures in economics and finance since the pioneering work of Markowitz. They exhibit
a number of nice technical properties. For instance, the variance of a portfolio return is the sum of the variances and covariances of the individual
returns. Furthermore, the variance is used as a standard optimization
function (quadratic optimization).
On the other hand, a two-sided risk measure contradicts the intuitive
notion of risk that only negative deviations are dangerous. In addition
variance does not account for fat tails of the underlying distribution and
for the corresponding tail risk. For this reason, people include higher
(normalized) central moments, as for example, skewness and kurtosis, into
the analysis to assess risk more properly.
Perhaps the most popular risk measure is the Value-at-Risk (VaR). Let
L be the potential loss of a financial position. The VaR at confidence level
p (0 < p < 1) is then defined by the requirement
Pr L > VaRp [L] = 1 − p.
(1.1)
An intuitive interpretation of the VaR is that of a probable
maximum
loss
or more concrete, a 100 × p% maximal loss, because Pr L ≤ VaRp [L] = p,
which means that in 100 × p% of the cases, the loss is smaller or equal to
VaRp [L]. Interpreting the VaR as necessary underlying capital to bear risk,
relation (1.1) implies that this capital will, on average, not be exhausted in
100 × p% of the cases. Obviously, the VaR is identical to the p-quantile of
the loss distribution, that is VaRp [L] = FL−1 (p). It is important to remark
that the VaR does not take into account the severity of potential losses
in the 100 × (1 − p)% worst cases. A regulator for instance is not only
concerned with the frequency of default, but also about the severity of
default. Also shareholders and management should be concerned with the
question “how bad is bad?” when they want to evaluate the risks at hand
in a consistent way. Therefore, one often uses another risk measure which
is called the Tail Value-at-Risk (TVaR) and defined by
1
TVaRp [L] =
1−p
Z
1
VaRq [L]dq,
p
p ∈ (0, 1).
6
Chapter 1 - Risk and comonotonicity in the actuarial world
It is the arithmetic average of the quantiles of L, from p on. Note that the
TVaR is always larger than the corresponding VaR.
We will define the other popular risk measures in terms of L for a
better comparison to the VaR. The Conditional Tail Expectation (CTE)
at confidence level p is defined by
CTEp [L] = E L|L > VaRp [L] ,
p ∈ (0, 1).
On the basis of the interpretation of the VaR as a 100 × p%-maximum
loss, the CTE can be interpreted as the average maximal loss in the worst
100 × (1 − p)% cases. Notice that in case of continuous distributions the
CTE and TVaR coincide.
Measures of shortfall risk are one-sided risk measures and measures
the shortfall risk relative to a target variable. This may be the expected
value, but in general, it is an arbitrary deterministic target or a stochastic
benchmark. The Expected Shortfall (ESF) at confidence level p is defined
as
ESFp [L] = E[max(L − VaRp [L], 0)],
p ∈ (0, 1).
The following relations hold between the four risk measures defined above.
Theorem 1 (Relation between VaR, TVaR, CTE and ESF).
For p ∈ (0, 1), we have that
1
ESFp [X],
1−p
1
CTEp [X] = VaRp [X] +
ESFp [X],
1 − FX (VaRp [X])
CTEp [X] = TVaRFX (VaRp [X]) [X].
TVaRp [X] = VaRp [X] +
Proof. See Dhaene et al. (2004).
Researchers always aimed to find a set of properties (axioms) that any
risk measure should satisfy. Recently the class of coherent risk measures,
introduced in Artzner (1999) and Artzner et al. (1999), has drawn a lot
of attention in the actuarial literature. The authors postulated that every
‘coherent’ risk measure should satisfy the following four properties:
1.1. Fundamental concepts in actuarial risk theory
1. monotonicity, i.e.
X≤Y
2. subadditivity, i.e.
ρ[X + Y ] ≤ ρ[X] + ρ[Y ];
3. translation invariance, i.e.
4. positive homogeneity, i.e.
⇒
7
ρ[X] ≤ ρ[Y ];
ρ[X + c] = ρ[X] + c ∀ c ∈ R;
ρ[aX] = aρ[X] ∀ a ≥ 0.
It can be demonstrated that the Value-at-Risk and the Expected Shortfall are in general not subadditive. On the other hand, the TVaR is subadditive. The desirability of the subadditivity property of risk measures has
been a major topic for research and discussion. Some researchers believe
that the axiom of subadditivity of risk measures used to determine the
solvency capital, reflects the risk diversification. However other authors
argue that the diversification benefits should be considered rather in terms
of subadditivity of the corresponding shortfalls.
It is an open question whether the coherent set of axioms is indeed the
‘best one’. For a relevant discussion we refer to e.g. Dhaene et al. (2003),
Goovaerts et al. (2003, 2004) and Darkiewicz et al. (2005a). It should
be noted that in spite of the disagreement in the scientific community
about the axioms of coherency, a lot of well-known risk measures satisfy
conditions (1)-(4) (e.g. the TVaR).
The expressions for the discussed risk measures of normal and lognormal losses are given in the next two examples, which will be used in the
remainder of this thesis. For a proof of these examples, we refer to Dhaene
et al. (2004).
Example 1 (normal losses).
Consider a random variable X ∼ N(µ, σ 2 ). The VaR, ESF and CTE at
confidence level p (p ∈ (0, 1)) of X are given by
VaRp [X] = µ + σΦ−1 (p),
ESFp [X] = σφ Φ−1 (p) − σΦ−1 (p)(1 − p),
φ Φ−1 (p)
,
CTEp [X] = µ + σ
1−p
(1.2)
(1.3)
(1.4)
where φ(x) = Φ0 (x) denotes the density function of the standard normal
distribution.
8
Chapter 1 - Risk and comonotonicity in the actuarial world
Example 2 (lognormal losses).
Consider a random variable X ∼ logN(µ, σ 2 ). The VaR, ESF and CTE at
confidence level p (p ∈ (0, 1)) of X are given by
VaRp [X] = eµ+σΦ
−1 (p)
µ+σ 2 /2
,
(p) − e
−1 (p)
Φ
σ
−
Φ
2
.
CTEp [X] = eµ+σ /2
1−p
ESFp [X] = e
Φ σ−Φ
−1
(1.5)
µ+σΦ−1 (p)
(1 − p),
(1.6)
(1.7)
We end this section with a note about inverse distribution functions.
Inverse distribution functions
The cdf FX (x) = Pr[X ≤ x] of a random variable X is a right continuous
non-decreasing function with
FX (−∞) = lim FX (x) = 0,
x→−∞
FX (+∞) = lim FX (x) = 1.
x→+∞
The classical definition of the inverse of a distribution function is the nondecreasing and left-continuous function FX−1 (p) defined by
FX−1 (p) = inf{x ∈ R|FX (x) ≥ p},
p ∈ [0, 1]
with inf ∅ = +∞ by convention. For all x ∈ R and p ∈ [0, 1], we have
FX−1 (p) ≤ x ⇔ p ≤ FX (x).
(1.8)
In this thesis we will use a more sophisticated definition for inverses of
distribution functions. For any real p ∈ [0, 1], a possible choice for the
inverse of FX in p is any point in the closed interval
inf{x ∈ R|FX (x) ≥ p}, sup{x ∈ R|FX (x) ≤ p} ,
where, as before, inf ∅ = +∞, and also sup ∅ = −∞. Taking the left hand
border of this interval to be the value of the inverse cdf at p, we get FX−1 (p).
Similarly, we define FX−1+ (p) as the right hand border of the interval:
FX−1+ (p) = sup{x ∈ R|Fx (x) ≤ p},
p ∈ [0, 1]
which is a non-decreasing and right-continuous function. Note that FX−1 (0)
= −∞, FX−1+ (1) = +∞ and that all the probability mass of X is contained
1.1. Fundamental concepts in actuarial risk theory
9
in the interval FX−1+ , (0)FX−1 (1) . Also note that FX−1 (p) and FX−1+ (p) are
finite for all p ∈ (0, 1). In the sequel we will always use p as a value ranging
over the open interval (0, 1), unless stated otherwise.
In the following lemma, we state the relation between the inverse distribution functions of the random variables X and g(X) for a monotone
function g.
Lemma 1 (Inverse distribution function of g(X)).
Let X and g(X) be real-valued random variables and 0 < p < 1.
(a) If g is non-decreasing and left-continuous, then
−1
Fg(X)
(p) = g FX−1 (p) .
(b) If g is non-decreasing and right-continuous, then
−1+
Fg(X)
(p) = g FX−1+ (p) .
(c) If g is non-increasing and left-continuous, then
−1+
Fg(X)
(p) = g FX−1 (1 − p) .
(d) If g is non-increasing and right-continuous, then
−1
Fg(X)
(p) = g FX−1+ (1 − p) .
Proof. See Dhaene et al. (2002a).
Hereafter, we will reserve the notation U and V for U (0, 1) random variables, i.e. FU (p) = p and FU−1 (p) = p for all 0 < p < 1, and the same for
V . One can prove that
d
d
X = FX−1 (U ) = FX−1+ (U ).
(1.9)
The first distributional equality is known as the quantile transform theorem and follows immediately from (1.8). It states that a sample of random
numbers from a general cumulative distribution function FX can be generated from a sample of uniform random numbers. Note that FX has at
most a countable number of horizontal segments, implying that the last
two random variables in (1.9) only differ in a null-set of values of U . This
means that these random variables are equal with probability one.
10
Chapter 1 - Risk and comonotonicity in the actuarial world
1.1.3
Actuarial ordering of risks
In the actuarial literature, orderings of risks are an important tool for
comparing the attractiveness of different risks. Many examples and results
can be found in the work of Goovaerts et al. (1990), Van Heerwaarden
(1991) and Kaas et al. (1998).
The essential tool for the comparison of different concepts of orderings of risks will be the stop-loss transform/premium and its properties.
Throughout this section a risk X will be a random variable with finite
mean. The distribution function of X is denoted by FX , and F X = 1 − FX
is the corresponding survival function.
In the actuarial literature it is a common feature to replace a risk by
a “less favorable” risk that has a simpler structure, making it easier to
determine the distribution function. Of course, we have to clarify what we
mean with a “less favorable” risk. Therefore, we first introduce the notion
of “stop-loss premium” of a distribution function.
Definition 1 (Stop-loss premium).
The stop-loss premium with retention d of a risk X is defined by
Z ∞
π(X, d) := E (X − d)+ =
F X (x)dx,
−∞ < d < +∞,
(1.10)
d
with the notation (x − d)+ = max(x − d, 0).
From this formula it is clear that the stop-loss premium with retention
d can be considered as the weight of an upper tail of (the distribution
function of) X. Indeed, it is the surface between the cdf FX of X and
the constant function 1, from d on. For these reasons stop-loss premiums
contain a lot of information about riskiness of underlying distributions.
The following properties of the stop-loss premium can easily be deduced
from the definition.
Theorem 2 (Stop-loss properties).
The stop-loss premium π(X, .) has the following properties:
(i) π(X, .) is decreasing and convex;
0
0
(ii) The right-hand derivative π + (X, .) exists and −1 ≤ π + (X, .) ≤ 0;
(iii) limd→+∞ π(X, d) = 0.
1.1. Fundamental concepts in actuarial risk theory
11
To every function π : R+ → R, that fulfils (i)-(iii) there is a risk X, such
that π is the stop-loss premium of X. The distribution function of X is
0
given by FX (d) = π + (X, d) + 1.
There are many concepts for comparing random variables. The most familiar one is the usual stochastic order introduced by Lehmann (1955).
In the actuarial and economic literature this ordering is sometimes called
stochastic dominance, see e.g. Goovaerts et al. (1990) and Van Heerwaarden (1991).
Definition 2 (Stochastic order).
We say that risk Y stochastically dominates risk X, written X ≤st Y , if
and only if FX (t) ≥ FY (t) for all t ∈ R.
In other words, X ≤st Y if their corresponding quantiles are ordered.
Note that the condition for stochastic dominance is very strong — it can
be easily seen that X ≤st Y if and only if there exist a bivariate vector
(X 0 , Y 0 ) with the same marginal distributions as X and Y and such that
X 0 ≤ Y 0 almost surely.
Several results for this ordering can be found in Shaked and Shanthikumar (1994). In the following theorem, some equivalent characterizations
are given for stochastic dominance.
Lemma 2 (Characterizations for stochastic dominance).
X ≤st Y holds if and only if any of the following equivalent conditions is
satisfied:
1. Pr[X ≥ t] ≥ Pr[Y ≥ t],
for all t ∈ R;
2. Pr[X > t] ≥ Pr[Y > t],
for all t ∈ R;
3. E[φ(X)] ≤ E[φ(Y )],
4. E[ψ(−X)] ≥ E[ψ(−Y )],
for all non-decreasing functions φ(.);
for all non-decreasing functions ψ(.);
5. The function t → π(Y, t) − π(X, t) is non-increasing.
A consequence of stochastic order X ≤st Y , i.e. a necessary condition for
d
it, is obviously that E[X] ≤ E[Y ], and even E[X] < E[Y ] unless X = Y .
The stochastic dominance has a natural interpretation in terms of utility
theory. We have that X ≤st Y holds if and only if E[u(−X)] ≥ E[u(−Y )]
for every non-decreasing utility function u. So the pairs of risks X and
12
Chapter 1 - Risk and comonotonicity in the actuarial world
Y with X ≤st Y are exactly those pairs of losses about which all decision
makers with an increasing utility function agree.
For actuarial applications the stop-loss order is much more interesting.
This ordering was investigated by Bühlmann et al. (1977), Goovaerts et al.
(1990) and Van Heerwaarden (1991). It is equivalent to increasing convex
order, which is well known in operations research and statistics.
Definition 3 (Stop-loss order).
If X and Y are two risks, then X precedes Y in stop-loss order, written
X ≤sl Y , if and only if
π(X, d) ≤ π(Y, d)
for all −∞ < d < +∞.
(1.11)
In other words two risks are ordered in the stop-loss sense if their corresponding stop-loss premiums are ordered. It is clear that stochastic order
induces stop-loss order.
Like stochastic order, stop-loss order between two risks X and Y implies
a corresponding ordering of their means. To prove this, assume that d < 0.
From the expression (1.10) in Definition 1 of stop-loss premiums as upper
tails, we immediately find the following equality:
Z 0
Z ∞
d + π(X, d) = −
FX (x)dx +
(1 − FX (x))dx
(1.12)
d
and also, letting d → −∞,
lim
d→−∞
0
d + π(X, d) = E[X].
Hence, adding d to both sides of the inequality (1.11) in Definition 3 and
taking the limit for d → −∞, we get E[X] ≤ E[Y ].
A sufficient condition for X ≤sl Y to hold is that E[X] ≤ E[Y ], together
with the condition that their cumulative distribution functions only cross
once. This means that there exists a real number c such that FX (x) ≥
FY (x) for x ≥ c, but FX (x) ≤ FY (x) for x < c. Indeed, considering the
function f (d) = π(Y, d) − π(X, d), we have that
lim f (d) = E[Y ] − E[X] ≥ 0, and
d→−∞
lim f (d) = 0.
d→+∞
Further, f (d) first increases, and then decreases (from c on) but remains
non-negative.
1.1. Fundamental concepts in actuarial risk theory
13
If two risks X and Y are ordered in the stop-loss sense, X ≤sl Y , this
means that X has uniformly smaller upper tails than Y , which in turns
means that a risk X is more attractive than a risk Y for an insurance
company. Moreover stop-loss order has a natural economic interpretation
in terms of expected
Indeed,
it can be shown that X ≤sl Y if and
utility.
only if E u(−X) ≥ E u(−Y ) holds for all non-decreasing concave real
functions u. This means that any risk-averse decision maker will prefer
to pay X instead of Y , which implies that acting as if the obligations X
are replaced by Y indeed leads to conservative or prudent decisions. This
characterization
order in terms of utility functions is equivalent
of stop-loss
to E v(X) ≤ E v(Y ) holding for all non-decreasing convex functions v.
For this reason stop-loss order is alternatively called an increasing convex
order and denoted by ≤icx .
Recall that our original problem was to replace a risk X by a less
favorable risk Y , for which the distribution function is easier to obtain. If
X ≤sl Y , then also E[X] ≤ E[Y ], and it is intuitively clear that the best
approximations arise in the borderline case where E[X] = E[Y ]. This leads
to the so-called convex order.
Definition 4 (Convex order).
If X and Y are two risks, then X precedes Y in convex order, written
X ≤cx Y , if and only if
E[X] = E[Y ]
and
π(X, d) ≤ π(Y, d)
for all −∞ < d < +∞. (1.13)
A sufficient condition for X ≤cx Y to hold is that E[X] = E[Y ], together
with the condition that their cumulative distribution functions only cross
once. This once-crossing condition can be observed to hold in most natural
examples, but it is of course easy to construct examples with X ≤cx Y and
distribution functions that cross more than once. It can also be proven that X ≤cx Y if and only E v(X) ≤ E v(Y ) for
all convex functions v. This explains the name “convex order”. Note that
when characterizing stop-loss order, the convex functions v are additionally
required to be non-decreasing. Hence, stop-loss order is weaker: more pairs
of random variables are ordered.
In the
utility
context
one will reformulate this condition to E[X] = E[Y ]
and E u(−X) ≥ E u(−Y ) for all non-decreasing concave functions u.
These conditions represent the common preferences of all risk-averse decision makers between risks with equal mean. We summarize the properties
of convex order in the following lemma.
14
Chapter 1 - Risk and comonotonicity in the actuarial world
Lemma 3 (Characterizations for convex order).
X ≤cx Y if and only if any of the following equivalent conditions is satisfied:
1. E[X] = E[Y ] and π(X, d) ≤ π(Y, d) for all d ∈ R;
2. E[X] = E[Y ] and E[(d − X)+ ] ≤ E[(d − Y )+ ] for all d ∈ R;
3. π(X, d) ≤ π(Y, d) and E[(d − X)+ ] ≤ E[(d − Y )+ ] for all d ∈ R;
4. E[X] = E[Y ] and E u(−X) ≥ E u(−Y ) for all concave functions
u(.);
5. E v(X) ≤ E v(Y ) for all convex functions v(.).
In case X ≤cx Y , the upper tails as well as the lower tails of Y eclipse the
corresponding tails of X, which means that extreme values are more likely
to occur for Y than for X. This observation also implies that X ≤cx Y is
equivalent to −X ≤cx −Y . Hence, the interpretation of risks as payments
or as incomes is irrelevant for the convex order.
Note that with stop-loss order, we are concerned with large values of
a random loss, and call the risk Y less attractive than X if the expected
values of all top parts (Y − d)+ are larger than those of X. Negative
values for these random variables are actually gains. With stability in
mind, excessive gains might also be unattractive for the decision maker,
for instance for tax reasons. In this situation, X could be considered to
be more attractive than Y if both the top parts (X − d)+ and the bottom
parts (d − X)+ have a lower expected value than for Y . Both conditions
just define the convex order introduced above.
Corollary 1 (Convex order and variance).
If X ≤cx Y then Var[X] ≤ Var[Y ].
Proof. It suffices to take the convex function v(x) = x2 .
Notice that the reverse implication does not hold in general. Comparing variances is meaningful when comparing stop-loss premiums of convex
ordered risks. The following corollary links variances and stop-loss premiums.
1.2. Comonotonicity
Corollary 2 (Variance and stop-loss premiums).
For any random variable X we can write
Z +∞ 1
Var[X] =
π(X, t) − E[X] − t + dt.
2
−∞
15
(1.14)
Proof. See e.g. Kaas et al. (1998).
From relation (1.14) in Corollary 2 we deduce that if X ≤cx Y ,
Z +∞
π(Y, t) − π(X, t)dt = 1 Var[Y ] − Var[X] .
2
−∞
(1.15)
Thus, if X ≤cx Y , their stop-loss distance, i.e. the integrated absolute
difference of their respective stop-loss premiums, equals half the variance
difference between these two random variables.
As the integrand in (1.15) is non-negative, we find that if X ≤cx Y and
in addition Var[X] = Var[Y ], then X and Y must have necessarily equal
stop-loss premiums and hence the same distribution. We also find that
if X ≤cx Y , and X and Y are not equal in distribution, then Var[X] <
Var[Y ] must hold. Note that (1.14) and (1.15) have been derived under
the additional condition that X and Y have finite second moments, hence
both limx→∞ x2 (1 − FX (x)) and limx→−∞ x2 FX (x) are equal to 0 (and
similar for Y ).
In the following theorem we recall the characterization of stochastic dominance in terms of Value-at-Risk, and a similar result characterizing stoploss order by Tail Value-at-Risk.
Theorem 3. For any random pair (X, Y ) we have that
1. X ≤st Y ⇔ VaRp [X] ≤ VaRp [Y ] for all p ∈ (0, 1);
2. X ≤sl Y ⇔ TVaRp [X] ≤ TVaRp [Y ] for all p ∈ (0, 1).
Proof. See Dhaene et al. (2004).
1.2
Comonotonicity
In an insurance context, one is often interested in the distribution function
of a sum of random variables. Such a sum appears for instance when considering the aggregate claims of an insurance portfolio over a certain reference period. In traditional risk theory, the individual risks of a portfolio are
16
Chapter 1 - Risk and comonotonicity in the actuarial world
usually assumed to be mutually independent. This is very convenient from
a mathematical point of view as the standard techniques for determining
the distribution function of aggregate claims, such as Panjer’s recursion
and convolution, are based on the independence assumption. Moreover, in
general the statistics gathered by the insurer only give information about
the marginal distributions of the risks, not about their joint distribution,
i.e. the way these risks are interrelated. The assumption of mutual independence however does not always comply with reality, which may resolve
in an underestimation of the total risk. On the other hand, the mathematics for dependent variables is less tractable, except when the variables are
comonotonic.
This section provides theoretical background for the concept of comonotonicity.
We start by defining a comonotonicity of a set A of n-vectors in Rn . We
will denote an n-vector (x1 , x2 , . . . , xn ) by ~x. For two n-vectors ~x and ~y , the
notation ~x ≤ ~y will be used for the componentwise order which is defined
by xi ≤ yi for all i = 1, 2, . . . , n. We will denote the (i, j)-projection of a
set A in Rn by Ai,j . It is formally defined by Aij = {(xi , xj )|~x ∈ A}.
Definition 5 (Comonotonic set).
The set A ⊆ Rn is said to be comonotonic if for any ~x and ~y in A, either
~x ≤ ~y or ~y ≤ ~x holds.
A set A ⊆ Rn is comonotonic if for any ~x and ~y in A, if xi < yi for some
i, then ~x ≤ ~y must hold. Hence, a comonotonic set is simultaneously nondecreasing in each component. Notice that a comonotonic set is a ‘thin’
set: it cannot contain any subset of dimension larger than 1. Any subset of
a comonotonic set is also comonotonic. The proof of the following lemma
is straightforward.
Lemma 4. A ⊆ Rn is comonotonic if and only if the set Ai,j is comonotonic for all i 6= j in {1, 2, . . . , n}.
For a general set A, comonotonicity of the (i, i + 1)-projections A i,i+1 ,
(i = 1, 2, . . . , n − 1), will not necessarily imply that A is comonotonic. As
a counter example, consider the set A = {(x1 , 1, x3 )|0 < x1 , x3 < 1}. This
set is not comonotonic, although A1,2 and A2,3 are comonotonic.
1.2. Comonotonicity
17
Next, we define the notion of support of an n-dimensional random
~ = (X1 , . . . , Xn ). Any subsect A ⊆ Rn will be called a support
vector X
~ ∈
~ ∈ A = 1 and Pr X
~ if Pr X
/ A = 0. In generally we will be
of X
interested in supports which are “as small as possible”. Informally, the
~ is the subset of Rn that is obtained
smallest support of a random vector X
n
by deleting from R all points which have a zero-probability neighborhood
~
(with respect to X).
This support can be interpreted as the set of all
~
possible outcomes of X.
Definition 6 (Comonotonic random vector).
~ = (X1 , X2 , . . . , Xn ) is said to be comonotonic if it has
A random vector X
a comonotonic support.
From Definition 6 we can conclude that comonotonicity is a very strong
positive dependency structure. Indeed, if ~x and ~y are elements of the
~ i.e. ~x and ~y are possible outcomes of X,
~ then
comonotonic support of X,
they must be ordered component by component. This explains the term
comonotonic (common monotonic).
~ implies that the higher the value
Comonotonicity of a random vector X
of one component Xj , the higher the value of any other component Xk .
This means that comonotonicity entails that no Xj is in any way a ‘hedge’,
for another component Xk .
In the following theorem, some equivalent characterizations are given
for comonotonicity of a random vector.
Theorem 4 (Characterizations for comonotonicity).
~ = (X1 , X2 , . . . , Xn ) is comonotonic if and only if one
A random vector X
of the following equivalent conditions are satisfied:
~ has a comonotonic support;
1. X
2. For all ~x = (x1 , x2 , . . . , xn ), we have
FX~ (~x) = min FX1 (x1 ), FX2 (x2 ), . . . , FXn (xn ) ;
(1.16)
3. For U ∼ U (0, 1), we have
d
~ =
X
FX−11 (U ), FX−12 (U ), . . . , FX−1n (U ) ;
(1.17)
18
Chapter 1 - Risk and comonotonicity in the actuarial world
4. There exist a random variable Z and non-decreasing functions fi
(i = 1, 2, . . . , n), such that
d
~ =
X
(f1 (Z), f2 (Z), . . . , fn (Z)).
Proof. See Dhaene et al. (2002a).
From (1.16) we see that, in order to find the probability of all the outcomes
of n comonotonic risks Xi being less than xi (i = 1, . . . , n) one simply
takes the probability of the least likely of these n events. It is obvious
that for any random vector (X1 , . . . , Xn ), not necessarily comonotonic,
the following inequality holds:
(1.18)
Pr X1 ≤ x1 , . . . , Xn ≤ xn ≤ min FX1 (x1 ), . . . , FXn (xn ) ,
and it is well-known that the function min FX1 (x1 ), . . . , FXn (xn ) is indeed the multivariate cdf of a random vector FX−11 (U ), . . . , FX−1n (U ) , which
has the same marginal distributions as (X1 , . . . , Xn ). Inequality (1.18)
states that in the class of all random vectors (X1 , . . . , Xn ) with the same
marginal distributions, the probability that all Xi simultaneously realize
large values is maximized if the vector is comonotonic, suggesting that
comonotonicity is indeed a very strong positive dependency structure. In
the special case that all marginal distribution functions FXi are identical,
~ is equivalent to saying that
we find from (1.17) that comonotonicity of X
X1 = X2 = · · · = Xn holds almost surely.
A standard way of modelling situations where individual random variables X1 , . . . , Xn are subject to the same external mechanism is to use a
secondary mixing distribution. The uncertainty about the external mechanism is then described by a structure variable z, which is a realization of
a random variable Z and acts as a (random) parameter of the distribution
~ The aggregate claims can then be seen as a two-stage process: first,
of X.
the external parameter Z = z is drawn from the distribution function FZ
of z. The claim amount of each individual risk Xi is then obtained as a
realization from the conditional distribution function of Xi given Z = z.
A special type of such a mixing model is the case where given Z = z, the
claim amounts Xi are degenerate on xi , where the xi = xi (z) are nond
decreasing in z. This means that (X1 , . . . , Xn ) = (f1 (Z), . . . fn (Z)) where
all functions fi are non-decreasing. Hence, (X1 , . . . , Xn ) is comonotonic.
Such a model is in a sense an extreme form of a mixing model, as in this
1.2. Comonotonicity
19
case the external parameter Z = z completely determines the aggregate
claims.
If U ∼ U (0, 1), then also 1 − U ∼ U (0, 1). This implies that comono~ can also be characterized by
tonicity of X
d
~ =
X
FX−11 (1 − U ), FX−12 (1 − U ), . . . , FX−1n (1 − U ) .
~ is comonotonic if and only if there exist a
Similarly, one can prove that X
random variable Z and non-increasing functions fi , (i = 1, 2, . . . , n), such
that
d
~ =
X
(f1 (Z), f2 (Z), . . . , fn (Z)).
In the sequel, for any random vector (X1 , . . . , Xn ), the notation (X1c , . . . , Xnc )
will be used to indicate a comonotonic random vector with the same
marginals as (X1 , . . . , Xn ). From (1.17) we find that for any random vec~ the outcome of its comonotonic counterpart X
~ c = (X c , . . . , Xnc ) lies
tor X
1
with probability one in the following set
−1
FX1 (p), FX−12 (p), . . . , FX−1n (p) |0 < p < 1 .
The following theorem states essentially that the comonotonicity of a random vector is equivalent with pairwise comonotonicity.
Theorem 5 (Pairwise comonotonicity).
~ is comonotonic if and only if the couples (Xi , Xj ) are
A random vector X
comonotonic for all i and j in {1, 2, . . . , n}.
The next theorem characterizes a comonotonic random couple by means
of Pearson’s correlation coefficient r.
Theorem 6 (Comonotonicity and maximum correlation).
For any random vector (X1 , X2 ) the following inequality holds:
r(X1 , X2 ) ≤ r FX−11 (U ), FX−12 (U ) ,
with strict inequalities when (X1 , X2 ) is not comonotonic.
(1.19)
As a special case of (1.19), we find that r FX−11 (U ), FX−12 (U ) ≥ 0 always
holds. In Denuit & Dhaene (2003) it is shown that other dependence
measures such as Kendall’s τ and Spearman’s ρ equal 1 (and thus are also
maximal) if and only if the variables are comonotonic.
In the following theorem we recall that the Value-at-Risk (VaRp ), the
Tail Value-at-Risk (TVaRp ) and the Expected Shortfall (ESFp ) are additive for comonotonic risks.
20
Chapter 1 - Risk and comonotonicity in the actuarial world
Theorem 7 (Comonotonicity and risk measures). Consider a comonotonic random vector X1c , X2c , . . . , Xnc , and let S c =
X1c + X2c + · · · + Xnc . Then for all p ∈ (0, 1) one has that
VaRp [S c ] =
TVaRp [S c ] =
ESFp [S c ] =
n
X
i=1
n
X
i=1
n
X
VaRp [Xi ];
(1.20)
TVaRp [Xi ];
(1.21)
ESFp [Xi ].
(1.22)
i=1
Proof. See Dhaene et al. (2004).
The computation of the most important risk measures is very easy for sums
of comonotonic random variables, since it suffices to perform calculations
for marginal distributions and add up the resulting values. Throughout
the rest of this thesis we will use the property of additivity of a quantile
function for comonotonic risks.
Chapter 2
Convex bounds
Summary In many actuarial and financial problems the distribution of
a sum of dependent random variables is of interest. In general, however,
this distribution function can not be obtained analytically because of the
complex underlying dependency structure. Kaas et al. (2000) and Dhaene
et al. (2002a) propose a possible way out by considering upper and lower
bounds for (the distribution function of) such a sum that allow explicit calculations of various actuarial quantities. When lower and upper bounds are
close to each other, together they can provide reliable information about
the original and more complex variable. In particular this technique is very
useful to find reliable estimations of upper quantiles and stop-loss premiums. We summarize the main results for deriving lower and upper bounds
and we construct sharper upper bounds for stop-loss premiums, based upon
the traditional comonotonic bounds. The idea of convex upper and lower
bounds is generalized to the case of scalar products of non-negative random
variables. We apply the derived results to the case of general discounted
cash flows, with stochastic payments. Numerous numerical illustrations
are provided, demonstrating that the derived methodology gives very accurate approximations for the underlying distribution functions and the
corresponding risk measures, like quantiles and stop-loss premiums.
2.1
Introduction
In many financial and actuarial applications where a sum of stochastic
terms is involved, the distribution of the quantity under investigation is too
difficult to obtain. It is well-known that in general the distribution function
21
22
Chapter 2 - Convex bounds
of a sum of dependent random variables cannot be determined analytically.
Therefore, instead of aiming to calculate the exact distribution, we will look
for approximations (bounds), in the convex order sense, with a simpler
structure.
The first approximation we will consider for the distribution function
of a sum of dependent random variables is derived by approximating the
dependence structure between the random variables involved by a comonotonic dependence structure. If the dependency structure between the summands of such a sum is strong enough, this upper bound in convex order
performs reasonably well.
The second approximation, which is derived by considering conditional
expectations, partly takes of the dependence structure into account. This
lower bound in convex order turns out to be extremely useful to evaluate
the quality of approximation provided by the upper bound. The lower
bound can also be applied as an approximation of the underlying distribution. This choice is not actuarially prudent, but the relative error of this
approximation significantly outperforms the relative error of the upper
bound.
When lower and upper bounds are close to each other, together they
can provide reliable information about the original and more complex variable. We emphasize that the bounds are in convex order, which does not
mean that the real value always lies between these two approximations. In
particular this technique is very useful to find reliable estimations of upper
quantiles and stop-loss premiums.
Section 2 recalls these theoretical results of Dhaene et al. (2002a).
The lower bound approximates very accurate the real stop-loss premium,
but the comonotonic upper bounds perform rather poorly. Therefore, in
Section 3 we construct sharper upper bounds based upon the traditional
comonotonic bounds. Making use of the ideas of Rogers and Shi (1995),
the first upper bound is obtained as the comonotonic lower bound plus
an error term. Next, this bound is refined by making the error term dependent on the retention in the stop-loss premium. Further, we study the
case that the stop-loss premium can be decomposed into two parts. One
part can be evaluated exactly, to another part, comonotonic bounds are
applied. The application to the lognormal case is presented at the end of
Section 3.
In Section 4 we illustrate the accuracy of the comonotonic approximations by means of an application in the context of discounted reserves.
2.2. Convex bounds for sums of dependent random variables 23
Section 5 extends the methodology of Dhaene et al. (2002a,b) for deriving lower and upper bounds of a sum of dependent variables to the case of
scalar products of independent random vectors. We derive a procedure for
calculating the lower and upper bounds in case one of the vectors follows
the multivariate lognormal law.
In Section 6 we apply these results to the case of general discounted
cash flows, with stochastic payments. Numerous numerical illustrations
are provided, demonstrating that the derived methodology gives very accurate approximations for the underlying distribution functions and the
corresponding risk measures.
Section 2 and 3 in this chapter are mainly based on Hoedemakers, Darkiewicz, Deelstra, Dhaene & Vanmaele (2005). The results in Section 4
come from Hoedemakers & Goovaerts (2004). The generalization to the
scalar product of two random vectors in Section 5 is based on Hoedemakers, Darkiewicz & Goovaerts (2005) and Section 6 is taken from Ahcan,
Darkiewicz, Goovaerts & Hoedemakers (2005).
2.2
Convex bounds for sums of dependent random variables
In the actuarial context one encounters quite often random variables of the
type
S = X 1 + X2 + · · · + X n ,
where the terms Xi are not mutually independent, but the multivariate
~ = (X1 , X2 , . . . , Xn ) is not
distribution function of the random vector X
completely specified and one only knows the marginal distribution functions of the random variables Xi . In such cases, to be able to make decisions
it may be helpful to find the dependence structure for the random vector
(X1 , . . . , Xn ) producing the least favorable aggregate claims S with given
marginals. Therefore, given the marginal distributions of the terms in a
P
random variable S = ni=1 Xi , we shall look for a joint distribution with
a smaller resp. larger sum, in the convex order sense.
If S consists of a sum of random variables (X1 , . . . , Xn ), replacing the
joint distribution of (X1 , . . . , Xn ) by the comonotonic joint distribution
yields an upper bound for S in the convex order. On the other hand, applying conditioning to S provides us a lower bound. Finally, if we combine
24
Chapter 2 - Convex bounds
both ideas, then we end up with an improved upper bound. This is formalized in the following theorem, which is taken from Dhaene et al. (2002a)
and Kaas et al. (2000).
Theorem 8 (Bounds for a sum of random variables).
Consider a sum of random variables S = X1 + X2 + . . . + Xn and define
the following related random variables:
S l = E[X1 |Λ] + E[X2 |Λ] + . . . + E[Xn |Λ],
S c = FX−11 (U ) + FX−12 (U ) + . . . + FX−1n (U ),
S
u
=
FX−11 |Λ (U )
+
FX−12 |Λ (U )
+ ... +
FX−1n |Λ (U ),
(2.1)
(2.2)
(2.3)
with U a U(0,1) random variable and Λ an arbitrary random variable.
Here FX−1i |Λ (U ) is the notation for the random variable fi (U, Λ), with the
function fi defined by fi (u, λ) = FX−1i |Λ=λ (u).
The following relations then hold:
S l ≤cx S ≤cx S u ≤cx S c .
Proof. See e.g. Dhaene et al. (2002a).
The comonotonic upper bound changes the original copula, but keeps the
marginal distributions unchanged. The comonotonic lower bound on the
other hand, changes both the copula and the marginals involved. Intuitively, one can expect that an appropriate choice of the conditioning variable Λ will lead to much better approximations compared to the upper
bound.
The upper bound S c is the most dangerous sum of random variables
with the same marginal distributions as the original terms Xj in S. Indeed,
the upper bound S c now consists of a sum of comonotonic variables all
depending on the same random variable U . If one can find a conditioning
random variable Λ with the property that all random variables E[Xj |Λ] are
non-increasing functions of Λ (or all are non-decreasing functions of Λ),
P
then the lower bound S l = nj=1 E[Xj |Λ] is also a sum of n comonotonic
random variables.
We recall from Dhaene et al. (2002a) and the references therein the
procedures for obtaining the lower and upper bounds for stop-loss premiums of sums S of dependent random variables by using the notion of
comonotonicity.
2.2. Convex bounds for sums of dependent random variables 25
2.2.1
The comonotonic upper bound
As proven in Dhaene et al. (2002a), the convex-largest sum of the components of a random vector with given marginals is obtained by the comonotonic sum S c = X1c + X2c + · · · + Xnc with
d
Sc =
n
X
FX−1i (U ),
(2.4)
i=1
where U denotes in the following a U(0, 1) random variable.
Kaas et al. (2000) have proved that the inverse distribution function of
a sum of comonotonic random variables is simply the sum of the inverse
distribution functions of the marginal distributions. See also Theorem 7.
Therefore, given the inverse functions FX−1i , the cumulative distribution
function of S c = X1c + X2c + · · · + Xnc can be determined as follows:
FS c (x) = sup {p ∈ (0, 1) | FS c (x) ≥ p}
= sup p ∈ (0, 1) | FS−1
c (p) ≤ x
(
)
n
X
−1
= sup p ∈ (0, 1) |
FXi (p) ≤ x .
(2.5)
i=1
Moreover, in case of strictly increasing and continuous marginals, the cdf
FS c (x) is uniquely determined by
FS−1
c (FS c
(x)) =
n
X
FX−1i (FS c (x)) = x,
(0) < x < FS−1
FS−1+
c (1).
c
i=1
(2.6)
Hereafter we restrict ourselves to this case of strictly increasing and continuous marginals.
In the following theorem Dhaene et al. (2000) have proved that the
stop-loss premiums of a sum of comonotonic random variables can easily
be obtained from the stop-loss premiums of the terms.
Theorem 9 (Stop-loss premium of comonotonic sum).
The stop-loss premium, denoted by π cub (S, d), of the sum S c of the components of the comonotonic random vector (X1c , X2c , . . . , Xnc ) at retention
d is given by
n
X
π Xi , FX−1i FS c (d) ,
(0) < d < FS−1
FS−1+
π cub (S, d) =
c (1) .
c
i=1
(2.7)
26
Chapter 2 - Convex bounds
If the only information available concerning the multivariate distribution
function of the random vector (X1 , . . . , Xn ) consists of the marginal distribution functions of the Xi , then the distribution function of S c =
FX−11 (U ) + FX−12 (U ) + · · · + FX−1n (U ) is a prudent choice for approximating the unknown distribution function of S = X1 + X2 + · · · + Xn . It is a
supremum in terms of convex order. It is the best upper bound that can
be derived under the given conditions.
We end this part about the comonotonic upper bound by summarizing
the main advantages of using S c = X1c + X2c + · · · + Xnc instead of S =
X1 + X 2 + · · · + X n :
• Replacing the distribution function of S by the distribution function
of S c is a prudent strategy in the framework of utility theory: the
real distribution function is replaced by a less attractive one.
• The random variables S and S c have the same expected value. As
these random variables are ordered in the convex order sense, we
have that every moment of order 2k (k = 1, 2, . . .) of S is smaller
than the corresponding moment of S c . Many actuarially relevant
quantities reflect convex order, for instance both the ruin probability
and the Lundberg upper bound for it increase when the claim size
distribution is replaced by a convex larger one. Other examples are
zero-utility premiums such as the exponential premium, and of course
stop-loss premiums for any retention d.
• The cdf of S c can easily be obtained; essentially, S c has a onedimensional distribution, depending only on the random variable U .
The distribution function of S can only be obtained if the dependency
structure is known. Even if this dependency structure is known, it
can be hard to determine the distribution function of S from it.
• The stop-loss premiums of S c follow from stop-loss premiums of the
marginal random variables involved. Computing the stop-loss premiums of S can only be carried out when the dependency structure
is known, and in general requires n integrations to be performed.
2.2.2
The improved comonotonic upper bound
Let us now assume that we have some additional information available
concerning the stochastic nature of (X1 , . . . , Xn ). More precisely, we as-
2.2. Convex bounds for sums of dependent random variables 27
sume that there exists some random variable Λ with a given distribution
function, such that we know the conditional cumulative distribution functions, given Λ = λ, of the random variables Xi , for all possible values of λ.
In fact, Kaas et al. (2000) define the improved comonotonic upper bound
S u as
S u = FX−11 |Λ (U ) + FX−12 |Λ (U ) + · · · + FX−1n |Λ (U ).
(2.8)
In order to obtain the distribution function of S u , observe that given the
event Λ = λ, the random variable S u is a sum of comonotonic random
variables.
Hence,
FS−1
u |Λ=λ (p)
=
n
X
i=1
FX−1i |Λ=λ (p),
Given Λ = λ, the cdf of S u is defined by
(
FS u |Λ=λ (x) = sup p ∈ (0, 1) |
n
X
i=1
p ∈ (0, 1) .
)
FX−1i |Λ=λ (p) ≤ x .
The cdf of S u then follows from
Z +∞
FS u (x) =
FS u |Λ=λ (x) dFΛ (λ).
−∞
If the marginal cdf’s FXi |Λ=λ are strictly increasing and continuous, then
FS u |Λ=λ (x) is a solution to
n
X
i=1
FX−1i | Λ=λ FS u | Λ=λ (x) = x,
−1
x ∈ FS−1+
(0),
F
(1)
.
u | Λ=λ
S u | Λ=λ
In this case, we also find that for any d ∈
u
E (S − d)+ |Λ = λ =
n
X
i=1
E
Xi −
−1
FS−1+
u |Λ=λ (0), FS u |Λ=λ (1)
FX−1i |Λ=λ
FS u |Λ=λ (d)
+
(2.9)
:
|Λ = λ ,
from which the stop-loss premium at retention d of S u , which we will
denote by π icub (S, d, Λ), can be determined by weighted integration with
respect to λ over the real line.
28
Chapter 2 - Convex bounds
2.2.3
The lower bound
~ = (X1 , . . . , Xn ) be a random vector with given marginal cumulaLet X
tive distribution functions FX1 , FX2 , . . . , FXn . Let us now assume that we
have some additional information available concerning the stochastic nature of (X1 , . . . , Xn ). More precisely, we assume that there exists some
random variable Λ with a given distribution function, such that we know
the conditional distribution, given Λ = λ, of the random variables Xi , for
all possible values of λ. We recall from Kaas et al. (2000) that a lower
bound, in the sense of convex order, for S = X1 + X2 + · · · + Xn is
S l = E [S|Λ] .
(2.10)
This idea can also be found in Rogers and Shi (1995) for the continuous
and lognormal case. Let us further assume that the random variable Λ is
such that all E [Xi |Λ] are non-decreasing and continuous functions of Λ,
then S l is a comonotonic sum.
The quantiles of the lower bound S l then follow from
FS−1
l (p)
=
n
X
i=1
−1
FE[X
(p)
i |Λ]
=
n
X
i=1
E Xi |Λ = FΛ−1 (p) ,
p ∈ (0, 1) , (2.11)
and the cdf of S l is according to (2.5) given by
)
(
n
X
E Xi |Λ = FΛ−1 (p) ≤ x .
FS l (x) = sup p ∈ (0, 1) |
(2.12)
i=1
Using Theorem 9, the stop-loss premiums with retention d read FS−1+
(0)
l
−1
< d < FS l (1)
n
X
−1
.
π E[Xi |Λ], FE[X
π (S, d, Λ) =
F
l (d)
S
i |Λ]
lb
i=1
When in addition the cdf’s of the random variables E [Xi |Λ] are strictly
increasing and continuous, then the cdf of S l is also strictly increasing and
continuous, and we get analogously to (2.6) for all x ∈ FS−1+
(0) , FS−1
l
l (1) ,
n
X
i=1
−1
FS l (x) = x
FE[X
i |Λ]
⇔
n
h
X
i
E Xi |Λ = FΛ−1 FS l (x) = x, (2.13)
i=1
2.2. Convex bounds for sums of dependent random variables 29
which unambiguously determines the cdf of the convex order lower bound
S l for S. In order to derive the above equivalence, we used the results of
Lemma 1.
Invoking Theorem 9, the stop-loss premium π lb (S, d, Λ) of S l can be
computed as:
n
X
π (S, d, Λ) =
π E Xi |Λ , E Xi |Λ = FΛ−1 FS l (d) ,
lb
(2.14)
i=1
which holds for all retentions d ∈ FS−1+
(0) , FS−1
l
l (1) .
So far, we considered the case that all E [Xi |Λ] are non-decreasing functions of Λ. The case where all E [Xi |Λ] are non-increasing and continuous
functions of Λ also leads to a comonotonic vector E [X1 |Λ] , . . . , E [Xn |Λ] ,
and can be treated in a similar way.
In case the cumulative distribution functions of the random variables
E [Xi |Λ] are not continuous nor strictly increasing or decreasing functions
of Λ, then the stop-loss premiums of S l , which is not comonotonic anymore,
can be determined as follows :
!
Z +∞ X
n
π lb (S, d, Λ) =
E [Xi |Λ = λ] − d
dFΛ (λ) .
−∞
2.2.4
i=1
+
Moments based approximations
The lower and upper bounds can be considered as approximations for the
distribution of a sum S of random variables. On the other hand, any
convex combination of the stop-loss premiums of the lower bound S l and
the upper bounds S c or S u also could serve as an approximation for the
stop-loss premium of S. Since the bounds S l and S c have the same mean
as S, any random variable S m defined by its stop-loss premiums
π m (S, d, Λ) = zπ lb (S, d, Λ) + (1 − z)π cub (S, d),
0 ≤ z ≤ 1,
will also have the same mean as S. By taking the (right-hand) derivative
we find
FS m (x) = zFS l (x) + (1 − z)FS c (x), 0 ≤ z ≤ 1,
so the distribution function of the approximation can be calculated fairly
easily. By choosing the optimal weight z, we want S m to be as close as
30
Chapter 2 - Convex bounds
possible to S. In Vyncke et al. (2004) z is chosen as
z=
Var[S c ] − Var[S]
.
Var[S c ] − Var[S l ]
(2.15)
This choice does not depend on the retention and it leads to equal variances
Var[S m ] = Var[S].
As an alternative one could consider the improved upper bound S u and
define a second approximation as follows
π m2 (S, d, Λ) = zπ lb (S, d, Λ) + (1 − z)π icub (S, d, Λ),
now with
z=
2.3
Var[S u ] − Var[S]
.
Var[S u ] − Var[S l ]
Upper bounds for stop-loss premiums
One of the most important tasks of actuaries is to assess the degree of dangerousness of a risk X — either by finding the (approximate) distribution
or at least by summarizing its properties quantitatively by means of risk
measures to determine an insurance premium or a sufficient reserve with
solvency margin.
A stop-loss premium π(X, d) = E[(X −d)+ ] = E[max(0, X −d)] is one of
the most important risk measures. The retention d is usually interpreted as
an amount retained by an insured (or an insurer) while an amount X − d
is ceded to an insurer (or a reinsurer). In this case π(X, d) has a clear
interpretation as a pure insurance (reinsurance) premium.
Another practical application of stop-loss premiums is the following:
Suppose that a financial institution faces a risk X to which a capital K is
allocated. Then the residual risk R = (X − K)+ is a quantity of concern to
the society and regulators. Indeed, it represents the pessimistic case when
the random loss X exceeds the available capital. The value E[R] is often
referred to as the “expected shortfall” as explained in Subsection 1.1.2,
with K a VaR at some level.
It is not always straightforward to compute stop-loss premiums. In
the actuarial literature a lot of attention has been devoted to determine
bounds for stop-loss premiums in case only partial information about the
2.3. Upper bounds for stop-loss premiums
31
claim size distribution is available (e.g. De Vylder & Goovaerts (1982),
Jansen et al. (1986), Hürlimann (1996, 1998), among others).
Other types of problems appear in the case of sums of random variables S = X1 +· · ·+Xn when full information about marginal distributions
is available but the dependency structure is not known. In the previous
section it is explained how the upper bound S c of the sum S in so called
convex order sense can be calculated by replacing the unknown joint distribution of the random vector (X1 , X2 , . . . , Xn ) by the most dangerous
comonotonic joint distribution. One can also obtain a lower bound S l
through conditioning. Such an approach allows to determine analytical
bounds for stop-loss premiums π lb (S, d, Λ) ≤ π(S, d) ≤ π cub (S, d).
In practical applications the comonotonic upper bound seems to be
useful only in the case of a very strong dependency between successive
summands. Even then the bounds for stop-loss premiums provided by
the comonotonic approximation are often not satisfactory. In this section
we present a number of techniques which allow to determine much more
efficient upper bounds for stop-loss premiums. To this end, we use on one
the hand the method of conditioning as in Curran (1994) and in Rogers &
Shi (1995), and on the other hand the upper and lower bounds for stoploss premiums of sums of dependent random variables as explained in the
previous subsection.
2.3.1
Upper bounds based on lower bound plus error term
Following the ideas of Rogers and Shi (1995), we derive an upper bound
based on the lower bound S l .
Lemma 5.
For any random variable X we have the following inequality
1
E[X+ ] ≤ E[X]+ + Var1/2 (X).
2
−
Proof. Define X+
as follows
−
X+
:= max(−X, 0) = (−X)+ = − min(X, 0).
(2.16)
32
Chapter 2 - Convex bounds
Using Jensen’s inequality twice we have
0 ≤ E[X+ ] − E[X]+
o
1n
−
] − E[X]−
E[X+ ] − E[X]+ + E[X+
=
+
2
o
1n
−
=
] − E[X]+ − E[X]−
E[X+ + X+
+
2
n
o
1
=
E[|X|] − |E[X]|
2
1
E[|X − E[X]|]
≤
2
1
Var1/2 (X)
≤
2
Applying now Proposition 5 for any random variable Y and Z:
i
1 hp
0 ≤ E E [Y+ |Z] − E [Y |Z]+ ≤ E
Var[Y |Z]
2
(2.17)
to the case of Y being S − d and Z being our conditioning variable Λ, we
obtain an error bound
h
i
i 1 hp
0 ≤ E E [(S − d)+ |Λ] − (S l − d)+ ≤ E
Var[S|Λ] ,
(2.18)
2
which is only useful if the retention d is strictly positive.
Consequently, we find as upper bound for the stop-loss premium of S
π(S, d) ≤ π eub (S, d, Λ),
(2.19)
with π eub (S, d, Λ) given by
i
1 hp
π eub (S, d, Λ) = π lb (S, d, Λ) + E
Var[S | Λ] .
2
(2.20)
The second term on the right hand side takes the form
hp
i
2 2 1/2
E
Var[S | Λ] = E E S |Λ − E[S|Λ]
(2.21)
" n n
#
XX
1/2
l 2
= E
E [Xi Xj |Λ] − S
,
i=1 j=1
and once the distributions of Xi and Λ are specified and known, it can be
written out more explicitly.
2.3. Upper bounds for stop-loss premiums
2.3.2
33
Bounds by conditioning through decomposition of the
stop-loss premium
Decomposition of the stop-loss premium
In this part we show how to improve the bounds introduced in Section
2.2 and Subsection 2.3.1. By conditioning S on some random variable Λ,
the stop-loss premium can be decomposed in two parts, one of which can
either be computed exactly or by using numerical integration, depending
on the distribution of the underlying random variable. For the remaining
part we first derive a lower and an upper bound based on comonotonic
risks, and another upper bound equal to that lower bound plus an error
term. This idea of decomposition goes back at least to Curran (1994).
By the tower property for conditional expectations the stop-loss premium
n
P
π(S, d) with S =
Xi equals
i=1
E E[(S − d)+ |Λ] ,
for every conditioning variable Λ, say with cdf FΛ .
If in addition there exists a dΛ such that Λ ≥ dΛ implies that S ≥ d,
we can decompose the stop-loss premium of S as follows
Z +∞
Z dΛ
E[S − d|Λ = λ]dFΛ (λ)
E[(S − d)+ |Λ = λ]dFΛ (λ) +
π(S, d) =
−∞
dΛ
=: I1 + I2 .
(2.22)
Notice that the other case (Λ ≤ dΛ implies that S ≥ d) can be treated
in a similar way with the appropriate integration bounds. In practical
applications the existence of such a dΛ depends on the actual form of S
and Λ = λ.
The second integral can further be simplified to
Z +∞ X
n
(2.23)
E Xi |Λ = λ dFΛ (λ) − d(1 − FΛ dΛ ) ,
I2 =
dΛ
i=1
and can be written out explicitly if the bivariate distribution of (Xi , Λ) is
known for all i.
Deriving bounds for the first part I1 in decomposition (2.22) and adding
up to the exact part (2.23) gives us the bounds for the stop-loss premium.
34
Chapter 2 - Convex bounds
Lower bound
By means of Jensen’s inequality, the first integral I1 of (2.22) can be
bounded below:
Z dΛ X
Z dΛ
n
E[S | Λ = λ]−d + dFΛ (λ) =
I1 ≥
E[Xi |Λ = λ]−d dFΛ (λ).
−∞
−∞
+
i=1
(2.24)
By adding the exact part (2.23) and introducing notation (2.10), we end
up with the inequality of Section 2.2.3:
π(S, d) ≥ π lb (S, d, Λ).
When S l is a sum of n comonotonic risks we can apply (2.14) which holds
even when we do not know or find a dΛ .
When S l is not comonotonic we use the decomposition
Z
lb
π (S, d, Λ) =
dΛ
−∞
+
Z
n
X
E[Xi |Λ = λ] − d
i=1
n
+∞ X
dΛ
i=1
+
dFΛ (λ)
E Xi |Λ = λ dFΛ (λ) − d(1 − FΛ dΛ ) .
Upper bound based on lower bound
In this part we improve the bound (2.19) by applying (2.17) to (2.24):
0 ≤ E E[(S − d)+ |Λ] − (S l − d)+
Z dΛ =
E (S − d)+ |Λ = λ − E[S|Λ = λ] − d + dFΛ (λ)
≤
≤
−∞
Z dΛ
1
1
Var[S | Λ = λ] 2 dFΛ (λ)
2 −∞
21 21
1 E Var[S|Λ]I(Λ<dΛ )
E I(Λ<dΛ )
=: ε(dΛ ),
2
(2.25)
(2.26)
where Hölder’s inequality has been applied in the last inequality. We will
denote this upper bound by π deub (S, d, Λ). So we have that
π deub (S, d, Λ) = π lb (S, d, Λ) + ε(dΛ ).
(2.27)
2.3. Upper bounds for stop-loss premiums
35
We remark that the error bound (2.18), and hence also the upper bound
π eub (S, d, Λ), is independent of dΛ and corresponds to the limiting case
of (2.25) where dΛ equals infinity. Obviously, the error bound (2.25) improves the error bound (2.18). In practical applications, the additional
error introduced by
turns out to be much smaller than
i
hpHölders inequality
1
Var[S|Λ] − ε(dΛ ).
the difference 2 E
2.3.3
Partially exact/comonotonic upper bound
We bound the first term I1 of (2.22) above by replacing S|Λ = λ by its
comonotonic upper bound S u (in convex order sense):
Z
dΛ
−∞
E (S −d)+ |Λ = λ dFΛ (λ) ≤
Z
dΛ
−∞
E (S u −d)+ |Λ = λ dFΛ (λ). (2.28)
Adding (2.28) to the exact part (2.23) of the decomposition (2.22) results
in the so-called partially exact/comonotonic upper bound for a stop-loss
premium. We will use the notation π pecub (S, d, Λ) to indicate this upper
bound.
It is easily seen that
π pecub (S, d, Λ) ≤ π icub (S, d, Λ),
while for two distinct conditioning variables Λ1 and Λ2 it does not necessarily holds that
π pecub (S, d, Λ1 ) ≤ π icub (S, d, Λ2 ).
2.3.4
The case of a sum of lognormal random variables
We show how to apply our results to the case of sums of lognormal distributed random variables. Such sums are widely encountered in practice,
both in actuarial science and in finance. Typical examples are present values of future cash flows with stochastic (Gaussian) returns (see Dhaene et
al. (2002b)), Asian options (see e.g. Simon et al. (2000), Vanmaele et al.
(2004b) and Albrecher et al. (2005)) and basket options (see Deelstra et
al. (2004) and Vanmaele et al. (2004a)).
36
Chapter 2 - Convex bounds
We assume that Xi = αi eZi with Zi ∼ N(E[Zi ], σZ2 i ) and αi ∈ R. We
develop the expressions for the lower and upper bounds for the following
sum S
n
n
X
X
S=
Xi =
(2.29)
αi eZi .
i=1
i=1
In this case the stop-loss premium π(Xi , di ) with some retention di is wellknown from the following lemma.
Lemma 6 (Stop-loss premium of lognormal random variable).
Let X be a lognormal random variable of the form αeZ with Z ∼ N(E[Z], σZ2 )
and α ∈ R. Then the stop-loss premium with retention d equals for αd > 0
π(X, d) = sign(α)eµ+
σ2
2
where
Φ sign(α)b1 − dΦ sign(α)b2 ,
µ = ln |α| + E[Z]
σ = σZ
b1 =
b2 = b1 − σ.
µ+
σ2
− ln |d|
σ
(2.30)
(2.31)
The case αd < 0 is trivial.
We now consider a normally distributed random variable Λ. The following
results are analogous to Theorem 1 in Dhaene et al. (2002b).
Theorem 10 (Bounds for a sum of lognormal random variables).
Let S be given by (2.29) and consider a normally distributed random variable Λ which is such that (Zi , Λ) is bivariate normally distributed for all
i. Then the distributions of the lower bound S l , the improved comonotonic
upper bound S u and the comonotonic upper bound S c are given by
Sl =
Su =
Sc =
n
X
i=1
n
X
i=1
n
X
i=1
αi e
2
E[Zi ]+ri σZi Φ−1 (V )+ 21 (1−ri2 )σZ
αi eE[Zi ]+ri σZi Φ
−1 (V
)+sign(αi )
αi eE[Zi ]+sign(αi )σZi Φ
−1 (U )
,
√
i
,
1−ri2 σZi Φ−1 (U )
(2.32)
,
(2.33)
(2.34)
2.3. Upper bounds for stop-loss premiums
37
Λ − E[Λ]
where U and V = Φ
are mutually independent U(0,1) random
σΛ
variables, and ri , i = 1, . . . , n, are correlations defined by
ri = Corr (Zi , Λ) =
Cov [Zi , Λ]
.
σ Zi σ Λ
If, for all i sign(αi ) = sign(ri ), or, for all i sign(αi ) = −sign(ri ) with
ri 6= 0, then S l is comonotonic.
Proof. See Dhaene et al. (2002b)
Comonotonic upper bound
The quantile function of S c results from (1.20) in Theorem 7 and is given
by
n
X
−1
−1
FS c (p) =
αi eE[Zi ]+sign(αi )σZi Φ (p) ,
p ∈ (0, 1).
(2.35)
i=1
Since the cdf’s FXi are strictly increasing and continuous, it follows from
(2.6) and (2.34) that for x ∈ FS−1+
(0), FS−1
c
c (1) , the cdf of the comonotonic
sum FS c (x) can be found by solving
n
X
αi e
E[Zi ]+sign(αi )σZi Φ−1 FS c (x)
i=1
= x.
Combination of Theorem 9 and Lemma 6 yields the following expression
(0) < d < FS−1
for the stop-loss premium of S c at retention d with FS−1+
c (1):
c
π
cub
(S, d) =
n
X
i=1
αi e
E[Zi ]+
2
σZ
2
i
h
i
Φ sign(αi )σZi −Φ−1 FS c (d) −d 1−FS c (d) .
Improved comonotonic upper bound
We now determine the cdf of S u and the stop-loss premium π icub (S, d, Λ),
where we condition on a normally distributed random variable Λ or equivalently on the U (0, 1) random variable introduced in Theorem 10:
Λ − E [Λ]
V =Φ
.
σΛ
The conditional probability FS u |V =v (x) also denoted by FS u (x|V = v),
is the cdf of a sum of n comonotonic random variables and follows for
38
Chapter 2 - Convex bounds
−1
FS−1+
u |V =v (0) < x < FS u |V =v (1), according to (2.9) and (2.33), implicitly
from:
n
X
αi eE[Zi ]+ri σZi Φ
−1 (v)+sign(α
i)
√
1−ri2 σZi Φ−1 FS u (x|V =v)
i=1
= x.
(2.36)
The cdf of S u is then given by
FS u (x) =
Z
1
0
FS u |V =v (x)dv.
We now look for an expression for the stop-loss premium at retention d
−1
u
with FS−1+
u |V =v (0) < d < FS u |V =v (1) for S :
π icub (S, d, Λ) =
=
Z
1
0
E (S u − d)+ |V = v dv
n Z
X
i=1
1
0
h
i
E FX−1i |Λ (U |V = v) − di + dv
with di = FX−1i |Λ FS u (d|V = v)|V = v and with U a random variable
which is uniformly distributed on (0, 1). Since sign(αi )FX−1i |Λ (U |V = v)
follows a lognormal distribution with mean and standard deviation:
q
µv (i) = ln |αi | + E [Zi ] + ri σZi Φ−1 (v), σv (i) = 1 − ri2 σZi ,
one obtains that
q
h
i
di = αi exp E[Zi ] + ri σZi Φ−1 (v) + sign(αi ) 1 − ri2 σZi Φ−1 FS u |V =v (d) .
Formula (2.30) then yields
E (S u − d)+ |V = v =
n 2 (i)
X
σv
sign(αi )eµv (i)+ 2 Φ sign(αi )bi,1 − di Φ sign(αi )bi,2 ,
i=1
with, according to (2.31),
bi,1 =
µv (i) + σv2 (i) − ln |di |
,
σv (i)
bi,2 = bi,1 − σv (i).
2.3. Upper bounds for stop-loss premiums
39
Substitution of the corresponding expressions and integration over the interval [0, 1] leads to the following result
π
icub
n
X
(S, d, Λ) =
i=1
αi e
2 1−r 2
E[Zi ]+ 21 σZ
( i)
i
q
×Φ sign(αi ) 1 −
−d 1 − FS u (d) .
Lower bound
Z
1
e r i σ Zi Φ
−1 (v)
0
ri2 σZi
−Φ
−1
×
FS u |V =v (d)
dv
(2.37)
In this subsection, we study the case that, for all i, sign(αi ) = sign(ri )
when ri 6= 0. For simplicity we take all αi ≥ 0 and assume that the
conditioning variable Λ is normally distributed and has the right sign such
that the correlation coefficients ri are all positive. These conditions ensure
that S l is the sum of n comonotonic random variables. The case that, for
all i, sign(αi ) = −sign(ri ) when ri 6= 0 can be dealt with in an analogous
way.
The quantile function of S l results from (1.20) in Theorem 7 and is given
by
FS−1
l (p) =
n
X
αi e
2
E[Zi ]+ri σZi Φ−1 (p)+ 21 (1−ri2 )σZ
i
p ∈ (0, 1).
,
i=1
(2.38)
Since by our assumptions E[Xi |Λ] is increasing, we can obtain FS l (x) according to (2.13) and (2.32) from
n
X
αi e
2
E[Zi ]+ri σZi Φ−1 FS l (x) + 21 (1−ri2 )σZ
i=1
i
= x.
(2.39)
Moreover as S l is the sum of n lognormally distributed random variables,
the stop-loss premium at retention d (> 0) can be expressed explicitly by
invoking Theorem 9 and Lemma 6:
π lb (S, d, Λ) =
n
X
i=1
αi e
2
E[Zi ]+ 21 σZ
i
h
i
Φ ri σZi − Φ−1 FS l (d) − d 1 − FS l (d) .
(2.40)
40
Chapter 2 - Convex bounds
Upper bound based on lower bound
From (2.21) we obtain that
E
hp
i
Var[S|Λ] =
)1
Z +∞ ( X
n X
n
2 2
dFΛ (λ). (2.41)
E Xi Xj |Λ = λ − E[S|Λ = λ]
−∞
i=1 j=1
Now consider the first term in the right hand side of (2.41). Because of
the properties of lognormally distributed random variables, the product of
lognormals is again lognormal if the underlying vector is multivariate normal distributed, and conditioning a lognormal variate on a normal variate
yields a lognormally distributed variable.
We can proceed by denoting Zij = Zi + Zj with E[Zij ] = E[Zi ] + E[Zj ]
and
σZ2 ij = σZ2 i + σZ2 j + 2σZi Zj ,
where σZi Zj := Cov[Zi , Zj ]. Note that
rij
=
=
=
Cov[Zij , Λ]
σZij σΛ
Cov [Zi , Λ] Cov [Zj , Λ]
+
σZij σΛ
σZij σΛ
σ Zj
σ Zi
ri +
rj .
σZij
σZij
Conditionally, given Λ = λ, the random variable Zij is normally dis
σZ
tributed with parameters µ(i, j) = E [Zij ]+rij σΛij λ−E[Λ] and σ 2 (i, j) =
2 σ 2 . Hence, conditionally, given Λ = λ, the random variable
1 − rij
Zij
eZij is lognormally distributed with parameters µ(i, j) and σ 2 (i, j). As
1 2
E eZij |Λ = λ = eµ(i,j)+ 2 σ (i,j) , we find
2 σ2
E[Zij ]+rij σZij Φ−1 (V )+ 12 (1−rij
) Zij
E eZij |Λ = e
,
where the random variable V = Φ
the interval (0, 1).
Λ−E[Λ]
σΛ
is uniformly distributed on
2.3. Upper bounds for stop-loss premiums
41
Thus, the first term in (2.41) equals
n X
n
X
i=1 j=1
E[Xi Xj |Λ] =
n
n X
X
αi αj exp E[Zij ] + rij σZij Φ
i=1 j=1
−1
2
1
2
σZij
1 − rij
(V ) +
2
, (2.42)
while the second term consists of (2.32). Hence (2.41) can be written out
explicitly and by using (2.20), we have that the upper bound (2.19) is given
by
π
eub
(S, d, Λ) =
n
X
αi e
i=1
1
+
2
Z
2
E[Zi ]+ 21 σZ
1
0
(
−
i
h
i
Φ ri σZi − Φ−1 FS l (d) − d 1 − FS l (d)
n X
n
X
αi αj e
2 )σ 2
E[Zij ]+rij σZij Φ−1 (v)+ 12 (1−rij
Z
ij
i=1 j=1
n
X
αi e
2
E[Zi ]+ri σZi Φ−1 (v)+ 21 (1−ri2 )σZ
i=1
!2 ) 1
2
i
dv.
Bounds by conditioning through decomposition of stop-loss premium
In this part we apply the theory of Subsection 2.3.2 to the sum of lognormal random variables (2.29). We give here the analytical expressions for
the two upper bounds π deub (S, d, Λ) and π pecub (S, d, Λ). For more details
concerning the calculation of the bounds the reader is referred to the last
section of this chapter.
The following auxiliary result is needed in order to write out the bounds
explicitly.
Lemma 7.
For any constant a ∈ R and any normally distributed random variable Λ
Z
where d∗Λ =
dΛ −E[Λ]
σΛ
dΛ
−∞
eaΦ
−1 (v)
dFΛ (λ) = e
and Φ−1 (v) =
λ−E[Λ]
σΛ .
a2
2
Φ(d∗Λ − a),
(2.43)
42
Chapter 2 - Convex bounds
Lower bound
Note that the lower bound via the decomposition equals the lower bound
without the decomposition. So the lower bound in the lognormal and
comonotonic case is given by expression (2.40).
Upper bound based on lower bound
The upper bound (2.27) can be written out explicitly as follows
π deub (S, d, Λ) =
n
X
E[Z ]+ 1 σ 2
αi e i 2 Zi Φ ri σZi − Φ−1 (FS l (d)) − d (1 − FS l (d))
i=1
1
+ Φ(d∗Λ )1/2
2
(
n X
n
X
αi αj e
2 +σ 2 )
E[Zij ]+ 21 (σZ
Z
i
j
i=1 j=1
×Φ d∗Λ − ri σZi + rj σZj
Proof. See Section 2.7.
e
σ Zi Zj
−e
×
σ Zi σ Zj
ri rj )1
2
.(2.44)
Partially exact/comonotonic upper bound
The partially exact/comonotonic upper bound of Subsection 2.3.3 is given
by
π pecub (S, d, Λ) =
( 2 2
Z
n
ri σZ
X
2 1−r 2
i
E[Zi ]+ 12 σZ
(
)
i
i
e 2 Φ(ri σZi − d∗Λ ) +
αi e
i=1
−d 1 −
Z
0
0
e r i σ Zi Φ
−1 (v)
×
)
q
× Φ sign(αi ) 1 − ri2 σZi Φ−1 FS u |V =v (d) dv
Φ(d∗Λ )
Φ(d∗Λ )
!
FS u |V =v (d)dv .
Proof. See Section 2.7.
(2.45)
2.3. Upper bounds for stop-loss premiums
43
Choice of the conditioning variable
If X ≤cx Y , and X and Y are not equal in distribution, then Var[X] <
d
Var[Y ] must hold. An equality in variance would imply that X = Y . This
shows that if we want to replace S by the convex smaller S l , the best
approximations will occur when the variance of S l is ‘as close as possible’
to the variance of S. Hence we should choose Λ such that the goodness-of[S l ]
fit expressed by the ratio z = Var
Var[S ] is as close as possible to 1. Of course
one can always use numerical procedures to optimize z but this would
outweigh one of the main features of the convex bounds, namely that the
different relevant actuarial quantities (quantiles, stop-loss premiums) can
be easily obtained. Having a ready-to-use approximation that can be easily
implemented and used by all kind of end-users is important from a business
point of view.
Notice that the expected values of the random variables S, S c and S l
are all equal:
E[S] = E[S l ] = E[S c ] =
n
X
αi e
2
E[Zi ]+ 21 σZ
i
,
(2.46)
i=1
while their variances are given by
Var[S] =
Var[S l ] =
n
n X
X
i=1 j=1
n X
n
X
αi αj e
αi αj e
2 +σ 2 )
E[Zi ]+E[Zj ]+ 12 (σZ
Zj
i
2 +σ 2 )
E[Zi ]+E[Zj ]+ 12 (σZ
Zj
i
e
σ Zi Zj
e
−1 ,
r i r j σ Zi σ Zj
i=1 j=1
−1
(2.47)
(2.48)
and
Var[S c ] =
n X
n
X
αi αj e
2 +σ 2 )
E[Zi ]+E[Zj ]+ 12 (σZ
Zj
i
i=1 j=1
e
σ Zi σ Zj
−1 ,
(2.49)
respectively.
We propose here three conditioning random variables. The first two are
linear combinations of the random variables Zi :
Λ=
n
X
i=1
γi Z i ,
(2.50)
44
Chapter 2 - Convex bounds
for particular choices of the coefficients γi .
Kaas et al. (2000) propose the following choice for the parameters γ i
when computing the lower bound S l :
γi = αi eE[Zi ] ,
i = 1, . . . , n.
(2.51)
This choice makes Λ a linear transformation of a first order approximation
to S. This can be seen from the following derivation:
S=
n
X
αi e
E[Zi ] +(Zi −E[Zi ])
n
X
≈
i=1
=C+
i=1
n
X
αi eE[Zi ] (1 + Zi − E [Zi ])
αi eE[Zi ] Zi ,
(2.52)
i=1
where C is constant. Hence S l will be “close” to S, provided (Zi − E[Zi ])
is sufficiently small, or equivalently, σZ2 i is
small. One intu sufficiently
itively expects
that for
this choice for Λ, E Var[S|Λ] is “small” and, since
Var[S] = E Var[S|Λ] + Var[S l ], this exactly means that one expects the
[S l ]
ratio z = Var
Var[S ] to be close to one.
A possible decomposition variable is in this case given by
dΛ = d − C = d −
n
X
i=1
αi eE[Zi ] (1 − E [Zi ]) .
Using the property that ex ≥ 1+x and (2.52), we have that Λ ≥ dΛ implies
that S ≥ d.
A second conditioning variable is proposed by Vanduffel et al. (2004).
They propose the following choice for the parameters γi when computing
the lower bound S l :
γi = α i e
2
E[Zi ]+ 21 σZ
i
,
i = 1, . . . , n.
(2.53)
In this case the first order approximation of the variance of S l will be
2.3. Upper bounds for stop-loss premiums
45
maximized. Indeed, from (2.48) we find that
Var[S l ] ≈
=
n X
n
X
i=1 j=1
n X
n
X
αi αj e
αi αj e
2 +σ 2 )
E[Zi ]+E[Zj ]+ 21 (σZ
Zj
i
2 +σ 2 )
E[Zi ]+E[Zj ]+ 21 (σZ
Zj
i
i=1 j=1
=
=
Cov
hP
Corr
n
i=1 αi
n
X
e
2
E[Zi ]+ 12 σZ
i
Zi , Λ
Var[Λ]
αi e
2
E[Zi ]+ 12 σZ
i=1
i
Zi , Λ
(ri rj σZi σZj )
Cov[Zi , Λ]Cov[Zj , Λ]
Var[Λ]
i2
!!2
Var
"
n
X
i=1
αi e
2
E[Zi ]+ 12 σZ
i
#
Zi .
Hence, the first order approximation of Var[S l ] is maximized when Λ is
given by
n
X
E[Z ]+ 1 σ 2
α i e i 2 Zi Z i .
Λ=
(2.54)
i=1
One can easily prove that the first order approximation for Var[S l ] with Λ
given by (2.54) is equal to the first order approximation of Var[S]. This
observation gives an additional indication that this particular choice for Λ
will provide a good fit.
For this ‘maximal variance’ conditioning variable a possible choice for
dΛ is given by
n
X
2
1 2
E[Zi ]+ 21 σZ
i
dΛ = d −
1 − E [Zi ] − σZi .
αi e
(2.55)
2
i=1
A third conditioning variable is based on the standardized logarithm of the
Q
geometric average G = ( ni=1 S)1/n as in Nielsen and Sandmann (2003)
Pn
(Zi − E[Zi ])
ln G − E[ln G]
= pi=1 Pn
Λ= p
.
Var[ i=1 Zi ]
Var[ln G]
Using the fact that the geometric average is not greater than the arithmetic
average, a possible decomposition variable is here given by
P
n ln nd − ni=1 E[Zi ]
p
dΛ =
,
P
Var[ ni=1 Zi ]
so that Λ ≥ dΛ implies that S ≥ d.
46
Chapter 2 - Convex bounds
Generalization to sums of lognormals with a stochastic time horizon
Suppose that S is a sum of lognormal variables with a stochastic time
horizon T
T
X
S=
αi eZi ,
i=1
with αi ∈ R, T a random variable with life time probability distribution
FT (t) and Zi ∼ N(E[Zi ], σZ2 i ) independent of T . Using the tower property
for conditional expectations, we can calculate the stop-loss premium of S
as follows
X
T
Zi
π(S, d) = π
αi e , d
i=1
X
T
Zi
= ET E
αi e − d |T
+
i=1
=
=
∞
X
j=1
∞
X
Pr[T = j]π
X
j
Zi
αi e , d
i=1
Pr[T = j] π(Sj , d),
(2.56)
j=1
with
Sj :=
j
X
αi eZi .
i=1
Notice that in practical applications the infinite time horizon is often replaced by a finite number. In this part of the thesis, the choice of Λ
will be dependent on the time horizon n. To indicate this dependence,
we introduce the notation Λn for the used conditioning variable Λ. It is
straightforward to obtain a lower bound, denoted as π lb (S, d, Λ), by looking
at the combination
lb
π (S, d, Λ) =
∞
X
Pr[T = j] π lb (Sj , d, Λj ),
j=1
with Λ = Λ1 , Λ2 , . . . and π lb (Sj , d, Λj ) given by (2.40) for n = j. The
same reasoning can be followed for obtaining the comonotonic upper bound
2.4. Application: discounted loss reserves
47
π cub (S, d), the improved comonotonic upper bound π icub (S, d, Λ) and the
partially exact/comonotonic upper bound π pecub (S, d, Λ).
For each term π(Sj , d) in the sum (2.56) we can take the minimum of
two or more of the above defined upper bounds. We propose two upper
bounds based on this simple idea.
The first bound takes each time the minimum of the error term (2.18)
independent of the retention and the error term (2.26) dependent on the
retention. Combining this with the stop-loss premium of the lower bound
S l results in the following upper bound
π
emub
(S, d, Λ) =
∞
X
Pr[T = j] min
j=1
1
E
2
+ π lb (S, d, Λ).
q
Var[Sj |Λj ] , ε(dΛj )
Calculating for each term the minimum of all the presented upper bounds
π
min
(S, d, Λ) =
∞
X
j=1
Pr[T = j] ×
× min π cub (Sj , d), π icub (Sj , d, Λj ), π pecub (Sj , d, Λj ), π emub (Sj , d, Λj ) ,
will of course provide the best possible upper bound.
Remark that
π emub (Sj , d, Λj ) = π lb (Sj , d, Λj ) + min
2.4
1
E
2
q
Var[Sj |Λj ] , ε(dΛj ) .
Application: discounted loss reserves
Loss reserving deals with the determination of the random present value
of future payments. Since this amount is very important for an insurance
company and its policyholders, these inherent uncertainties are no excuse
for providing anything less than a rigorous scientific analysis. Since the
reserve is a provision for the future payments, the estimated loss reserve
should reflect the time value of money. At the same time, it may be
necessary or desirable for those reserves to contain a security margin that
produces p × 100% confidence in their adequacy, where p is a suitably high
number.
48
Chapter 2 - Convex bounds
In many situations knowledge of the d.f. of this discounted reserve is
useful, for example dynamic financial analysis, assessing profitability and
pricing, identifying risk based capital needs, loss portfolio transfers,etc. .
This application is concerned with the evaluation of loss reserves of this
type according to financial economics (see Panjer (1998)).
2.4.1
Framework and notation
Consider an insurance portfolio subject to liability payments L(i) ≥ 0 at
times i = 1, 2, . . ., where i = 0 denotes the present. Let L(i) be a random
variable and suppose that it is modified by certain forces that influence
the liability over time.
(i)
For example, suppose that Lt denotes the amount of liability ex(i)
pressed in money values of time i. Then Lt evolves in the sense that
(i)
(i)
Lt = Lt−1 RLt ,
t = 1, . . . , i,
where the RLt are strictly positive random variables of the form
RLt = 1 + rLt ,
with rLt the inflation of claims costs over interval (t − 1, t]. The liability
finally paid is
L(i) = L(i)
s .
(i)
As an example, Lt−1 and RLt might be independently distributed as follows:
(i)
Lt−1 ∼ logN(ν, τ 2 ) and RLt ∼ logN(µ, σ 2 ).
It is emphasized that, in this example, rLt denotes claims inflation. This
might include influences other than simple community inflation, such as
the particular pressures of the legal and health care environments on claim
costs.
Similarly, a holding of assets of value At−1 at time t − 1 accumulates
at time t to
At = At−1 RAt ,
with
RAt = 1 + rAt .
2.4. Application: discounted loss reserves
49
Assume that RXt , where X is either A or L, follows the capital Asset
Pricing Model (CAPM):
rXt = rF t + βX ∆t + Xt ,
(2.57)
where rF t is the risk-free rate in period t, βX is the CAPM beta associated
with X, Xt is the idiosyncratic risk associated with X, and
∆t = r M t − r F t ,
with rM denoting the period increase in value of the economy-wide portfolio of assets. The distribution of ∆t is assumed independent of t. The
assumption of CAPM returns is consistent with an assumption that assets
and liabilities here are marked to market.
Henceforth, it will be assumed that rF t = rF , independent of t. This
simplifies the following algebraic development considerably. It should be
emphasized, however, that the whole development generalizes to the case
in which rF t varies with t. The generalization is theoretically straightforward, but adds considerable notational baggage without yielding any
deeper insight.
Assume that the At are i.i.d. and similarly the Lt . Assume that all
variables At , Lt and ∆t are stochastically independent, and that E[Xt ] =
2 .
0. Let us further denote the variance of Xt with ωX
It follows that the RAt and RLt are independent and identically distributed. Suppose now the following distribution assumptions:
(i)
(i) 2(i) 2
,
(2.58)
L0 ∼ logN νL0 , τL0 and RXt ∼ logN µX , σX
(i)
with stochastic independence between L0 and RXt for all i, t, and X =
A, L.
Denote
ρ = Corr(logRAt , logRLt )
and
(r)
(s) κ(rs) = Corr logL0 , logL0
Define the accumulation factor
RXt:u = RX,t+1 RX,t+2 . . . RXu ,
Note that RXt:t+1 = RX,t+1 .
.
for u = t + 1, t + 2, . . .
50
Chapter 2 - Convex bounds
By relation (2.58) and the independence between distinct time intervals,
2
.
RXt:u ∼ logN (u − t)µX , (u − t)σX
The implicit asset allocation is any that is consistent with relation (2.58).
One might assume, for example, a constant allocation by asset sector, with
continuous rebalancing and sector-specific returns that are constant over
time. As remarked earlier in this section, the last of these assumptions
could be weakened. Indeed, if the assumptions of constant returns over
time were weakened, no assumption would be required with respect to
asset allocation. Define the discounted liability payment
(i)
−1
V (i) = Li RA0:i
(i)
−1
= L0 RL0:i RA0:i
(i)
= L0
i
Y
−1
)
(RLj RAj
j=1
∼ logN(α(i) , δ 2(i) ),
(i)
2(i)
2 − 2ρσ σ ). The
with α(i) = νL0 + i(µL − µA ) and δ 2(i) = τL0 + i(σL2 + σA
L A
present value S, given by
S=
n
X
V (i) :=
n
X
eZi ,
(2.59)
i=1
i=1
with n the number of cash-flow liabilities in the discounted value of the
total outstanding losses of the portfolio.
In Taylor (2004), the mean and variance of S are calculated and given
by
E[S] =
n
X
E[V (s) ]
s=1
=
n
X
i=1
(s)
E[L0 ]
R̄L
R̄A
2 σ 2 + ω 2 )/R̄2
1 + (βA
M
A
A
2 /R̄ R̄
1 + β A β L σM
A L
s
,
2.4. Application: discounted loss reserves
Var[S] =
=
n
X
r,s=1
n
X
r,s=1
51
Cov[V (r) , V (s) ]
(r) (s)
E[V (r) ]E[V (s) ] exp κ(rs) τL0 τL0
2
+ min(r, s)[σL2 + σA
− 2ρσA σL ] − 1 ,
2 = Var[r
with R̄X = E[RXt ] and σM
M t ]. We will denote the variance of S
2
by σS .
There are now three relevant values of loss reserve:
Pn
(s)
•
s=1 E[L0 ], which is the CAPM-based economic value of the liability.
• E[S], which is the expected value of the discounted liability cash
flows, the discount rate taking into account the insurers asset holdings.
• Ap = FS−1 (p) = E[S]exp(σS Φ−1 (p) − 21 σS2 ), which is the p × 100%confidence loss reserve.
It may be convenient to write the last of these conditions in the form
Ap = [1 + η(ρ, σS )]E[S],
where η(ρ, σS ) may be regarded as a security loading. Note, however,
that the security loading in this formulation is applied to E[S] and not
to the economic value of the liability. The first two of the above three
possibilities for loss reserve are the ones involved in the current debate
over the appropriate rate(s) at which to discount liabilities. The quantity
E[S] is obtained using the expectations of discount factors that reflect
the insurers expected returns. In broad (though not quite precise) terms,
it may be thought of as the amount of assets which, accumulating with
expected investment return, will be sufficient to meet liabilities as they
are required to be paid. This value depends on the insurer-specific asset
holdings, and so cannot be market or fair value of the liabilities. This is
given by the first of the above three candidates for loss reserves.
Taylor (1996) pointed out for high security margins (Φ−1 (p) > σS ), the
size of the security margin increases with increasing asset beta. However,
52
Chapter 2 - Convex bounds
for low security margins (Φ−1 (p) < σS ), the size of the security margin
decreases with increasing asset beta. In this latter case the additional
yield expected from an increased asset risk outweighs the additional risk.
Taylor (2004) defines the security margin for confidence level p as
SMp [S] := η(p, σS ) = (VaRp [S]/E[S]) − 1, which is based on the quantile risk measure from the distribution of the discounted reserve S. In
general, it is hard or even impossible to determine the quantiles of the discounted reserve analytically, because in any realistic model for the return
process the random variable S will be a sum of strongly dependent random variables. Here, S is is a finite sum of correlated lognormal random
variables. This implies that its cumulative distribution function cannot be
determined exactly and is even too cumbersome to work with. An interesting solution to this difficulty consists of determining the lower bound S l
and the upper bound S c as explained earlier in this chapter.
2.4.2
Calculation of convex lower and upper bounds
To calculate the security margin η(p, σS ) expressions for the quantiles and
the expected value of S l and S c are needed. The expressions for the quantile function of the lower and upper bound of a sum of lognormal random
variables are given by (2.35) and (2.38) in the case of αi = 1 for all i.
The expression for the expected value is given by (2.46). To calculate the
lower bound we choose the ‘maximal variance’ conditioning variable given
by (2.50) and (2.53):
n
X
E[Z ]+ 1 σ 2
e i 2 Zi Z i .
Λ=
i=1
We find that

!
 R̄ 1 + (β 2 σ 2 + ω 2 )/R̄2 1/2 i 
(i)
L
A M
A
A
E[Zi ] = νL0 + log
,
2 + ω 2 )/R̄2
 R̄A 1 + (βL2 σM

L
L
2(i)
Var[Zi ] = σZ2 i = τL0 + iσ̂ 2 ,
2 − 2ρσ σ
where the variability of the discounting structure σ̂ 2 := σL2 + σA
L A
is given by
log
2 σ 2 + ω 2 )/R̄2 ][1 + (β 2 σ 2 + ω 2 )/R̄2 ]
[1 + (βA
M
A
A
L M
L
L
2 /R̄ R̄ ]2
[1 + βA βL σM
A L
.
2.4. Application: discounted loss reserves
53
The correlation between Zi and Λ is given by
with
Pn
2
(i,k)
Cov[Zi , Λ]
k=1 βk σ̂ min(i, k) + η
qP
=
,
ri =
Pn
n
σ Zi σ Λ
2 min(k, l) + η (k,l) )
β
β
(σ̂
σ Zi
l=1 k l
k=1
i
h
(k) (l)
(k)
(l)
η (k,l) = Cov log L0 , log L0 = κ(kl) τL0 τL0 .
2(k)
Notice that if the liability cash flows are independent η (k,s) = τL0 I(k=s) .
We will compare the performance of the lower and upper bound approach
with the Monte Carlo simulation results, obtained by generating 1 000 000
random paths, who serve as a benchmark. Note that the random paths are
based on antithetic variables in order to reduce the variance of the Monte
Carlo estimate.
We use the notation SMp [S l ] and SMp [S c ] to denote the security margin for confidence level p approximated by the lower bound and the upper
bound approximation respectively. The different tables display the Monte
Carlo simulation result (MC) for the security margin, as well as the procentual deviations of the different approximation methods, relative to the
Monte Carlo result. These procentual deviations are defined as follows:
LB :=
SMp [S l ] − SMp [S M C ]
× 100%,
SMp [S M C ]
UB :=
SMp [S c ] − SMp [S M C ]
× 100%,
SMp [S M C ]
where S l and S c correspond to the lower bound approach and the upper
bound approach, and S M C denotes the Monte Carlo simulation result. The
figures displayed in bold in the tables correspond to the best approximations, this means the ones with the smallest procentual deviation compared
to the Monte Carlo results.
We set βL equal to zero and choose as financial parameters rF = 6%,
E[∆] = 6% and βA = 0.9. The tables list the results for different values of
the parameters ωL , ωA , σM and n.
We construct two different cash flow structures. Table 2.1 displays the
first structure of the liability cash flows (ex. 1), each of which is assumed
lognormally distributed, and all of which are stochastically independent.
54
Chapter 2 - Convex bounds
(i)
(i)
(i)
Time i
E[L(i) ]
E[L0 ]
ν0
τ0
1
2
3
4
5
6
7
8
Total
5%
15%
25%
20%
15%
10%
5%
5%
100%
4.7%
13.3%
21.0%
15.8%
11.2%
7.0%
3.3%
3.1%
79.6%
−3.059
−2.019
−1.566
−1.854
−2.120
−2.663
−3.424
−3.493
10%
10%
10%
15%
15%
15%
20%
25%
Table 2.1: Structure of stochastic liability cash flow (ex. 1).
The profile of the cash flows is intended to resemble a medium-term
casualty payment pattern. It is assumed that ωL = 5% and as financial
parameters σM = 20% and ωA = 0. It follows from equation (2.57) that
RL = 1.06. Further, we have for this example µL = 0.0570 and σL =
0.0471.
Table 2.2 summarizes the results for the 70% security margin for different market volatilities σM . The lower bound turns out to fit the security
margins the best for all values of the parameters. Notice that between
brackets the standard error of the Monte Carlo estimate is displayed.
Table 2.3 compares the approximations for some selected confidence
levels p. For this example we have that σA = 16.1%, σL = 4.7%, µA = 9.5%
1 2
2 such that R̄
and µL = 5.7%, with µX and σX
X = exp(µX + 2 σX ). The
results are in line with the previous ones. The lower bound approach gives
excellent results for high as well as for low values of p.
Table 2.4 displays the approximated and simulated 97.5% margins for
some selected market volatilities. These parameters are consistent with
historical capital market values as reported by Ibbotson Associates (2002).
The presented figures again indicate that the lower bound is the most
precise method.
2.4. Application: discounted loss reserves
σM :
LB
UB
MC
(s.e. × 107 )
55
0.05
0.15
0.25
0.35
−0.25%
+19.86%
0.0853
(1.11)
−0.09%
+12.12%
0.1090
(2.47)
−0.12%
+5.37%
0.1309
(6.15)
−0.00%
−1.62%
0.1370
(8.18)
Table 2.2: (ex. 1) Approximations for the security margin SM0.70 [V ] for
different market volatilities and ωL = 0.1 and ωA = 0.05.
p:
LB
UB
MC
(s.e. × 105 )
0.995
0.975
0.95
0.90
0.80
0.70
−0.38%
+26.26%
1.0348
(2.49)
−0.21%
+23.44%
0.6927
(0.46)
−0.16%
+21.80%
0.5421
(0.26)
−0.08%
+19.76%
0.3859
(0.10)
−0.00%
+16.38%
0.2192
(0.06)
−0.00%
+11.25%
0.1124
(0.04)
Table 2.3: (ex. 1) Approximations for some selected confidence levels
of SMp [V ]. The market volatility is set equal to 20%. (ωL = 0.05 and
ωA = 0)
σM :
LB
UB
MC
(s.e. × 105 )
0.05
0.10
0.15
0.20
0.25
0.30
0.35
−0.19%
+31.74%
0.4390
(0.15)
−0.15%
+27.72%
0.5250
(0.29)
−0.23%
+24.12%
0.6528
(0.41)
−0.16%
+21.81%
0.8103
(0.69)
−0.11%
+20.31%
0.9924
(1.22)
−0.17%
+19.18%
1.1970
(3.78)
−0.38%
+18.13%
1.4232
(4.16)
Table 2.4: (ex. 1) Approximations for the security margin SM0.975 [V ] for
different market volatilities.
We include an additional example (ex. 2) with a different stochastic liability
cash-flow structure. We fix the number of liabilities at n = 30. Further,
(i)
we choose ν0 = −4.46 for i = 1, . . . , 30 and


5% i ≤ 5;




 10% 5 < i ≤ 15;
(i)
τ0 =
15% 15 < i ≤ 25;



20% 25 < i ≤ 28;


 25% 28 < i ≤ 30.
56
Chapter 2 - Convex bounds
p:
LB
UB
MC
(s.e. × 105 )
0.995
0.975
0.95
0.90
0.80
0.70
−0.93%
+24.59%
4.4521
(37.63)
−0.04%
+19.86%
2.2264
(2.99)
−0.02%
+16.94%
1.4998
(7.44)
−0.18%
+12.95%
0.8814
(2.79)
−0.03%
+5.16%
0.3508
(0.78)
−0.6%
−30.40%
0.0761
(0.27)
Table 2.5: (ex. 2) Approximations for some selected confidence levels of
SMp [V ]. The market volatility is set equal to 25%.
This means that the sum of the expected cash flows E[L(i) ] is equal to
(i)
100% and E[L0 ] = 35.51%. In this example we fix the parameters ωL and
ωA equal to 10% and 5% respectively.
The same conclusions as for ex. 1 can be drawn from the results in
Table 2.5. This table reports the discussed approximations for SMp [V ] for
different probability levels and a fixed market volatility σM = 0.25. Note
that for the parameters in Table 2.5 σA = 20.5%, σL = 9.4%, µA = 8.7%
and µL = 5.4%.
Overall, the comonotonic lower bound approach provides a very accurate
fit under different parameter assumptions. These assumptions are in line
with realistic market values. Moreover, the comonotonic approximations
have the advantage that they are easy computable for any risk measure
that is additive for comonotonic risks, such as Value-at-Risk and the wider
class of distortion risk measures (see e.g. Dhaene et al. (2004)).
2.5
Convex bounds for scalar products of random
vectors
Within the fields of finance and actuarial science one is often confronted
with the problem of determining the distribution function of a scalar product of two random vectors of the form
S=
n
X
Xi Y t i ,
(2.60)
i=1
where the nominal random payments Xi are due at fixed and known times
ti , i = 1, . . . , n and Yt denotes the nominal discount factor over the interval
[0, t], t ≥ 0. This means that the amount one needs to invest at time 0
2.5. Convex bounds for scalar products of random vectors
57
to get an amount 1 at time t is the random variable Yt . By nominal we
mean that there is no correction for inflation. Notice that here the random
~ = (X1 , X2 , . . . , Xn ) may reflect e.g. the insurance or credit risk
vector X
~ = (Yt , Yt , . . . , Ytn ) represents the financial/investment
while the vector Y
1
2
risk. If the payments Xi at time ti are independent of inflation, then the
~ and Y
~ can be assumed to be mutually independent. On the
vectors X
~ and Y
~
other hand if the payments are adjusted for inflation, the vectors X
are not mutually independent anymore. Denoting the inflation factor over
the period [0, t] by Zt , the random variable S can be rewritten as
S=
n
X
X̃i Ỹti ,
i=1
where the real payments X̃i and the real discount factors Ỹti are given
by X̃i = Xi /Zti and Ỹti = Yti Zti . Hence, in this case S is the scalar
product of two mutually independent random vectors (X̃1 , X̃2 , . . . , X̃n )
and (Ỹt1 , Ỹt2 , . . . , Ỹtn ). For this reason the assumption of independence
between the insurance risk and the financial risk is in most cases realistic and can be efficiently deployed to obtain various quantities describing
risk within financial institutions, e.g. discounted insurance claims or the
embedded/appraisal value of a company.
Distributions of sums of the form (2.60) are often encountered in practice and need to be analyzed thoroughly by actuaries and other practitioners involved in the risk management process. Not only the basic summary
measures (like the first few moments) have to be computed, but also more
sophisticated risk measures which require much deeper knowledge about
the underlying distributions (e.g. the Value-at-Risk).
Unfortunately there are no analytical methods to compute distribution
functions for random variables of this form. That is why usually one has
to rely on volatile and time consuming Monte Carlo simulations. In spite
of the enormous increase in computational power observed within the last
few decades, computing time remains a serious drawback of Monte Carlo
simulations, especially when one is interested in estimating very high values
of quantiles (note that a solvency capital of an insurance company may
be determined e.g. as the 99.95%-quantile, which is extremely difficult to
estimate within reasonable time by simulation methods).
In this section we propose an alternative solution. By extending the
methodology of Section 2.2 to the case of scalar products of independent
58
Chapter 2 - Convex bounds
random vectors, we obtain convex upper and lower bounds for sums of the
form (2.60). As we demonstrate by means of a series of numerical illustrations, the methodology provides an excellent framework to get accurate
and easily obtainable approximations of distribution functions for random
variables of the form (2.60).
We first give the theoretical foundations for convex lower and upper
bounds in the case of scalar products of independent random vectors. Next,
we demonstrate how to obtain the bounds for (2.60) in the convex order
~ follows the lognormal law. Finally, we present several
sense in case when Y
applications for discounted claim processes in a Black & Scholes setting.
2.5.1
Theoretical results
Consider sums of the form:
S = X 1 Y1 + X 2 Y2 + . . . + X n Yn ,
(2.61)
~ = (X1 , X2 , . . . , Xn ) and Y
~ = (Y1 , Y2 , . . . , Yn )
where the random vectors X
are assumed to be mutually independent. Theoretically, the techniques
developed in Section 2.2 can be applied also in this case (one can take
Vj = Xj Yj ). Such an approach is however not very practical. First of all,
it is not always easy to find the marginal distributions of Vj . Secondly, it
is usually very difficult to find a suitable conditioning random variable Λ,
which will be a good approximation to the whole scalar product, taking
~ and Y
~ simultaneously.
into account the riskiness of the random vector X
The following theorem provides a more suitable approach to deal with
scalar products. Before we prove the theorem we recall a helpful lemma.
Lemma 8 (Scalar products and convex order).
~ = (X1 , . . . , Xn ), Y
~ = (Y1 , . . . , Yn ) and Z
~ = (Z1 , . . . , Zn )
Assume that X
~ is mutually independent of
are non-negative random vectors and that X
~
~
~
the vectors Y and Z. If for all possible outcomes x1 , . . . , xn of X
n
X
i=1
xi Yi ≤cx
n
X
x i Zi ,
i=1
then the corresponding scalar products are ordered in the convex order
sense, i.e.
n
n
X
X
Xi Yi ≤cx
Xi Zi .
i=1
i=1
2.5. Convex bounds for scalar products of random vectors
59
~ and taking the
Proof. Let φ be a convex function. By conditioning on X
assumptions into account, we find that
n
n
h X
i
h h X
ii
~
E φ
Xi Y i
= EX~ E φ
X i Y i |X
i=1
i=1
n
ii
h h X
~
≤ EX~ E φ
X i Z i |X
i=1
n
i
h X
Xi Zi
= E φ
i=1
holds for any convex function φ.
Theorem 11 (Bounds for scalar products of random vectors).
Consider the following sum of random variables
n
X
S=
Xi Y i .
(2.62)
i=1
~ = (X1 , X2 , . . . , Xn ) and Y
~ = (Y1 , Y2 , . . . , Yn )
Assume that the vectors X
are mutually independent. Define the following quantities:
S
c
=
Sl =
n
X
i=1
n
X
i=1
(V ),
FX−1i (U )FY−1
i
(2.63)
E[Xi |Γ]E[Yi |Λ],
(2.64)
where U and V are independent standard uniform random variables, Γ is
~ and Λ, and the second conditioning
a random variable independent of Y
~ and Γ. Then, the following relation
random variable Λ is independent of X
holds:
S l ≤cx S ≤cx S c .
Proof. The proof is based on a multiple application of Lemma 8.
P
P
1. First, we prove that ni=1 Xi Yi ≤cx ni=1 FX−1i (U )FY−1
(V ).
i
From Theorem 8 it follows that for all possible outcomes (x1 , . . . , xn )
~ the following inequality holds:
of X
n
X
i=1
xi Yi ≤cx
n
X
i=1
Fx−1
(V ) =
i Yi
n
X
i=1
xi FY−1
(V ).
i
60
Chapter 2 - Convex bounds
Pn
Thus from Lemma 8 it follows immediately that
i=1 Xi Yi ≤cx
Pn
−1
(V
).
The
same
reasoning
can
be
applied
to
show that
X
F
i=1 i Yi
n
X
Xi FY−1
(V
i
i=1
) ≤cx
n
X
FX−1i (U )FY−1
(V ).
i
i=1
2. In a similar way, one can show that
n
X
i=1
E[Xi |Γ]E[Yi |Λ] ≤cx
n
X
i=1
Xi E[Yi |Λ] ≤cx
n
X
Xi Y i .
i=1
Pn
Pn
−1
−1
−1
Remark 1. Notice that
i=1 FXi Yi (U ). Thus
i=1 FXi (U )FYi (V ) ≤cx
the upper bound (2.63) is improved compared to the comonotonic upper
~ and Y
~
bound. It takes the information into account that the vectors X
are independent.
Remark 2. One can also calculate the improved upper bound
Su =
n
X
i=1
FX−1i |Γ (U )FY−1
(V ),
i |Λ
but since the improved upper bound S u is very close to the comonotonic
upper bound S c and it requires much more computational time, we concentrate in this thesis only on the lower bound S l and the comonotonic
upper bound S c as approximations for S.
Remark 3. Having obtained the convex upper and lower bounds one can
get also the moments based approximation S m as described in Subsection
2.2.4, i.e. by determining the distribution function as follows:
FS m (t) = zFS l (t) + (1 − z)FS c (t),
where
z=
Var[S c ] − Var[S]
.
Var[S c ] − Var[S l ]
(2.65)
(2.66)
2.5. Convex bounds for scalar products of random vectors
2.5.2
61
Stop-loss premiums
The stop-loss premiums of S c and S l provide natural bounds for the stoploss premiums of the underlying scalar product of random vectors. More
precisely, one has the following relationship:
π lb (S, d, Γ, Λ) ≤ π(S, d) ≤ π cub (S, d).
The values π cub (S, d) and π lb (S, d, Γ, Λ) can be easily computed. Below we
give the computational procedure in detail.
First, consider a sum of the form
(S c |U = u) =
n
X
FX−1i (u)FY−1
(V ).
i
i=1
It can be easily seen that it is a sum of the components of a comonotonic
vector, and hence the conditional stop-loss premiums of S c (given U = u)
can be found in the case the distribution functions of Yi are continuous
and strictly increasing, by applying Theorem 9. Then, the overall stoploss premium of S c can be computed by conditioning
h i
π cub (S, d) = E E (S c − d)+ |U
Z 1X
n
−1
−1
=
FXi (u)π Yi , FYi FS c |U =u (d) du.
0
(2.67)
i=1
In general it is more difficult to calculate stop-loss premiums for the lower
bound. However it can be done similarly as in the case of the upper
bound if one additionally assumes that the conditioning variables Γ and
Λ can be
chosen in such
a way that for any fixed γ ∈ supp(Γ) all components E Xi |Γ = γ E Yi |Λ = λ are non-decreasing (or equivalently nonincreasing) in λ. Then the vector
E[X1 |Γ = γ]E[Y1 |Λ], E[X2 |Γ = γ]E[Y2 |Λ], . . . , E[Xn |Γ = γ]E[Yn |Λ]
62
Chapter 2 - Convex bounds
is comonotonic and Theorem 9 can be applied. Thus, one gets
h i
π lb (S, d, Γ, Λ) = E E (S l − d)+ |Γ
Z 1X
n =
E Xi |Γ = FΓ−1 (u) ×
0
i=1
×π
−1
E[Yi |Λ], FE[Y
i |Λ]
FS l |Γ=F −1 (u) (d)
Γ
du.
(2.68)
Hence if one can only compute stop-loss premiums of Yi and E[Yi |Λ], one
can also compute stop-loss premiums of S c and S l .
Note that stop-loss premiums of the moments based approximation S m
can be easily calculated as
π m (S, d, Γ, Λ) = zπ lb (S, d, Γ, Λ) + (1 − z)π cub (S, d).
2.5.3
The case of log-normal discount factors
In the sequel we develop a framework for computing convex bounds for
random variables of the form:
n
X
S=
αi Xi eZi ,
(2.69)
i=1
~ and Z
~ satisfy the usual conditions (see Section 2.5.1).
where the vectors X
We assume αi > 0 and Zi ∼ N(E[Zi ], σZ2 i ). In this section we consider
the problem in general, without imposing any conditions on the random
variables Xi . In particular we don’t discuss the choice of the conditioning
variable Γ.
The upper bound
From Theorem 11 it follows that
n
X
c
FX−1i (U )Fα−1eZi (V )
S =
=
i=1
n
X
i
FX−1i (U )αi eE[Zi ]+sign(αi )σZi Φ
−1 (V
)
,
(2.70)
i=1
where U and V are independent standard uniform random variables.
The cumulative distribution function of S c can be calculated in three
steps:
2.5. Convex bounds for scalar products of random vectors
63
1. Suppose that U = u is fixed. Then from (2.70) it follows that conditional quantiles can be computed as
FS−1
c |U =u (p)
=
n
X
FX−1i (u)αi eE[Zi ]+sign(αi )σZi Φ
−1 (p)
;
(2.71)
i=1
2. Obviously for any u the function given by (2.71) is continuous and
strictly increasing. Thus for any y ≥ 0 one can compute the value
of the conditional distribution function using one of the well-known
numerical methods (e.g. Newton-Raphson) as a solution of
n
X
−1
FX−1i (u)αi eE[Zi ]+sign(αi )σZi Φ (FSc |U =u (y)) = y;
(2.72)
i=1
3. The cumulative distribution function of S c can be now derived as
FS c (y) =
Z
1
0
FS c |U =u (y)du.
The stop-loss premiums of the upper bound can be computed as follows.
For simplicity of notation let us denote
−1
du,i = Fα−1eZi FS c |U =u (d) = αi eE[Zi ]+sign(αi )σZi Φ (FSc |U =u (d)) .
i
(2.73)
Then one has
2
σZ
i
(1)
(2)
π αi eZi , du,i = αi eE[Zi ]+ 2 Φ sign(αi )bu,i − du,i Φ sign(αi )bu,i ,
(2.74)
where, using Lemma 6,
(1)
bu,i =
E[Zi ] + σZ2 i − ln(du,i )
,
σ Zi
(2)
(1)
bu,i = bu,i − σZi .
Then the stop-loss premium of S c with retention d can be computed by
plugging (2.74) into (2.67) and is given by
64
π
cub
Chapter 2 - Convex bounds
(S, d) =
=
Z
0
n
1X
n
X
i=1
αi e
i=1
×
FX−1i (u)π αi eZi , du,i du
2
E[Zi ]+ 12 σZ
Z
1
0
i
×
FX−1i (u)Φ sign(αi )σZi − Φ−1 FS c |U =u (d) du
− d (1 − FS c (d)) .
(2.75)
The lower bound
The computations for the lower bound are performed similarly, however
the quality of the bound heavily depends on the choice of the conditioning
random variables. Recall that from Theorem 11 it follows that
n
X
S =
E Xi |Γ E αi eZi |Λ ,
l
(2.76)
i=1
~ and
where the first conditioning variable Γ is independent of Λ and Y
~ In
where the second conditioning variable Λ is independent of Γ and X.
this section the choice of Γ will not be discussed and the random variable
Λ will be assumed to be of the ‘maximal variance’ form (2.54)
Λ=
n
X
β i Zi =
i=1
n
X
αi E[Xi ]e
2
E[Zi ]+ 21 σZ
i
Zi .
(2.77)
i=1
Under these assumptions the vectors of the form Zi , Λ have a bivariate
normal distribution. Thus, Zi |Λ = λ will be normally distributed with
2 given by
mean µi,λ and variance σi,λ
µi,λ
Cov Zi , Λ
λ − E[Λ]
= E[Zi ] +
Var[Λ]
and
2
σi,λ
=
σZ2 i
2
Cov Zi , Λ
.
−
Var[Λ]
2.5. Convex bounds for scalar products of random vectors
65
The lower bound (2.76) can be written out as
Sl =
=
=
n
X
E Xi |Γ E αi eZi |Λ
i=1
n
X
i=1
n
X
i=1
2
σi,Λ
E Xi |Γ αi eµi,Λ + 2
E[Z ]+ 1 σ2 (1−r2 )+σZ ri Φ−1 (U )
i
i
,
E Xi |Γ αi e i 2 Zi
(2.78)
with U a standard uniform random variable and correlations given by
Cov Zi , Λ
ri = Corr (Zi , Λ) =
σ Zi σ Λ
2
Pn
E[Zj ]+ 21 σZ
jσ
E[X
]e
i
Zi Zj
j=1
q
=
. (2.79)
2 +σ 2 )
P
E[Zk ]+E[Zl ]+ 12 (σZ
Zl
k
σ Zi
σ Zk Zl
1≤k,l≤n E[Xk ]E[Xl ]e
Note that the ri ’s are non-negative and the random variable S l is (given a
value Γ = γ) the sum of the components of a comonotonic vector. Thus the
cumulative distribution function of the lower bound S l can be computed,
similar to the case of the upper bound S c , in three steps:
1. From (2.78) it follows that the conditional quantiles (given Γ = γ)
can be computed as
FS−1
l |Γ=γ (p)
n
X
E[Z ]+ 1 σ2 (1−r2 )+σZ ri Φ−1 (p)
i
i
=
; (2.80)
E Xi |Γ = γ αi e i 2 Zi
i=1
2. The conditional distribution function is computed as the solution of
n
X
E[Z ]+ 1 σ2 (1−r2 )+σZ ri Φ−1 (F l
(y))
i
S |Γ=γ
i
= y; (2.81)
E Xi |Γ = γ αi e i 2 Zi
i=1
3. Finally, the cumulative distribution function of S l can be derived as
FS l (y) =
Z
1
0
FS l |Γ=F −1 (u) (y)du.
Γ
66
Chapter 2 - Convex bounds
The stop-loss premiums are computed as follows. Let us denote
1 2
2
−1
FS l |Γ=γ (d) = αi eE[Zi ]+ 2 σZi (1−ri )+σZi ri Φ (FSl |Γ=γ (d)) .
dγ,i = F −1
Z
E αi e
i |Λ
Then one has
E[Z ]+ 1 σ 2
(2)
(1)
π E αi eZi |Λ , dγ,i = αi e i 2 Zi Φ sign(αi )bγ,i −dγ,i Φ sign(αi )bγ,i ,
(2.82)
with
(1)
bγ,i =
E[Zi ] + 12 σZ2 i (1 − ri2 ) + σZ2 i ri2 − ln(dγ,i )
,
σ Zi ri
(2)
(1)
bγ,i = bγ,i − σZi ri .
Then the stop loss-premium of S l with retention d can be computed by
plugging (2.82) into (2.68) and is given by
Z 1X
n
lb
π (S, d, Γ, Λ) =
E Xi |Γ = FΓ−1 (u) π E αi eZi |Λ , dγ,i du
=
0
i=1
n
X
αi e
2
E[Zi ]+ 12 σZ
i=1
Z
i
×
E Xi |Γ = FΓ−1 (u) Φ ri σZi − Φ−1 FS l |Γ=γ (d) du
0
(2.83)
− d 1 − FS l (d) .
×
1
Moments based approximations
For computing the moments based approximation as defined in (2.65), one
has to calculate the variance of S, S l and S c . In general the problem
is easy solvable for the upper and the lower bound. For the exact distribution it is more difficult to find a universal solution and the problem
needs to be considered individually. In the general case one would face the
problem of computing multiple integrals, what requires usually too much
computational time.
Note that the upper and the lower bound of S, as described in Subsections 2.5.3 and 2.5.3, can be seen as a special case of the following random
variable X with general form given by
X=
n
X
i=1
αi fi (U )gi (V ),
(2.84)
2.5. Convex bounds for scalar products of random vectors
67
where (α1 , α2 , . . . , αn ) is a vector of non-negative numbers, fi (.) and gi (.)
are non-negative functions and U and V two independent standard uniform
random variables. Indeed, in the case of the upper bound one takes
fi (U ) = FX−1i (U ) and
gi (V ) = Fe−1
Zi (V )
and in the case of the lower bound
fi (U ) = E Xi |Γ and
gi (V ) = E eZi |Λ .
The variance of X in expression (2.84) can be computed as follows
Var[X] = E Var[X|U ] + Var E[X|U ]
Z 1
Z
n
hX
i
=
Var
αi fi (u)gi (V ) du +
0
−
Z
1
i=1
n
hX
E
0
i
αi fi (u)gi (V ) du
i=1
0
2
1
E
n
hX
αi fi (u)gi (V )
i=1
i 2
du
.
Thus the problem of computing the variance of X is always solvable if one
is able to compute the expectation and the variance of random variables
X̃ of the form
n
X
α̃i gi (V ),
X̃ =
i=1
for any vector of non-negative numbers (α̃1 , α̃2 , . . . , α̃n ) (here α̃i = αi fi (u)).
−1
For the comonotonic upper bound (2.70), i.e. gi (V ) = eE[Zi ]+σZi Φ (V ) , the
variance of X̃ is given by
2
2
n X
n
σZ +σZ
X
j
i
σ σ
e Zi Zj − 1
Var X̃ =
α̃i α̃j eE[Zi ]+E[Zj ]+ 2
i=1 j=1
E[Z ]+ 1 σ 2 (1−r 2 )−σ
r Φ−1 (V )
Zi i
i
and for the lower bound (2.76), i.e. gi (V ) = e i 2 Zi
by
2 +σ 2
n X
n
σZ
Z
X
rr σ σ
E[Zi ]+E[Zj ]+ i 2 j
Var X̃ =
α̃i α̃j e
e i j Zi Zj − 1 .
i=1 j=1
,
68
2.6
Chapter 2 - Convex bounds
Application: the present value of stochastic
cash flows
In this section we derive convex upper and lower bounds for general discounted cash flows of the form
S=
n
X
Xi e−Y (i) ,
(2.85)
i=1
where the random variables Xi denote future (non-negative) payments due
at time i and Y (t) is a stochastic process describing returns on investment
in the period (0, t).
We give explicit results for convex upper and lower bounds in three
specific cases:
~ = ln(X1 ), ln(X2 ), . . . , ln(Xn ) has a multivariate
(i) The vector ln X
normal distribution and hence the losses are log-normally distributed.
~ = X1 , X2 , . . . , Xn has a multivariate elliptical distri(ii) The vector X
bution. Formally the described methodology is valid only in the case
when Xi > 0.
(iii) The yearly payments Xi are independent and identically distributed.
2.6.1
Stochastic returns
We start with a general definition of a Gaussian process.
Definition 7 (Gaussian
process).
A stochastic process Y (t)|t ≥ 0 is called Gaussian
if for any 0 < t1 <
t2 < . . . < tn the vector Y (t1 ), Y (t2 ), . . . , Y (tn ) has a multivariate normal distribution.
Gaussian processes have a lot of desirable properties. They are very easy to
handle since they are completely determined by their mean and covariance
functions
m(t) = E[Y (t)] and
c(s, t) = Cov[Y (s), Y (t)].
(2.86)
For an introduction to Gaussian processes, see e.g. Karatzas & Shreve
(1991). The normality assumption for modelling returns on investment
2.6. Application: the present value of stochastic cash flows
69
has been questioned in the financial literature for the short term setting
(e.g. daily returns — see Schoutens (2003)). In the long term however
Gaussian models provide a satisfactory approximation since the Central
Limit Theorem is applicable under the reasonable assumptions of independent returns with finite variance (some empirical evidence is provided e.g.
in Cesari & Cremonini (2003)). Therefore in the framework of this thesis
we restrict ourselves to two simple Gaussian models for future returns Y (t).
More precisely, we will focus on modelling returns by means of a Brownian
motion with drift (the Black & Scholes model) and an Ornstein-Uhlenbeck
process. This limitation is very convenient because it leads to closed-form
formulas for convex upper and lower bounds of future cash flows.
The Black & Scholes setting (B-SM)
We assume that a process X(t) satisfies the following stochastic differential
equation:
1 (2.87)
dX(t) = X(t) µ + σ 2 dt + X(t)σdW1 (t),
2
where W1 (t) denotes a standard Brownian motion. It is well-known that
(2.87) has a unique solution of the form
X(t) = X(0)eµt+σW1 (t) ,
and thus the return on investment process Y (t) = log
with mean and covariance functions given by
m(t) = µt and
X(t)
X(0)
is Gaussian
c(s, t) = min(s, t)σ 2 .
One of the most important features of the return process Y (t) is the property of independent increments. Indeed, it is straightforward to verify that
for every 0 < s < t < u one has that
Cov Y (u) − Y (t), Y (t) − Y (s) = 0.
For this reason we often consider yearly rates of return
Yi = Y (i) − Y (i − 1)
for i = 1, 2, . . .
(2.88)
which are independent and normally distributed with mean equal to µ and
variance equal to σ 2 .
70
Chapter 2 - Convex bounds
The Ornstein-Uhlenbeck model (O-UM)
In the Ornstein-Uhlenbeck model the return process is described as
Y (t) = µt + Z(t),
where Z(t) is the solution of the following stochastic differential equation:
dZ(t) = −aZ(t)dt + σdW1 (t),
with a and σ being positive constants. Then Y (t) is again Gaussian with
mean and covariance functions given by
m(t) = µt and
c(s, t) =
σ 2 −a|t−s|
e
− e−a(t+s)
2a
(2.89)
We refer to e.g. Arnold (1974) for more details about the derivation.
Note that for a = 0 the Ornstein-Uhlenbeck process degenerates to
an ordinary Brownian motion with drift and is equivalent to the Black &
Scholes setting. When a > 0, process Y (t) has no independent increments
any more. Moreover, it becomes mean reverting. Intuitively the property
of mean reversion means that process Y (t) cannot deviate too far from its
mean function m(t). In fact the parameter a measures how strongly paths
of Y (t) are attracted by the mean function. The value a = 0 corresponds to
the case when there is no attraction and as a consequence the increments
become independent. On Figure 2.1 we illustrate typical sample paths of
the Ornstein-Uhlenbeck model for different values of parameter a.
In particular we will concentrate on the case when Y (i) is defined by one
of these models. Then the sum S in (2.85) has a clear interpretation: it is
the discounted value of future benefits Xi with returns described by one of
the well-known Gaussian models. The input variables of the two discussed
return models are displayed in Table 2.6.
2.6. Application: the present value of stochastic cash flows
0.6
0.4
0.0
0.2
Y(t)
0.4
0.0
0.2
Y(t)
0.6
0.8
b) The Ornstein-Uhlenbeck process: a=0.02
0.8
a) The Ornstein-Uhlenbeck process: a=0
0
2
4
6
8
10
0
2
4
t
6
8
10
t
0.6
0.4
0.2
0.0
0.0
0.2
Y(t)
0.4
0.6
0.8
d) The Ornstein-Uhlenbeck process: a=0.5
0.8
c) The Ornstein-Uhlenbeck process: a=0.1
Y(t)
71
0
2
4
6
t
8
10
0
2
4
6
8
10
t
Figure 2.1: Typical paths for the Ornstein-Uhlenbeck process with mean
µ = 0.05, volatility σ = 0.07 and different values of parameter a.
Model
B-SM
O-UM
Variable
E[Y (i)]
Var[Y (i)]
Var[Λ]
Cov[Y (i), Λ]
E[Y (i)]
Var[Y (i)]
Var[Λ]
Cov[Y (i), Λ]
Formula
iµ
2
iσ
Pn
P
jβj2 σ 2 + 1≤j<k≤n 2jβj βk σ 2
Pj=1
n
2
j=1 min(i, j)βj σ
iµ
σ2
− e−2iα )
2α (1
P
2
n
σ
2
−2jα
)+
j=1 βj (1 − e
2α
P
+ 1≤j<k≤n 2βj βk (e−(k−j)α − e−(j+k)α )
Pn
σ2
−|i−j|α
− e−(i+j)α )
j=1 βj (e
2α
Table 2.6: Input variables for returns. We take Λ =
Pn
i=1 βi Y
(i).
72
2.6.2
Chapter 2 - Convex bounds
Lognormally distributed payments
Consider a sum of the form
SLN =
n
X
eNi e−Y (i) ,
(2.90)
i=1
~ = N1 , N2 , . . . , Nn = ln(X1 ), ln(X2 ), . . . , ln(Xn ) is a normally
where N
distributed random vector with mean µ
~ N~ = µN1 , µN2 , . . . , µNn and co N~ variance matrix ΣN~ = σ̃ij
. The corresponding variances are de1≤i,j≤n
~
2 := σ N .
noted by σN
ii
i
There are two different approaches to derive convex upper and lower
bounds for SLN as defined in (2.90). In the first approach independent
parts of the scalar product are treated separately (this approach is consistent with the methodology described in Subsections 2.5.1 and 2.5.3). In
the second approach we treat SLN unidimensionally, by noticing that it
can be rewritten as
n
n
X
X
eN̂i ,
(2.91)
SLN =
X̂i =
i=1
i=1
~
where N̂ = N̂1 , N̂2 , . . . , N̂n = N1 − Y (1), N2 − Y (2), . . . , Nn − Y (n) has
a multivariate normal distribution with parameters
µ
~ ~ = µN̂1 , µN̂2 , . . . , µN̂n
N̂
with
µN̂i = µNi − m(i)
and
and
N̂~ Σ ~ = σij
,
1≤i,j≤n
N̂
~
(2.92)
~
N̂
N
σij
= σij
+ c(i, j),
where m(.) and c(., .) denote mean and covariance functions of the process
2 :=
Y (.), as defined in (2.86). We further use the following notations σN̂
~
σiiN ,
i
σi2
µi := −m(i) and
:= c(i, i). Thus one can derive convex upper
and lower bounds of (2.91) just by adapting the methodology described in
Section 2.3.4.
Below we work out both approaches explicitly. The main advantage of
the first method is a better recognition of the dependency structure and
this results in more precise estimates (especially the upper bound). On
the other hand the second method is much less time-consuming because
the problem is reduced to only one dimension.
2.6. Application: the present value of stochastic cash flows
73
The upper bound
The upper bound can be written as
c
SLN
=
n
X
eµNi +µi +σNi Φ
−1 (U )+σ
iΦ
−1 (V
)
i=1
and its distribution function can be computed as described in Subsection
2.5.3.
The lower bound
To compute the lower bound we propose to define a conditioning random
variable Γ symmetrically to the conditioning variable Λ, i.e.
n
n
X
X
µ + 1 σ2
µ +µ + 1 σ 2 +σ 2
Γ=
E e−Y (i) e Ni 2 Ni Ni =
e Ni i 2 Ni i N i .
i=1
i=1
The conditioning variable Λ is chosen as in (2.77), which gives after the
obvious substitution
n
X
2 +σ 2
µNi +µi + 21 σN
i
i
Y (i).
(2.93)
Λ=−
e
i=1
Now the corresponding lower bound can be written as
l1
SLN
=
n
X
e
2 (1−r 2 )+ 1 σ 2 (1−r 2 )+σ
−1 (U )+σ r Φ−1 (V )
µNi +µi + 21 σN
i i
Ni r Ni Φ
i
N
2 i
i
i
,
i=1
where correlations ri = r(−Y (i), Λ) are defined as in (2.79) and
rNi = r(Ni , Γ)
=
σ Ni
r
Pn
Pn
j=1 e
k,l=1 e
2 +σ 2
µNj +µj + 12 σN
j
j
~
N
σij
2 +σ 2 +σ 2 +σ 2
µNk +µNl +µk +µl + 12 σN
N
k
l
k
l
.
~
N
σkl
Its distribution function can be computed by conditioning on U , as described in Section 2.5.3.
From Remark 1 it follows that
c
SLN
≤cx
n
X
i=1
(U ),
F −1
N̂
e
i
74
Chapter 2 - Convex bounds
and thus we don’t consider the comonotonic upper bound for (2.91). To
compute the lower bound we apply directly the results of Section 2.3.4.
Therefore, we take as conditioning random variable
Λ̃ =
n
X
e
µN̂ (µ)+σ 2
N̂i
i
N̂i .
(2.94)
i=1
Then the lower bound is given explicitly as
l2
SLN
=
n
X
e
µN̂ + 21 σ 2 (1−r 2 )+σN̂ rN̂ Φ−1 (U )
N̂i
N̂i
i
i
i
,
i=1
where
rN̂i = r(N̂i , Λ̃) =
σN̂i
Pn
r
Pn
j=1 e
k,l=1 e
µN̂ + 12 σ 2
N̂j
j
~
N̂
σij
µN̂ +µN̂ + 12 σ 2 +σ 2
k
N̂k
l
N̂l
~
N̂
σkl
Note that in order to obtain a comonotonic lower bound one has to assure
additionally that rN̂i > 0 for all i.
Suppose that this lower bound is comonotonic. Then its quantiles are
given by a closed-form expression:
FS−1
l2 (p)
LN
=
n
X
e
µN̂ + 21 σ 2 (1−r 2 )+σN̂ rN̂ Φ−1 (p)
i
N̂i
N̂i
i
i
,
i=1
from which one can easily find values of the corresponding distribution
function e.g. by means of the Newton-Raphson method.
The moments based approximation
It is also possible to derive the moments based approximations S m1 and
S m2 as described in (2.65) since there are explicit solutions for the vari-
2.6. Application: the present value of stochastic cash flows
ances:
Var[SLN ] =
c
Var[SLN
]
l1
Var[SLN
]
l2
Var[SLN
]
=
=
n
n X
X
e
µN̂ +µN̂ + 12 σ 2 +σ 2
j
i
N̂j
N̂i
i=1 j=1
n X
n
X
i=1 j=1
n X
n
X
e
e
µN̂ +µN̂ + 12 σ 2 +σ 2
N̂j
N̂i
j
i
µN̂ +µN̂ + 12 σ 2 +σ 2
N̂i
j
i
N̂j
i=1 j=1
=
n
n X
X
e
µN̂ +µN̂ + 12 σ 2 +σ 2
N̂i
j
i
N̂j
i=1 j=1
~
N̂
eσij − 1 ,
−1 ,
e
σNi σNj +σi σj
e
rNi rNj σNi σNj +ri rj σi σj
e
rN̂ rN̂ σN̂ σN̂
i
j
i
j
−1 .
75
−1 ,
After obvious substitutions in formulas (2.75) and (2.83) one gets the following expressions for stop-loss premiums in the first approach:
π cub (SLN , d) =
n
X
i=1
1
2
e µi + 2 σ i ×
Z
1
−1
u |U =u (d)
eµNi +σNi Φ (u) Φ σi − Φ−1 FSLN
du
0
u (d) ,
− d 1 − FSLN
Z 1
n
X
2 )σ 2 +r
−1 (u)
1 2
µ + 1 (1−rN
Ni σNi Φ
Ni
i
π lb1 (SLN , d, Γ, Λ) =
e µi + 2 σ i
e Ni 2
×
×
i=1
0
× Φ ri σi − Φ−1 FS l1 |Γ=F −1 (u) (d)
LN
Γ
− d 1 − FS l1 (d) .
LN
du
In the second approach the expression for stop-loss premiums of the lower
bound follows straightforward from (2.40):
π
lb2
(SLN , d, Λ) =
n
X
i=1
e
µN̂ + 12 σ 2
i
N̂i
Φ rN̂i σN̂i − Φ−1 FS l2 (d)
LN
− d 1 − FS l2 (d) .
LN
Finally, the corresponding stop-loss premiums for the moments based approximations are given by
π m1 (SLN , d) = z1 π lb1 (SLN , d) + (1 − z1 )π cub (SLN , d),
π m2 (SLN , d) = z2 π lb2 (SLN , d) + (1 − z2 )π cub (SLN , d),
76
Chapter 2 - Convex bounds
p
0.75
0.90
0.95
0.975
0.995
l1
SLN
14.6818
17.0976
18.7642
20.3631
23.9603
l2
SLN
14.6822
17.1024
18.7723
20.3753
23.9823
m1
SLN
14.6847
17.1067
18.7788
20.3843
24.0032
m2
SLN
14.6839
17.1078
18.7815
20.3882
24.0082
c
SLN
15.0295
18.0976
20.2580
22.3610
27.1914
MC (s.e.×104 )
14.6795 (0.71)
17.1019 (1.06)
18.7769 (1.45)
20.3881 (2.08)
24.0237 (4.59)
Table 2.7: Approximations for some selected quantiles with probability
level p of SLN .
where
z1 =
c ] − Var[S
Var[SLN
LN ]
c
l1 ]
Var[SLN ] − Var[SLN
and
z2 =
c ] − Var[S
Var[SLN
LN ]
.
c
l2 ]
Var[SLN ] − Var[SLN
A numerical illustration
We examine the accuracy and efficiency of the derived approximations for
the present values of a cash flow with lognormally distributed payments.
For the purpose of this numerical illustration we choose parameters µ Ni =
2 = ln(1.01) (Note that this value correspond to E[X] = 1
− ln(1.01)
and σN
2
i
and Var[X] = 0.01). Moreover, we allow for some dependencies between
the payments by imposing correlations between the normal exponents:

1



0.5
r(Ni , Nj ) =

0.2


0
if
if
if
if
i=j
|i − j| = 1
|i − j| = 2,
|i − j| > 2.
We restrict ourselves to the case of a Black & Scholes setting with drift
µ = 0.05 and volatility σ = 0.1. We compare the distribution functions of
c
l1 (obtained by taking two
the upper bound SLN
and the lower bounds SLN
l2
conditioning random variables) and SLN (with 1 conditioning variable)
with the original distribution function of SLN obtained by means of a
Monte Carlo (MC) simulation based on generating 500 × 100 000 sample
paths.
Table 2.7 illustrates the performance of the different approximations.
c
One can see that the upper bound SLN
gives a poor approximation. The
main reason for that is a relatively weak dependence between payments,
2.6. Application: the present value of stochastic cash flows
d
0
5
10
15
20
25
30
l1
SLN
12.8928
7.8928
3.0854
0.5589
0.0658
0.0070
0.0008
l2
SLN
12.8928
7.8928
3.0856
0.5602
0.0663
0.0071
0.0008
m1
SLN
12.8928
7.8928
3.0871
0.5615
0.0668
0.0072
0.0008
m2
SLN
12.8928
7.8928
3.0866
0.5618
0.0669
0.0072
0.0008
c
SLN
12.8928
7.8931
3.2521
0.8216
0.1647
0.0315
0.0062
77
MC (s.e.×104 )
12.8931 (4.37)
7.8931 (4.37)
3.0870 (4.11)
0.5613 (2.14)
0.0672 (0.72)
0.0074 (0.25)
0.0008 (0.08)
Table 2.8: Approximations for some selected stop-loss premiums with
retention d of SLN .
for which the comonotonic approximation significantly overestimates the
l1 and S l2 give excellent
tails. On the other hand, both lower bounds SLN
LN
approximations. One may be surprised especially with the performance
of the second lower bound — it turns out that the results are not less
accurate for one conditioning random variable than in the case of two
conditioning random variables. In the table we include also two moments
m1 and S m2 , which perform excellent as well.
based approximations SLN
LN
Finally, the stop-loss premiums for the different approximations are
compared in Table 2.8. This study confirms the high accuracy of the
lower bounds and moments based approximations, which are very close to
the Monte Carlo estimates. The overestimation of the stop-loss premiums
provided by the convex upper bound is considerable.
2.6.3
Elliptically distributed payments
The class of elliptical distributions is a natural extension of the normal
~ = X1 , X2 , . . . , Xn has an nlaw. We say that a random vector X
dimensional
elliptical
distribution
with
parameters
µ
~
=
µ
,
µ
,
.
.
.
,
µ
,
1
2
n
Σ = σij 1≤i,j≤n (symmetric and positive definite matrix) and character~ is given by
istic generator φ(·), if the characteristic function of X
0
~0
ϕX~ ~t = eit µ~ φ ~t Σ~t .
~ ∼ En (~
We write X
µ, Σ, φ). Obviously the normal distribution satisfies this
1
definition, with φ(y) = e− 2 y . Elliptical distributions are very useful for
several reasons. First of all they are very easy to manipulate because they
78
Chapter 2 - Convex bounds
inherit surprisingly many properties from the normal law. On the other
hand the normal distribution is not very flexible in modelling tails (in practice we often encounter much heavier tails than the Gaussian ones). The
class of elliptical laws offers a full variety of random distributions, from very
heavy-tailed ones (like Cauchy or stable distributions), distributions with
tails of the polynomial-type (t-Student), through the exponentially-tailed
Laplace and logistic distributions to the light-tailed Gaussian distribution.
Below we give a brief overview of the properties of elliptical distributions. For more information about elliptical distributions we refer to Fang
et al. (1990). The generalization of some of the results on comonotonic
Pn
bounds for
i=1 Xi to the multivariate elliptical case can be found in
Valdez & Dhaene (2004).
1. E[Xi ] = µi , Var[Xi ] = −2φ0 (0)σii and Cov[Xi , Xj ] = −2φ0 (0)σii if
only the corresponding moments exist. Here, φ0 (·) is the first derivative of the characteristic generator φ(·).
~ = AX
~ + ~b, where A denote an m × n-matrix and ~b is a vector
2. Let Y
n
~ ∼ Em A~
in R . Then Y
µ + ~b, AΣA0 , φ ;
3. If the density function fX~ (·) exists, it is given by the formula
c
fX~ (~x) = q g (~x − µ
~ )0 Σ−1 (~x − µ
~)
det Σ
for any non-negative function g satisfying
Z ∞
n
z 2 −1 g(z)dz < ∞
0<
0
and c being a normalizing constant. The function
g(·) is called the
density generator of the distribution Em µ
~ , Σ, φ . A detailed proof
of these results, using spherical transformations of rectangular coordinates, can be found in Landsman & Valdez (2002).
~ = X
~ 1, X
~ 2 denote an En+m (~
4. Let X
µ, Σ, φ)-random vector, where
µ
~= µ
~ 1, µ
~ 2 and
Σ11 Σ12
Σ=
.
Σ21 Σ22
2.6. Application: the present value of stochastic cash flows
79
~ 2 = ~x2 , the vector X
~ 1 has the
Then, given conditionally that X
En (~
µ1|2 , Σ11|2 , φx2 )-distribution with parameters given by
x2 − µ
~2
µ
~ 1|2 = µ
~ 1 + Σ12 Σ−1
22 ~
Σ11|2 = Σ11 −
Σ12 Σ−1
22 Σ21 .
and
Notice that in general (unlike in the normal case) the characteristic
generator of the conditional distribution is not known explicitly and
depends on the value of x2 .
Consider now sums of the form
Sel =
n
X
Xi e−Y (i) ,
i=1
where the return process Y (t) is, like in the previous example,
described
~ = X1 , X2 , . . . , Xn is elliptically
by the Black & Scholes model and X
X~ distributed with parameters µ
~ X~ = µX1 , µX2 , . . . , µXn , ΣX~ = σij
1≤i,j≤n
u
and characteristic generator φ(·). Here we note only that for φ(u) = e − 2
one gets a multivariate normal distribution with mean parameter µ
~ X~ and
covariance matrix ΣX~ .
Note that elliptical random variables take both positive and negative
values and therefore one cannot apply immediately Theorem 11. We
propose to consider pragmatically only the cases where the probability
Pr[Xi < 0] is very small. This can be achieved by choosing the parameters
µX i
is much larger then 0, where we use the conventional
in such a way that σX
~
i
2 := σ X .
notation σX
ii
i
The upper bound
The computation of the upper bound is straightforward if the inverse distribution function for the specific elliptical distribution is available in the
software package. In other words, the comonotonic upper bound is given
by
n
X
−1
c
Sel
=
U eµi +σi Φ (V ) ,
(2.95)
F −1
2
i=1
En µXi ,σX ,φ
i
80
Chapter 2 - Convex bounds
where by convention µi = −m(i) and σi2 = c(i, i) for m(·) and c(·, ·) denoting the mean and covariance functions of the process Y (i) described
previously in this subsection.
Note that for the most interesting case of a multivariate normal distribution one gets
n X
−1
µXi + σXi Φ−1 (U ) eµi +σi Φ (V ) .
=
c
SN
i=1
The corresponding expressions for stop-loss premiums are given by
π
cub
(Sel , d) =
n
X
i=1
×
1
2
e µi + 2 σ i ×
Z
0
1
F −1
2 ,φ
En µXi ,σX
u (d)
−d 1 − FSel
i
(u)Φ σi − Φ
−1
u |U =u (d)
FSel
du
(2.96)
and
π cub (SN , d) =
n
X
i=1
×
Z
1
2
e µi + 2 σ i ×
1 µ Xi + σ Xi Φ
u (d) .
−d 1 − FSN
0
−1
−1
u |U =u (d)
(u) Φ σi − Φ FSN
du
The lower bound
To compute the lower bound, we define the conditioning random variable
Γ as follows
n
n
X
X
−Y (j) 1 2
e µj + 2 σ j X j .
E e
Xj =
Γ=
j=1
j=1
Then a random vector Xj , Γ has a bivariate elliptical distribution, with
Γ,i , where
parameters µ
~ Γ,i = µXi , µΓ and ΣΓ,i = σkl
1≤k,l≤2
µΓ =
n
X
j=1
1
2
e µ j + 2 σ j µ Xj ,
2.6. Application: the present value of stochastic cash flows
2
σX
i
:=
Γ,i
σ11
,
Γ,i
σ12
=
Γ,i
σ21
=
n
X
1
~
2
X
eµj + 2 σj σij
81
and
j=1
Γ,i
σΓ2 := σ22
=
n
n X
X
1
eµj +µk + 2
σj2 +σk2
j=1 k=1
~
X
σjk
.
From property (4) of the elliptical distributions, it follows that — given
Γ = γ — the r.v. Xi is elliptically distributed with parameters
µXi ,Γ = µXi +
Γ,i
σ12
σΓ2
Γ − µΓ ,
2
2
−
= σX
σX
i
i ,Γ
Γ,i
σ12
σΓ2
2
(2.97)
and the unknown characteristic generator φa (·) depending on a equals
(Γ−µΓ )2
(recall that for the multivariate normal case the conditional disσ2
Γ
tribution remains normal). Note that in our application it does not really
matter that the characteristic generator φa (·) is not known — it suffices
to notice that
σ Γ,i
E Xi | Γ] = µXi ,Γ = µXi + 122 Γ − µΓ .
σΓ
The second conditioning random variable is chosen analogously as in (2.93):
Λ=−
n
X
i=1
1
2
E[Xi ]eµi + 2 σi Y (i) = −
n
X
1
2
µXi eµi + 2 σi Y (i).
i=1
From Section 2.5.1 it follows that the lower bound is given by the following
expression:
l
Sel
=
n X
µ Xi
i=1
Γ,i
µ + 1 σ2 (1−r2 )+r σ Φ−1 (V )
σ12
−1
i i
i
,
+ 2 FΓ (U ) − µΓ e i 2 i
σΓ
(2.98)
where correlations ri = r(−Y (i), Λ) are defined as in (2.79) (with E[Xi ]
substituted by µXi ). Note that expression (2.98) simplifies in the normal
case to
l
SN
=
n
X
i=1
1 2
2
−1
µXi + rXi σXi Φ−1 (U ) eµi + 2 σi (1−ri )+ri σi Φ (V )
82
Chapter 2 - Convex bounds
with
rXi = r(Xi , Γ) =
σ Xi
r
Pn
Pn
j=1 µXj e
~
µj + 21 σj2 X
σij
k,l=1 µXk µXl e
µk +µl + 21 σk2 +σl2
.
~
X
σkl
Finally, the corresponding stop-loss premiums are computed according to
the following expressions:
1 Z
Γ,i
σ12
−1
π (Sel , d, Γ, Λ) =
e
µXi + 2 FΓ (u) − µΓ ×
σΓ
0
i=1
−1
× Φ ri σi − Φ FS l |Γ=F −1 (u) (d)
du
Γ
el
− d 1 − FS l (d) ,
el
Z 1
n
X
lb
µi + 12 σi2
µXi + rXi σXi Φ−1 (u) ×
π (SN , d, Γ, Λ) =
e
n
X
lb
µi + 21 σi2
0
i=1
× Φ ri σi − Φ−1 FS l |Γ=F −1 (u) (d)
N
Γ
− d 1 − FS l (d) .
du
N
The moments based approximation
m from forIt is also possible to find the moments based approximation SN
mula (2.65), since one can compute the variance of SN as
h
h i
i
~ + Var ~ E SN | X
~
Var[SN ] = EX~ Var SN | X
X
n X
n
hX
i
1
2
2
= EX~
Xi Xj eµi +µj + 2 σi +σj eσij − 1
i=1 j=1
n
hX
+ VarX~
i=1
=
n
n X
X
~
X
σij
i=1 j=1
n X
n
X
−
i=1 j=1
1
2
X i e µi + 2 σ i
i
1
+ µXi µXj eµi +µj + 2
1
µXi µXj eµi +µj + 2
σi2 +σj2
σi2 +σj2 +σij
.
2.6. Application: the present value of stochastic cash flows
83
Here, the variances of the upper and the lower bound are computed as
explained in Section 2.5.3.
~ having a multivariate elliptical distribution the
We remark that for X
computations are almost identical, with the only difference in the formula
for covariances
~
Cov[Xi , Xj ] = −2φ0 (0)X
ij .
Then the stop-loss premium of the moments based approximation is
obtained as a convex combination
π m (Sel , d, Γ, Λ) = zπ lb (Sel , d, Γ, Λ) + (1 − z)π cub (Sel , d),
where z is defined as in (2.66).
A numerical illustration
We study the case of normally distributed payments with mean µXi = 1
2 = 0.01. Note that the mean and the variance are the same
and variance σX
i
as in the lognormal case. Moreover we assume the following correlation
pattern for the payments:

1



0.5
r(Xi , Xj ) =

0.2


0
if
if
if
if
i=j
|i − j| = 1
.
|i − j| = 2,
|i − j| > 2.
As in the previous example, we work in the Black & Scholes setting with
drift parameter µ = 0.05 and volatility σ = 0.1. We compare the perl , the upper bound S c and the moformances of the lower bound SN
N
m
ments based approximation SN with the real distribution of SN of the
present value function, obtained by a Monte Carlo simulation (MC) based
on 500 × 100 000 simulated paths.
The performance of the approximations is illustrated by the numerical
values of some upper quantiles displayed in Table 2.9. The same concluc gives
sions can be drawn as in the log-normal case — the upper bound SN
l and the moments
a quite poor approximation, while the lower bound SN
based approximation perform excellent.
The study of stop-loss premiums in Table 2.10 confirms this observation.
84
Chapter 2 - Convex bounds
p
0.75
0.90
0.95
0.975
0.995
l
SN
14.6820
17.0978
18.7642
20.3630
23.9599
m
SN
14.6849
17.1068
18.7787
20.3840
24.0020
c
SN
15.0368
18.0992
20.2522
22.3456
27.1468
MC (s.e.×103 )
14.6820 (0.70)
17.1025 (1.02)
18.7789 (1.46)
20.3895 (2.11)
24.0354 (4.61)
Table 2.9: Approximations for some selected quantiles with probability
level p of SN .
d
0
5
10
15
20
25
30
l
SN
12.8928
7.8928
3.0855
0.5589
0.0658
0.0070
0.0008
m
SN
12.8928
7.8929
3.0872
0.5615
0.0668
0.0072
0.0008
c
SN
12.8928
7.8931
3.2544
0.8213
0.1636
0.0309
0.0060
MC (s.e.×104 )
12.8923 (4.50)
7.8923 (4.50)
3.0863 (4.16)
0.5610 (2.11)
0.0671 (0.74)
0.0073 (0.25)
0.0008 (0.08)
Table 2.10: Approximations for some selected stop-loss premiums with
retention d of SN .
2.6.4
Independent and identically distributed payments
Finally, we consider the case where the payments Xi are independent and
identically distributed. The independence assumption accounts for more
flexibility in modelling the underlying marginal distributions, however —
unlike in the lognormal and elliptical cases — it imposes a rigid condition
on the dependence structure. We start with defining the class of tempered
stable distributions for which the methodology works particularly efficient.
Tempered stable distributions
The Tempered Stable law T S(δ, a, b) for a, b > 0 and 0 < δ < 1 is a
one-dimensional distribution given by the characteristic function:
ϕT S (t; δ, a, b) = eab−a
1
b δ −2it
δ
.
(2.99)
2.6. Application: the present value of stochastic cash flows
85
For more details we refer to e.g. Schoutens (2003). This class of distributions has the special property that the sum of independent and identically
distributed tempered stable random variables is again tempered stable.
This is formalized in the following lemma:
Lemma 9 (Sum of tempered stable random variables).
If Xi are i.i.d. random variables T S(κ, a, b)-distributed for i = 1, 2, . . . , n,
then their sum X1 + X2 + · · · + Xn is T S(κ, na, b)-distributed.
Proof. Consider the corresponding characteristic functions. We get
n
ϕX1 +X2 +···+Xn (t) = ϕT S (t; κ, a, b)
1
= e(na)b−(na)(b κ −2it)
κ
= ϕT S (t; κ, na, b).
The first two moments of a random variable X ∼ T S(δ, a, b) are given by
δ−1
δ−2
E[X] = 2aδb δ and Var[X] = 4aδ(1 − δ)b δ .
In the sequel we provide more details about two well-known special
cases: the gamma distribution and the inverse Gaussian distribution.
The gamma distribution Gamma(a, b) corresponds to the limiting case
when δ → 0. The characteristic function of the gamma distribution is
given by
it −a
ϕ(t; a, b) = 1 −
.
b
Notice that for X ∼ Gamma(a, b) one has E[X] = ab and Var[X] = ba2 .
The inverse Gaussian distribution is a member of the class of Tempered
Stable distributions with δ = 12 . Thus, the characteristic function is given
by
ϕ(t; a, b) = e−a
√
−2it+b2 −b
.
Moreover the mean and variance of X ∼ IG(a, b) are given by E[X] =
and Var[X] = ba3 .
We consider now sums of the form
n
X
Xi e−Y (i) ,
Sind =
a
b
(2.100)
i=1
where the process Y (i) is defined like in the previous examples and the
payments Xi are independent and follow the law defined by the cdf FX (·).
86
Chapter 2 - Convex bounds
The upper bound
The computation of the upper bound is straightforward (as described in
Section 2.5.3):
n
X
−1
c
Sind
= FX−1 (U )
eµi +σi Φ (V ) .
(2.101)
i=1
The stop-loss premiums for the upper bound are given by an expression
c .
analogous to (2.75), with S c replaced by Sind
The lower bound
To compute the lower bound, we start with defining the conditioning random variables Γ and Λ. Let
Γ = X1 + X2 + · · · + X n .
If we know the distributions of Xi , the distribution of the sum Γ is also
known. In particular, for Xi gamma distributed the sum Γ remains gamma
distributed and the same for Xi inverse Gaussian distributed.
Like in the previous examples, the conditional random variable Λ is
chosen as
n
X
1 2
(2.102)
E[Xi ]eµi + 2 σi Y (i).
Λ=−
i=1
Now, the lower bound can be written as
n
l
Sind
X
1
1
2 2
−1
= FΓ−1 (U )
eµi + 2 (1−ri )σi +ri σi Φ (V ) ,
n
i=1
where the correlations ri = r − Y (i), Λ are defined as in (2.79).
Note that the computation of stop-loss premiums of the lower bound
l .
is straightforward, by applying (2.83) and replacing S l by Sind
Cumulative distribution functions
In this case there is a more efficient method to compute the distribution
functions than this described in Section 2.5.3.
2.6. Application: the present value of stochastic cash flows
87
Remark 4. The cumulative distribution function of the product W of two
non-negative independent variables X and Y can be written as
!
Z ∞
Z 1
z z
dFX (x) =
FW (z) =
du.
(2.103)
FY
FY
x
FX−1 (u)
−∞
0
Using this result one can compute the cumulative distribution functions of
the upper and the lower bound as
Z 1
y
c (y) =
FX
dv,
FSind
FS̃−1
(v)
0
c
Z 1
y
FS l (y) =
F1Γ
dv,
ind
n
FS̃−1
(v)
0
l
where
S̃ c =
FS̃−1
(v) =
c
n
X
i=1
n
X
eµi +σi Φ
−1 (V
)
,
S̃ l =
n
X
1
2
2
eµi + 2 (1−ri )σi +ri σi Φ
−1 (V
)
,
i=1
eµi +σi Φ
−1 (v)
i=1
,
FS̃−1
(v) =
l
n
X
1
2
2
eµi + 2 (1−ri )σi +ri σi Φ
−1 (v)
.
i=1
The moments based approximation
The moments based approximation of Sind can be found in a similar way
to the moments based approximation for elliptical distributions. The key
step is to compute the variance of Sind :
h h
i
i
~
~ + Var ~ E Sind | X
Var[Sind ] = EX~ Var Sind | X
X
n
n
hXX
i
µi +µj + 21 σi2 +σj2
= EX~
Xi Xj e
eσij − 1
i=1 j=1
n
hX
+ VarX~
1
i=1
=
n X
n
X
i=1 j=1
n
X
i=1
i
1
E[Xi ]E[Xj ] eµi +µj + 2
2
Var[Xi ]e2µi +σi .
+
2
X i e µi + 2 σ i
σi2 +σj2
eσij − 1
(2.104)
88
Chapter 2 - Convex bounds
p
0.75
0.90
0.95
0.975
0.995
l
Sind
14.6709
17.0767
18.7372
20.3309
23.9183
m
Sind
14.6723
17.0810
18.7443
20.3412
23.9390
c
Sind
15.0320
18.0984
20.2563
22.3560
27.1762
MC (s.e.×103 )
14.6820 (0.70)
17.1025 (1.02)
18.7789 (1.46)
20.3895 (2.11)
24.0354 (4.61)
Table 2.11: Approximations for some selected quantiles with probability
level p of Sind for gamma i.i.d. liabilities.
The variances of the upper and the lower bound are computed as explained
in Subsection 2.5.3.
Consequently, the stop-loss premium of the moments based approximation is obtained as a convex combination
π m (Sind , d, Γ, Λ) = zπ lb (Sind , d, Γ, Λ) + (1 − z)π cub (Sind , d),
where z is defined as in (2.66).
A numerical illustration
We consider in this application independent Gamma(100, 100) distributed
future payments. Note that this choice of parameters implies that E[X] = 1
and Var[X] = 0.01 — i.e. we take the same mean and variance of liabilities
as in the lognormal and normal cases. As before we work in a Black &
Scholes setting with drift µ = 0.05 and volatility σ = 0.1. We compare
l , the upper bound S c
the performances of the lower bound Sind
ind and the
m
moments based approximation Sind with the real value Sind obtained by a
Monte Carlo simulation (MC) based on 500 × 100 000 simulated paths.
The results are very similar to the normal and lognormal case. It is
worth noticing that the variance of Sind (10.1489) is a bit lower than in the
lognormal case (10.2789) and in the normal case (10.2792). This is due
to independence of gamma-payments while we imposed a slight positive
dependence in the previous cases.
The quality of the approximations is illustrated by some upper quanl
tiles displayed in Table 2.11. The lower bound Sind
and the moments
m
based approximation Sind perform well, but not as good as in the lognormal and normal cases (probably because the conditioning random variable
2.7. Proofs
89
d
0
5
10
15
20
25
30
l
Sind
12.8928
7.8928
3.0813
0.5540
0.0647
0.0068
0.0007
m
Sind
12.8928
7.8928
3.0821
0.5553
0.0652
0.0069
0.0008
c
Sind
12.8928
7.8931
3.2528
0.8215
0.1644
0.0313
0.0061
MC (s.e.×104 )
12.8921 (4.44)
7.8921 (4.44)
3.0821 (4.06)
0.5549 (2.08)
0.0655 (0.77)
0.0071 (0.27)
0.0008 (0.09)
Table 2.12: Approximations for some selected stop-loss premiums with
retention d of Sind for gamma i.i.d. liabilities.
Γ does not take discounting factors into account). The study of stop-loss
premiums in Table 2.12 goes in line with these findings.
2.7
Proofs
Upper bound based on lower bound (2.44)
In the following we shall derive an easily computable expression for (2.26).
The second expectation term in the product (2.26) equals, when denoting
by FΛ (·) the normal cumulative distribution function of Λ,
E[I(Λ<dΛ ) ] = 0 · Pr[Λ ≥ dΛ ] + 1 · Pr[Λ < dΛ ] = FΛ (dΛ ) = Φ(d∗Λ ). (2.105)
The first expectation term in the product (2.26) can be expressed as
E Var [S|Λ] I(Λ<dΛ ) = E E[S 2 |Λ]I(Λ<dΛ ) − E (E[S|Λ])2 I(Λ<dΛ ) .
(2.106)
Now consider the second term of the right-hand side of (2.106)
E (E[S|Λ])2 I(Λ<dΛ ) =
Z
dΛ
−∞
(E[S|Λ = λ])2 dFΛ (λ).
(2.107)
According to (2.32) and using the notation Zij introduced before, we can
90
Chapter 2 - Convex bounds
express (2.107) as
E (E[S|Λ])2 I(Λ<dΛ )
=
=
=
=
Z
Z
Z
dΛ
−∞
dΛ
−∞
dΛ
n
X
i=1
n
X
E[Xi |Λ = λ]
αi e
!2
dFΛ (λ)
2
E[Zi ]+ri σZi Φ−1 (v)+ 12 (1−ri2 )σZ
i
!2
i=1
n X
n
X
αi αj e
E[Zij ]+(ri σZi +rj σZj )Φ−1 (v)
−∞ i=1 j=1
n X
n
X
1
× e2
αi αj e
i=1 j=1
×
Z
(1−ri2 )σZ2 i +(1−rj2 )σZ2 j
E[Zij ]+ 12
dΛ
e
−∞
×
dFΛ (λ)
(1−ri2 )σZ2 i +(1−rj2 )σZ2 j
(ri σZi +rj σZj )Φ−1 (v)
dFΛ (λ)
dFΛ (λ).
×
(2.108)
Next, applying Lemma 7 to (2.108) with a = ri σZi + rj σZj yields
E (E[S|Λ])2 I(Λ<dΛ ) =
n X
n
X
2 +σ 2 +2r r σ σ
E[Zij ]+ 21 (σZ
i j Zi Zj )
Zj
i
Φ d∗Λ − ri σZi + rj σZj .(2.109)
αi αj e
i=1 j=1
Now
the first term of the right-hand side of expression (2.106),
consider
2
E E[S |Λ]I(Λ<dΛ ) . The term E[S 2 |Λ] is given by (2.42). By applying
(2.43) with a = rij σZij = ri σZi + rj σZj , and simplifying, we obtain
2.7. Proofs
91
E E[S 2 |I(Λ<dΛ )
=
n X
n Z
X
i=1 j=1
=
n
n X
X
dΛ
αi αj e
2 σ2
E[Zij ]+rij σZij Φ−1 (v)+ 12 (1−rij
) Z
−∞
αi αj e
2 σ2
E[Zij ]+ 12 (1−rij
) Z
Z
ij
i=1 j=1
=
n
n X
X
αi αj e
2 σ2 +
E[Zij ]+ 12 (1−rij
) Z
ij
dΛ
e
=
αi αj e
E[Zij ]+
2
σZ
ij
2
i=1 j=1
dFΛ (λ)
rij σZij Φ−1 (v)
dFΛ (λ)
−∞
2 σ2
rij
Z
2
i=1 j=1
n X
n
X
ij
ij
Φ(d∗Λ − rij σZij )
Φ(d∗Λ − (ri σYi + rj σYj )).
(2.110)
Combining (2.110) and (2.109) into (2.106), and then substituting (2.105)
and (2.106) into (2.26) we get the following expression for the error bound
ε(dΛ ) (2.26):
ε(dΛ )
=
1
1
(Φ(d∗Λ )) 2
2
−e
=
n X
n
X
"
αi αj e
i=1 j=1
E[Zij ]+
2
σZ
2 +σ 2 +2r r σ σ
E[Zij ]+ 12 (σZ
i j Zi Zj )
Zj
i
1
1
(Φ(d∗Λ )) 2
2
× e
=
(
n X
n
X
i=1 j=1
(
n X
n
X
−e
Φ d∗Λ − ri σZi + rj σZj
αi αj e
−e
1
2 +σ 2 +2r r σ σ
(σZ
i j Zi Zj )
Zj
2
i
2 +σ 2 )
E[Zij ]+ 21 (σZ
Z
i
i=1 j=1
σ Zi Zj
Φ d∗Λ − ri σZi + rj σZj
αi αj eE[Zij ] Φ d∗Λ − ri σZi + rj σZj
1
2 +σ 2 +2σ
(σZ
Zi Zj )
Zj
2
i
1
1
(Φ(d∗Λ )) 2
2
× e
(
ij
2
σ Zi σ Zj
ri rj )1
2
.
j
−
) 12
×
) 21
Φ d∗Λ − ri σZi + rj σZj
×
92
Chapter 2 - Convex bounds
Partially exact/comonotonic upper bound (2.45)
Applying Lemma 7 with a = ri σZi , and using (2.32), we can express the
second term I2 in (2.22) in closed-form:
Z +∞
E[S − d|Λ = λ]dFΛ (λ)
dΛ
=
=
=
Z
+∞
E[S|Λ = λ]dFΛ (λ) − d(1 − FΛ (dΛ ))
dΛ
n
X
αi e
i=1
n
X
2
E[Zi ]+ 21 (1−ri2 )σZ
αi eE[Zi ]+
2
σZ
2
i=1
i
i
Z
+∞
e r i σ Zi Φ
dΛ
−1 (v)
dFΛ (λ) − d(1 − Φ(d∗Λ ))
Φ(ri σZi − d∗Λ ) − dΦ(−d∗Λ ).
(2.111)
Substituting (2.33) in (2.28) we end up with the following upper bound of
I1 similar to (2.37) but now with an integral from zero to Φ(d∗Λ ):
Z
dΛ
−∞
≤
=
=
Z
Z
E[(S − d)+ |Λ = λ]dFΛ (λ)
dΛ
E[(S u − d)+ |Λ = λ]dFΛ (λ)
−∞
Φ(d∗Λ )
E[(S u − d)+ |V = v] dv
0
n
X
αi e
2 1−r 2
E[Zi ]+ 12 σZ
( i)
i
i=1
×
−d
Z
Φ(d∗Λ )
e
ri σZi Φ−1 (v)
0
Φ(d∗Λ )
−
Z
Φ(d∗Λ )
0
×
Φ sign(αi )
q
1−
!
FS u |V =v (d)dv ,
ri2 σZi
−Φ
−1
FS u |V =v (d)
dv
(2.112)
where we recall that d∗Λ is defined as in (2.43), and the cumulative distribution FS u (d) is, according to (2.36), determined by
n
√ 2
X
−1
−1
αi eE[Zi ]+ri σZi Φ (v)+sign(αi ) 1−ri σZi Φ (FSu (d|V =v)) = d.
i=1
Finally, adding (2.112) to the exact part (2.111) of the decomposition (2.22)
results in the partially exact/comonotonic upper bound.
Chapter 3
Reserving in life insurance
business
Summary In the traditional approach to life contingencies only decrements are assumed to be stochastic. In this contribution we consider the
distribution of a life annuity (and a portfolio of life annuities) when also
the stochastic nature of interest rates is taken into account. Although the
literature concerning this topic is already quite rich, the authors usually
restrict themselves to the computation of the first two or three moments.
However, if one wants to determine e.g. capital requirements using more
sofisticated risk measures like Value-at-Risk or Tail Value-at-Risk, more
detailed knowledge about underlying distributions is required. For this
purpose, we propose to use the theory of comonotonic risks introduced in
Chapter 2. This methodology allows to obtain reliable approximations of
the underlying distribution functions, in particular very accurate estimates
of upper quantiles and stop-loss premiums. Several numerical illustrations
confirm the very high accuracy of the methodology.
3.1
Introduction
Unlike in finance, in insurance the concept of stochastic interest rates
emerged quite recently. In the traditional approach to life contingencies
only decrements are assumed to be stochastic — see e.g. Bowers et al.
(1986), Wolthuis & Van Hoek (1986). Such a simplification allows to treat
effectively summary measures of financial contracts such as the mean, the
93
94
Chapter 3 - Reserving in life insurance business
standard deviation or the upper quantiles. For a more detailed discussion
about the distributions in life insurance under deterministic interest rates,
see e.g. Dhaene (1990).
In non-life insurance the use of deterministic interest rates may be
justified by short terms of insurance commitments. In the case of the life
insurance and the life annuity business, durations of contracts are typically
very long (often 30 or even more years). Then uncertainty about future
rates of return becomes very high. Moreover the financial and investment
risk — unlike the mortality risk — cannot be diversified with an increase in
the number of policies. Therefore in order to calculate insurance premiums
or mathematical reserves, actuaries are forced to adopt very conservative
assumptions. As a result the diversification effects between interest rates
in different investment periods may not be taken into account (i.e. that
poor investment results in some periods are usually compensated by very
good ones in others) and the life insurance business becomes too expensive,
both for the insureds who have to pay higher insurance premiums and for
the shareholders who have to provide more capital than necessary. Profitsharing can partially solve this problem. For these reasons the necessity to
introduce models with stochastic interest rates have been well-understood
in the actuarial world.
In the actuarial literature numerous papers have treated the random
interest rates. In Boyle (1976) autoregressive models of order one are introduced to model interest rates. Bellhouse & Panjer (1980, 1981) use similar
models to compute moments of insurance and annuity functions. In Wilkie
(1976) the force of interest is assumed to follow a Gaussian random walk.
Waters (1978) computes the moments of actuarial functions when the interest rates are independent and identically Gaussian distributed. He computes also moments of portfolios of policies and approximates the limiting
distribution by Pearson’s curves. In Dhaene (1989) the force of interest
is modelled as an ARMA(p, d, q) process. He uses this model to compute
the moments of present value functions. Norberg (1990) provides an axiomatic approach to stochastic interest rates and the valuation of payment
streams. Parker (1994d) compares two approaches to the randomness of interest rates: by modelling only the accumulated interest and by modelling
the force of interest. Both methodologies are illustrated by calculating the
mean, the standard deviation and the skewness of the annuity-immediate.
An overview of stochastic life contingencies with solvency valuation is
presented in Frees (1990). In the papers of Beekman & Fuelling (1990,
3.1. Introduction
95
1991) the mean and the standard deviation of continuous-time life annuities are calculated with the force of mortality modelled as an OrnsteinUhlenbeck and a Wiener process respectively. In Beekman & Fuelling
(1993) expressions are given for the mean and the standard deviation of
the future life insurance payments. Norberg (1993) derives the first two
moments of the present value of stochastic payment streams. The first
three moments of homogeneous portfolios of life insurance and endowment
policies are calculated in Parker (1994a,b) and the results are generalized
to heterogeneous portfolios in Parker (1997). The same author (1994c,
1996) provides a recursive formula to calculate an approximate distribution function of the limiting homogeneous portfolio of term life insurance
and endowment policies. In Dȩbicka (2003) the mean and the variance are
calculated for the present value of discrete-time payment streams in life
insurance.
Although the literature on stochastic interest rates in life insurance is
already quite rich, for most of the problems no satisfactory solutions have
been found as yet. In almost all papers the authors restrict themselves
to calculating the first two or three moments of the present value function
(except Waters (1978), Parker (1994d, 1996)). The computation of the first
few moments may be seen as just a first attempt to explore the properties of
a random distribution. Moreover in general the variance does not appear to
be the most suitable risk measure to determine the solvency requirements
for an insurance portfolio. As a two-sided risk measure it takes into account
both positive and negative discrepancies which leads to underestimation of
the reserve in the case of a skewed distribution. It does not emphasize the
tail properties of the distribution and does not give any reliable estimates of
the Value-at-Risk or other tail-related risk measures, for which simulation
methods have to be deployed. The same applies to risk measures based on
stop-loss premiums, like Expected Shortfall.
In this chapter we aim to provide some conservative estimates both
for high quantiles and stop-loss premiums for an individual policy and for
a whole portfolio. We focus here only on life annuities, however similar
techniques may be used to get analogous estimates for more general life
contingencies. Using the results of Chapter 2 we will approximate the
quantiles of the present value of a life annuity and a portfolio of life annuities.
We perform our analysis separately for a single life annuity and a whole
portfolio of policies. Our solution enables to solve with a great accuracy
96
Chapter 3 - Reserving in life insurance business
personal finance problems, such as: How much does one need to invest now
to ensure — given a periodical (e.g. yearly) consumption pattern — that
the probability of outliving ones money is very small (e.g. less than 1%)?
Similar problems were studied by Dufresne (2004) and Milevsky & Wang
(2004).
The case of a portfolio of life annuity policies has been studied extensively in the literature, but only in the limiting case — for homogeneous
portfolios, when the mortality risk is fully diversified. However the applicability of these results in insurance practice may be questioned: especially
in the case of the life annuity business a typical portfolio does not contain enough policies to speak about full diversification. For this reason we
propose to approximate the number of active policies in subsequent years
using a normal power distribution and to model the present value of future
benefits as a scalar product of mutually independent random vectors.
This chapter is mainly based on Hoedemakers, Darkiewicz & Goovaerts
(2005) and is organized as follows. In Section 2 we give a summary of
the model assumptions and properties for the mortality process that are
needed to reach our goal. In the first part of Section 3 we apply the
results of Chapter 2 to the present value of a single life annuity policy.
In the second part of this section we present the convex bounds for a
homogeneous portfolio of policies. A numerical illustration is provided at
the end of each part. We also illustrate the obtained results graphically.
3.2
Modelling stochastic decrements
A life annuity may be defined as a series of periodic payments where each
payment will actually be made only if a designated life is alive at the time
the payment is due. Let us consider a person aged x years, also called a
life aged x and denoted by (x). We denote his or her future lifetime by Tx .
Thus x + Tx will be the age of death of the person. The future lifetime Tx
is a random variable with a probability distribution function
Gx (t) = Pr[Tx ≤ t] = t qx ,
t ≥ 0.
The function Gx represents the probability that the person will die within t
years, for any fixed t. We assume that Gx is known. We define Kx = bTx c,
the number of completed future years lived by (x), or the curtate future
3.2. Modelling stochastic decrements
97
lifetime of (x). The probability distribution of the integer valued random
variable Kx is given by
Pr[Kx = k] = Pr[k ≤ Tx < k + 1] =
k+1 qx − k qx
=
k| qx ,
k = 0, 1, . . . .
Let us denote the lifetime from birth by the random variable T . We assume
Pr[Tx ≤ t] = Pr[T ≤ x + t|T ≥ x].
d
With this notation, T = T0 . Further, the ultimate age of the life table is
denoted by ω, this means that ω − x is the first remaining lifetime of (x)
for which ω−x qx = 1, or equivalently, G−1
x (1) = ω − x.
In the remainder of this chapter we will always use the standard actuarial notation:
Pr[Tx > t] = t px , Pr[Tx > 1] = px , Pr[Tx ≤ t] = t qx , Pr[Tx ≤ 1] = qx .
In this chapter we consider three types of annuities. The present value of
a single life annuity for a person aged x paying periodically (e.g. yearly) a
fixed amount of αi (i = 1, . . . , bω − xc) can be expressed as
Ssp,x =
Kx
X
αi e
−Y (i)
=
i=1
bω−xc
X
I(Tx >i) αi e−Y (i) .
(3.1)
i=1
We consider also the present value of a homogeneous portfolio of life
annuities — this random variable is particularly interesting for an insurer who has to determine a sufficient level of the reserve and the solvency margin. Assuming that every beneficiary gets a fixed amount of αi
(i = 1, . . . , bω − xc) per year, the present value can be expressed as follows
Spp,x =
bω−xc
X
αi Ni e−Y (i) ,
(3.2)
i=1
where Ni denotes the remaining number of policies-in-force in year i.
Finally, consider a portfolio of N0 homogeneous life annuity contracts
(1)
(2)
(N )
for which the future lifetimes of the insureds Tx , Tx , . . . , Tx 0 are assumed to be independent. Then the insurer faces two risks: mortality
risk and investment risk. Note that from the Law of Large Numbers the
98
Chapter 3 - Reserving in life insurance business
mortality risk decreases with the number of policies N0 while the investment risk remains the same (each of the policies is exposed to the same
investment risk). Thus, for sufficiently large N0 we have that




bω−xc
bω−xc
bω−xc
X
X
X
Ni −Y (i) 
αi Ni e−Y (i) = N0 
αi
e
≈ N0 
αi i px e−Y (i)  .
N0
i=1
i=1
i=1
Hence in the case of large portfolios of life annuities it suffices to compute
risk measures of an ‘average’ portfolio Sapp,x given by
Sapp,x =
bω−xc
X
i=1
αi i px e−Y (i) = E Ssp,x |Y (1), · · · , Y (bω − xc) .
(3.3)
Remark 5. For the random variables Sapp,x and Ssp,x one has that
Sapp,x ≤cx Ssp,x
and consequently Var[Sapp,x ] ≤ Var[Ssp,x ].
Indeed, let Γ denote a random variable independent of Tx . Then, it follows
immediately from Theorem 8 that
Ssp,x
=
bω−xc
X
I(Tx >i) αi e−Y (i)
i=1
≥cx
=
bω−xc
X
i=1
bω−xc
X
E[I(Tx >i) |Γ]αi e−Y (i)
i px α i e
−Y (i)
i=1
=
Sapp,x .
Obviously Ssp,x , Spp,x and Sapp,x depend on the distribution of the total
lifetime T . We assume that T follows the Gompertz-Makeham law, i.e.
the force of mortality at age ξ is given by the formula
µξ = α + βcξ ,
where α > 0 is a constant component, interpreted as capturing accident
hazard, and βcξ is a variable component capturing the hazard of aging
with β > 0 and c > 1. This leads to the survival probability
t px
= Pr[Tx > t] = e−
x+t
x
µξ dξ
= st g c
x+t −cx
,
3.2. Modelling stochastic decrements
99
where
s = e−α and g = e
β
− log
c
.
(3.4)
In numerical illustrations we use the Belgian analytic life tables MR and
FR for life annuity valuation, with corresponding constants for males: s =
0.999441703848, g = 0.999733441115 and c = 1.101077536030 and for females: s = 0.999669730966, g = 0.999951440171 and c = 1.116792453830.
Denote by T 0 and Tx0 the corresponding random variables from the
Gompertz family — the subclass of the Makeham-Gompertz family with
the force of mortality given by
µ0ξ = βcξ .
It is straightforward to show that
d
Tx = min(Tx0 , E/α),
(3.5)
where E denotes a random variable from the standard exponential distribution, independent of T 0 . Indeed, one has that
Pr[min(Tx0 , E/α) > t] = Pr[Tx0 > t] Pr[E > αt]
= e−
= e
−
x+t
x
µ0ξ dξ −αt
x+t
x
µξ dξ
e
= Pr[Tx > t].
The cumulative distribution function for the Gompertz law, unlike for the
Makeham-Gompertz law in general, has an analytical expression for the
inverse function and therefore (3.5) can be used for simulations.
For generating one random variate from Makeham’s law, we use the
composition method (Devroye, 1986) and perform the following steps
1. Generate G from the Gompertz’s law by the well-known inversion
method
2. Generate E from the exponential(1) distribution
3. Retain T = min(G, E/α),
where α = − log s, see (3.4).
100
3.3
Chapter 3 - Reserving in life insurance business
The distribution of life annuities
This section is organized into 2 subsections. In the first subsection we
derive upper and lower bounds in convex order for the distribution of the
present value of a single life annuity given a mortality law T and a model
for the returns. This distribution is very important in the context of socalled personal finance problems. Suppose that (x) disposes of a lump
sum L. What is the amount that (x) can yearly consume to be sure with a
sufficiently high probability (e.g. p = 99%) that the money will not be run
out before death? Obviously, to answer this question one has to compute
the Value-at-Risk measure of the distribution at an appropriate chosen
level.
In the second part of this section we will consider the distribution of
a homogeneous and ‘average’ portfolio of life annuities. An insurer has
to derive this distribution to determine its future liabilities and solvency
margin. Notice that the presented methodology is appropriate not only in
the case of large portfolios when the limiting distribution can be used on
the basis of the law of large numbers but also for portfolios of average size
(e.g. 1 000 - 5 000) which are typical for the life annuity business.
~ = Y (1), Y (2), . . . , Y (n) is assumed to have a n-dimensional
The vector Y
normal distribution with given mean vector
µ
~ = (µ1 , . . . , µn ) = E[Y (1)], E[Y (2)], . . . , E[Y (n)]
and covariance matrix
h
i
Σ = [σij ]1≤i,j≤n = Cov Y (i), Y (j)
1≤i,j≤n
.
In the above notation we will denote σii by σi2 .
3.3.1
A single life annuity
In this subsection we consider a whole life annuity of αi (> 0) payable at
the end of each year i while (x) survives, described by the formula
Ssp,x =
Kx
X
i=1
αi e
−Y (i)
=
bω−xc
X
i=1
I(Tx >i) αi e−Y (i) .
3.3. The distribution of life annuities
101
The upper bound
The random variable Xi = I(Tx >i) is Bernoulli(i px ) distributed and thus
the inverse distribution function is given by
FX−1i (p) =
1 for p > i qx
0 for p ≤ i qx .
This leads to the following formula for the upper bound
c
Ssp,x
=
bω−xc
X
i=1
=
FX−1i (U )Fα−1e−Y (i) (V )
i
(U )c
bFT−1
x
X
i=1
Fα−1e−Y (i) (V ),
i
where U and V are independent standard uniformly distributed random
variables. Thus the conditional quantiles are given by
FS−1
c
sp,x
|Tx =t (p) =
btc
X
i=1
Fα−1e−Y (i) (V )
i
and the conditional distribution function can be computed numerically
from the identity
btc
X
αi e
c
−µi +sign(αi )σi Φ−1 (FSsp,x
|Tx =t (y))
i=1
=
k
X
αi e
c
−µi +sign(αi )σi Φ−1 (FSsp,x
|Kx =k (y))
= y.
i=1
Define S̃k as follows:
S̃k =
k
X
i=1
αi e−Y (i) ,
(3.6)
102
Chapter 3 - Reserving in life insurance business
d
c
then S̃k = Ssp,x |Kx = k. Hence, the distribution function of Ssp,x
can be
computed as
c
FSsp,x
(y) =
bω−xc
X
k=1
=
bω−xc
X
c
Pr[Kx = k]FSsp,x
|Kx =k (y)
k| qx FS̃ c (y)
k
k=1
=
bω−xc
X
k| qx Pr
k=1
"
k
X
αi e−µi +sign(αi )σi
Φ−1 (U )
i=1
#
≤y ,
P
with S̃kc = ki=1 Fα−1e−Y (i) (U ) and U a standard uniform random variable.
i
The computation of the corresponding stop-loss premiums is also straightforward:
h i
c
π cub (Ssp,x , d) = EKx E (Ssp,x
− d)+ |Kx
=
bω−xc
X
k| qx π
cub
(S̃k , d)
k=1
=
bω−xc
X
k=1
k| qx
X
k
π αi e
−Y (i)
i=1
, dck,i
,
where dck,i is defined analogously to (2.73) as
dck,i = αi e
−µi +sign(αi )σi Φ−1 (FS̃ c (d))
k
and the values of π(αi e−Y (i) , dk,i ) are computed as in (2.74). The stop-loss
c
premium of Ssp,x
at retention d can be written out explicitly as follows
π
cub
(Ssp,x , d) =
bω−xc
X
k=1
k| qx
(
k
X
αi e−µi +
σi2
2
i=1
− d 1 − FS̃ c (d)
k
Φ sign(αi )σi − Φ−1 FS̃ c (d)
k
)
.
3.3. The distribution of life annuities
103
The lower bound
For the lower bound one faces the problem of choosing appropriate conditioning random variables Γ and Λ. The random variables Xi are in fact
comonotonic and depend only on the future lifetime Tx , thus Γ = Tx is the
most natural choice. As a result one simply gets
E I(Tx >i) |Tx = I(Tx >i) .
The choice of the second conditioning random variable Λ is less obvious.
We propose two different approaches:
Pbω−xc
1 2
1. Λ(a) = i=1 i px αi e−µi + 2 σi Y (i). Intuitively it means that the conditioning random variable is chosen as a first order approximation to
the present value of the limiting portfolio Sapp,x in (3.3).
2. Consider the ‘maximal variance’ conditioning random variables
of
P
1 2
the form Λj = ji=1 αi e−µi + 2 σi Y (i) j = 1, . . . , bω − xc and the
corresponding lower bounds
l,j
Ssp,x
=
Kx
X
E αi e−Y (i) |Λj ,
j = 1, . . . , bω − xc
i=1
from which one chooses the lower bound with the largest variance.
The corresponding conditioning random variable will be denoted as
Λ(m) . This choice can be motivated as follows. For two random
variables X and Y with X ≤cx Y one has that Var[X] ≤ Var[Y ]. As
discussed in Chapter 2 we should choose Λ such that the goodnessVar[S l ]
of-fit expressed by the ratio z = Var[Ssp,x ] is as close as possible to
sp,x
1. Hence one can expect that a lower bound with a larger variance
will provide a better fit to the original random variable.
Having chosen the conditioning random variable Λ one proceeds as in the
case of the upper bound: the first step requires the computation of the
conditional distribution of the lower bound from the formula
k
X
i=1
αi e
−µi + 21 σi2 (1−ri2 )+σi ri Φ−1 (FS l
sp,x |Kx =k
(y))
= y.
104
Chapter 3 - Reserving in life insurance business
l
The cumulative distribution function of Ssp,x
can then be computed as
FSsp,x
(y) =
l
bω−xc
X
k=1
=
bω−xc
X
l
k| qx FSsp,x
|Kx =k (y)
k| qx FS̃ l (y)
k
k=1
=
bω−xc
X
k| qx Pr
k=1
"
k
X
αi e
−µi −ri σi Φ−1 (U )+ 21 (1−ri2 )σi2
i=1
#
≤y ,
with S̃kl = E[S̃k |Λ] and U a standard uniform random variable.
The computation of the corresponding stop-loss premium is similar to
the one of the upper bound and as a result one gets the following explicit
solution
h i
l
π lb (Ssp,x , d, Γ, Λ) = EKx E (Ssp,x
− d)+ |Kx
=
bω−xc
X
k| qx π
k=1
=
bω−xc
X
k| qx
lb
(S̃k , d, Λ)
!
k
X
π E αi e−Y (i) |Λ , dlk,i ,
i=1
k=1
with dlk,i given by
dlk,i = αi e
−µi + 21 σi2 (1−ri2 )+σi ri Φ−1 (FS l
sp,x |Kx =k
(d))
.
Note that the values of π E αi e−Y (i) |Λ , dlk,i can be computed as in
l
(2.82). The stop-loss premium of Ssp,x
at retention d can be written out
explicitly as follows
lb
π (Ssp,x , d, Γ, Λ) =
bω−xc
X
k=1
k| qx
(
k
X
αi e−µi +
σi2
2
k
i=1
− d 1 − FS̃ l (d)
k
Φ ri σi − Φ−1 FS̃ l (d)
)
.
3.3. The distribution of life annuities
105
The lower bound based on a lifetime dependent conditioning random variable
In this subsection we show how it is possible to improve the lower bound of
a scalar product if one of the vectors is comonotonic. We state this result
in the following lemma.
Lemma 10.
P
Consider a scalar product of random variables S = ni=1 Xi Yi , where the
~ and Y
~ are independent and X
~ is additionally assumed
random vectors X
−1
−1
~ = F (U ), F (U ), . . . , F −1 (U ) . Let Λ(u) be
to be comonotonic, i.e. X
Xn
X2
X1
a random variable which is defined for each u ∈ (0, 1) separately. Define
S cl (u) as follows:
S cl (u) =
n
X
i=1
FX−1i (u) E Yi | Λ(u) ,
d
then S cl (u) = (S cl |U = u). Define the random variable S cl through its
distribution function
FS cl (y) =
Z
1
0
FS cl |U =u (y)du.
Then S cl ≤cx S.
Remark 6. Obviously the conditioning random variable U can be replaced
by any other random variable which determines the comonotonic vector
~ by a functional relationship. We consider here the case when Xi =
X
I(Tx >i) = I(Kx ≥i) and therefore it is convenient to condition on the future
lifetime Kx .
Proof. Let S(u) denote a random variable distributed as S given that
U = u. From Definition 1b of convex order, it follows immediately that
S cl (u) ≤cx S(u).
Indeed, let v(.) be an arbitrary convex function. Then we get
cl
E v(S ) =
Z
1
0
cl
E v(S (u)) du ≤
which completes the proof.
Z
1
0
E v(S(u)) du = E v(S) ,
106
Chapter 3 - Reserving in life insurance business
Because of Lemma 10, one can determine a lower bound of a single life
annuity using the following conditioning random variable:
Λ Kx =
Kx
X
1
2
αi e−µi + 2 σi Y (i).
i=1
Intuitively it is clear that the lower bound defined by the random variable
ΛKx should approximate the underlying distribution better than those defined by the conditioning random variables Λ(a) and Λ(m) . As before, one
starts with computing the conditional distributions for the lower bound
cl
Ssp,x
numerically by considering the equation
k
X
αi e
2 )σ 2 +r
−1 F
−µi + 12 (1−ri,k
i,k σi Φ
i
S cl
sp,x |Kx =k
(y)
i=1
= y,
with correlations ri,k given by
ri,k
Cov Y (i), Λk
p
=p
Var[Y (i)] Var[Λk ]
cl
Consequently, the distribution function of Ssp,x
can be obtained as
FSsp,x
cl (y) =
bω−xc
X
k=1
Pr[Kx = k]FSsp,x
cl |K =k (y) =
x
bω−xc
X
k| qx FS̃ cl (y),
k=1
k
with
S̃kcl = E S̃k |Λk .
cl
The stop-loss premiums of Ssp,x
can be computed as follows
h i
cl
π clb (Ssp,x , d, Γ, Λ) = EKx E (Ssp,x
− d)+ |Kx
=
bω−xc
X
k| qx π
k=1
=
bω−xc
X
k=1
k| qx
lb
(S̃k , d, Λk )
!
k
X
π E αi e−Y (i) |Λk , dcl
,
k,i
i=1
(3.7)
3.3. The distribution of life annuities
107
with dcl
k,i given by
dcl
k,i = αi e
−1 (F
2 )+σ r
(d))
−µi + 12 σi2 (1−ri,k
i i,k Φ
S̃ cl
k
.
cl
The stop-loss premium of Ssp,x
at retention d can be written out explicitly
as follows
clb
π (Ssp,x , d, Γ, Λ) =
bω−xc
X
k| qx
k=1
(
k
X
αi e−µi +
σi2
2
k
i=1
− d 1 − FS̃ cl (d)
k
Φ ri,k σi − Φ−1 FS̃ cl (d)
)
.
The moments based approximation
l
c
Having computed the upper bound Ssp,x
and the lower bounds Ssp,x
and
cl
Ssp,x , one can compute two moments based approximations as described
in Subsection 2.2.4. To find the coefficient z given by (2.15) one needs to
c , Sl
cl
c
calculate the variances of Ssp,x
sp,x , Ssp,x and Ssp,x . The variance of Ssp,x
l
and Ssp,x can be computed as explained in Subsection 2.5.3. The variance
cl
of Ssp,x and Ssp,x
can be treated very similarly. Indeed, after some simple
calculations one gets
h cl i cl 2
cl
= EKx E (Ssp,x
)2 |Kx − E Ssp,x
Var Ssp,x
=
bω−xc
X
k=1
Var Ssp,x
k| qx E
h
S̃kcl
2 i
2
cl
,
− E Ssp,x
h 2
i = EKx E (Ssp,x )2 |Kx − E Ssp,x
=
bω−xc
X
k=1
h
2 i 2
q
E
S̃
−
E
S
,
x
sp,x
k
k|
108
Chapter 3 - Reserving in life insurance business
where S̃kcl and S̃k are defined as in (3.7) and (3.6) respectively. Thus it
suffices to plug in
k
X
σi2
E S̃kcl = E S̃k =
αi e−µi + 2 ,
h
2 i
E S̃kcl
=
h
E S̃k
and
2 i
=
i=1
k
k X
X
1
2
2
1
2
2
αi αj e−µi −µj + 2 (σi +σj )+ri,k rj,k σi σj ,
i=1 j=1
k X
k
X
αi αj e−µi −µj + 2 (σi +σj )+σij ,
i=1 j=1
bω−xc
bω−xc
X
X
cl cl q
E
S̃
E Ssp,x = E Ssp,x
=
=
k
k| qx E S̃k .
k| x
k=1
k=1
Now one can compute distributions of the moment based approximations
from the formulas
c
m (y)
(y) + (1 − z1 )FSsp,x
(y),
FSsp,x
= z1 FSsp,x
l
cm (y)
FSsp,x
= z2 FSsp,x
(y)
cl (y) + (1 − z2 )FS c
sp,x
and their corresponding stop-loss premiums as
π m (Ssp,x , d, Γ, Λ) = z1 π lb (Ssp,x , d, Γ, Λ) + (1 − z1 )π cub (Ssp,x , d),
π cm (Ssp,x , d, Γ, Λ) = z2 π clb (Ssp,x , d, Γ, Λ) + (1 − z2 )π cub (Ssp,x , d),
where
z1 =
c ] − Var[S
c ] − Var[S
Var[Ssp,x
Var[Ssp,x
sp,x ]
sp,x ]
and
z
=
.
2
c
l
c
cl ]
Var[Ssp,x ] − Var[Ssp,x ]
Var[Ssp,x ] − Var[Ssp,x
A numerical illustration
We examine the accuracy and efficiency of the derived approximations
for a single life annuity of a 65-years old male person with yearly unit
payments. We restrict ourselves to the case of a Black & Scholes setting
(model BS) with drift µ = 0.05 and volatility σ = 0.1. We assume further
that the future lifetime T65 follows the Makeham-Gompertz law with the
corresponding coefficients of the Belgian analytic life table MR (see Section
3.3. The distribution of life annuities
109
c
3.2). We compare the distribution functions of the upper bound Ssp,65
and
l
cl
the lower bounds Ssp,65 and Ssp,65 , as described in the previous sections,
with the original distribution function of Ssp,65 based on extensive Monte
Carlo (MC) simulation. We generated 500 × 100 000 paths and for each
estimate we computed the standard error (s.e.). As is well-known, the
(asymptotic) 95% confidence interval is given by the estimate plus or minus
1.96 times the standard error. Note also that the random paths are based
on antithetic variables in order to reduce the variance. Notice that to
compute the lower bound we use as conditioning random variable Λ(m) =
Λ24 (the value j = 24 was found to be the one that maximizes the variance
as described in Section 3.3.1).
Figure 3.1 shows the cumulative distribution functions of the approximations, compared to the empirical distribution. One can see that the
cl
lower bound Ssp,65
is almost indistinguishable from the original distribution. In order to have a better view on the behavior of the approximations
cl
l
in the tail, we consider a QQ-plots where the quantiles of Ssp,65
, Ssp,65
and
c
Ssp,65 are plotted against the quantiles of Ssp,65 obtained by simulation.
The different bounds will be good approximations if the plotted points
(FS−1
(p), FS−1
(p)), (FS−1
(p), FS−1
(p)) and (FS−1
(p), FS−1
(p)) for
c
l
cl
sp,65
sp,65
sp,65
sp,65
sp,65
sp,65
all values of p in (0, 1) do not deviate too much from the line y = x. From
the QQ-plot in Figure 3.2, we can conclude that the comonotonic upper
bound slightly overestimates the tails of Ssp,65 , whereas the accuracy of
l
cl
the lower bounds Ssp,65
and Ssp,65
is extremely high; the corresponding
QQ-plot is indistinguishable from a perfect straight line. These visual observations are confirmed by the numerical values of some upper quantiles
displayed in Table 3.1, which also reports the moments based approximam
cm .
tions Ssp,65
and Ssp,65
Stop-loss premiums for the different approximations are compared in
Figure 3.3 and Table 3.2. This study confirms the high accuracy of the
derived bounds. Note that for very high values of d the differences become
larger, however these cases don’t represent any practical importance. All
Monte Carlo estimates are very close to π clb (Ssp,65 , d, Γ, Λ) and some of
them even turn out to be smaller than this lower bound for. This not only
demonstrates the difficulty of estimating stop-loss premiums by simulation,
but it also indicates the accuracy of the lower bound π clb (Ssp,65 , d, Γ, Λ).
Indeed, since the Monte Carlo estimate is based on random paths, it can
be smaller than π clb (Ssp,65 , d, Γ, Λ) and this is very likely to happen if the
110
Chapter 3 - Reserving in life insurance business
p
0.75
0.90
0.95
0.975
0.995
l
Ssp,65
14.1741
17.5905
19.9565
22.2495
27.5124
cl
Ssp,65
14.1887
17.5972
19.9713
22.2875
27.6700
m
Ssp,65
14.1750
17.6250
20.0232
22.3559
27.7498
cm
Ssp,65
14.1887
17.6008
19.9783
22.2986
27.6943
c
Ssp,65
14.1867
18.0797
20.8754
23.6574
30.2983
MC (s.e. × 103 )
14.1887 (0.978)
17.5969 (1.420)
19.9731 (1.896)
22.2839 (2.816)
27.6933 (6.324)
Table 3.1: Approximations for some selected quantiles with probability
level p of Ssp,65 .
d
0
5
10
15
20
25
30
35
l
Ssp,65
11.0944
6.3715
2.5956
0.7151
0.1628
0.0357
0.0080
0.0019
cl
Ssp,65
11.0944
6.3756
2.6071
0.7201
0.1664
0.0379
0.0091
0.0023
m
Ssp,65
11.0944
6.3721
2.6029
0.7265
0.1698
0.0388
0.0092
0.0024
cm
Ssp,65
11.0944
6.3756
2.6078
0.7213
0.1671
0.0382
0.0092
0.0023
c
Ssp,65
11.0944
6.3792
2.6900
0.8629
0.2536
0.0758
0.0239
0.0081
MC (s.e. × 104 )
11.0937 (9.43)
6.3748 (8.67)
2.6068 (5.89)
0.7201 (0.34)
0.1668 (0.21)
0.0382 (0.10)
0.0093 (0.02)
0.0024 (0.004)
Table 3.2: Approximations for some selected stop-loss premiums with
retention d of Ssp,65 .
lower bound is close to the real stop-loss premium. Table 3.3 compares the
stop-loss premium of the comonotonic upper bound with the partially exact/comonotonic upper bound π pecub (Ssp,65 , d, Λ, Γ) (PECUB) and the two
combination bounds π eub (Ssp,65 , d, Λ, Γ) (EMUB) (upper bounds based on
l
the lower bound Ssp,65
) and π min (Ssp,65 , d, Λ, Γ) (MIN). For the partial
exact/comonotonic upper bound we use the same conditioning variable as
cl
for the lower bound Ssp,65
. Remark that the decomposition variable is of
the form (2.55) with Λ ≡ Λn .
For the important retentions d = 5, 10, 15 and 20 the upper bound
π min (Ssp,65 , d, Λ, Γ) really improves the comonotonic upper bound. Notice
that for the extreme cases the values are more or less the same.
111
0.0
0.2
0.4
cdf
0.6
0.8
1.0
3.3. The distribution of life annuities
0
10
20
30
40
50
outcome
30
25
20
15
10
5
0
0
5
10
15
20
25
30
l
Figure 3.1: The cdf ’s of ‘Ssp,65 ’ (MC) (solid grey line), Ssp,65
(•-line),
cl
c
Ssp,65 (N-line) and Ssp,65 (dashed line).
0
5
10
15
20
25
0
5
10
15
20
25
l
cl
c
Figure 3.2: QQ-plot of the quantiles of Ssp,65
(◦) / Ssp,65
(4) and Ssp,65
() versus those of ‘Ssp,65 ’ (MC).
Chapter 3 - Reserving in life insurance business
6
4
0
2
Stop-loss premium
8
10
112
0
10
20
30
40
50
outcome
l
Figure 3.3: Stop-loss premiums for ‘Ssp,65 ’ (MC) (solid grey line), Ssp,65
cl
c
(•-line), Ssp,65 (N-line) and Ssp,65 (dashed line).
d
0
5
10
15
20
25
30
35
MIN EMUB
11.0944 11.0944
6.3759
6.3761
2.6153
2.6164
0.7484
0.7532
0.2066
0.2207
0.0684
0.1009
0.0223
0.0738
0.0074
0.0672
PECUB
CUB
11.0944 11.0944
6.3775
6.3792
2.6523
2.6900
0.8025
0.8629
0.2331
0.2536
0.0711
0.0758
0.0223
0.0239
0.0074
0.0081
MC (s.e. × 104 )
11.0937 (9.43)
6.3748 (8.67)
2.6068 (5.89)
0.7201 (0.34)
0.1668 (0.21)
0.0382 (0.10)
0.0093 (0.02)
0.0024 (0.004)
Table 3.3: Upper bounds for some selected stop-loss premiums with retention d of Ssp,65 .
3.3. The distribution of life annuities
3.3.2
113
A homogeneous portfolio of life annuities
We consider now the distribution of the present value of a homogeneous
portfolio of N0 life annuities paying a fixed amount of αi (> 0) at the end
of each year i. This present value can be expressed by the formula
Spp,x =
bω−xc
X
Ni αi e−Y (i) ,
i=1
where Ni denotes the number of survivals in year i and can be written as
Ni = I
+I
(1)
Tx >i
+ ... + I
(2)
Tx >i
(N0 )
Tx
>i
,
(j)
where Tx denotes the future lifetime of the j-th insured. We assume that
these random variables are mutually independent. So the random variables Ni are binomially distributed with parameters n = N0 and success
parameter i px .
Note that
Spp,x =
N0
X
(j)
Ssp,x
,
(3.8)
j=1
(j)
with Ssp,x given by
(j)
Ssp,x
=
bω−xc
X
i=1
I
αi e−Y (i) .
(j)
Tx >i
The computation of the convex upper and lower bound for the case of a
portfolio of life annuities is more complicated than in the case of a single
life annuity. The binomial distributed random variables Ni are not very
useful in practical computations, because there exist no closed-form expressions for the cumulative and the inverse distribution functions. This
problem can be dealt with by replacing the random variables Ni by more
handy continuous approximations Ñi . We propose to approximate the distribution of Ni by the Normal Power Approximation (NPA). This allows to
incorporate the sknewness in contrast with a Normal approximation, because the binomial distribution is very skewed (unless either the parameter
114
Chapter 3 - Reserving in life insurance business
n is very high or the success parameter p is close to 21 ). The distribution
function of the NPA Ñi is given by the formula
!
s
3
6(x − µNi )
9
FÑi (x) = Φ −
+
+1 ,
2 +
γN i
γN i σ N i
γN
i
where
µN i
= E [Ni ] = N0 i px ,
2
σN
i
= Var [Ni ] = N0 i px i qx ,
E (Ni − µNi )3
1 − 2 i px
=
.
=√
3
σ Ni
N 0 i px i q x
γN i
Then the p-th quantile of Ñi is given by
γN σ N FÑ−1 (p) = µNi + σNi Φ−1 (p) + i i (Φ−1 (p))2 − 1 .
i
6
(3.9)
The upper bound
c
The upper bound Spp,x
is computed as described in Section 2.5.3. The
only difference is that in the formulas (2.71), (2.72) and (2.75) FX−1i (u) has
to be replaced by the approximation given in (3.9).
The lower bound
To compute the lower bound one has to choose two conditioning variables:
Γ and Λ. For the first conditioning random variable Γ we propose to take
Ni0 — the number of policies-in-force in the year i0 . Note that
E Ni |Ni0 = n0 = i−i0 px+i0 n0 for i ≥ i0 .
For i < i0 , Pr[Ni = n|Ni0 = ni ] can be computed from Bayes’ theorem.
As a result one gets the following formula for the conditional expectation:
E Ni |Ni0 = n0
N0
X
Pr[Ni0 = n0 |Ni = k]Pr[Ni = k]
Pr[Ni0 = n0 ]
k=n0
k
N0
N0
n0
k−n0
k
N0 −k
X
n0
k i0 −i px+i i0 −i qx+i i px i qx
=
k N0 n0
N0 −n0
i0 px i0 q x
n0
k=n0
N0
k−n0 N0 −k
X
N0 − n 0
k−n0 i0 −i qx+i i qx
k
.
=
i px
N0 −n0
k − n0
i0 qx
k=n0
=
k
3.3. The distribution of life annuities
115
For mathematical convenience we rewrite this formula for non-integer values of Ni0 as follows
E Ni |Ni0 = y =
N0
X
k=dye
N0 − dye
k
k − dye
k−dye N0 −k
k−dye i0 −i qx+i i qx
.
i px
N0 −dye
i0 qx
(3.10)
We propose to take Λ(a) , as defined in Section 3.3.1, for the second conditioning random variable Λ. Now one can perform step by step the computa
tions
described in Subsection 2.5.3 with the only exception that E Xi |Γ
=
γ has to replaced in the formulas (2.80) and (2.81) by E Ni |Ni0 = y in
(3.10).
Also the stop-loss premiums are calculated according to the methodology presented
5.3
in Section
and 2.5.3 with the only difference the replace−1
ment of E Xi |Γ = FΓ (u) in formula (2.83) by the approximation given
in (3.10).
The moments based approximation
As in the case of a single life annuity, the only problem in the computation
of the weight z given by (2.66) is to find expressions for the variances of
c , Sl
Spp,x
pp,x and Spp,x . For the upper and the lower bound we have deployed
a procedure, described in Section 2.5.3, with fi (u) replaced by
fi (u) = FÑ−1 (u) for the upper bound
i
and
fi (u) = E Ni |Ni0 = FÑ−1 (u) for the lower bound.
i0
The variance of Spp,x can be computed from (3.8) and by noticing that,
(1)
~ = Y (1), . . . , Y (bω − xc) , the random variables Ssp,x
given the returns Y
,
(2)
(N )
Ssp,x , . . . , Ssp,x0 are conditionally independent. Hence, we have that
h h
i
i
~
~ + Var ~ E Spp,x |Y
Var Spp,x = EY~ Var Spp,x |Y
Y
h
h i
i
~ + N02 Var ~ E Ssp,x |Y
~
= N0 EY~ Var Ssp,x |Y
Y
h i
~ ,
= N0 Var Ssp,x + (N02 − N0 )VarY~ E Ssp,x |Y
116
Chapter 3 - Reserving in life insurance business
p
l
Spp,65
m
Spp,65
c
Spp,65
MC (s.e.)
0.75
0.90
0.95
0.975
0.995
12 574
14 565
15 937
17 252
20 209
12 577
14 574
15 951
17 272
20 250
12 821
15 290
17 029
18 722
22 620
12 577 (3.90)
14 568 (5.08)
15 947 (8.15)
17 276 (8.80)
20 242 (22.09)
Table 3.4: Approximations for some selected quantiles with probability
level p of Spp,65 .
where Var Ssp,x is calculated in Subsection 3.3.1 and
bω−xc bω−xc
h σ 2 +σ 2
X X
i
−µi −µj + i 2 j
~
eσij − 1 .
VarY~ E Ssp,x |Y =
i px j px α i α j e
i=1
j=1
A numerical illustration
To test the quality of the derived approximations we present a numerical illustration similar to this from Subsection 3.3.1. As before we work
in a Black & Scholes setting with drift µ = 0.05 and volatility σ = 0.1
and we use the Makeham-Gompertz law to describe the mortality process
of 65-year old male persons. We compare the performance of the lower
l
c
and the moments based approxima, the upper bound Spp,65
bound Spp,65
m
tion Spp,65 with the real value Spp,65 , obtained by extensive simulation,
for a portfolio of 1 000 policies. The number of policies-in-force after the
first year N1 is taken as the conditioning random variable Γ for the lower
bound. This choice seems to us to be reasonable — other choices can improve the performance of the lower bound only a bit but with a significant
increase in computational time as cost. The Monte Carlo (MC) study of
Spp,65 is based on 30 × 50 000 simulated paths. Antithetic variables are
used in order to reduce the variance of the Monte Carlo estimates.
The quality of the approximations is illustrated in Figure 3.4 and 3.5.
l
One can see that the lower bound Spp,65
indeed performs very well. The fit
of the upper bound is a bit poorer but still reasonable. The moments based
m
approximation Spp,65
performs extremely well. These visual observations
are confirmed by the numerical values of some upper quantiles displayed
in Table 3.4 and by the study of stop-loss premiums in Figure 3.6 and in
Table 3.5.
117
0.0
0.2
0.4
cdf
0.6
0.8
1.0
3.3. The distribution of life annuities
5000
10000
15000
20000
25000
30000
outcome
20000
15000
10000
5000
5000
10000
15000
20000
l
Figure 3.4: The cdf ’s of ‘Spp,65 ’ (MC) (solid grey line), Spp,65
(•-line),
m
c
Spp,65 (N-line) and Spp,65 (dashed line).
6000
8000
10000
12000
14000
16000
18000
20000
6000
8000
10000
12000
14000
16000
18000
20000
l
m
c
Figure 3.5: QQ-plot of the quantiles of Spp,65
(◦)/Spp,65
(4) and Spp,65
() versus those of ‘Spp,65 ’ (MC).
Chapter 3 - Reserving in life insurance business
4000
2000
0
Stop-loss premium
6000
118
5000
10000
15000
20000
25000
30000
outcome
l
Figure 3.6: Stop-loss premiums for ‘Spp,65 ’ (MC) (solid grey line), Spp,65
m
c
(•-line), Spp,65 (N-line) and Spp,65 (dashed line).
d
0
5 000
10 000
15 000
20 000
25 000
30 000
l
Spp,65
11 094
6 094
1 608
153.7
10.23
0.680
0.051
m
Spp,65
11 094
6 094
1 610
155.3
10.57
0.734
0.059
c
Spp,65
11 094
6 095
1 793
278.4
36.02
4.816
0.711
MC (s.e.)
11 098 (2.11)
6 098 (2.10)
1 611 (1.95)
155.3 (1.78)
10.67 (1.26)
0.743 (0.09)
0.036 (0.02)
Table 3.5: Approximations for some selected stop-loss premiums with
retention d of Spp,65 .
3.3. The distribution of life annuities
3.3.3
119
An ‘average’ portfolio of life annuities
As explained in Section 3.2 in the case of large portfolios of life annuities
it suffices to compute risk measures of an ‘average’ portfolio given by
Sapp,x =
bω−xc
X
i px
αi e−Y (i) ,
i=1
where we assume that the payments αi are positive and due at times i =
1, . . . , bω − xc (payable at the end of each year). Notice that Sapp,x is of the
form (2.29) and that Sapp,x = E[Ssp,x |Y (1), · · · , Y (bω − xc)]. Comonotonic
approximations for this type of sums has been studied extensively by Kaas
et al. (2000), Dhaene et al. (2002a,b), Vyncke (2003), Darkiewicz (2005b)
and Vanduffel (2005), among others.
It turns out that for this application the conditioning variable of the
‘maximal variance’ form gives very accurate results. This means that we
define Λ here as
bω−xc
X
i px
1
2
αi e−µi + 2 σi Y (i).
i=1
Notice that this conditioning variable could also be used in order to derive
the lower bound for a single life annuity.
To compute the comonotonic approximations for the quantiles and
stop-loss premiums, notice that the correlations ri are given by
ri = Corr(Y (i), Λ) =
Cov[Y (i), Λ]
.
σi σΛ
Because all correlation coefficients ri are positive, we have seen that the
lower bound is a comonotonic sum (all the terms in the sum are nondecreasing functions of the same standard uniform random variable U ).
This implies that the quantiles related to the lower and upper bound can
be computed by summing the corresponding quantiles for the marginals
involved. We find the following expressions for the quantiles and stop-loss
l
c
premiums of Sapp,x
and Sapp,x
:
120
Chapter 3 - Reserving in life insurance business
FS−1
(p)
l
app,x
=
bω−xc
X
−1 (p)+ 1
2
(1−ri2 )σi2 ,
i px
αi e−µi +ri σi Φ
i px
αi e−µi +sign(i px αi )σi Φ
i px
αi e−µi +
i=1
(p) =
FS−1
c
app,x
bω−xc
X
−1 (p)
,
i=1
lb
π (Sapp,x , d, Λ) =
bω−xc
X
σi2
2
i=1
h
i
Φ ri σi − Φ−1 FSapp,x
(d)
l
(d) ,
−d 1 − FSapp,x
l
π
cub
(Sapp,x , d) =
bω−xc
X
i px
αi e−µi +
i=1
σi2
2
i
h
c
(d)
Φ sign(i px αi )σi − Φ−1 FSapp,x
c
−d 1 − FSapp,x
(d) .
To calculate the moments based approximation we need the expressions
l
c
for the variances of Sapp,x , Sapp,x
and Sapp,x
. These are given by
Var[Sapp,x ] =
bω−xc bω−xc
X X
i=1
l
Var[Sapp,x
] =
c
Var[Sapp,x
]
=
X X
3.3.4
σi2 +σj2
2
−µi −µj +
i px j px α i α j e
σi2 +σj2
2
j=1
bω−xc bω−xc
X X
i=1
αi αj e−µi −µj +
j=1
bω−xc bω−xc
i=1
i px j px
i px j px
αi αj e
j=1
−µi −µj +
σi2 +σj2
2
(eσij − 1) ,
(eri rj σi σj − 1) ,
(eσi σj − 1) .
A numerical illustration
In this subsection we illustrate our findings numerically and graphically.
We use the same parameters for the financial and mortality process as in
the two previous illustrations, namely a Black & Scholes model for the
returns with µ = 0.05, σ = 0.1 and the Makeham-Gompertz law with
corresponding coefficients of the Belgian analytic life table MR. We will
compare the different approximations for quantiles and stop-loss premiums
with the values obtained by Monte Carlo simulation (MC). The simulation
3.3. The distribution of life annuities
121
results are based on generating 500 × 100000 random paths. The estimates
obtained from this time-consuming simulation will serve as benchmark.
The random paths are based on antithetic variables in order to reduce the
variance of the Monte Carlo estimates.
l
Figure 3.7 shows the distribution functions of the lower bound Sapp,65
,
c
m
the upper bound Sapp,65 , the moment based approximation Sapp,65 and the
simulated one Sapp,65 . Again the lower bound and the moments based approximation prove to be very good approximations for the real cumulative
distribution function of Sapp,65 . To assess the accuracy of the bounds in
the tails, we plot their quantiles against those of Sapp,65 in Figure 3.8. The
m
largest quantile (p = 0.995) of Sapp,65
in this QQ-plot underestimates the
exact quantile by only 0.06%. Table 3.6 shows the numerical values for
some high quantiles. The stop-loss premiums for different choices of d are
shown in Figure 3.9 and in Table 3.7. The lower bound and the moments
based approximation give very accurate results compared to the real value
of the stop-loss premium. The comonotonic upper bound performs rather
badly for some retentions. But, using the results of Chapter 2 we can
construct sharper upper bounds than the traditional comonotonic upper
bounds.
In Table 3.8 we compare the stop-loss premium of the comonotonic upper bound with the partially exact/comonotonic upper bound π pecub (Sapp,65 ,
l
d, Λ) (PECUB) and the two upper bounds based on the lower bound Sapp,65
deub
plus an error term dependent of the retention π
(Sapp,65 , d, Λ) (DEUB)
eub
and independent of the retention π (Sapp,65 , d, Λ) (EUB). For the partial
exact/comonotonic upper bound we use the same conditioning variable
l
as for the lower bound Sapp,65
. The decomposition variable used in this
illustration is given by
dΛ = d −
bω−xc
X
i=1
i px
αi e
−µi +
σi2
2
1 2
1 + µ i − σi .
2
The results for the different upper bounds are in line with the previous ones
for a single life annuity. Note that for very high values of d the differences
become larger, but these cases don’t represent any practical importance.
We can conclude that in both cases the upper bound based on the lower
bound plus an error term dependent on the retention π deub (., d, Λ) performs
very well.
122
Chapter 3 - Reserving in life insurance business
p
0.75
0.90
0.95
0.975
0.995
l
Sapp,65
12.5745
14.5649
15.9364
17.2513
20.2073
m
Sapp,65
12.5760
14.5698
15.9444
17.2628
20.2303
c
Sapp,65
12.8192
15.2819
17.0152
18.703
22.5847
MC (s.e. × 104 )
12.574 (0.03)
14.5699 (0.07)
15.9448 (0.14)
17.2683 (0.24)
20.2425 (1.58)
Table 3.6: Approximations for some selected quantiles with probability
level p of Sapp,65 .
d
0
5
10
15
20
25
l
Sapp,65
11.0944
6.0945
1.6081
0.1536
0.0102
0.0007
m
Sapp,65
11.0944
6.0945
1.6094
0.1545
0.0104
0.0007
c
Sapp,65
11.0944
6.0951
1.7910
0.2766
0.0355
0.0047
MC (s.e. × 104 )
11.0948 (8.22)
6.0948 (7.67)
1.6097 (4.45)
0.1549 (1.01)
0.0105 (0.31)
0.0007 (0.01)
Table 3.7: Approximations for some selected stop-loss premiums with
retention d of Sapp,65 .
Notice that for the retention d = 0 all values (except the value for DEUB
because there the error term is independent of the retention) in both tables
are identical and equal to 11.0944. This follows from the fact that in this
case the expected value of Ssp,65 equals the expected value of Sapp,65 . Note
also that the values in Tables 3.2 and 3.3 are typically larger than the
corresponding values in Tables 3.7 and 3.8. This is not surprising. From
Remark 5 it immediately follows that Sapp,65 ≤cx Ssp,65 and hence for any
retention d > 0 one has
π(Sapp,65 , d) ≤ π(Ssp,65 , d).
123
0.0
0.2
0.4
cdf
0.6
0.8
1.0
3.3. The distribution of life annuities
5
10
15
20
25
30
outcome
20
15
10
5
5
10
15
20
l
Figure 3.7: The cdf ’s of ‘Sapp,65 ’ (MC) (solid grey line), Sapp,65
(•-line),
m
c
Sapp,65 (N-line) and Sapp,65 (dashed line).
6
8
10
12
14
16
18
20
6
8
10
12
14
16
18
20
l
m
c
Figure 3.8: QQ-plot of the quantiles of Sapp,65
(◦)/Sapp,65
(4) and Sapp,65
() versus those of ‘Sapp,65 ’ (MC).
Chapter 3 - Reserving in life insurance business
4
0
2
Stop-loss premium
6
124
5
10
15
20
25
30
outcome
Figure 3.9: Stop-loss premiums for ‘Sapp,65 ’ (MC) (solid grey line),
l
m
c
Sapp,65
(•-line), Sapp,65
(N-line) and Sapp,65
(dashed line).
d
0
5
10
15
20
25
EUB
DEUB
11.1652 11.0944
6.1653
6.0948
1.6789
1.6240
0.2244
0.2144
0.0810
0.0809
0.0715
0.0715
PECUB
CUB
11.0944 11.0944
6.0948
6.0951
1.6980
1.7910
0.2559
0.2766
0.0328
0.0355
0.0041
0.0047
MC (s.e. × 104 )
11.0948 (8.22)
6.0948 (7.67)
1.6097 (4.45)
0.1549 (1.01)
0.0105 (0.31)
0.0007 (0.01)
Table 3.8: Upper bounds for some selected stop-loss premiums with retention d of Sapp,65 .
3.4. Conclusion
3.4
125
Conclusion
In this chapter we studied the case of life annuities. The aggregate distribution function of such stochastic sums of dependent random variables is
very difficult to calculate. Usually it is only possible to get formulae for the
first couple of moments. To compute more cumbersome risk measures, like
stop-loss premiums or upper quantiles, one has to rely on time consuming
simulations.
We derived comonotonicity based approximations both for the case of
a single life annuity and a homogeneous portfolio of life annuities. The
numerical illustrations confirm the very high accuracy of the bounds (especially the lower bound). These observations are confirmed by the results
of the stop-loss premiums. One maybe gets an impression that the upper
bound — which performs poorer than the lower bound in all cases — is
not worth being studied. In actuarial applications, however, the upper
bound should draw a lot of attention because one is usually interested in
conservative estimates of quantities of interest. Indeed, when an actuary
calculates reserves he has to take into account some additional sources of
uncertainty, such as the choice of the interest rates model, the estimation
of parameters, the assumptions about mortality, the longevity risk and
many others. For these the estimates provided by the upper bound in convex order can be in many cases more appropriate than the more accurate
approximations obtained from the lower bound in convex order.
Chapter 4
Reserving in non-life
insurance business
Summary In this chapter we present some methods to set up confidence
bounds for the discounted IBNR reserve. We first model the claim payments by means of a lognormal and a loglinear location-scale regression
model. We further extend this to the class of generalized linear models.
The knowledge of the distribution function of the discounted IBNR reserve will help us to determine the initial reserve, e.g. through the quantile
risk measure. The results are based on the comonotonic approximations
explained in Chapter 2.
4.1
Introduction
To get the correct picture of its liabilities, a company should set aside the
correctly estimated amount of money to meet claims arising in the future
on the written policies. The past data used to construct estimates for the
future payments consist of a triangle of incremental claims Yij , as depicted
in Figure 4.1. This is the simplest shape of data that can be obtained and
it avoids having to introduce complicated notation to cope with all possible
situations. We use the standard notation, with the random variables Yij for
i = 1, 2, . . . , t; j = 1, 2, . . . , s denoting the claim figures for year of origin (or
accident year) i and development year j, meaning that the claim amounts
were paid in calendar year i+j −1. Year of origin, year of development and
calendar year act as possible explanatory variables for the observation Y ij .
127
128
Chapter 4 - Reserving in non-life insurance business
Year of
origin
1
2
..
.
i
..
.
t
1
Y11
Y21
2
Y12
Y22
Development year
···
j
··· t − 1
· · · Y1j · · · Y1,t−1
· · · Y2j · · · Y2,t−1
···
Yi1
···
···
···
···
···
Yt1
···
···
···
Yij
t
Y1t
···
Figure 4.1: Random variables in a run-off triangle
Most claims reserving methods assume that t = s. For (i, j) combinations
with i + j ≤ t + 1, Yij has already been observed, otherwise it is a future
observation. Next to claims actually paid, these figures can also be used
to denote quantities such as loss ratios. To a large extent, it is irrelevant
whether incremental or cumulative data are used when considering claims
reserving in a stochastic context.
We consider annual development (the methods can be extended easily
to semi-annual, quarterly or monthly development) and we assume that
the time it takes for the claims to be completely paid is fixed and known.
The triangle is augmented each year by the addition of a new diagonal.
The purpose is to complete this run-off triangle to a square, or to
a rectangle if estimates are required pertaining to development years of
which no data are recorded in the run-off triangle at hand. To this end, the
actuary can make use of a variety of techniques. The inherent uncertainty
is described by the distribution of possible outcomes, and one needs to
arrive at the best estimate of the reserve.
The choice of an appropriate statistical model is an important matter.
Furthermore within a stochastic framework, there is considerable flexibility in the choice of predictor structures. In England & Verrall (2002) the
reader finds an excellent review of possible stochastic models. An appropriate model will enable the calculation of the distribution of the reserve
that reflects the process variability producing the future payments, and
accounts for the estimation error and statistical uncertainty (in the sense
given in Taylor & Ashe (1983)). It is necessary to be able to estimate the
variability of claims reserves, and ideally to be able to estimate the full dis-
4.1. Introduction
129
tribution of possible outcomes so that percentiles (or other risk measures
of this distribution) can be obtained. Next, recognizing the estimation
error involved with the parameter estimates, confidence intervals for these
measures constitute another desirable part of the output.
Here, putting the emphasis on the discounting aspect of the reserve,
we first consider simple lognormal linear models. Doray (1996) studied
the loglinear models extensively, taking into account the estimation error
on the parameters and the statistical prediction error in the model. Such
models have some significant disadvantages. Predictions from this model
can yield unusable results, and we need to impose that each incremental
value should be greater than zero. So, it is not possible to model negative
or zero claims. From the nature of the claims reserving problem, it is
expected that a higher proportion of zeros would be observed in the later
stages of the incremental loss data triangle. In reinsurance, zero claims
are also frequently observed in incremental loss data triangles for excess
layers. Negative incremental values will be the result of salvage recoveries,
payments from third parties, total or partial cancellation of outstanding
claims, due to initial overestimation of the loss or to possible favorable jury
decision in favor of the insurer, rejection by the insurer, or just plain errors.
In Goovaerts & Redant (1999) a lognormal linear regression model is used
to model the random fluctuations in the direction of the calendar years,
taking into account the apparatus of financial mathematics. The results are
based on supermodularity order, such that, in the framework of stop-loss
ordering one obtains the distribution of the IBNR reserve corresponding
to an extremal element in this ordering, when some marginals are fixed.
The lognormal linear model is a member of the broader class of loglinear location-scale regression models. In Doray (1994) the reader can find
an overview with a lot a characteristics of the different distributions in
this class. The logarithm of the error is assumed to follow certain known
distributions (normal, extreme value, generalized loggamma, logistic and
log inverse Gaussian). Doray studied these models extensively. He has derived certain theoretical properties of these distributions and proved that
the MLE’s of the regression and scale parameters exist and are unique,
when the error has a log-concave density.
Claim sizes can often be described by distributions with a subexponential right tail. Furthermore, the phenomena to be modelled are rarely
additive in the collateral data. A multiplicative model is much more plausible. These problems cannot be solved by working with ordinary linear
130
Chapter 4 - Reserving in non-life insurance business
models, but with generalized linear models. The generalization is twofold.
First, it is allowed that the random deviations from the mean follow another distribution than the normal. In fact, one can take any distribution
from the exponential dispersion family, including for instance the Poisson,
the binomial, the gamma and the inverse Gaussian distributions. Second,
it is no longer necessary that the mean of the random variable is a linear
function of the explanatory variables, but it only has to be linear on a
certain scale. If this scale for instance is logarithmic, we have in fact a
multiplicative model instead of an additive model.
Loss reserving deals with the determination of the (characteristics of
the) d.f. of the random present value of an unknown amount of future
payments. Since this d.f. is very important for an insurance company and
its policyholders, these inherent uncertainties are no excuse for providing
anything less than a rigorous scientific analysis. In order for the reserve
estimate truly to represent the actuary’s “best estimate” of the needed
reserve, both the determination of the expected value of unpaid losses and
the appropriate discount should reflect the actuary’s best estimates (i.e.
should not be dictated by others or by regulatory requirements). Since
the reserve is a provision for the future payment of unpaid losses, we believe the estimated loss reserve should reflect the time value of money. In
many situations this discounted reserve is useful, for example dynamic financial analysis, assessing profitability and pricing, identifying risk based
capital needs, loss portfolio transfers, profit testing, and so on. Ideally the
discounted loss reserve would also be acceptable for regulatory reporting.
However, many current regulations do not permit it. It could be argued
that reserves set on an undiscounted basis include an implicit margin for
prudence, although, in the current climate of low interest rates, that margin is very much reduced. If reserves are set on a discounted basis, there is
a strong case for including an explicit prudential margin. As such, a risk
margin based on a risk measure from a predictive distribution of claims
reserves is a strong contender.
One of the sub-problems in this respect consists of the discounting of
the future estimates in the run-off triangle, where returns (and inflation)
are not known for certain. We will model the stochastic discount factor
using a Brownian motion with drift. When determining the discounted
loss reserve, we impose an explicit margin based on a risk measure (for
example Value-at-Risk) from the total distribution of the discounted reserve. Considering the discounted IBNR reserve, we have to incorporate a
4.2. The claims reserving problem
131
certain dependence structure. In general, it is hard or even impossible to
determine the quantiles of the discounted loss reserve analytically, because
in any realistic model for the return process this random variable will be
a sum of strongly dependent random variables. The “true” multivariate
distribution function of the lower triangle cannot be determined analytically in most cases, because the mutual dependencies are not known, or
are difficult to cope with. We suggest to solve this problem by calculating
upper and lower bounds making efficient use of the available information.
This chapter is set out as follows. Section 2 places the claims reserving
problem in a broader context. Section 3 gives a brief review of loglinear
and generalized linear models and their applications to claims reserving.
To be able to use the results of Chapter 2 we need some asymptotic results
for model parameter estimates in generalized linear models. Section 4 describes how convex lower and upper bounds can be obtained for discounted
IBNR evaluations. Some numerical illustrations for a simulated data set
are provided in Section 5, together with a discussion of the estimation error
using a bootstrap approach. We also graphically illustrate the obtained
bounds.
The results of this chapter come from Hoedemakers, Beirlant, Goovaerts
& Dhaene (2003, 2005).
4.2
The claims reserving problem
As a rule not all claims on a general insurance portfolio will have been paid
by the end of the calender year of an insurance company. There can be
several reasons for the delay in payment, e.g. delays in reporting the claim,
long legal procedures, difficulties in determining the size of the claim, and
so on. It is also possible that the claim still has to occur, but that the
cause of the claim occurs in the past (e.g. exposed to asbestos). This of
course depends on what is insured in the policy. The delay in payment can
vary from a couple of days up to some years depending on the complexity
and the severity of the damage. To be able to pay these claims the insurer
has to keep reserves which should enable him to pay all future outstanding
claims.
Claims reserving is a vital area of insurance company management,
132
Chapter 4 - Reserving in non-life insurance business
which is receiving close attention from shareholders, auditors, tax authorities and regulators. For insurance companies, the claims reserve is a very
substantial balance sheet item, which can be large in relation to shareholders funds. Actuaries are now well-established in the area of claims reserving for non-life insurance business. In many countries there is already
a statutory requirement for actuarial certification of reserves. Even in
jurisdictions where there is no such requirement, the substantial contribution actuaries can make to estimating future liabilities has been recognized
across the market.
Failure to reserve accurately for outstanding and IBNR claims will
adversely affect a company’s future financial development. Any current
reserve inadequacy will give rise to losses in subsequent years. Conversely,
premium calculations based on a too pessimistic evaluation of current liabilities will damage the company’s competitive position.
The reserves held by a general insurance company can be divided into the
following categories:
• Claims reserves representing the estimated outstanding claims payments that are to be covered by premiums already earned by the
company. These reserves are sometimes called IBNS reserves (Incurred But Not Settled). These can in turn be divided into
1. IBNYR reserves, representing the estimated claims payments
for claims which have already Incurred, But which are Not Yet
Reported to the company.
2. RBNS reserves, being the reserves required in respect of claims
which have been Reported to the company, But are Not yet
fully Settled. A special case of RBNS reserves are case reserves,
which are the individual reserves set by the claim handlers in
the claims handling process.
• Unearned premium reserves (UPR). Because the insurance premiums
are paid up-front, the company will, at any given accounting date,
need to hold a reserve representing the liability that a part of the
paid premium should be paid back to the policyholder in the event
that insurance policies were to be cancelled at that date. Unearned
premium reserves are pure accounting reserves, calculated on a pro
rata basis.
4.3. Model set-up: regression models
133
• Unexpired risk reserves (URR). While the policyholder only in special
cases has the option to cancel a policy before the agreed insurance
term has expired, he certainly always has the option to continue the
policy for the rest of the term. The insurance company, therefore,
runs the risk that the unearned premium will prove insufficient to
cover the corresponding unexpired risk, and hence the unexpired risk
reserve is set up to cover the probable losses resulting from insufficient
written but yet unearned premiums.
• CBNI reserves. Essentially the same as unearned premium reserves,
but to take into account possible seasonal variations in the risk pattern, they are not necessarily calculated pro rata, so that they also
incorporate the function of the unexpired risk reserves. Their purpose is to provide for Covered But Not Incurred (CBNI) claims.
• The sum of the CBNI and IBNS reserves is sometimes called the
Covered But Not Settled (CBNS) reserve.
• Fluctuation reserves (equalization reserves) do not represent a future
obligation, but are used as a buffer capital to safeguard against random fluctuations in future business results. The use of fluctuation
reserves varies from country to country.
The loss reserves considered here only refer to the claims that result from
already occurred events; the so-called IBNS reserves. Notice that often the
terminology is not used uniformly: the abbreviation IBNR is used when
speaking of loss reserving problems as a whole.
4.3
Model set-up: regression models
The problem of estimating IBNR claims consists in predicting, for each
accident year, the ultimate amount of claims incurred. The amount paid
by the insurance company for those claims, when it comes due, is then subtracted, leaving the reserve the insurer should hold for future payments. To
calculate the reserve, all methods or models usually assume that the pattern of cumulative or incremental claims incurred or paid is stable across
the development years, for each accident year. Since for the last accident
year, only one amount will be available, the reserve will be highly sensitive
to this amount. Moreover, because of growth experienced by the company,
134
Chapter 4 - Reserving in non-life insurance business
it will be larger than any other amount in the data set, hence the importance of verifying that the development pattern of the claims has not
changed over the years. One of the earliest methods, and now the most
commonly used in the actuarial profession, is the chain-ladder method.
Assuming that for each accident year, the development pattern remains
stable, development factors are calculated by dividing cumulative paid or
incurred claims after j periods by the cumulative amount after j − 1 periods. The year-to-year development factors are then applied to the most
recent amount for each accident year, i.e. the amounts on the right-most
diagonal. Many variations have been presented for the basic chain-ladder
method just introduced; a linear trend or an exponential growth may be
assumed to be present among the development factors. Instead of taking
their weighted average, they could be extrapolated into the future. The
chain-ladder method can also be adjusted for inflation. However, the chainladder method suffers from the following deficiencies:
1. It explicitly assumes too many parameters (one for each column).
2. It does not give any idea of the variability of the reserve estimate, or
a confidence interval for the reserve.
3. It is negatively biased, which could lead to serious underreserving, a
threat to the insurer’s solvency.
Therefore stochastic models have been developed which enable to calculate
an amount such that there is a high probability that the reserve will be
sufficient to cover the liabilities generated by the current block of business.
In claims reserving, we are interested in the aggregated value
t
t
X
X
Yij .
i=2 j=t+2−i
In this section we given an overview of the different regression models used
in claims reserving.
We use the following notation throughout this section:
~
Y = (Y11 , . . . , Yt1 , Y21 , . . . , Yt1 ) is the vector of claims, β~ = (β1 , . . . , βp )
are model parameters, U is the regression matrix corresponding to the
upper triangle of dimension [ t(t+1)
2 ] × p and R is the regression matrix
corresponding to the complete square of dimension t2 × p.
4.3. Model set-up: regression models
4.3.1
135
Lognormal linear models
We consider the following loglinear regression model in matrix notation
~ = lnY
~ = Rβ~ + ~,
Z
~ ∼ N (0, σ 2 I),
(4.1)
where ~ is the vector of independent normal random errors with mean 0
and variance σ 2 .
So, the normal responses Zij are assumed to decompose (additively)
~ ij and a hointo a deterministic non-random component with mean (Rβ)
moscedastic normally distributed random error component with zero mean.
The parameters are estimated by the maximum likelihood method,
which in the case of the normal error structure is equivalent to minimizing
the residual sum of squares. The unknown variance σ 2 is estimated by the
residual sum of squares divided by the degrees of freedom (the number of
observations minus the numbers of regression parameters estimated):
σ̌ 2 =
1
~ˆ 0 (Z
~ˆ
~ − Uβ)
~ − Uβ).
(Z
n−p
(4.2)
This is an unbiased estimator of σ 2 . The maximum likelihood estimator
of σ 2 is given by
1 ~
~ˆ 0 (Z
~ˆ
~ − Uβ),
− Uβ)
(4.3)
σ̂ 2 = (Z
n
while the maximum likelihood estimator of β~ is
ˆ
~
β~ = (U0 U)−1 U0 Z.
(4.4)
Now we can forecast the total IBNR reserve with
IBNR reserve =
t
t
X
X
~ˆ
e(Rβ)ij +ij .
(4.5)
i=2 j=t+2−i
This definition of the IBNR reserve can, among others, be found in Doray
~ˆ ij and ij are independent.
(1996). Here (Rβ)
We have that
ij
~ˆ ij
(Rβ)
i.i.d
∼ N(0, σ 2 ),
∼
~ ij , σ 2
N (Rβ)
(4.6)
R(U0 U)−1 R0 ij .
(4.7)
136
Chapter 4 - Reserving in non-life insurance business
Starting from model (4.1), we summarize now some properties of the IBNR
reserve (4.5), which can be found in Doray (1996).
1. The mean of the IBNR reserve equals
W =
t
t
X
X
1 2
0
−1 0
~
e(Rβ)ij + 2 σ (1+(R(U U) R )ij ) .
(4.8)
i=2 j=t+2−i
2. The unique UMVUE of the mean of the IBNR reserve is given by
t
t
n − p SS X
X
~ˆ
z
ŴU = 0 F1
e(Rβ)ij ,
;
2
4
(4.9)
i=2 j=t+2−i
where 0 F1 (α; z) denotes the hypergeometric function.
3. The MLE of the mean of the IBNR reserve:
ŴM =
t
t
X
X
1 2
0
−1 0
~ˆ
e(Rβ)ij + 2 σ̂ (1+(R(U U) R )ij ) .
(4.10)
i=2 j=t+2−i
Verrall (1991) has considered an estimator similar to ŴM , but with σ̂ 2
replaced with σ̌ 2 :
ŴV =
t
t
X
X
1 2
0
−1 0
~ˆ
e(Rβ)ij + 2 σ̌ (1+(R(U U) R )ij ) .
(4.11)
i=2 j=t+2−i
The simple estimator
ŴD =
t
t
X
X
~ˆ
1
2
e(Rβ)ij + 2 σ̌ ,
(4.12)
i=2 j=t+2−i
was considered in Doray (1996).
Also, we have the order relation
ŴU < ŴD < ŴV ,
(4.13)
W = E[ŴU ] < E[ŴD ] < E[ŴV ].
(4.14)
which implies that
4.3. Model set-up: regression models
137
Hence both the estimators ŴD and ŴV exhibit a positive bias.
This Lognormal Linear (LL) model with normal random error is a special
case of the class of loglinear location-scale models. Other choices possible
for the distribution of the random error are the extreme value distribution,
leading to the Weibull-extreme value regression model, the generalized
loggamma, the logistic, and the log inverse Gaussian distribution. In what
follows we shortly recall this class of regression models.
4.3.2
Loglinear location-scale models
For a general introduction to survival analysis we refer to Kalbfleish &
Prentice (1980), Lawless (1982), Cohen & Whitten (1985), among others.
In this section we recall the structure of this model and the main characteristics of the distributions for the error component.
A location-scale model has a cumulative distribution function of the
form
x−µ
,
(4.15)
FX (x) = G
σ
where µ is the location parameter, σ is the scale parameter, and G is the
standardized form (µ = 0, σ = 1) of the cumulative distribution function.
The parameter vector is θ~ = (µ, σ).
We consider the following Loglinear Location-Scale (LLS) regression
model in matrix notation
~ = lnY
~ = Rβ~ + σ̃~,
Z
(4.16)
~ ij is the linear predictor or location parameter for Zij , σ̃ is the
where (Rβ)
scale parameter and ~ is a random error with known density f~(·).
It should also be noticed that in general the scale parameter estimator
is not independent of the location parameter estimator, as is the case in
normal regression.
It is clear that the random variable Zij has the following density
!
~ ij
zij − (Rβ)
1
,
f~
σ̃
σ̃
with −∞ < zij < ∞. This model can only be applied if all data points are
non-negative. The parameters are estimated by maximum likelihood.
138
Chapter 4 - Reserving in non-life insurance business
Doray (1994) showed that the maximum likelihood estimators of the
regression and scale parameters exist and are unique when the error ~ in the
loglinear location-scale regression model has a log-concave density. This is
the case for the five distributions we consider in Table 4.1. Note that the
exponential distribution is a special case of the Weibull distribution when
the shape parameter is equal to 1. The generalized gamma distribution is a
flexible family of distributions containing as special cases the exponential,
the Weibull and the gamma distribution.
The IBNR reserve under this class of regression models is given by
IBNR reserve =
t
t
X
X
~
e(Rβ)ij +σ̃ij .
i=2 j=t+2−i
Table 4.2 displays the mean, cumulative distribution function and in~
verse distribution function of Xij = e(Rβ)ij +σ̃ij for the different regression
models in the LLS family.
Notice that the definition of the IBNR reserve here differs from defi~
nition (4.3.2) under the lognormal linear model. We use here e(Rβ)ij +σ̃ij
~ˆ
ˆ
ˆ
ˆ represent the MLE’s of β~ and σ̃
instead of e(Rβ)ij +σ̃ij , where β~ and σ̃
respectively. Also this definition of the IBNR reserve can, among others,
be found in Doray (1996). This approach partly uses the information con~ˆ and acknowledges the underlying
tained in the upper triangle (through β),
stochastic structure (through ij ).
4.3. Model set-up: regression models
139
Regression model
Density
Lognormal linear
ij ∼ i.i.d N(0, 1)
1 2
√1 e− 2 x
(−∞ < x < ∞)
2π
Weibull-extreme value
ex−e
Logistic
ij ∼ standard logistic
ex
(−∞ < x < ∞)
(1+ex )2
Generalized loggamma
k− 1
2
k
Γ(k)
Log inverse Gaussian
ij ∼ Gumbel
(−∞ < x < ∞)
x
e
√
ij ∼ loggamma
x
√
(−∞ < x < ∞)
k
(0 < k < +∞)
kx−ke
ij ∼ log inverse Gaussian
1
x
1
1
(2πλ)− 2 e− 2 e λ e− λ cosh(x)
(−∞ < x < ∞)
(λ > 0)
Table 4.1: Characteristics of the random error ij in the regression models
of the LLS family.
E[Xij ]
Lognormal linear
e(Rβ)ij +
~
Logistic
e
~
e(Rβ)ij
e
Generalized loggamma
Log inverse Gaussian
2− σ̃1
Φ
(1 − 2σ̃)πcosec(2πσ̃)
(1 < σ̃ < 2)
Γ(1 + σ̃)Γ(1 − σ̃)
(σ̃ < 1)
~
e(Rβ)ij +σ̃Φ
σ̃
1 − exp −
Γ(1 + σ̃)
−1
FX
(p)
ij
FXij (xij )
~ ij
ln(xij )−(Rβ)
σ̃1 xij
~
e(Rβ)ij
−1
~
e(Rβ)ij (−ln(1 − p))
σ̃1 −1
~
1 − 1 + xij e−(Rβ)ij
~
e(Rβ)ij
p
1−p
~ ij
(Rβ)
k−σ̃
√
~
k (Rβ)
ij
e
Γ(k)
e
√
Γ(k + σ̃ k)
~ ij
(Rβ)
I k,
Φ
2
"r
eλ Φ
~
xij
k−σ̃
√
~
e(Rβ)ij
λxij
"r
~
~
k e(Rβ)ij
+
e(Rβ)ij
λxij
q
−
σ̃
1
√
k
xij
~
λe(Rβ)ij
q
xij
~
#
(p)
−
#
/
/
λe(Rβ)ij
Table 4.2: Characteristics of Xij = e(Rβ)ij +σ̃ij in the regression models of the LLS family.
σ̃
σ̃
Chapter 4 - Reserving in non-life insurance business
Weibull-extreme value
~ ij
(Rβ)
σ̃ 2
2
140
Regression model
4.3. Model set-up: regression models
4.3.3
141
Generalized linear models
For a general introduction to Generalized Linear Models (GLIMs) we refer
to McCullagh & Nelder (1992). This family encompasses normal error
linear regression models and the nonlinear exponential, logistic and Poisson
regression models, as well as many other models, such as loglinear models
for categorical data. In this subsection we recall the structure of GLIMs
in the framework of claims reserving.
The first component of a GLIM, the random component, assumes that
the response variables Yij are independent and that the density function
of Yij belongs to the exponential family with densities of the form
f (yij ; θij , φ) = exp {[yij θij − b(θij )] /a(φ) + c(yij , φ)} ,
(4.17)
where a(·), b(·) en c(·, ·) are given functions. The function a(φ) often has
the form a(φ) = φ, where φ is called the dispersion parameter.
When φ is a known constant, (4.17) simplifies to the natural exponential family
f (yij ; θij ) = ã(θij )b̃(yij )exp {yij Q(θij )} .
(4.18)
We identify Q(θ) with θ/a(φ), ã(θ) with exp{−b(θ)/a(φ)}, and b̃(y) with
exp{c(y, φ)}. The more general formula (4.17) is useful for two-parameter
families, such as the normal or gamma, in which φ is a nuisance parameter.
Denoting the mean of Yij by µij , it is known that
µij = E[Yij ] = b0 (θij ) and Var[Yij ] = b00 (θij )a(φ),
(4.19)
where the primes denote derivatives with respect to θ. The variance can
be expressed as a function of the mean by
Var[Yij ] = a(φ)V (µij ) = φV (µij ),
where V (·) is called the variance function. The variance function V captures the relationship, if any, between the mean and variance of Yij .
The possible distributions to work with in claims reserving include for
instance the normal, Poisson, gamma and inverse Gaussian distributions.
Table 4.3 shows some of their characteristics. For a given distribution,
link functions other than the canonical link function can also be used. For
example, the log-link is often used with the gamma distribution.
The systematic component of a GLIM is based on a linear predictor
~ ij = β1 Rij,1 + · · · + βp Rij,p ,
ηij = (Rβ)
i, j = 1, . . . , t.
(4.20)
142
Chapter 4 - Reserving in non-life insurance business
Distribution
N(µ, σ 2 )
Density
IG(µ, σ 2 )
Canonical
link θ(µ)
µ(θ) =
b0 (θ)
V (µ) =
b00 (θ)
σ2
µ
θ
1
e−µ µy!
1
log(µ)
eθ
µ
ν
1
ν
1/µ
−1/θ
µ2
σ2
1/µ2
(−2θ)−1/2
µ3
√1 exp
σ 2π
Poisson(µ)
Gamma(µ, ν)
φ
1
Γ(ν)
νy
µ
− (y−µ)
2σ 2
2
y
1
exp − νy
µ
y
y −3/2
√
exp
2πσ 2
−(y−µ)2
2yσ 2 µ2
Table 4.3: Characteristics of some frequently used distributions in loss
reserving.
Various choices are possible for this linear predictor. In Subsection 4.3.4
we give a short overview of frequently used parametric structures in claims
reserving applications.
The link function, the third component of a GLIM, connects the expectation µij of Yij to the linear predictor by
ηij = g(µij ),
(4.21)
where g is a monotone, differentiable function. Thus, a GLIM links the
expected value of the response to the explanatory variables through the
equation
~ ij
g(µij ) = (Rβ)
i, j = 1, . . . , t.
(4.22)
For the canonical link g for which g(µij ) = θij in (4.17), there is the direct
relationship between the natural parameter and the linear predictor. Since
µij = b0 (θij ), the canonical link is the inverse function of b0 .
2 for the
Generalized linear models may have nonconstant variances σij
2 can be taken as a function of the
responses Yij . Then the variance σij
predictor variables through the mean response µij , or the variance can
be modelled using a parameterized structure (see Renshaw (1994)). Any
regression model that belongs to the family of generalized linear models
can be analyzed in a unified fashion. The maximum likelihood estimates of
the regression parameters can be obtained by iteratively reweighted least
4.3. Model set-up: regression models
143
squares (naturally extending ordinary least squares for normal error linear
regression models).
Supposing that the claim amounts follow a lognormal distribution,
taking the logarithm of all Yij ’s implies that they have a normal distribution. So, the link function is given by ηij = µij and the scale parameter
is the variance of the normal distribution, i.e. φ = σ 2 . We remark that
each incremental claim must be greater than zero, and predictions from
this model can yield unusable results.
The predicted value under a generalized linear model will be given by
IBNR reserve =
t
t
X
X
µ̂ij ,
(4.23)
i=2 j=t+2−i
~ˆ ij
with µ
~ˆij = g −1 (Rβ)
for a given link function g.
We end this section with some extra comments concerning GLIMs.
The need for more general GLIM models for modelling claims reserves becomes clear in the column of variance functions in Table 4.3. If the variance
of the claims is proportional to the square of the mean, the gamma family
of distributions can accommodate this characteristic. The Poisson and inverse Gaussian provide alternative variance functions. However, it may be
that the relationship between the mean and the variance falls somewhere
between the inverse Gaussian and the gamma models. Quasi-likelihood is
designed to handle this broader class of mean-variance relationships. This
is a very simple and robust alternative, introduced in Wedderburn (1974),
which uses only the most elementary information about the response variable, namely the mean-variance relationship. This information alone is
often sufficient to stay close to the full efficiency of maximum likelihood
estimators. Suppose that we know that the response is always positive,
the data are invariably skew to the right, and the variance increases with
the mean. This does not enable us to specify a particular distribution
(for example it does not discriminate between Poisson or negative binomial errors), hence one cannot use techniques like maximum likelihood or
likelihood ratio tests. However, quasi-likelihood estimation allows one to
model the response variable in a regression context without specifying its
distribution. We need only to specify the link and variance functions to
144
Chapter 4 - Reserving in non-life insurance business
estimate regression coefficients. Although the link and variance functions
determine a theoretical likelihood, the likelihood itself is not specified so
fewer assumptions are required for estimation and inference. This is analogous to the connection between normal-theory regression models and leastsquares estimates. Least-squares estimation provides identical parameter
estimates to those obtained from normal-theory models, but least-squares
estimation assumes far less. Only second moment assumptions are made by
least-squares compared to full distribution assumptions of normal-theory
models. For quasi-likelihood, specification of a variance function determines a corresponding quasi-likelihood element for each observation:
Z µij
yij − t
dt,
(4.24)
Q(µij ; yij ) =
yij φV (t)
where Q(µij ; yij ) satisfies a number of properties in common with the loglikelihood. Specifically, if K = k(µij ; Yij ) = (Yij − µij )/(φV (µij )), then
E(K) = 0
Var(K) =
∂K
−E
=
∂µij
1
φV (µij )
1
.
φV (µij )
(4.25)
According to McCullagh & Nelder (1992), since most first-order asymptotic theory regarding likelihood functions is based on the three properties (4.25), we can expect Q(µij ; yij ) to behave like a log-likelihood under
certain broad conditions. Summing (4.24) over all yij -values yields the
quasi-likelihood for the complete data. The quasi-deviance D(yij ; µij ) is
similarly defined to be the sum over all yij -values of
Z yij
yij − t
−2φQ(µij ; yij ) = 2
dt.
(4.26)
V (t)
µij
Parameter estimation proceeds by maximizing the quasi-likelihood. Since
the quasi-likelihood behaves like an ordinary likelihood, it inherits all the
large sample properties of likelihoods: approximate unbiasedness and normality of the parameter estimates. For example, through the use of the
quasi-likelihood
!
Z µij
µ2ij
Yij − t
µij yij
1
dt =
−
(4.27)
Q(µij ; yij ) =
φt2.5
(−1.5) (−0.5)
φµ2.5
yij
ij
4.3. Model set-up: regression models
145
we could model a variance function between those of the gamma and inverse
Gaussian families: V (µij ) = µ2.5
ij .
When using the canonical link function, the quasi-likelihood equations
are given by
t+1−i
X
µij =
j=1
t+1−j
X
i=1
t+1−i
X
Yij
1 ≤ i ≤ t;
Yij
1 ≤ j ≤ t.
j=1
µij =
t+1−j
X
i=1
(4.28)
As can easily be seen from these equations in case of the Poisson model
with logarithmic link function, it is necessary to impose the constraint
that the sum of the incremental claims in every row and column has to
be non-negative. For example, this assumption makes the model unsuitable for incurred triangles, which may contain many negatives in the later
development periods due to overestimates of case reserves in the earlier
development periods.
We recall that the only distributional assumptions used in GLIMs are
the functional relationship between variance and mean and the fact that
the distribution belongs to the exponential family. When we consider the
Poisson case, this relationship can be expressed as
Var[Yij ] = E[Yij ].
(4.29)
One can allow for more or less dispersion in the data by generalizing (4.29)
to Var[Yij ]=φE[Yij ] (φ ∈ (0, ∞)) without any change in the form and
solution of the likelihood equations. For example, it is well known that an
over-dispersed Poisson model with the chain-ladder type linear predictor
gives the same predictions as those obtained by the deterministic chainladder method (see Renshaw & Verrall, 1994).
Modelling the incremental claim amounts as independent gamma response variables, with a logarithmic link function and the chain-ladder
type linear predictor produces exactly the same results as obtained by
Mack (1991). The relationship between this generalized linear model and
the model proposed by Mack was first pointed out by Renshaw & Verrall
(1994). The mean-variance relationship for the gamma model is given by
Var[Yij ] = φ (E[Yij ])2 .
(4.30)
146
Chapter 4 - Reserving in non-life insurance business
Using this model gives predictions close to those from the deterministic
chain-ladder technique, but not exactly the same. Notice that we need to
impose that each incremental value should be positive (non-negative) if
we work with gamma (Poisson) models. This restriction can be overcome
using a quasi-likelihood approach.
As in normal regression, the search for a suitable model may encompass
a wide range of possibilities. The Bayesian information criterion (BIC)
and the Akaike Information Criterion (AIC) are model selection devices
that emphasize parsimony by penalizing models for having large numbers
of parameters. Tests for model development to determine whether some
predictor variables may be dropped from the model can be conducted
using partial deviances. Two measures for the goodness-of-fit of a given
generalized linear model are the scaled deviance and Pearson’s chi-square
statistic.
In cases where the dispersion parameter is not known, an estimate can
be used to obtain an approximation to the scaled deviance and Pearson’s
chi-square statistic. One strategy is to fit a model that contains a sufficient
number of parameters so that all systematic variation is removed, estimate
φ from this model, and then use this estimate in computing the scaled
deviance of sub-models. The deviance or Pearson’s chi-square divided by
its degrees of freedom is sometimes used as an estimate of the dispersion
parameter φ.
4.3.4
Linear predictors and the discounted IBNR reserve
Various choices are possible for the linear predictor in claims reserving
applications. We give here a short overview of frequently used parametric
structures.
A well-known and widely used predictor is the chain-ladder type
ηij = αi + βj ,
(4.31)
(αi is the parameter for each year of origin i and βj for each development
year j). It should be noted that this representation implies the same
development pattern for all years of origin, where that pattern is defined
by the parameters βj . Notice that a parameter, for example β1 , must be set
equal to zero, in order to have a non-singular regression matrix. Another
natural and frequently used restriction on the parameters is to impose that
4.3. Model set-up: regression models
147
β1 + · · · + βt = 1, since this allows the βj to be interpreted as the fraction
of claims settled in development year j.
The separation predictor takes into account the calendar years and
replaces in (4.31) αi with γk (k = i + j − 1). It combines the effects of
monetary inflation and changing jurisprudence.
For a general model with parameters in the three directions, we refer to
De Vylder & Goovaerts (1979). We give here some frequently used special
cases:
• The probabilistic trend family (PTF) of models as suggested in Barnett
& Zehnwirth (1998)
ηij = αi +
j−1
X
k=1
βk +
i+j−2
X
γt ,
(4.32)
t=1
where γ denotes the calendar year effect; it combines the effects of
monetary inflation and changing jurisprudence.
• The Hoerl curve as in Zehnwirth (1985)
ηij = αi + βi log(j) + γi j (j > 0).
(4.33)
This model has the advantage that one can predict payments by
extrapolation for j > t, because development year j is considered
as a continuous covariate. This is useful in estimating tail factors.
Wright (1990) extends this Hoerl curve further to model possible
claim inflation.
• A mixture of models (4.31) and (4.33) as in England & Verrall (2001)
ηij =
αi + β j
if j ≤ q;
αi + βi log(j) + γi j if j > q
(4.34)
for some integer q specified by the modeller.
In the case that the type of business allows for discounting we add a discounting process. Of course, the level of the required reserve will strongly
148
Chapter 4 - Reserving in non-life insurance business
depend on how we will invest this reserve. We define the discounted IBNR
reserve S under one of the discussed regression models as follows
lognormal linear model: SLL =
t
t
X
X
~ˆ
e(Rβ)ij +ij −Y (i+j−t−1) ,
i=2 j=t+2−i
loglinear location-scale model: SLLS =
t
t
X
X
~
e(Rβ)ij +σ̃ij −Y (i+j−t−1) ,
i=2 j=t+2−i
generalized linear model: SGLIM =
t
X
t
X
i=2 j=t+2−i
~ˆ ij e−Y (i+j−t−1) ,
g −1 (Rβ)
where the returns are modelled by means of a Brownian motion described
by the following equation
Y (i) = (δ +
ς2
)i + ςB(i),
2
(4.35)
where B(i) is the standard Brownian motion, ς is the volatility and δ is a
constant force of interest.
4.4
Convex bounds for the discounted IBNR reserve
Before we can apply the results of Chapter 2 in order to derive the comonotonic approximations for S, we have to specify further the distribution of
~ˆ ij . This is done in what follows.
µ
~ˆ = g −1 (Rβ)
4.4.1
Asymptotic results in generalized linear models
~ˆ ~ηˆ = Rβ~ˆ and µ
Let φ̂, β,
~ˆ = g −1 (~ηˆ) be the maximum likelihood estimates of
ˆ
~ ~η and µ
φ, β,
~ respectively. The estimation equation for β~ is then given by
ˆ
U0 ŴUβ~ = U0 Ŵ~yˆ∗ ,
(4.36)
where W = diag{w11 , · · · , wt1 }, with wij = Var[Yij ]−1 (dµij /dηij )2 , ~y ∗ =
∗ , · · · , y ∗ )0 , and denoting y ∗ = η + (y − µ )dη /dµ where y de(y11
ij
ij
ij
ij
ij
ij
t1
ij
ˆ
~ It is well-known
note the sample values. Note that Ŵ is W evaluated at β.
that for asymptotically normal statistics, many functions of such statistics
4.4. Convex bounds for the discounted IBNR reserve
149
~ˆ 11 , · · · , (Rβ)
~ˆ tt is
(Rβ)
~ 11 , · · · , (Rβ)
~ tt
asymptotically multivariate normal with mean Rβ~ = (Rβ)
~ˆ = Σa = {σ a } = R(U0 WU)−1 R0
and variance-covariance matrix Σ(Rβ)
ij
~ = (ψ11 , · · · , ψtt ) at (Rβ),
~
and g −1 (η11 , · · · , ηtt ) has a nonzero differential ψ
where ψij = dµij /dηij , it follows from the delta method that
h
i
d
µ
~ˆ − µ
~ → N 0, Σ(µ
~ˆ) ,
(4.37)
ˆ
are also asymptotically normal. Because Rβ~ =
~ 0 Σa ψ.
~ Hence, for large samples the distribution of µ
where Σ(µ
~ˆ) = ψ
~ˆ =
~ˆ can be approximated by a normal distribution with mean µ
g −1 (Rβ)
~ and
ˆ
variance-covariance matrix Σ(µ
~ ).
Maximum likelihood estimates may be biased when the sample size or
the total Fisher information is small. The bias is usually ignored in practice, because it is negligible compared with the standard errors. In small
or moderate-sized samples, however, a bias correction can be necessary,
and it is helpful to have a rough estimate of its size.
In deriving the convex bounds, one need the expected values. Since
there is no exact expression for the expectation of µ
~ˆ, we approximate it
using a general formula for the first-order bias of the estimate of µ
~.
~ˆ In
Cordeiro & McCullagh (1991) derived the first order bias of β.
matrix notation this bias reduces to the simple form
~ˆ = − 1 Σb U0 Σc Fd 1̄,
B(β)
d
2
(4.38)
~ˆ = {σ b } = (U0 WU)−1 , Σc = Σ(Uβ)
~ˆ = {σ c } = UΣb U0 ,
with Σb = Σ(β)
ij
ij
t(t+1)
2 ×1vectorof
dµij
d2 µij
−1
Var[Yij ]
.
dηij
dηij2
a , · · · , σ a }, Σc = diag{σ c , · · · , σ c }, 1̄ is a
Σad = diag{σ11
tt
11
t1
d
ones, and Fd = diag{f11 , · · · , ft1 } with fij =
It follows that the n−1 bias of ~ηˆ also has a simple expression:
1
B(~ηˆ) = − RΣb U0 Σcd Fd 1̄.
2
(4.39)
ˆ
To evaluate the n−1 biases of β~ and ~ηˆ we need only the variance and the link
functions with their first and second derivatives. In the right-hand sides of
equations (4.38) and (4.39), which are of order n−1 , consistent estimates
of the parameters µ
~ can be inserted to define the corrected maximum
150
Chapter 4 - Reserving in non-life insurance business
ˆ
ˆ
~ˆ which should
likelihood estimates ~ηˆc = ~ηˆ − B̂(~ηˆ) and β~c = β~ − B̂(β),
~ˆ From now on B̂(·)
have smaller biases than the corresponding ~ηˆ and β.
means the value of B(·) at the point µ
~ˆ. Expressions (4.38) and (4.39)
are applicable even if the link is not the same for each observation. For
~ˆ and
the linear model with any distribution in the exponential family B(β)
B(~ηˆ) are zero. This is to be expected for the normal linear model or for the
inverse Gaussian non-intercept linear regression model. However it is not
obvious that this happens for any distribution in the exponential family
ˆ
(4.17) with identity link since β~ is obtained, apart from these cases, from
ˆ
the non-linear equation (4.36) with and because of the dependence of β~
on Ŵ and ~yˆ∗ . We now give the n−1 bias of µ
~ˆ. Because µij = g −1 (ηij ) =
−1
~
g ((Rβ)ij ) and the link function is monotone and twice differentiable, we
can apply a Taylor series expansion of µ̂ij around ηij :
µ̂ij
dµij
1 d2 µij
2
∼
(η̂ij − ηij ) +
= µij +
2 (η̂ij − ηij ) ,
dηij
2 dηij
∼
=
dµij
1 d2 µij
2
(η̂ij − ηi ) +
2 (η̂ij − ηij ) ,
dηij
2 dηij
E[µ̂ij − µij ] ∼
=
dµij
1 d2 µij
E[(η̂ij − ηij )] +
2 Var[η̂ij ].
dηij
2 dηij
µ̂ij − µij
In matrix notation
1
E[µ
~ˆ − µ
~] ∼
= G1 E[(~ηˆ − ~η )] + G2 [Var(~ηˆ)]
2
1
1
∼
= − RΣb U0 Σcd Fd 1̄ + G2 Σad 1̃
2
2
o
1n
a
b 0 c
=
G2 Σd 1̃ − G1 RΣ U Σd Fd 1̄ .
2
So, the first order bias of µ
~ˆ in matrix notation is given by the following
equation:
B(µ
~ˆ) =
o
1n
G2 Σad 1̃ − G1 RΣb U0 Σcd Fd 1̄ ,
2
(4.40)
where 1̃ is a t2 × 1 vector of ones and G1 = diag{ψ11 , · · · , ψtt }, G2 =
d2 µij
dµ
.
diag{ϕ11 , · · · , ϕtt } where ψij = ij and ϕij =
dηij
dηij2
4.4. Convex bounds for the discounted IBNR reserve
151
So, we can define adjusted values as µ
~ˆc = µ
~ˆ − B̂(µ
~ˆ), which should have
ˆ
smaller biases than the corresponding µ
~ . Note that B̂(·) means here the
ˆ
value of B(·) taken at (φ̂, µ
~ ).
4.4.2
Lower and upper bounds
In this subsection we will derive the upper and lower bounds in convex order, as described in Chapter 2, for the discounted IBNR reserve SLL , SLLS
and SGLIM under the different regression models.
Using the results of Chapter 2, we derive a convex lower and upper
P P
bound for S = i j Xij Zij given by
XX
|
i
with
j
E[Xij ]E[Zij |Λ] ≤cx
{z
Sl


 e ij
~
Xij =
e(Rβ)ij +σ̃ij

 µ̂
ij
}
XX
|
i
(SLL );
(SLLS );
(SGLIM ).
j
Xij Zij ≤cx
{z
S
}
XX
|
i
j
FX−1ij (U )FZ−1
(V ),
ij
{z
Sc
}

~ˆ

 e(Rβ)ij −Y (i+j−t−1) (SLL );
Zij =
e−Y (i+j−t−1)
(SLLS );

 −Y (i+j−t−1)
e
(SGLIM ).
We introduce the random variables Wij and W̃ij defined by
~ˆ ij − Y (i + j − t − 1) and W̃ij = −Y (i + j − t − 1),
Wij = (Rβ)
with
~ ij − (δ + 1 ς 2 )(i + j − t − 1),
E[Wij ] = (Rβ)
2
1 2
E[W̃ij ] = −(δ + ς )(i + j − t − 1),
2
2
Var[Wij ] = σWij = σ 2 R(U0 U)−1 R0 ij + (i + j − t − 1)ς 2 ,
2
Var[W̃ij ] = σW̃
= (i + j − t − 1)ς 2 .
ij
(4.41)
152
Chapter 4 - Reserving in non-life insurance business
The lower bound
To compute the lower bound we consider the following conditioning normal
random variable of the form (2.53)
Λ=
t
t
X
X
i=2 j=t+2−i
with
νij Y (i + j − t − 1),
(4.42)

~
l );

e(Rβ)ij e−(i+j−t−1)δ
(SLL

i
 h ~
l
E e(Rβ)ij +σ̃ij e−(i+j−t−1)δ (SLLS
);
νij =


 µ + B(µ
l
).
~ˆ)ij e−(i+j−t−1)δ (SGLIM
ij
(4.43)
Notice that (Wij , Λ) has a bivariate normal distribution. Conditionally
given Λ = λ, Wij has a univariate normal distribution with mean and
variance given by
σWij
E[Wij |Λ = λ] = E[Wij ] + ρij
(λ − E[Λ])
(4.44)
σΛ
and
2
Var[Wij |Λ = λ] = σW
1 − ρ2ij ,
ij
(4.45)
where ρij denotes the correlation between Λ and Wij . The same is true for
(W̃ij , Λ), where we denote the correlation between Λ and W̃ij by ρ̃ij .
The lower bound can be written as
l
SLL
=
t
t
X
X
E[Xij ]e
2
E[Wij ]+ρij σWij Φ−1 (V )+ 21 (1−ρ2ij )σW
ij
,
W̃ij
,
i=2 j=t+2−i
l
SLLS
=
t
t
X
X
E[Xij ]e
E[W̃ij ]+ρ̃ij σW̃ Φ−1 (V )+ 21 (1−ρ̃2ij )σ 2
ij
i=2 j=t+2−i
l
SGLIM
=
t
t
X
X
E[Xij ]e
E[W̃ij ]+ρ̃ij σW̃ Φ−1 (V )+ 21 (1−ρ̃2ij )σ 2
ij
W̃ij
i=2 j=t+2−i
with

1 2
(SLL );
E [eij ] = e 2 σ i


 h ~
(R
β)
+σ̃
ij
ij
E e
= See Table 4.2 (SLLS );
E[Xij ] =
h
i


ˆ
 E g −1 (Rβ)
~ ij = µij + B(µ
~ˆ)ij (SGLIM ).
,
4.4. Convex bounds for the discounted IBNR reserve
153
The correlations ρij and ρ̃ij are given by
ρij =
Cov[Λ, Wij ]
,
σΛ σWij
ρ̃ij =
Cov[Λ, W̃ij ]
,
σΛ σW̃ij
with
Cov[Λ, Wij ] = Cov[Λ, W̃ij ]
= −ς
2
t
X
t
X
k=2 l=t+2−k
νkl min(i + j − t − 1, k + l − t − 1)
and
2
Var[Λ] = σΛ
= ς2
t
X
t
X
t
X
t
X
νrs νvw min(r+s−t−1, v+w−t−1).
r=2 s=t+2−r v=2 w=t+2−v
By conditioning on one of the standard uniform random variables one can
compute the distribution function of the lower bound. See Subsection 2.5.3
for more details.
For the lognormal linear and loglinear location-scale models there exist
a closed-form expression for the quantile function of S l .
Pt P t
Taking into account that Λ =
j=t+2−i νij Y (i + j − t − 1) is
i=2
normally distributed, we find that
FΛ−1 (1 − p) = E[Λ] − σΛ Φ−1 (p),
and hence
FS−1
= F −1t
l (p)
i=2
=
t
j=t+2−i
t
t
X
X
i=2 j=t+2−i
=
t
t
X
X
i=2 j=t+2−i
E[Xij ]E[Zij |Λ]
(p),
p ∈ (0, 1)
−1
(p)
FE[X
ij ]E[Zij |Λ]
E[Xij ]E[Zij |Λ = FΛ−1 (1 − p)],
In order to derive the above result, we used the fact that for a nonincreasing continuous function g, we have
−1
Fg(X)
(p) = g(FX−1 (1 − p)),
p ∈ (0, 1).
(4.46)
154
Chapter 4 - Reserving in non-life insurance business
Here, g = E[Zij |Λ] is a non-increasing function of Λ since ρij (ρ̃ij ) is always
negative. So, we have that
FS−1
l (p) =
t
t
X
X
E[Xij ]e
2
E[Wij ]−ρij σWij Φ−1 (p)+ 21 (1−ρ2ij )σW
ij
, (LL)
i=2 j=t+2−i
t
t
X
X
E[Xij ]e
E[W̃ij ]−ρ̃ij σW̃ Φ−1 (p)+ 21 (1−ρ̃2ij )σ 2
W̃ij
ij
. (LLS)
i=2 j=t+2−i
and FS l (x) can be obtained from solving the equation
t
t
X
X
E[Xij ]e
E[Wij ]−ρij σWij Φ−1 (FS l
LL
2
(x))+ 21 (1−ρ2ij )σW
ij
= x, (LL)
i=2 j=t+2−i
t
t
X
X
E[Xij ]e
E[Wij ]−ρ̃ij σWij Φ−1 (FS l
LLS
2
(x))+ 21 (1−ρ̃2ij )σW
ij
= x. (LLS)
i=2 j=t+2−i
The upper bound
The upper bound can be written as
c
SLL
=
t
t
X
X
FX−1ij (U )e
E[Wij ]+σWij Φ−1 (V )
FX−1ij (U )e
E[W̃ij ]+σW̃ Φ−1 (V )
FX−1ij (U )e
E[W̃ij ]+σW̃ Φ−1 (V )
,
i=2 j=t+2−i
c
SLLS
=
t
t
X
X
ij
,
i=2 j=t+2−i
c
SGLIM
=
t
t
X
X
ij
,
i=2 j=t+2−i
with
 −1
−1
Feij (U ) = eσΦ (U )
(SLL );



−1

(U ) = See Table 4.2
(SLLS );

~
 Fe(Rβ)
ij +σij
−1
−1
FXij (U ) =
(U ) = µij + B(µ
~ˆ)ij
F

~ˆ ij
g −1 (Rβ)


q



+ Σ(µ
~ˆ)ij Φ−1 (p) (SGLIM ).
4.4. Convex bounds for the discounted IBNR reserve
155
The cdf of the upper bound can be computed as described in Subsection
c
2.5.3. Using Remark 4 one can calculate the distribution function of SLL
c
c .
and SLLS
more efficiently. We start with the cdf of SLL
From previous results
Z 1
c
(u) du,
FN ln(y) − ln F −1
FSLL (y) =
c0
SLL
0
with FN (x) the cdf of N(0, σ 2 ) and
c0
SLL
=
t
t
X
X
exp
i=2 j=t+2−i
=
t
t
X
X
(U )
F −1ˆ
~ ij −Y (i+j−t−1)
(Rβ)
~
1 2
)(i+j−t−1)
e(Rβ)ij −(δ+ 2 ς
i=2 j=t+2−i
×e
√
σ 2 (R(U0 U)−1 R0 )ij +ς 2 (i+j−t−1)Φ−1 (p)
.
and
F −1
(u) =
c0
SLL
t
t
X
X
~
1 2
)(i+j−t−1)
e(Rβ)ij −(δ+ 2 ς
i=2 j=t+2−i
×e
√
σ 2 (R(U0 U)−1 R0 )ij +ς 2 (i+j−t−1)Φ−1 (u)
We can write the upper bound of SLLS as
c
SLLS
=G
t
t
X
X
e
E[W̃ij ]+σW̃ Φ−1 (V ) (Rβ)
~ ij
ij
e
,
i=2 j=t+2−i
with

σ̃Φ−1 (U )

(Lognormal linear);

 e
σ̃
(−log(1
−
U
))
(Weibull-extreme value);
G=
σ̃


U

(Logistic).
1−U
The distribution function of G is given by

lnx
(Lognormal linear);
Φ

σ̃


1
σ̃
−x
1−e
(Weibull-extreme value);
FG (x) ∼

1 −1

 1 − 1 + x σ̃
(Logistic).
.
156
Chapter 4 - Reserving in non-life insurance business
c
Using Remark 4 we can write the cdf of SLLS
for the lognormal linear, the
weibull-extreme value and the logistic regression model as follows


Z 1
y
 du.
(y) =
FG  −1
FS−1
c
LLS
F c0 (u)
0
SLLS
with
c0
SLLS
=
t
t
X
X
exp
i=2 j=t+2−i
=
t
t
X
X
F −1~
(U )
(Rβ)ij −Y (i+j−t−1)
~
√
1 2
)(i+j−t−1)+ς i+j−t−1Φ−1 (U )
e(Rβ)ij −(δ+ 2 ς
.
i=2 j=t+2−i
and
F −1
(u) =
c0
SLLS
t
t
X
X
~
√
1 2
)(i+j−t−1)+ς i+j−t−1Φ−1 (u)
e(Rβ)ij −(δ+ 2 ς
.
i=2 j=t+2−i
Remark 7. Since we have no equality of the first moments in the GLIM
framework, the convex order relationship between the two approximations
and S is not valid. This does not impose any restrictions on the use of
the approximations. In fact, we can say that the convex order only holds
asymptotically in this case.
Remark 8. The estimator ŴD (4.12), for the mean of the IBNR reserve,
constitutes a close upper bound for the UMVUE of the mean of the IBNR
− p is large and the residual sum of squares is small. It
reserve if t(t+1)
2
~ˆ
2
should be noted that e((Rβ)ij +σ̌ /2) is the estimator of the mean of a log~ ij , σ 2 ) obtained by replacing the parameters
normal distribution logN((Rβ)
β~ and σ 2 by their unbiased estimates. Adding now a discount process to
ŴD gives
t
t
X
X
1 2
~ˆ
e(Rβ)ij −Y (i+j−t−1)+ 2 σ̌ .
(4.47)
ŴDD =
i=2 j=t+2−i
Now, we can apply the same methodology as explained before. The results
for the lognormal linear model are still applicable. The only difference is
4.5. The bootstrap methodology in claims reserving
157
that ij is changed by 21 σ̌ 2 , with
1 2
σ̌ ∼ Gamma
2
4.5
4.5.1
n − p σ2
,
2
n−p
.
(4.48)
The bootstrap methodology in claims reserving
Introduction
The bootstrap technique as an inferential statistical computer intensive
device was introduced by Efron (1979) as a quite intuitive and simple way
of making approximations to distributions which are very hard or even
impossible to compute analytically. This technique has proved to be a
very useful tool in many statistical applications and can be particularly
interesting to assess the variability of the claim reserving predictions and
to construct upper limits at an adequate confidence level. Its popularity
is due to a combination of available computing power and theoretical development. One advantage of the bootstrap technique is that it can be
applied to any data set without having to assume an underlying distribution. Moreover most computer packages can handle very large numbers of
repeated samplings.
Our goal is to obtain quantiles of the loss reserve for which the predictive distribution is not known. If we do not know the distribution, then
our best guess at the distribution is provided by the data. The main idea
in bootstrapping is that we (a) pretend that the data constitute the population and (b) take samples from this pretended population (which we call
“resamples”). Substituting the sample for the population means that we
are interested in the frequency with which the observed values occurred.
This is done by sampling with replacement. From the re-sample, we
calculate the statistic we are interested in. This is called a “bootstrap
statistic”. After storing this value, one repeats the above steps collecting a large number (B) of bootstrap statistics. The general idea is that
the relationship of the bootstrap statistics to the observed statistic is the
same as the relationship of the observed statistic to the true value. Under
mild regularity conditions, the bootstrap yields an approximation to the
distribution of an estimator or test statistic that is at least as accurate
as the approximation obtained from first-order asymptotic theory. For an
introduction explaining the bootstrap technique, see Efron & Tibshirani
(1993).
158
4.5.2
Chapter 4 - Reserving in non-life insurance business
Central idea
The concept of bootstrap relies on the consideration of the discrete empirical distribution generated by a random sample of size n from an unknown
distribution F . This empirical distribution assigns equal probability to
each sample item. In the discussion which follows, we will write F̂n for
that distribution. By generating an independent, identically distributed
random sequence (resample) from the distribution F̂n or its appropriately
smoothed version, we can arrive at new estimates of various parameters
and nonparametric characteristics of the original distribution F .
As we have already mentioned, the central idea of bootstrap lies in
sampling the empirical cdf F̂n . This idea is closely related to the following,
well-known statistical principle, henceforth referred to as the “plug-in”
principle. Given a parameter of interest θ(F ) depending upon an unknown
population cdf F , we estimate this parameter by θ̂ = θ(F̂n ). That is, we
simply replace F in the formula for θ by its empirical counterpart F̂n
obtained from the observed data. The plug-in principle will not provide
good results if F̂n poorly approximates F , or if there is information about
F other than that provided by the sample. For instance, in some cases we
might know (or be willing to assume) that F belongs to some parametric
family of distributions. However, the plug-in principle and the bootstrap
may be adapted to this latter situation as well. To illustrate the idea,
let us consider a parametric family of cdf’s {Fµ } indexed by a parameter
µ (possibly a vector), and for some given µ0 , let µ̂0 denote its estimate
calculated from the sample. The plug-in principle in this case states that
we should estimate θ(Fµ0 ) by θ(Fµ̂0 ). In this case, bootstrap is often called
parametric, since a resample is now collected from Fµ̂0 . Here, we refer to
any replica of θ̂ calculated from a resample as “a bootstrap estimate of
θ(F )” and denote it by θ̂∗ .
4.5.3
Bootstrap confidence intervals
Let us now turn to the problem of using the bootstrap methodology to
construct confidence intervals. This area has been a major focus of theoretical work on the bootstrap, and several different methods of approaching
the problem have been suggested. The “naive” procedure described below
is not the most efficient one and can be significantly improved in both
rate of convergence and accuracy. It is, however, intuitively obvious and
4.5. The bootstrap methodology in claims reserving
159
easy to justify, and seems to be working well enough for the cases considered here. For a complete review of available approaches to bootstrap
confidence intervals, see Efron & Tibisharani (1993). Let us consider θ̂∗ ,
a bootstrap estimate of θ based on a resample of size n from the original sample X1 , . . . , Xn , and let G∗ be its distribution function given the
observed sample values
G∗ = Pr[θ̂∗ ≤ x|X1 = x1 , . . . , Xn = xn ].
−1
The bootstrap percentiles method gives G−1
∗ (α) and G∗ (1−α) as, respectively, lower and upper bounds for the (1 − 2α) confidence interval for θ̂.
Let us note that for most statistics θ̂, the distribution function of the boot−1
strap estimator θ̂∗ is not available. In practice, G−1
∗ (α) and G∗ (1 − α)
are approximated by taking multiple resamples and then calculating the
empirical percentiles. In most cases B ≥ 1000 is recommended.
4.5.4
Bootstrap in claims reserving
As already mentioned above, with bootstrapping, we treat the obtained
data as if they are an accurate reflection of the parent population, and
then draw many bootstrapped samples by sampling, with replacement,
from a pseudo-population consisting of the obtained data. Technically,
this is called “non-parametric bootstrapping”, because we are sampling
from the actual data and we have made no assumptions about the distribution of the parent population, other than that the raw data adequately
reflect the population’s shape. If we were willing to make more assumptions, such as an assumption that the parent population follows a normal
distribution, then we could do our sampling, with replacement, from a
normal distribution. This is called “parametric bootstrapping”.
For a description of the bootstrap methodology in claims reserving we refer
to England & Verrall (1999) and Pinheiro et al. (2003). In these papers the
bootstrap technique is used to obtain prediction errors for different claims
reserving methods, namely methods based on the chain-ladder technique
and on generalized linear models. Applications of the bootstrap technique
to claims reserving can also be found in Lowe (1994), in Taylor (2000) and
in England & Verrall (2002).
Starting from the original run-off triangle one can create a large number
of bootstrap run-off triangles by repeatedly resampling, with replacement,
160
Chapter 4 - Reserving in non-life insurance business
from the appropriate residuals. For each bootstrap sample the regression
model is refitted and the bootstrap statistic is calculated.
In England & Verrall (1999) the bootstrap technique is used to compute the bootstrap root mean squared error of prediction (RMSEP bs ), also
known as the bootstrap standard error of prediction. This is equal to, what
they call, the square root of the sum of the squares of parameter variability
and data variability. For the parameter variability one suggests a correction on the bootstrap standard error to enable a comparison between the
analytic standard error and the bootstrap one by taking account of the
number of parameters used in fitting the model. The bootstrap standard
error is the standard deviation of the bootstrap reserve estimates. So, parameter variability is defined as the bootstrap standard error multiplied
by the square root of n divided by n − p (n: sample size, p: number of parameters). Data variability is the square root of the uniformly minimum
variance unbiased estimator of the variance of the IBNR reserve. This
estimator was already calculated by Doray (1996). Note that if the full
predictive distribution can be found, the RMSEP can be obtained directly
by calculating its standard deviation. Using a normal approximation, a
100(1 − α)% bootstrap prediction interval for the total reserve is calculated as [R ± Φ−1 (1 − α/2) ∗ RMSEPbs (R)], with R the initial forecast of
the IBNR reserve.
The second approach is more robust against deviations from the hypothesis of the model. For a detailed presentation of this method see
Davidson & Hinkley (1997). A new bootstrap statistic is defined here as
a function of the bootstrap estimate and a bootstrap simulation of the
future reality. This statistic is called the prediction error. (This is very
confusing because in the literature the term prediction error is also used
for the RMSEP or the standard error of prediction.) For each bootstrap
loop the prediction error is then kept in a vector and the percentile method
is used to obtain the desired percentile of this prediction error (PPE). In
a last stage an upper limit of the prediction interval for the total reserve
is calculated as [R + PPE].
The reader can find a complete list of the required steps for those two
procedures in the paper of Pinheiro et al. (2003). These authors have also
compared and discussed the two bootstrap procedures and the main conclusion is that the differences amongst the results obtained with the two
procedures, RMSEP and PPE, are not very important. The PPE procedure generates generally smaller values. Further one suggest to eliminate
4.5. The bootstrap methodology in claims reserving
161
the residuals with value 0 and to work with standardized residuals since
only the former could be considered as identically distributed.
The third approach is explained in England & Verrall (2002). Like
in the previous methods, first of all a stochastic model is fitted to the
bootstrap sample and a run-off triangle is bootstrapped. For this pseudo
triangle the parameters are estimated in order to calculate future incremental claim payments Ŷij∗ . The second stage of the procedure replicates
the process variance. This is achieved by simulating an observed claim
payment for each future cell in the run-off triangle, using the bootstrap
value Ŷij∗ as the mean, and using the process distribution assumed in the
underlying model. For each iteration the reserves are calculated by adding
up the simulated forecast payments. The set of reserves obtained in this
way forms the predictive distribution. The percentile method is then used
to obtain the required prediction interval.
In a practical case study one can bootstrap a high percentile of the distribution of the lower bound in order to describe the estimation error
involved. Taylor & Ashe (1983) used the terminology estimation error for
~ˆ ij ] and statistical or random error for Var[ij ]. The estimation
Var[(Rβ)
ˆ
error arises from the estimation of the vector β~ from the data, and the
statistical error stems from the stochastic nature of the regression model.
We bootstrap an upper triangle using the non-parametric procedure. This
involves resampling, with replacement, from the original residuals and then
creating a new triangle of past claim payments using the resampled residuals together with the fitted values.
With regression type problems the resampling procedure is applied to the
residuals of the model. Residuals are approximately independent and identically distributed. In a statistical analysis they are commonly used in
order to explore the adequacy of the fit of the model, with respect to
the choice of the variance function, link function and terms in the linear
predictor. Residuals may also indicate the presence of anomalous values
requiring further investigation.
For generalized linear models an extended definition of residuals is required, applicable to all the distributions that may replace the normal
distribution. It is convenient if these residuals can be used for the same
purposes as standard normal residuals. Three well-known forms of general-
162
Chapter 4 - Reserving in non-life insurance business
ized residuals are the Pearson, Anscombe and deviance residuals. Pearson
residuals are easy to interpret: it are just the raw residuals scaled by the
estimated standard deviation of the response variable. A disadvantage of
the Pearson residual is that the distribution of this residual form for nonnormal distributions is often markedly skewed, and so it may fail to have
properties similar to those of a normal theory residual. Anscombe and deviance residuals are more appropriate to check the approximate normality.
In general the lower bound S l turns out to perform very well. A final
method to obtain a confidence bound for the predictive distribution is a
combination of the power of this lower bound and bootstrapping. We will
bootstrap a high percentile of the distribution of the lower bound. This is
done as follows:
1. The preliminaries:
 2
 σ (LL)
~
• Estimate the model parameters β and
σ̃ 2 (LLS)

φ (GLIM)

~ˆ

(LL)
 e(Rβ)ij
• Calculate the fitted values: µ̂ij =
See Table 4.2 (LLS)

 −1 ~ˆ
g (Rβ)ij
(GLIM)
(i = 1, . . . , t; j = 1, . . . , t + 1 − i).


 zij − ln µ̂ij (LL)
zij − ln µ̂ij (LLS)
• Calculate the residuals: rij =

 √yij −µ̂ij
(GLIM)
φ̂V (µ̂ij )
(i = 1, . . . , t; j = 1, . . . , t + 1 − i).
2. Bootstrap loop (to be repeated B times):
∗ by sampling with replacement
• Generate a set of residuals rij
from the original
residuals (rij )
(i = 1, . . . , t; j = 1, . . . , t + 1 − i).
∗:
• Create a new upper triangle yij
– non-parametric bootstrap (NPB)
 ln(µ̂ij )+r∗
ij
(LL)

 e
∗
ln(µ̂
)+r
ij
ij
∗
eq
(LLS)
yij =


∗
φ̂V (µ̂ij )rij + µ̂ij (GLIM)
4.6. Three applications
163
(i = 1, . . . , t; j = 1, . . . , t + 1 − i).
– parametric
bootstrap (PB)

~ˆ ij +σ̌N(0,1)

(R
β)

(LL)
 e
∗
See Table 4.2
(LLS)
yij =
q


 ≈ µ̂ + B(µ
ˆ
ˆ
~ )ij N(0, 1) (GLIM)
~ )ij + Σ(µ
ij
(i = 1, . . . , t; j = 1, . . . , t + 1 − i).
Now we have bootstrapped a run-off triangle.
• Calculate
 2 ∗
 (σ̌ )
ˇ 2 )∗
(σ̃
 ∗
φ̂
ˆ
for this bootstrapped triangle the parameters β~ ∗ and
(LL)
(LLS)
(GLIM)
l∗ , using
• Calculate the percentile k of the distribution of S l , S(k)
these parameters.
• Return to the beginning of step 2 until the B repetitions are
completed.
3. Analysis of the bootstrap data:
• Apply the percentile method to the bootstrap observations to
obtain the required prediction interval.
4.6
Three applications
In this section we illustrate the effectiveness of the bounds derived for the
discounted IBNR reserve S, under the model studied. We investigate the
accuracy of the proposed bounds, by comparing their cumulative distribution function to the empirical distribution obtained with Monte Carlo
simulation (MC), which serves as a close approximation to the exact distribution of S. The simulation results are based on generating 100 000
random paths. The estimates obtained from this time-consuming simulation will serve as benchmark. The random paths are based on antithetic
variables in order to reduce the variance of the Monte Carlo estimates.
In order to illustrate the power of the bounds, namely inspecting the
deviation of the cdf of the convex bounds S l and S c from the true distribution of the total IBNR reserve S, we simulate a triangle from a particular
model. We created a non-cumulative run-off triangle based on the chainladder predictor (4.31) with parameters given in Table 4.4. So, the run-off
164
Chapter 4 - Reserving in non-life insurance business
α1
12.8
α2
12.9
α3
13.6
α4
13.5
α5
13.4
α6
13.2
α7
13.8
α8
13.7
α9
13.1
α10
13.0
α11
13.9
β1
0
β2
0.31
β3
−0.11
β4
−0.42
β5
−0.37
β6
−0.87
β7
−0.96
β8
−1.33
β9
−1.63
β10
−1.92
β11
−2.31
Table 4.4: Model parameters.
triangle has only trends in the two main directions, namely in the year of
origin and in the development year. The parameter β1 is set equal to zero
in order to have a non-singular regression matrix.
We also specify the multivariate distribution function of the random
vector (Y1 , Y2 , . . . , Yt−1 ). In particular, we will assume that the random
variables Yi are i.i.d. and N(δ + 21 ς 2 , ς 2 ) distributed with δ = 0.08 and
ς = 0.11. This enables now to simulate the cdf’s while there is no way to
compute them analytically.
4.6.1
Lognormal linear models
The simulated run-off triangle for this model is displayed in Table 4.5.
Fitting the lognormal linear model with a chain-ladder type predictor gives
the parameter estimates and standard errors shown in Table 4.6.
1
363,346
397,798
806,154
727,102
659,846
541,187
979,636
890,641
486,340
445,174
1,084,253
2
492,947
543,864
1,096,841
995,988
900,386
736,205
1,342,832
1,219,406
666,405
604,206
3
322,511
358,855
727,977
654,059
591,633
487,730
882,924
798,007
442,457
4
236,555
263,325
530,683
476,665
433,425
353,255
651,920
582,415
5
249,319
276,817
557,870
502,405
457,482
373,921
682,307
6
151,228
167,045
336,716
303,132
276,056
226,091
7
138,373
153,095
310,022
278,280
253,301
8
95,703
106,272
213,706
192,436
9
71,742
78,515
157,504
10
53,788
58,790
11
35,997
4.6. Three applications
1
2
3
4
5
6
7
8
9
10
11
Table 4.5: Simulated run-off triangle with non-cumulative claim figures for the lognormal linear regression
model.
165
166
Chapter 4 - Reserving in non-life insurance business
Parameter
Value
Estimate
Standard error
α1
α2
α3
α4
α5
α6
α7
α8
α9
α10
α11
β2
β3
β4
β5
β6
β7
β8
β9
β10
β11
σ
12.8
12.9
13.6
13.5
13.4
13.2
13.8
13.7
13.1
13.0
13.9
0.31
−0.11
−0.42
−0.37
−0.87
−0.96
−1.33
−1.63
−1.92
−2.31
0.0004
12.7976
12.8968
13.5994
13.4957
13.3996
13.1997
13.7999
13.6983
13.0999
13.0035
13.8964
0.3109
−0.1060
−0.4198
−0.3677
−0.8717
−0.9579
−1.3267
−1.6249
−1.9100
−2.3064
0.0037
0.0018
0.0018
0.0018
0.0019
0.0019
0.0020
0.0021
0.0023
0.0025
0.0029
0.0039
0.0018
0.0018
0.0019
0.0020
0.0021
0.0022
0.0024
0.0027
0.0032
0.0043
Table 4.6: Model specification, maximum likelihood estimates and standard errors for the run-off triangle in Table 4.5.
Figure 4.2 shows the cdf’s of the upper and lower bounds, compared to
the empirical distribution based on 100 000 randomly generated, normally
l
c ,
distributed vectors (Y1 , Y2 , . . . , Yt−1 ) and ~. Since SLL
≤cx SLL ≤cx SLL
the same ordering holds for the tails of their respective distribution functions which can be observed to cross only once. We see that the cdf of
l is very close to the distribution of S
SLL
LL . The “real” standard deviation
equals 1,617,912 whereas the standard deviation of the lower bound equals
1,590,233. A lower bound for the 95th percentile is given by 13,638,620.
c
The comonotonic upper bound SLL
performs badly in this case. This
l , we make use of the
comes from the fact that in order to determine SLL
(estimated values of the) correlations between the cells of the lower trianc , the distribution is an upper bound (in the
gle, whereas in the case of SLL
sense of convex order) for any possible dependence structure between the
~ . The standard deviation of the upper bound
components of the vector V
is given by 1,890,298. The 95th percentile of the upper bound now equals
4.6. Three applications
167
l .
14,207,619, which is of course much higher than the 95th percentile of S LL
Table 4.7 summarizes the numerical values of the 95th percentiles of
l
c , together with their means and standard
the two bounds SLL
and SLL
deviations. This is also provided for the row totals
t
X
SLL,i =
~ˆ
e(Rβ)ij −Y (i+j−t−1)+ij ,
j=t+2−i
i = 2, · · · , t.
(4.49)
We can conclude that the lower bound approximates the “real discounted
reserve” very well.
In order to have a better view on the behavior of the upper bound
c and of the lower bound S l
SLL
LL in the tails, we consider a QQ-plot where
c
l
the quantiles of SLL and of the lower bound SLL
are plotted against the
c
l
quantiles of SLL . The upper bound SLL and the lower bound SLL
will
−1
−1
be a good approximation for SLL if the plotted points (FSLL (p), FS c (p)),
LL
respectively (FS−1
(p), FS−1
l (p)), for all values of p in (0, 1) do not deviLL
LL
ate too much from the line y = x. From the QQ-plot in Figure 4.3, we
can conclude that the upper bound (slightly) overestimates the tails of S,
whereas the accuracy of the lower bond is extremely high for the chosen
set of parameter values. Table 4.8 confirms these observations.
u is very close to the
We remark that the improved upper bound SLL
c . This could be expected because ρ is close
comonotonic upper bound SLL
ij
to ρkl for any pair (ij, kl)with ij and kl sufficient close. This implies that
−1
for any such pair (ij, kl) F −1
(U
)
(U
),
F
ˆ
ˆ
~
~
e(Rβ)kl −Y (k+l−t−1) |Λ
e(Rβ)ij −Y (i+j−t−1) |Λ
−1
(U
)
. Since the im(U
),
F
is close to F −1
ˆ
ˆ
~
~
(Rβ)
−Y (k+l−t−1)
(Rβ)
−Y (i+j−t−1)
e
ij
e
kl
proved upper bound requires more computational time, the results for the
improved upper bound are not displayed in this thesis.
Chapter 4 - Reserving in non-life insurance business
0.6
0.4
0.0
0.2
cum. distr.
0.8
1.0
168
6*10^6
8*10^6
10^7
1.2*10^7
1.4*10^7
1.6*10^7
1.8*10^7
discounted IBNR reserve
8*10^6
10^7
1.2*10^7
1.6*10^7
l
Figure 4.2: The cdf ’s of ‘SLL ’ (MC) (solid line), SLL
(dotted line) and
c
SLL (dashed line) for the run-off triangle in Table 4.5.
8*10^6
10^7
1.2*10^7
1.4*10^7
1.6*10^7
l
c () versus those
Figure 4.3: QQ-plot of the quantiles of SLL
(◦) and SLL
of ‘SLL ’ (MC).
95%
41,913
210,781
339,371
487,782
609,034
1,515,794
1,976,955
1,392,268
1,641,355
5,423,367
13,638,620
st. dev.
3,043
18,580
33,568
51,644
66,663
174,414
226,804
166,894
199,051
649,616
1,590,233
95%
43,742
215,958
344,231
492,575
614,094
1,526,990
1,986,766
1,403,295
1,657,107
5,473,462
13,718,215
SLL
mean
36,690
178,510
280,570
396,817
491,252
1,206,571
1,574,556
1,095,420
1,286,851
4,266,762
10,814,002
st. dev.
4,072
21,334
36,069
53,804
68,525
177,891
230,635
169,890
203,005
662,975
1,617,912
95%
43,796
218,463
350,678
503,873
630,052
1,570,251
2,053,898
1,449,017
1,713,161
5,674,518
14,207,619
c
SLL
mean
36,694
178,522
280,596
396,861
491,311
1,206,735
1,574,772
1,095,585
1,287,052
4,267,416
10,815,543
st. dev.
4,096
22,805
39,738
60,357
77,971
203,422
267,668
196,744
236,658
781,003
1,890,298
4.6. Three applications
year
2
3
4
5
6
7
8
9
10
11
total
l
SLL
mean
36,694
178,522
280,596
396,861
491,311
1,206,735
1,574,772
1,095,585
1,287,052
4,267,416
10,815,543
l
c vs. ‘S
Table 4.7: 95th percentiles, means and standard deviations of the distributions of S LL
and SLL
LL ’ (MC).
169
170
Chapter 4 - Reserving in non-life insurance business
p
0.95
0.975
0.99
0.995
0.999
l
SLL
13,638,620
14,303,311
15,122,153
15,709,687
17,003,250
SLL
13,718,215
14,411,869
15,166,753
15,710,588
17,003,255
c
SLL
14,207,619
15,035,380
16,066,305
16,813,432
18,479,550
Table 4.8: Approximations for some selected quantiles with probability
level p of SLL .
1 st
2.5 th
5 th
10 th
25 th
50 th
75 th
90 th
95 th
97.5 th
99 th
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
Distribution of bootstrapped
l
95th percentiles of SLL
Simulated distribution
of FS−1
(0.95)
LL
13,587,825
13,589,852
13,597,445
13,616,522
13,627,692
13,637,841
13,647,654
13,661,140
13,671,003
13,678,085
13,680,785
13,578,331
13,579,131
13,585,813
13,598,723
13,619,389
13,634,543
13,651,195
13,669,104
13,678,393
13,685,378
13,688,379
Table 4.9: Percentiles of the bootstrapped 95th percentile of the distribuB
tion of the lower bound Sl(95)
vs. the simulation.
Finally, for each bootstrap sample, we calculate the desired percentile of
l . This two-step procedure is repeated a large number
the distribution of SLL
of times. The first column of Table 4.9 shows the results, concerning the
95th percentile, for 5000 bootstrap samples applied to the run-off triangle
in Table 4.5. When compared with the simulated distribution of FS−1
(0.95)
LL
(obtained through 5000 simulated triangles), we can conclude that the
bootstrap distribution yields appropriate confidence bounds.
4.6. Three applications
Parameter
α1
α2
α3
α4
α5
α6
α7
α8
α9
α10
α11
β2
β3
β4
β5
β6
β7
β8
β9
β10
β11
σ
171
Value Estimate Standard error
12.8
12.805
0.0073
12.9
12.909
0.0074
13.6
13.599
0.0077
13.5
13.506
0.0076
13.4
13.411
0.0082
13.2
13.203
0.0076
13.8
13.788
0.0091
13.7
13.708
0.0081
13.1
13.103
0.0101
13.0
13.982
0.0102
13.9
13.905
0.0131
0.31
0.310
0.0068
−0.11
−0.118
0.0080
−0.42
−0.424
0.0079
−0.37
−0.370
0.0088
−0.87
−0.883
0.0079
−0.96
−0.967
0.0093
−1.33
−1.325
0.0108
−1.63
−1.643
0.0097
−1.92
−1.956
0.0225
−2.31
−2.311
0.0150
0.01
0.0093
0.0001
Table 4.10: Model specification, maximum likelihood estimates and standard errors for the run-off triangle in Table 4.11.
4.6.2
Loglinear location-scale models
Table 4.11 displays the simulated run-off triangle for the logistic regression
model with given parameters displayed in Table 4.4.
Fitting the logistic regression model with a chain-ladder type predictor
gives the parameter estimates and standard errors shown in Table 4.10.
172
1
362,573
400,144
819,562
724,419
675,791
544,870
990,881
896,565
482,297
432,302
1,093,549
2
487,703
548,504
1,109,572
999,135
893,821
736,215
1,341,576
1,230,011
674,219
595,206
3
327,399
366,684
665,960
668,363
597,618
471,965
850,040
790,872
437,692
4
247,297
255,014
520,160
478,629
434,052
359,236
639,613
589,761
5
248,321
283,467
566,065
512,920
442,722
377,939
658,638
6
151,494
166,318
330,429
307,563
276,007
222,590
7
137,722
154,915
302,985
275,629
262,520
8
98,983
105,641
216,361
192,212
9
70,587
77,890
156,159
10
50,118
58,763
11
36,110
Table 4.11: Simulated run-off triangle with non-cumulative claim figures for the logistic regression model.
Chapter 4 - Reserving in non-life insurance business
1
2
3
4
5
6
7
8
9
10
11
4.6. Three applications
173
We will compare the derived bounds with a time consuming Monte Carlo
simulation based on 100 000 randomly generated, normally distributed vectors (Y1 , Y2 , . . . , Yt−1 ) and eσ̃~. Using the following properties, the simulation of these last terms can be done in any statistical software package.
• If ij is Gumbel distributed, then we have that eσ̃ij is Weibull distributed with location parameter 1/σ̃ and scale parameter equal to
1.
• If ij is generalized loggamma distributed with parameter k, then
σ̃ij is generalized gamma distributed with parameters
we have that
√
√ e
γ = 1/(σ̃ k) and α = k −σ̃ k . One can generate a random number
from a generalized gamma distribution as follows:
1. Generate Gk from the gamma distribution with location parameter k and scale parameter 1
1
2. Retain α(Gk ) λ .
• If ij is log inverse Gaussian distributed, then we have that eσ̃ij
is inverse Gaussian distributed with location parameter and scale
parameter equal to 1/σ̃. Michael et al. (1976) describe an algorithm
to generate a random number from an inverse Gaussian distribution
with parameters α and β as follows:
1. Generate C from the χ2 (1) distribution
√
C
1
2. Calculate x1 = αβ + 2β
− 2β
4αC + C 2 , x2 =
−1
1 + αβ x1
α2
x
β2 1
and p1 =
3. Generate U ∼ Uniform(0, 1)
4. Retain x2 if U ≤ p1 , else x1 .
On Figures 4.4 and 4.5 we compare the approximations (the convex upper
and lower bounds) for the distribution of the discounted loss reserve SLLS
to the empirical distribution function obtained by a Monte Carlo (MC)
c
simulation study. One can see that the upper bound SLLS
gives a poor
approximation. We observe that this upper bound has heavier tails than
the original distribution — the deviation for upper quantiles reaches 25%.
The main reason for that is a relatively weak dependence between claims,
for which the comonotonic approximation significantly overestimates the
174
Chapter 4 - Reserving in non-life insurance business
p
0.95
0.975
0.99
0.995
0.999
l
SLLS
13,517,204
14,175,492
14,988,558
15,573,369
16,865,068
SLLS
13,524,010
14,165,083
15,009,978
15,483,938
16,623,928
c
SLLS
14,125,203
14,950,838
15,979,224
16,724,584
18,386,959
Table 4.12: Approximations for some selected quantiles with probability
level p of SLLS .
tails, which is very clear both from the plot of cdf’s and from the QQ-plot.
On the other hand the lower bound gives a much better fit to the original
distribution. These findings are confirmed in Table 4.12 for some chosen
quantiles.
Similar conclusions can be drawn from the study of the reserves for the
row totals given by
SLLS,i =
t
X
j=t+2−i
~
e(Rβ)ij +σ̃ij −Y (i+j−t−1) ,
i = 2, · · · , t.
(4.50)
Table 4.13 summarizes the numerical values of the 95th percentiles of the
l
c
two bounds SLLS
and SLLS
, together with their means and standard deviations.
We end this illustration with a bootstrap study in order to incorporate
the estimation error involved. Starting from the run-off triangle in Table
4.11 we bootstrap 5000 pseudo run-off triangles and calculate for each
l
bootstrap sample the 95% percentile of the distribution of SLLS
. Table
4.14 displays the results of this study. One can observe that, compared
to the simulated distribution of FS−1
(0.95), the bootstrap distributions
LLS
performs very well.
175
0.6
0.4
0.0
0.2
cum. distr.
0.8
1.0
4.6. Three applications
6*10^6
8*10^6
10^7
1.2*10^7
1.4*10^7
1.6*10^7
1.8*10^7
discounted IBNR reserve
8*10^6
10^7
1.2*10^7
1.6*10^7
l
Figure 4.4: The cdf ’s of ‘SLLS ’ (MC) (solid line), SLLS
(dotted line) and
c
SLLS (dashed line) for the run-off triangle in Table 4.11.
8*10^6
10^7
1.2*10^7
1.4*10^7
l
c
Figure 4.5: QQ-plot of the quantiles of SLLS
(◦) and SLLS
() versus
those of ‘SLLS ’ (MC).
176
95%
41,609
201,904
330,812
481,854
599,443
1,473,927
1,971,886
1,384,144
1,593,022
5,438,603
13,517,204
st. dev.
2,705
16,524
30,930
48,841
63,572
166,351
222,631
164,670
192,941
650,272
1,559,369
95%
44,057
209,182
339,482
487,849
605,245
1,484,295
1,979,501
1,386,684
1,589,451
5,443,520
13,524,010
SLLS
mean
36,990
173,309
276,778
395,969
487,188
1,178,908
1,576,489
1,090,821
1,248,692
4,278,076
10,743,220
st. dev.
4,124
20,367
35,288
53,075
67,227
171,499
226,440
165,568
193,174
653,437
1,583,892
95%
44,146
212,567
346,424
503,109
625,306
1,535,494
2,057,661
1,443,768
1,663,714
5,693,117
14,125,203
c
SLLS
mean
36,990
173,309
276,778
395,969
487,188
1,178,908
1,576,489
1,090,821
1,248,692
4,278,076
10,743,220
st. dev.
4,133
21,756
38,786
60,567
77,321
201,482
263,879
192,212
224,982
763,183
1,884,508
l
c
Table 4.13: 95th percentiles, means and standard deviations of the distributions of S LLS
and SLLS
vs. ‘SLLS ’
(MC).
Chapter 4 - Reserving in non-life insurance business
year
2
3
4
5
6
7
8
9
10
11
total
l
SLLS
mean
36,990
173,309
276,778
395,969
487,188
1,178,908
1,576,489
1,090,821
1,248,692
4,278,076
10,743,220
4.6. Three applications
1 st
2.5 th
5 th
10 th
25 th
50 th
75 th
90 th
95 th
97.5 th
99 th
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
177
Distribution of bootstrapped
l
95th percentiles of SLLS
Simulated distribution
of FS−1
(0.95)
LLS
13,123,442
13,227,201
13,314,139
13,363,615
13,434,055
13,510,262
13,583,175
13,646,792
13,691,476
13,716,004
13,730,976
13,111,532
13,216,739
13,301,539
13,340,200
13,421,181
13,501,808
13,585,042
13,654,483
13,698,756
13,731,551
13,740,991
Table 4.14: Percentiles of the bootstrapped 95th percentile of the distribuB
tion of the lower bound Sl(95)
vs. the simulation.
4.6.3
Generalized linear models
In this last illustration we model the incremental claims Yij with a logarithmic link function to obtain a multiplicative parametric structure and
we link the expected value of the response to the chain-ladder type linear
predictor. Formally, this means that
E[Yij ] = µij ,
Var[Yij ] = φµκij ,
log(µij ) = ηij ,
ηij
= αi + β j .
(4.51)
The choice of the error distribution is determined by κ.
More specific we consider model (4.51) with the Poisson error distribution (κ=1 and φ = 1). The simulated triangle for this model is depicted in
Table 4.15. Parameter estimates and standard errors for this fit are shown
in Table 4.16.
Since this model is a generalized linear model, standard statistical
software can be used to obtain maximum (quasi) likelihood parameter
estimates, fitted and predicted values. Standard statistical theory also
suggests goodness-of-fit measures and appropriate residual definitions for
diagnostic checks of the fitted model.
178
1
362,505
399,642
805,843
728,762
661,713
539,789
983,897
889,268
487,823
442,982
1,087,672
2
493,876
545,274
1,100,020
994,975
899,778
737,394
1,341,585
1,217,248
666,590
601,706
3
323,065
357,788
722,110
653,231
591,647
484,415
881,786
798,387
437,987
4
237,574
263,414
531,220
478,728
434,626
355,175
647,431
585,099
5
249,850
276,500
557,195
502,797
456,763
372,800
679,264
6
152,221
168,064
337,606
306,071
276,588
226,865
7
139,293
153,603
309,306
278,436
253,297
8
95,961
105,760
213,416
193,201
9
70,812
78,736
158,611
10
53,395
58,612
11
35,902
Table 4.15: Simulated run-off triangle with non-cumulative claim figures for the Poisson regression model.
Chapter 4 - Reserving in non-life insurance business
1
2
3
4
5
6
7
8
9
10
11
4.6. Three applications
Parameter
α1
α2
α3
α4
α5
α6
α7
α8
α9
α10
α11
β2
β3
β4
β5
β6
β7
β8
β9
β10
β11
φ
179
Value
12.8
12.9
13.6
13.5
13.4
13.2
13.8
13.7
13.1
13.0
13.9
0.31
−0.11
−0.42
−0.37
−0.87
−0.96
−1.33
−1.63
−1.92
−2.31
1
Estimate Standard error
12.7990566
0.0007918770
12.8989406
0.0007631003
13.6001742
0.0006060520
13.4989356
0.0006283423
13.4007436
0.0006556928
13.1997559
0.0007180990
13.7991616
0.0005991796
13.6998329
0.0006464691
13.0989431
0.0008707837
12.9987252
0.0010370987
13.8995502
0.0009710197
0.3106789
0.0005310346
−0.1099061
0.0006026958
−0.4189677
0.0006804776
−0.3700452
0.0007168115
−0.8685181
0.0009462170
−0.9585385
0.0010542829
−1.3284870
0.0013825136
−1.6269622
0.0018947413
−1.9170757
0.0030880359
−2.3105083
0.0054029754
1.025663
Table 4.16: Model specification, maximum likelihood estimates and standard errors for the run-off triangle in Table 4.15.
Figure 4.6 shows the distribution functions of the different bounds compared to the empirical distribution obtained by Monte Carlo simulation
(MC). The distribution functions are remarkably close to each other and
enclose the simulated cdf nicely. This is confirmed by the QQ-plot in Figure 4.7 where we also see that the comonotonic upper bound has somewhat
l
heavier tails. Numerical values of some high quantiles of SGLIM , SGLIM
c
and SGLIM are given in Table 4.18.
Table 4.17 summarizes the numerical values of the 95th percentiles of
l
c
the two bounds SGLIM
and SGLIM
vs. SGLIM , together with their means
and standard deviations. This is also provided for the row totals
SGLIM,i =
t
X
j=t+2−i
µ̂ij e−Y (i+j−t−1) ,
i = 2, . . . , t.
(4.52)
Chapter 4 - Reserving in non-life insurance business
0.6
0.4
0.0
0.2
cum. distr.
0.8
1.0
180
10^7
1.5*10^7
2*10^7
discounted IBNR reserve
8*10^6
10^7
1.2*10^7
1.6*10^7
l
Figure 4.6: The cdf ’s of ‘SGLIM ’ (MC) (solid line), SGLIM
(dotted line)
c
and SGLIM (dashed line) for the run-off triangle in Table 4.15.
8*10^6
10^7
1.2*10^7
1.4*10^7
1.6*10^7
l
c
Figure 4.7: QQ-plot of the quantiles of SGLIM
(◦) and SGLIM
() versus
those of ‘SGLIM ’ (MC).
95%
43,622
214,142
342,589
489,087
608,891
1,514,480
1,977,737
1,390,601
1,632,675
5,439,986
13,631,905
st. dev.
4,041
21,002
35,595
52,976
67,401
175,099
227,703
167,320
199,110
655,280
1,594,152
95%
43,624
214,428
343,011
489,689
609,535
1,516,799
1,980,868
1,392,957
1,634,653
5,446,107
13,648,695
SGLIM
mean
36,623
177,600
280,318
396,089
490,289
1,205,224
1,575,313
1,093,992
1,278,947
4,276,121
10,810,476
st. dev.
4,042
21,040
35,691
53,194
67,565
175,658
228,343
167,862
199,693
656,472
1,597,507
95%
43,631
217,352
350,360
502,853
628,672
1,567,945
2,054,475
1,444,660
1,702,375
5,685,932
14,200,226
c
SGLIM
mean
36,623
177,600
280,318
396,089
490,289
1,205,224
1,575,313
1,093,992
1,278,947
4,276,121
10,810,476
st. dev.
4,046
22,751
39,805
60,398
78,021
203,692
268,661
197,121
236,121
785,741
1,896,219
4.6. Three applications
year
2
3
4
5
6
7
8
9
10
11
total
l
SGLIM
mean
36,623
177,600
280,318
396,089
490,289
1,205,224
1,575,313
1,093,992
1,278,947
4,276,121
10,810,476
l
c
Table 4.17: 95th percentiles, means and standard deviations of the distributions of S GLIM
and SGLIM
vs.
‘GLIM ’ (MC).
181
182
Chapter 4 - Reserving in non-life insurance business
p
0.95
0.975
0.99
0.995
0.999
l
SGLIM
13,631,905
14,296,448
15,115,189
15,702,702
16,996,374
SGLIM
13,648,695
14,305,657
15,122,840
15,709,497
17,018,860
c
SGLIM
14,200,226
15,027,414
16,057,613
16,804,206
18,469,110
Table 4.18: Approximations for some selected quantiles with probability
level p of SGLIM .
1 st
2.5 th
5 th
10 th
25 th
50 th
75 th
90 th
95 th
97.5 th
99 th
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
percentile
Distribution of bootstrapped
l
95th percentiles of SGLIM
Simulated distribution
of FS−1
(0.95)
GLIM
13,614,404
13,617,028
13,619,474
13,622,664
13,626,759
13,631,651
13,636,506
13,641,168
13,643,882
13,646,720
13,648,833
13,604,314
13,609,425
13,613,048
13,618,053
13,624,369
13,631,622
13,638,997
13,645,812
13,649,574
13,652,995
13,656,178
Table 4.19: Percentiles of the bootstrapped 95th percentile of the distribuB
tion of the lower bound Sl(95)
vs. the simulation.
The bootstrap results in Table 4.19 are in line with the results of the
previous applications. We can conclude that in the discussed applications
the lower bound approximates the “real discounted reserve” very well. The
precision of the bounds only depends on the underlying variance of the statistical and financial part. As long as the yearly volatility does not exceed
ς = 35%, the financial part of the comonotonic approximation provides a
very accurate fit. These parameters are consistent with historical capital
market values as reported by Ibbotson Associates (2002). The underlying
variance of the statistical part depends on the estimated dispersion parameter and error distribution or mean-variance relationship. For example,
in case of the gamma distribution one obtains excellent results as long
4.7. Conclusion
183
as the dispersion parameter is smaller than 1. This is again in line with
the volatility structure in practical IBNR data sets. Since the parameters
in the paper for the statistical part of the bounds, obtained through the
quasi-likelihood approach, have small standard errors, it follows that results would be similar when simulating from a GLIM with the same linear
predictor, but for instance with another distribution type. In that sense
our findings are robust.
4.7
Conclusion
In this chapter, we considered the problem of deriving the distribution
function of the present value of a triangle of claim payments that are
discounted using some given stochastic return process. We started to model
the claim payments by means of a lognormal linear model which is also
included in the larger class of loglinear location-scale models. The use of
generalized linear models offers a great gain in modelling flexibility over the
simple lognormal model. The incremental claim amounts can for instance
be modelled as independent normal, Poisson, gamma or inverse Gaussian
response variables together with a logarithmic link function and a specified
linear predictor.
Because an explicit expression for the distribution function is hard to
obtain, we presented some approximations for this distribution function, in
the sense that these approximations are larger or smaller in convex order
sense than the exact distribution. When lower and upper bounds are close
to each other, together they can provide reliable information about the
original and more complex variable. An essential point in the derivation
of the presented convex lower bound approximations is the choice of the
conditioning random variable Λ.
When dealing with very large variances in the statistical and financial
part of our model, an adaptation of the random variable Λ will be necessary
or one can use other approximation techniques. This will be the topic of
the next chapter.
Chapter 5
Other approximation
techniques for sums of
dependent random variables
Summary In this chapter we derive some asymptotic results for the tail
distribution of sums of heavy tailed dependent random variables. We show
how to apply the obtained results to approximate certain functionals of
(the d.f. of) sums of dependent random variables. Our numerical results
demonstrate that the asymptotic approximations are typically close to the
Monte Carlo value. We will further briefly recall the mathematical techniques behind the moment matching approximations and the Bayesian approach. Finally, we compare these approximations with the comonotonic
approximations of the previous chapter in the context of claims reserving.
5.1
Introduction
Many quantities of relevance in actuarial science concern functionals of
(the d.f. of) sums of dependent random variables. For example, one can
think of the Value-at-Risk of a stochastically discounted life annuity, or
the stop-loss premium for the aggregate claim amount of a number of interrelated policies. Therefore, distribution functions of sums of dependent
random variables are of particular interest. Typically these distribution
functions are of a complex form. Consequently, in order to compute functionals of sums of dependent random variables, approximation methods
185
186
Chapter 5 - Approximation techniques for sums of r.v.’s
are generally indispensable. Obviously, in many cases we could use Monte
Carlo simulation to obtain empirical distribution functions. However, this
is typically a time-consuming approach, in particular if we want to approximate tail probabilities, which would require an excessive number of
simulations. Therefore, alternative methods need to be explored.
Practitioners often use moment matching techniques to approximate
(the d.f. of) a sum of dependent lognormal random variables. In Section 2
we recall the lognormal and reciprocal gamma moment matching approach.
Both approximations are chosen such that their first two moments are equal
to the corresponding moments of the random variable of interest.
In Chapter 2 we discussed the concept of comonotonicity to obtain
bounds in convex order for sums of dependent random variables. Although these bounds in convex order have proven to be good approximations in case the variance of the random sum is sufficiently small, they
perform much worse when the variance gets large. Section 3 establishes
some asymptotic results for the tail probability of a sum of dependent
random variables, in the presence of heavy-tailedness conditions.
Section 4 sketches, in very broad terms, basic elements of Bayesian
computation. We discuss two major obstacles to its popularity. The first
is how to specify prior distributions, and the second is how to evaluate
the integrals required for inference, given that for most models, these are
analytically intractable.
In the last section we compare the discussed approximations with the
comonotonic approximations of the previous chapter in the context of
claims reserving. In case the underlying variance of the statistical and
financial part of the discounted IBNR reserve gets large, the comonotonic approximations perform worse. We will illustrate this observation
by means of a simple example and propose to solve this problem using the
derived asymptotic results for the tail probability of a sum of dependent
random variables, in the presence of heavy-tailedness conditions. These
approximations are compared with the lognormal moment matching approximations. We finally consider the distribution of the discounted loss
reserve when the data in the run-off triangle is modelled by a generalized
linear model and compare the outcomes of the Bayesian approach with the
comonotonic approximations.
This chapter is based on Laeven, Goovaerts & Hoedemakers (2005),
Vanduffel, Hoedemakers & Dhaene (2004) and Antonio, Beirlant & Hoedemakers (2005).
5.2. Moment matching approximations
5.2
187
Moment matching approximations
Consider a sum S given by
S=
n
X
αi eZi .
(5.1)
i=1
Here, the αi are non-negative real numbers and (Z1 , Z2 , ..., Zn ) is a multivariate normal distributed random vector.
The accumulated value at time n of a series of future deterministic
saving amounts αi can be written in the form (5.1), where Zi denotes the
random accumulation factor over the period [i, n]. Also the present value
of a series of future deterministic payments αi can be written in the form
(5.1), where now Zi denotes the random discount factor over the period
[0, i]. The valuation of Asian or basket options in a Black & Scholes model
and the setting of provisions and required capitals in an insurance context
boils down to the evaluation of risk measures related to the distribution
function of a random variable S as defined in (5.1).
The r.v. S defined in (5.1) will in general be a sum of non-independent
lognormal r.v.’s. Its distribution function cannot be determined analytically and is too cumbersome to work with. In the literature, a variety of
approximation techniques for this distribution function has been proposed.
Practitioners often use a moment matching lognormal approximation
for the distribution of S. The lognormal approximation is chosen such that
its first two moments are equal to the corresponding moments of S.
The present value of a continuous perpetuity with lognormal return
process has a reciprocal gamma distribution, see for instance Milevsky
(1997) and Dufresne (1990). This present value can be considered as the
limiting case of a random variable S as defined above. Motivated by this
observation, Milevsky & Posner (1998) and Milevsky & Robinson (2000)
propose a moment matching reciprocal gamma approximation for the d.f.
of S such that the first two moments match. They use this technique
for deriving closed form approximations for the price of Asian and basket
options.
5.2.1
Two well-known moment matching approximations
It belongs to the toolkit of any actuary to approximate the distribution
function of an unknown r.v. by a known distribution function in such a
188
Chapter 5 - Approximation techniques for sums of r.v.’s
way that the first moments are preserved. In this section we will briefly
describe the reciprocal gamma and the lognormal moment matching approximations. These two methods are frequently used to approximate the
distribution function of the r.v. S defined by (5.1).
The reciprocal gamma approximation
A r.v. X is said to be gamma distributed when its probability density
function is given by
fX (x; α, β) =
β α α−1 −βx
x
e
,
Γ(α)
x > 0,
(5.2)
where α > 0, β > 0 and Γ(.) denotes the gamma function.
Consider now the r.v. Y = 1/X. This r.v. is said to be reciprocal gamma
distributed. Its p.d.f. is given by
fY (y; α, β) = fX (1/y; α, β)/y 2 ,
y > 0.
(5.3)
It is straightforward to prove that the quantiles of Y are given by
FY−1 (p) =
FX−1 (1
1
,
− p; α, β)
p ∈ (0, 1) ,
(5.4)
where FX (.; α, β) is the cdf of the gamma distribution with parameters
α and β. Since the inverse of the gamma distribution function is readily
available in many statistical software packages, quantiles can easily be
determined.
The first two moments of the reciprocal gamma distributed r.v. Y are
given by
1
,
α>1
(5.5)
E[Y ] =
β(α − 1)
and
E[Y 2 ] =
β 2 (α
1
,
− 1)(α − 2)
α > 2.
(5.6)
Expressing the parameters α and β in terms of E[Y ] and E[Y 2 ] gives
α=
2E[Y 2 ] − E[Y ]2
E[Y 2 ] − E[Y ]2
(5.7)
β=
E[Y 2 ] − E[Y ]2
.
E[Y ]E[Y 2 ]
(5.8)
and
5.2. Moment matching approximations
189
The d.f. of the r.v. defined in (5.1) is now approximated by a reciprocal
gamma distribution with first two moments (2.46) and (2.47). The coefficients α and β of the reciprocal gamma approximation follow from (5.7)
and (5.8). The reciprocal gamma approximation for the quantile function
is then given by (5.4).
The reciprocal gamma moment matching method appears naturally in
case one wants to approximate the d.f. of stochastic present values. Indeed,
for the limiting case of the constant continuous perpetuity :
Z ∞
σ2
(5.9)
S∞ =
exp −(µ − )τ − σB(τ ) dτ,
2
0
2
where B(τ ) represents a standard Brownian motion and µ > σ2 , the risk
measures can be calculated very easily since Dufresne (1990) proved that
−1 is gamma distributed with parameters 2µ − 1 and σ 2 . An elegant
S∞
2
σ2
proof for this result can be found in Milevsky (1997).
Expression (5.9) can be seen as a continous counterpart of a discounted
sum such as in (5.1). One expects that the present value of a finite discrete annuity with a normal logreturn process with independent periodic
returns, can be approximated by a reciprocal gamma distribution, provided the time period involved is long enough. This idea was set forward
and explored in Milevsky & Posner (1998), Milevsky & Robinson (2000)
and Huang et al. (2004).
The lognormal approximation
A r.v. X is said to be lognormally distributed if its p.d.f. is given by
fX (x; µ, σ 2 ) =
−(log x−µ)2
1
√ e 2σ2
,
xσ 2π
x > 0,
(5.10)
where σ > 0.
The quantiles of X are given by
FX−1 (p) = eµ+σΦ
−1 (p)
p ∈ (0, 1) .
,
(5.11)
The first two moments of X are given by
1
E[X] = eµ+ 2 σ
and
2
2
E[X 2 ] = e2µ+2σ .
(5.12)
(5.13)
190
Chapter 5 - Approximation techniques for sums of r.v.’s
Expressing the parameters µ and σ 2 of the lognormal distribution in terms
of E[X] and E[X 2 ] leads to
!
E[X]2
µ = log p
(5.14)
E[X 2 ]
and
2
σ = log
E[X 2 ]
E[X]2
.
(5.15)
The same procedure as the one explained in the previous subsection can
be followed in order to obtain a lognormal approximation for S, with the
first two moments matched. Dufresne (2002) obtains a lognormal limit distribution for S as volatility σ tends to zero and this provides a theoretical
justification for the use of the lognormal approximation.
5.2.2
Application: discounted loss reserves
We calculate the lognormal moment matching approximations for the application considered in Section 2.4 and compare the results with the convex
lower bound. The results are given below.
We use the notation SMp [V l ] and SMp [V LN ] to denote the security
margin for confidence level p approximated by the lower bound and by the
lognormal moment matching technique respectively. The different tables
display the Monte Carlo simulation result (MC) for the security margin, as
well as the procentual deviations of the different approximation methods,
relative to the Monte Carlo result. These procentual deviations are defined
as follows:
LB :=
SMp [V l ] − SMp [V M C ]
× 100%,
SMp [V M C ]
LN :=
SMp [V LN ] − SMp [V M C ]
× 100%,
SMp [V M C ]
where V l and V LN correspond to the lower bound approach and the lognormal moment matching approach, and V M C denotes the Monte Carlo
simulation result. The figures displayed in bold in the tables correspond to
the best approximations, this means the ones with the smallest procentual
deviation compared to the Monte Carlo results.
Overall the comonotonic lower bound approach provides a very accurate fit under different parameter assumptions. These assumptions are
5.2. Moment matching approximations
σM :
LB
LN
MC
(s.e. × 107 )
191
0.05
0.15
0.25
0.35
−0.25%
−1.66%
0.0853
(1.11)
−0.09%
+1.28%
0.1090
(2.47)
−0.12%
+4.09%
0.1309
(6.15)
−0.00%
+7.52%
0.1370
(8.18)
Table 5.1: (ex. 1) Approximations for the security margin SM0.70 [V ] for
different market volatilities and ωL = 0.1 and ωA = 0.05.
p:
LB
LN
MC
(s.e. × 105 )
0.995
0.975
0.95
0.90
0.80
0.70
−0.38%
−4.30%
1.0348
(2.49)
−0.21%
−2.96%
0.6927
(0.46)
−0.16%
−2.29%
0.5421
(0.26)
−0.08%
−1.43%
0.3859
(0.10)
−0.00%
−0.11%
0.2192
(0.06)
−0.00%
+1.74%
0.1124
(0.04)
Table 5.2: (ex. 1) Approximations for some selected confidence levels
of SMp [V ]. The market volatility is set equal to 20%. (ωL = 0.05 and
ωA = 0)
σM :
LB
LN
MC
s.e.(×105 )
0.05
0.10
0.15
0.20
0.25
0.30
0.35
−0.19%
−4.94%
0.4390
(0.15)
−0.15%
−3.92%
0.5250
(0.29)
−0.23%
−3.17%
0.6528
(0.41)
−0.16%
−2.49%
0.8103
(0.69)
−0.11%
−1.95%
0.9924
(1.22)
−0.17%
−1.56%
1.1970
(3.78)
−0.38%
−1.30%
1.4232
(4.16)
Table 5.3: (ex. 1) Approximations for the security margin SM0.975 [V ] for
different market volatilities.
p:
LB
LN
MC
s.e.(×105 )
0.995
0.975
0.95
0.90
0.80
0.70
−0.93%
−3.94%
4.4521
(37.63)
−0.04%
+3.78%
2.2264
(2.99)
−0.02%
+7.22%
1.4998
(7.44)
−0.18%
+11.29%
0.8814
(2.79)
−0.03%
+19.68%
0.3508
(0.78)
−0.6%
+53.46%
0.0761
(0.27)
Table 5.4: (ex. 2) Approximations for some selected confidence levels of
SMp [V ]. The market volatility is set equal to 25%.
in line with the realistic market values. Moreover the comonotonic approximations have the advantage that they are easy computable for any
risk measure that is additive for comonotonic risks, such as Value-at-Risk
192
Chapter 5 - Approximation techniques for sums of r.v.’s
and Tail Value-at-Risk. We believe the comonotonic approach is preferred
to any moment matching approximation, because it is more stable and
accurate across all levels of volatility.
5.3
Asymptotic approximations
In actuarial applications it is often merely the tail of the distribution function that is of interest. Indeed, one may think of Value-at-Risk, Conditional
Tail Expectation or Expected Shortfall estimations. Therefore, approximations for functionals of sums of (the d.f. of) dependent random variables
may alternatively be obtained through the use of asymptotic relations. Although asymptotic results are valid at infinity, they may as well serve as
approximations near infinity.
This section establishes some asymptotic results for the tail probabilities related with a sum of heavy tailed dependent random variables.
In particular, we establish an asymptotic result for the randomly weighted
sum of a sequence of non-negative numbers. Furthermore, we establish under two different sets of conditions, an asymptotic result for the randomly
weighted sum of a sequence of independent random variables that consist
of a random and a deterministic component. Throughout, the random
weights are products of i.i.d. random variables and thus exhibit an explicit
dependence structure. Next, we present an application that demonstrates
how the derived asymptotic results can be employed to approximate certain functionals of sums of (the d.f. of) dependent random variables. To
explore the quality of the asymptotic approximations, we also provide a
numerical illustration that compares the asymptotic approximation values
to Monte Carlo simulated values.
5.3.1
Preliminaries for heavy-tailed distributions
First we introduce some notational conventions. For a random variable X
with a distribution function F , we denote its tail probability by F (x) =
1 − F (x) = Pr[X > x]. For two independent r.v.’s X and
R +∞Y with d.f.’s F
and G supported on (−∞, +∞), we write by F ∗G(x) = −∞ F (x−t)G(dt),
−∞ < x < +∞, the convolution of F and G. We denote by F ∗n =
F ∗ · · · ∗ F the n-fold convolution of F , and we write by F ⊗ G the d.f. of
XY .
5.3. Asymptotic approximations
193
Throughout, unless otherwise stated, all limit relations are for x →
+∞. Let a(x) ≥ 0 and b(x) > 0 be two functions satisfying
l− ≤ lim inf
x→+∞
a(x)
a(x)
≤ lim sup
≤ l+ .
b(x)
b(x)
x→+∞
We write a(x) = O (b(x)) if l + < +∞, a(x) = o (b(x)) if l + = 0 and
a(x) b(x) if both l+ < +∞ and l− > 0. We write a(x) . b(x) if l + = 1,
a(x) & b(x) if l− = 1 and a(x) ∼ b(x) if both l + = 1 and l− = 1. We say
that a(x) and b(x) are weakly equivalent if a(x) b(x), and say that a(x)
and b(x) are (strongly) equivalent if a(x) ∼ b(x).
A r.v. X or its d.f. F is said to be heavy-tailed if E[eγX ] = +∞ for
any γ > 0. Below we introduce some important classes of heavy-tailed
distributions. A d.f. F supported on (0, +∞) belongs to the subexponential
class S if
lim F ∗n (x)/F (x) = n
(5.16)
x→+∞
for any (or equivalently, for some) n ≥ 2. More generally, a d.f. F supported on (−∞, +∞) belongs to the class S if Fe(x) = F (x)I(x>0) does. A
d.f. F supported on (−∞, +∞) belongs to the long-tailed class L if for any
real number y (or equivalently, for y = 1) we have that
lim F (x + y) /F (x) = 1.
x→+∞
(5.17)
A class of heavy-tailed distributions that is closely related to the classes
S and L, is the class D of d.f.’s with dominatedly varying tails. A d.f. F
supported on (−∞, +∞) belongs to the class D if its tail F is of dominated
variation in the sense that
lim sup
x→+∞
F (xy)
< +∞
F (x)
(5.18)
for any 0 < y < 1 (or equivalently for some 0 < y < 1). It is well-known
that
D ∩ L ⊂ S ⊂ L.
See e.g. Embrechts et al. (1997). We remark that the intersection D ∩ L
contains many useful heavy-tailed distributions. In particular, the intersection D ∩ L covers the class R, which consists of all d.f.’s with regularly
194
Chapter 5 - Approximation techniques for sums of r.v.’s
varying tails. A d.f. F supported on (−∞, +∞) has a regularly varying
tail if there is some α > 0 such that the relation
lim
x→+∞
F (xy)
= y −α
F (x)
holds true for any y > 0. We denote F ∈ R−α .
In addition to the classes of heavy-tailed distributions introduced above,
we introduce the class R−∞ of d.f.’s with rapidly varying tails, containing
both heavy-tailed and light-tailed distributions. For a d.f. F supported on
(−∞, +∞) satisfying F (x) > 0 for any x > 0, F belongs to the class R−∞
if
F (xy)
0,
for any y > 1;
(5.19)
lim
=
x→+∞ F (x)
+∞, for any 0 < y < 1.
We remark that the intersection S ∩R−∞ contains e.g. lognormal distributions and certain Weibull distributions, which are prominent distributions
in actuarial applications.
For an elaboration on the classes of heavy-tailed distributions and the
class of rapidly varying tailed distributions, and their applications in insurance and finance, the interested reader is referred to Bingham et al.
(1987), Embrechts et al. (1997) and Beirlant et al. (2004).
In Table 5.5 we list some well-known distributions and their corresponding distribution class.
5.3.2
Asymptotic results
In this subsection, we derive some asymptotic results for the tail probability of sums of dependent r.v.’s, in the presence of heavy-tailedness. In
the following, we let {Xn , n = 1, 2, . . .} and {Yn , n = 1, 2, . . .} denote two
sequences of i.i.d. r.v.’s that are mutually independent. We write by FX
the d.f. of a r.v. X of which Xn , n = 1, 2, . . ., are considered to be independent replicates, and assume it is supported on (−∞, +∞). Similarly,
we write by FY the d.f. of a r.v. Y of which Yn , n = 1, 2, . . ., are considered
to be independent replicates, and assume it is supported on (0, +∞). For
notational convenience, we will use the device of independent replicates
throughout.
d.f. or density f
Lognormal
f (x) =
Weibull
Benktander-I
Benktander-II
Pareto
Burr
Loggamma
F (x)
2
F (x) = 1 − cx−α−1 e−β(log x) (α + 2β log x)
F (x) = 1 − cαx−(1−β) exp{−(α/β)xβ }
F (x) = 1 − ( βx )−α
τ
F (x) = 1 − (1 + xβ )−α
βα
f (x) = Γ(α)x
(log x)α−1 x−β
Transformed β
Truncated
α-stable
1 log x−µ 2
√ 1
e− 2 ( σ ) ,
2πσx
β
= 1 − e−cx
|a|
f (x) = B(p,q)
xap−1 (1 + xa )−(p+q)
F (x) = Pr[|X| ≤ x], X ∼ α-stable
Parameters
Class
(µ ∈ R, σ > 0)
R−∞ ∩ S
(c > 0, 0 < β < 1)
(c, α, β > 0)
(c, α > 0, 0 < β < 1)
(α, β > 0)
(α, β, τ > 0)
(α, β > 0)
(a ∈ R, p, q > 0)
(1 < α < 2)
R−∞ ∩ S
R−∞ ∩ S
R−∞ ∩ S
R
R
R
5.3. Asymptotic approximations
Name
R
R
Table 5.5: Some well-known distributions and their distribution class.
195
196
Chapter 5 - Approximation techniques for sums of r.v.’s
We state the following theorem:
Theorem 12.
Let Zi = Y1 Y2 · · · Yi and 0 < ai < +∞, i = 1, 2, . . .. If FY ∈ S ∩ R−∞ ,
then it holds for each n = 1, 2, . . . and x → +∞ that
#
" n
n
X
X
Pr [ai Zi > x] .
(5.20)
a i Zi > x ∼
Pr
i=1
i=1
Proof. See section 5.6.
In an actuarial context the sequence {ai , i = 1, 2, . . .} can be regarded as a
sequence of deterministic payments. The following theorem applies to the
case in which the payments consist of both a deterministic and a random
component, and the deterministic component is either an additive or a
multiplicative constant. The theorem is an extension of Theorems 5.1 and
5.2 of Tang & Tsitsiashvili (2003):
Theorem 13.
Let Zi = Y1 Y2 · · · Yi and 0 < ai < +∞, i = 1, 2, . . .. If the following
conditions are valid:
1. FX ∈ D ∩ L,
2. FY ∈ R−∞ ,
then it holds for each n = 1, 2, . . . and x → +∞ that
" n
#
n
X
X
Pr
(ai + Xi )Zi > x ∼
Pr [(ai + X)Zi > x] .
i=1
(5.21)
i=1
Furthermore, it holds for each n = 1, 2, . . . and x → +∞ that
" n
#
n
X
X
Pr [(ai X)Zi > x] .
Pr
(ai Xi )Zi > x ∼
i=1
Proof. See section 5.6.
i=1
(5.22)
5.3. Asymptotic approximations
197
Corollary 3.
Under the conditions stated in Theorem 13, we have for each n = 1, 2, . . .
and x → +∞ that
" n
#
"n−1
#
X
X
Pr
(ai + Xi )Zi > x −Pr
(ai + Xi )Zi > x ∼ Pr [(an + X)Zn > x] .
i=1
i=1
(5.23)
Furthermore, it holds for each n = 1, 2, . . . and x → +∞ that
#
#
"n−1
" n
X
X
(ai Xi )Zi > x ∼ Pr [an XZn > x] . (5.24)
(ai Xi )Zi > x − Pr
Pr
i=1
i=1
Proof. See section 5.6.
Corollary 4.
If condition 1. stated in Theorem 13 is replaced by “FX ∈ R−α ”, while the
other conditions remain the same, then it holds for each n = 1, 2, . . . and
x → +∞ that
#
" n
n
X
X
(ai + Xi )Zi > x ∼
Pr
F X (x − ai ) (E[Y α ])i .
(5.25)
i=1
i=1
and
#
n
n
X
X
(ai Xi )Zi > x ∼ F X (x)
Pr
aαi (E[Y α ])i .
"
i=1
(5.26)
i=1
Proof. See section 5.6.
We remark that the particular case of lognormally distributed payments is
not covered by Theorem 13, since the lognormal distribution does not belong to the intersection D∩L. The lognormal distribution has a moderately
heavy tail and and has been a popular model for loss severity distributions.
Hence, we state the following theorem:
Theorem 14.
Relations (5.21), (5.22), (5.23) and (5.24) remain valid if conditions 1.
and 2. stated in Theorem 13 are replaced by
2 ), −∞ < µ < +∞ and σ > 0,
1’ X ∼ logN(µX , σX
X
X
2’ Y ∼ logN(µY , σY2 ), −∞ < µY < +∞ and σY > 0,
198
Chapter 5 - Approximation techniques for sums of r.v.’s
3’ σX > σY .
Proof. See section 5.6.
5.3.3
Application: discounted loss reserves
In this subsection, we consider the problem of determining stop-loss premiums and quantiles for discounted loss reserves. We denote by the r.v.
Xi from the i.i.d. sequence {Xi , i = 1, . . . , n}, the net loss in year i. Furthermore, the positive r.v. Yi from the i.i.d. sequence {Yi , i = 1, . . . , n}
represents the present value discounting factor from year i to year i − 1.
The two sequences {Xi , i = 1, . . . , n} and {Yi , i = 1, . . . , n} are considered
to be mutually independent. Then the discounted loss reserve Se is given
by
n
i
X
Y
Xi
Yj .
(5.27)
Se =
i=1
j=1
e
Henceforth, we impose that E[SI
] < +∞, which is implied by the
(S>0)
condition that E[XI(X>0) ] < +∞ and E[Y ] < +∞.
Approximate values for the stop-loss premium and quantiles of the discounted loss reserve Se may be obtained by using the previously obtained
asymptotic results. In particular, if X and Y satisfy the corresponding conditions under which Theorem 13 or Theorem 14 holds, then for sufficiently
large values of the retention d, the stop-loss premium can be approximated
by
e d) ≈
π(S,
n Z
X
i=1
+∞
d
FX
i
j=1
Yj (s)ds =
n
i
Y
X
π X
Yj , d .
i=1
(5.28)
j=1
Q
Since the d.f. of X ij=1 Yj will generally not be analytically tractable,
Monte Carlo simulation may still be required. However, the number of
simulations has been reduced considerably.
In case FX ∈ R−α , 0 < α < +∞, and FY ∈ R−∞ , the asymptotic
approximations for the stop-loss premium of Se reduce to
e d) ≈
π(S,
Z
d
n
+∞ X
i=1
(E[Y α ])i F X (s)ds =
n
X
i=1
(E[Y α ])i π(X, d).
(5.29)
5.3. Asymptotic approximations
199
Furthermore, in this case we have for sufficiently large values of p, that the
asymptotic approximation for the p-quantile is given by
(
)
n
X
−1
α i
(5.30)
F (p) ≈ inf s :
(E[Y ]) F X (s) ≤ 1 − p .
S
i=1
Under the conditions of Theorem 14, we have for sufficiently large values
of p, that the asymptotic approximation for the p-quantile is given by
(
)
n
X
F X i Yj (s) ≤ 1 − p .
F −1 (p) ≈ inf s :
(5.31)
S
i=1
j=1
We emphasize that the approximation (5.31) is not in general valid under
the conditions of Theorem 13; it requires the additional condition that
FX ∈ R−α , 0 < α < ∞.
As an example, we consider Xi ∼ GPD(α, β) and Yi ∼ logN(µ, σ 2 ), i =
1, . . . , n, in which GPD(α, β) denotes the generalized Pareto distribution
with d.f.
x
x > 0,
FX (x) = 1 − (1 + )−α ,
β
where α > 0 and β > 0. Then, clearly we have that FX ∈ R−α and FY ∈
R−∞ . Hence, the asymptotic approximations (5.29) and (5.30) are valid.
Notice that for the example considered, the asymptotic approximations can
even be computed analytically. We performed 5 000 000 Monte Carlo (MC)
simulations for quantiles and stop-loss premiums to assess the quality of the
asymptotic approximations (5.29) and (5.30), under various specifications
of the parameter n. We fix the parameter values: α = 1.5, β = 1, µ =
−0.04 and σ = 0.10. The results are presented in Table 5.6. Ndiff. refers
to the normalized difference defined as MC−Appr.
× 100%. Our numerical
MC
results demonstrate that the asymptotic approximations are typically close
to the Monte Carlo value.
200
Chapter 5 - Approximation techniques for sums of r.v.’s
n=3
d
MC
Appr.
Ndiff.
p
MC
Appr.
Ndiff.
15
20
25
30
35
40
50
60
80
100
150
200
1.50
1.28
1.14
1.03
0.95
0.88
0.78
0.71
0.61
0.55
0.44
0.38
1.36
1.19
1.07
0.98
0.91
0.85
0.76
0.70
0.61
0.54
0.44
0.38
9%
7%
6%
5%
4%
4%
3%
2%
1%
1%
0%
0%
0.95
0.975
0.99
0.995
0.999
16
25
44
69
198
14
22
41
66
194
15%
11%
7%
4%
2%
d
MC
Appr.
MC
Appr.
Ndiff.
20
30
40
60
80
100
150
200
250
300
2.22
1.75
1.48
1.18
1.01
0.90
0.72
0.62
0.56
0.51
1.89
1.56
1.35
1.11
0.96
0.86
0.70
0.61
0.55
0.50
24
36
63
96
274
19
30
57
90
265
22%
17%
10%
6%
3%
d
MC
Appr.
MC
Appr.
Ndiff.
40
60
80
100
150
200
300
400
2.91
2.22
1.86
1.62
1.28
1.09
0.87
0.74
2.41
1.98
1.72
1.54
1.26
1.09
0.88
0.75
40
58
98
148
402
28
45
84
133
390
30%
23%
14%
10%
3%
n=5
Ndiff.
p
15%
11%
9%
6%
5%
4%
3%
2%
2%
2%
0.95
0.975
0.99
0.995
0.999
n = 10
Ndiff.
p
17%
11%
7%
5%
2%
0%
-1%
-1%
0.95
0.975
0.99
0.995
0.999
Table 5.6: Approximations for stop-loss premiums and quantiles of Se for
Pareto claim sizes and lognormal present value discounting factors.
5.4. The Bayesian approach
5.4
201
The Bayesian approach
Some comments on notation are needed at this point. First p(.|.) denotes
a conditional probability density with the arguments determined by the
context, and similarly for p(·), which denotes a marginal distribution. The
same notation is used for continuous density functions and discrete probability mass functions.
5.4.1
Introduction
Bayesian theory is a powerful branch of statistics not yet fully explored by
practitioner actuaries. One of its main benefits, which is the core of its
philosophy, is the ability of including subjective information in a formal
framework. Apart from this, the wide range of models presented by this
branch of statistics is also one of the main reasons why it has been so much
studied recently.
Since the early 1990s, statistics (and to a lesser extent econometrics)
has seen an explosion in applied Bayesian research. This explosion has
had little to do with a renewed interest of the statistics and econometrics
communities to the theoretical foundation of Bayesianism, or to a sudden
awakening to the merits of the Bayesian approach over frequentist methods, but instead can be primarily explained on pragmatic grounds. The
recent developments are mainly due to, firstly, the recent computer developments that have made it easier to perform calculation by simulations
and, secondly, to the failure of classical statistical methods to give solutions
to many problems. Indeed, the use of such tools often enables researchers
to estimate complicated statistical models that would be quite difficult, if
not virtually impossible, using standard frequentist techniques. But, although so many developments have been occurring in Bayesian statistics,
very few actuaries are aware of them and even fewer make use of them.
The purpose of this section is to sketch, in very broad terms, basic elements
of Bayesian computation.
Classical statistics provides methods to analyze data, from simple descriptive measures to complex and sophisticated models. The available
data are processed and then conclusions about a hypothetical population,
of which the data available is supposed to be a representative sample, are
drawn. It is not hard to imagine situations, however, in which data are not
the only available source of information about the population. Bayesian
202
Chapter 5 - Approximation techniques for sums of r.v.’s
methods provide a principled way to incorporate this external information
into the data analysis process. To do so, however, Bayesian methods have
to change entirely the vision of the data analysis process with respect to
the classical approach. In a Bayesian approach, the data analysis process
starts already with a given probability distribution. As this distribution is
given before any data is considered, it is called prior distribution.
Bayesian methods allow us to assign prior distributions to the parameters in the model which capture known qualitative and quantitative features, and then to update these priors in the light of the data, yielding a
posterior distribution via Bayes’ theorem
Posterior ∝ Likelihood × Prior,
where ∝ denotes that two quantities are proportional to each other. Hence
the posterior distribution is found by combining the prior distribution for
the parameters with the probability of observing the data given the parameters (the likelihood). The ability to include prior information in the model
is not only an attractive pragmatic feature of the Bayesian approach, it is
theoretically vital for guaranteeing coherent inferences.
More formally Bayes’ theorem is defined as follows. Consider a process
~ is the vector of observations) are to be taken
in which observations (Y
~
~ |θ),
from a distribution for which the probability density function is p(Y
where θ~ is a set of unknown parameters. Before any observation is made,
the analyst would include all his previous information and judgements of θ~
~ that would be combined with the observations
in a prior distribution p(θ),
~Y
~ ) in the following way:
to give a posterior distribution p(θ|
~Y
~ θ)
~
~ ) ∝ p(Y
~ |θ)p(
p(θ|
Bayesian modelling involves integrals over the parameters, whereas nonBayesian methods often rely on optimization of the parameters. The main
difference between these methods is that optimization fails to take into
account the inherent uncertainty in the parameters. There is no true value
for each of the parameters which can be found by optimization. Instead,
there is a range of plausible values, each with some associated density.
The mechanisms of the Bayesian approach to model fitting to make
inferences consists of three basic steps:
1. Assign priors to all the unknown parameters;
5.4. The Bayesian approach
203
2. Write down the likelihood of the data given the parameters;
3. Determine the posterior distribution of the parameters given the data
using Bayes’ theorem.
Bayesian inference is quite simple to describe probabilistically, but there
have been two major obstacles to its popularity. The first is how to specify
prior distributions, and the second is how to evaluate the integrals required
for inference, given that for most models, these are analytically intractable.
This will be discussed in short in the next two subsections.
5.4.2
Prior choice
The prior distribution can arise from data previously observed, or it can be
the subjective assessment of some domain expert and, as such, it represents
the information we have about the problem at hand, that is not conveyed
by the sample data.
Several methods for eliciting prior densities from experts exist. See,
e.g. O’Hagan (1994) for a comprehensive review. A common approach is
to choose a prior distribution with density function similar to the likelihood
function. In doing so, the posterior distribution of θ~ will be in the same
class and the prior is said to be conjugate to the likelihood. The conjugate
family is mathematically convenient in that the posterior distribution follows a known parametric form. Of course, if information is available that
contradicts the conjugate parametric family, it may be necessary to use a
more realistic, if inconvenient, prior distribution. The basic justification
for the use of conjugate prior distributions is similar to that for using standard models for the likelihood: it is easy to understand the corresponding
results, which can often be put in analytic form. Next, they are often a
good approximation, and they simplify computations. Although they can
make interpretations of posterior inferences less transparent and computation more difficult, non-conjugate prior distributions do not pose any
new conceptual problem. In practice, for complicated models, conjugate
prior distributions may not even be possible. In general, the exponential
families are the only classes of distributions that have natural conjugate
distributions, since, apart from certain irregular cases, the only distributions having a fixed number of sufficient statistics are of the exponential
type.
204
Chapter 5 - Approximation techniques for sums of r.v.’s
Kass and Wasserman (1996) survey formal rules that have been suggested
for choosing a prior. Many of these rules reflect the desire to let the
“data speak for themselves”, so that inferences are unaffected by information external to the current data. This has led to variety of priors with
names like conventional, default, diffuse, flat, formal, generic, indifference,
neutral, non-informative, objective, reference, and vague priors. Prior distributions playing a minimal role in the posterior distribution are called
‘reference prior distributions’. One interpretation of letting the data speak
for themselves is to use classical techniques. Maximum likelihood estimates
are rationalizable in a Bayesian framework by appropriate choice of prior
distribution, specifically a uniform prior.
There are many ways of defining a non-informative prior. The main
objective is to give as little subjective information as possible. So, usually
a prior distribution with a large value for the variance is used. Another
way of including the minimal prior information is to find estimates of the
parameters of the prior distribution, using the data. This last approach
is called the empirical Bayes method, but often there is a relationship
between those two approaches — non-informative and empirical Bayes.
A commonly used reference prior in Bayesian analysis is Jeffreys’ prior
(See Jeffreys (1946)). This choice is based on considering one-to-one trans~ Jeffreys’ general principle is that any
formations of the parameter h(θ).
~ should yield an equivalent rerule for determining the prior density p(θ)
sult if applied to the transformed parameter. This non-informative prior
is obtained by applying Jeffreys’ rule, which is to take the prior density
to be proportional to the square root of the determinant of the Fisher
information matrix. This prior exhibits many nice features that make it
an attractive reference prior. One such property is parametrization invariance. Although Jeffreys’ rule has many desirable properties, it should be
used with caution.
In most cases, Jeffreys’ prior is technically not a probability distribution, since the density function does not have a finite integral over the
parameter space. It is then termed an improper prior. It is often the case
that Bayesian inference based on improper priors returns proper posterior
distributions which then turn out to be numerically equivalent to the results of classical inference. Problems related to the use of improper prior
distributions can be overcome by assigning prior distributions that are as
uniform as possible but still remain probability distributions. The use of
uniform prior distributions to represent uncertainty clearly assumes that
5.4. The Bayesian approach
205
“equally probable” is an adequate representation of “lack of information”.
Theoretically, a prior distribution could be included for all the parameters that are unknown in a model, so that any model could be represented in
a Bayesian way. However, this often leads to intractable problems (mainly
integrals without solution). So the main limitation of Bayesian theory is
the difficulty, and in many cases the impossibility, of solving the required
equations analytically.
In the last decade many simulation techniques have been developed
in order to solve this problem and to obtain estimates of the posterior
distribution. These techniques were turning points for the Bayesian theory,
making it possible to apply many of its models. On one hand, the use
of a final and closed formula for a solution is, generally speaking, more
satisfactory than the use of an approximation through simulation. On the
other hand, simulation gives a larger range of models for which solutions
(or at least good approximations) can be obtained.
5.4.3
Iterative simulation methods
In order to illustrate the simulation philosophy, suppose that the posterior
of a specific parameter θ~ is needed. If an analytical solution was available,
a formula would be derived, where the observed data and known parameters would be included, defining a final result. But, depending on the
model, this solution will not be possible. In such cases an approximation
for the posterior distribution of θ~ is needed. One way of finding this approximation is by simulation, that substitutes the posterior distribution
by a large sample of θ~ based on the characteristics of the model. With this
large sample of θ~ many summary statistics could be calculated, like the
mean, variance or histogram, extracting all the information needed from
this sample of the posterior distribution.
There are a number of ways of simulating and in all of them some
checking should be carried out to guarantee that the simulation set is
really representative for the required distribution. For instance, it must
be checked whether the simulation is mixing well or, in other words, if the
~ It should be
simulation procedure is visiting all the possible values for θ.
also considered how large the sample should be, and whether the initial
point where the simulation starts does not play a big role. Among many
other issues, the moment when convergence to the true distribution of θ~ is
achieved should also be monitored.
206
Chapter 5 - Approximation techniques for sums of r.v.’s
The most popular type of simulation in Bayesian theory are the Markov
Chain Monte Carlo (MCMC) methods. This class of simulation models
has been used in a large number and wide range of applications, and has
been found to be very powerful. The essence of the MCMC method is that
by sampling from specific simple distributions (derived from the combination of the likelihood and prior distributions), a sample from the posterior
distribution will be obtained in an asymptotic way.
Iterative simulation methods, particularly the Gibbs sampler and the
Metropolis Hastings algorithm are powerful statistical tools that facilitate
computation in a variety of complex models. Though these two algorithms
are commonly presented as useful yet distinct instruments for simulating
joint posteriors, this distinction is rather artificial - indeed, one can regard
the Gibbs sampler as a special case of the Metropolis-Hastings algorithm
where jumps along the complete conditional distributions are accepted with
probability one. In conditionally conjugate models, the Gibbs sampler is
typically the algorithm of choice (since the complete posterior conditionals
are easily sampled).
The general strategy with iterative methods is to follow the steps of the
algorithms to generate a series of draws (sometimes called a parameter
chain), say θ0 , θ1 , θ2 , . . . that converge in distribution to some target density
~ ). The algorithms are constructed so that
- in our case, the posterior f (θ|Y
~ ) is the unique stationary distribution of the parameter
the posterior f (θ|Y
chain. Once convergence to the target density is “achieved” we can use
these draws in the same way as with direct Monte Carlo integration to
calculate posterior means, posterior standard deviations, and so on. In
practice, we take care to diagnose that the parameter chain has approached
convergence to the target density, to discard the initial set of the preconvergence draws (often called the burn-in period), and then to use the
post-convergence sample to calculate the desired quantities. Unlike the
non-iterative methods discussed previously, the post-convergence draws
we obtain using these iterative methods will prove to be correlated, as
the distribution of, say, θ t depends on the last parameter sampled in the
chain, θ t−1 . If the correlation among the draws is severe, it may prove to be
difficult to traverse the entire parameter space, and the numerical standard
errors associated with the point estimates can be quite large. When the
simulations are highly correlated, and our chain makes only small local
movements from iteration to iteration, we refer to this as slow mixing of
5.4. The Bayesian approach
207
the parameter chain.
One can find an excellent overview and a detailed discussion of examples of MCMC algorithms in, for example, Gilks et al. (1996). Here we
will describe Gibbs Sampling (GS), a special case of Metropolis-Hastings
algorithms, which is becoming increasingly popular in the statistical community. GS is an iterative method that produces a Markov Chain, that is
a sequence of values {θ~(0) , θ~(1) , θ~(2) , . . .} such that θ~(i+1) is sampled from a
distribution that depends on the current state i of the chain. The algorithm
works as follows.
(0)
(0)
Let θ~(0) = {θ1 , . . . , θk } be a vector of initial values of θ~ and suppose that
~ ) are known
the conditional distributions of θi |(θ1 , . . . , θi−1 , θi+1 , . . . , θk , Y
for each i. The first value in the chain is simulated as follows:
(1)
(1)
(1)
~ );
is sampled from the conditional distribution of θ1 |(θ2 , . . . , θk , Y
(1) ~
(1) (1)
(1)
θ is sampled from the conditional distribution of θ2 |(θ , θ , . . . , θ , Y );
θ1
1
2
3
k
(1) ~
(1)
(1) (1)
);
θk is sampled from the conditional distribution of θk |(θ1 , θ2 , . . . , θk−1 , Y
Then θ~(0) is replaced by θ~(1) and the simulation is repeated to generate
θ~(2) , and so forth. In general, the i-th value in the chain is generated by
simulating from the distribution of θ~ conditional on the value previously
generated θ~(i−1) . After an initial long chain, called burn-in, of say b iterations, the values {θ~(b+1) , θ~(b+2) , θ~(b+3) , . . .} will be approximately a sample
~ from which empirical estimates of the
from the posterior distribution of θ,
posterior means and any other function of the parameters can be computed. Critical issues for this method are the choice of the starting value
θ~(0) , the length of the burn-in and the selection of a stopping rule. The program “WinBugs” provides an implementation of GS suitable for problems
in which the likelihood function satisfies certain factorization properties.
5.4.4
Bayesian model set-up
In this subsection we explain how to set up the relevant Bayesian models
and draw samples from posterior distributions for parameters θ~ and future
observables Ỹ .
We show how simple simulation methods can be used to draw samples
from posterior and predictive distributions, automatically incorporating
208
Chapter 5 - Approximation techniques for sums of r.v.’s
uncertainty in the model parameters, and draw samples for posterior predictive checks.
The simplest and most widely used version of this model is the normal
~ given
linear model, in which the distribution of the response variable Y
the regression matrix X is normal with mean a linear function of X:
~ X] = β1 xi1 + · · · + βk xik ,
E[Yi |β,
for i = 1, . . . , n. We further restrict to the case of ordinary linear regres~ X] = σ 2 for
sion, in which the conditional variances are equal, Var[Yi |θ,
~ X. The
all i, and the observations are conditionally independent given θ,
2
parameter vector is then θ~ = (β1 , . . . , βk , σ ).
Under a standard non-informative prior distribution, the Bayesian estimates and standard errors coincide with the classical results. In the
simplest case, called ordinary linear regression, the observation errors are
independent and have equal variance. In vector notation given by
~ σ 2 , X ∼ Nn (Xβ,
~ σ 2 I),
~ |β,
Y
where I is the n × n identity matrix. In the normal regression model, a
~ log σ) or,
convenient non-informative prior distribution is uniform on (β,
equivalently,
~ σ 2 |X) ∝ σ −2
p(β,
When there are many data points and only a few parameters, the noninformative prior distribution is useful — it gives acceptable results and
takes less effort than specifying prior knowledge in probabilistic form. For
a small sample size or a large number of parameters, the likelihood is less
sharply peaked, and so prior distributions are more important.
~ conditional on σ 2 ,
We determine first the posterior distribution for β,
and then the marginal posterior distribution for σ 2 . That is, we factor the
~ σ 2 |Y
~ 2, Y
~ ) = p(β|σ
~ )p(σ 2 |Y
~ ).
joint posterior distribution for β~ and σ 2 as p(β,
1. Conditional posterior distribution of β~ given σ 2
~ 2, Y
~ˆ V ~ σ 2 ),
~ ∼ N(β,
β|σ
β
with
ˆ
~
β~ = (X0 X)−1 XY
and
Vβ~ = (X0 X)−1
5.5. Applications in claims reserving
209
2. Marginal posterior distribution of σ 2
~ ∼ Inv − χ2 (n − k, s2 ),
σ 2 |Y
where
s2 =
1
~ˆ 0 (Y
~ˆ
~ − Xβ)
~ − Xβ).
(Y
n−k
~ averaging over σ 2 , is multivariThe marginal posterior distribution of β|y,
ate t with n − k degrees of freedom, but we rarely use this fact in practice
when drawing inferences by simulation, since to characterize the joint pos~ 2 . The
terior distribution we can draw simulations of σ 2 and then β|σ
ˆ
standard non-Bayesian estimates of β~ and σ 2 are β~ and s2 , respectively,
as just defined. The classical standard error estimate for β~ is obtained by
setting σ 2 = s2 .
It is easy to draw samples from the posterior distribution: Compute
~ˆ V ~ and s2 and draw then σ 2 from the scaled inverse-χ2 distribution
first β,
β
and β~ from the multivariate normal distribution.
~ |Y
~ ), has two
The posterior predictive distribution of unobserved data, p(Ỹ
components of uncertainty:
1. The fundamental variability of the model, represented by the vari~ , and
ance σ 2 in Y
2. The posterior uncertainty in β~ and σ 2 due to the finite sample size
~ . As the sample size n → ∞, the variance due to posterior unof Y
~ σ 2 ) decreases to zero, but the predictive uncertainty
certainty in (β,
remains.
5.5
5.5.1
Applications in claims reserving
The comonotonicity approach versus the Bayesian approximations
In this subsection we apply a Bayesian model in the context of discounted
loss reserves. The outcomes of this approach are compared with the
comonotonic approximations for the distribution of the discounted loss reserve when the run-off triangle is modelled by a generalized linear model.
210
Chapter 5 - Approximation techniques for sums of r.v.’s
We realize that the Bayesian posterior predictive distribution is a very
general workhorse, which takes into account all sources of uncertainty in
the model formulation and is applicable to different statistical domains,
whereas the comonotonic approximations originate from a specific actuarial
context. We want to illustrate however that the predictive distribution
based on the comonotonic bounds provides results that are close to the
results obtained via MCMC. The main advantage of the bounds is that
several risk measures such as percentiles (VaRs), expected shortfalls (stoploss premiums) and TailVaRs can be calculated easily from it.
As illustrated by Verrall (2004) (for GLIMs) and in earlier work by (for
instance) de Alba (2002) (for lognormal models) Bayesian techniques are
useful in this area as they provide the posterior predictive distribution of
the reserve.
Bayesian methods for the analysis of GLIMs
We consider Bayesian methods for the analysis of generalized linear models,
which provide a general framework for cases in which normality and linearity are not viable assumptions. These cases point out the major computational bottleneck of Bayesian methods: when the assumptions of normality
and/or linearity are removed, usually the posterior distribution cannot be
computed in closed form. We will discuss some computational methods to
approximate this distribution.
Generalized linear models provide a unified framework to encompass
several situations which are not adequately described by the assumptions
of normality of the data and linearity in the parameters. As described in
Chapter 4 (Section 4.3.3), the features of a GLIM are the fact that the
~ |θ~ (θ~ is used to denote the parameter vector) belongs to
distribution of Y
the exponential family, and that a transformation of the expectation of the
~ The parameter
data, g(~
µ), is a linear function of the linear predictor Rβ.
vector is made up of β~ and of the dispersion parameter φ.
Classical analyses of generalized linear models allow for the possibility of variation beyond that of the assumed sampling distribution, called
overdispersion. A prior distribution can be placed on the dispersion pa~ φ) can be described condirameter, and any prior information about p(β,
~ φ) = p(φ)p(β|φ).
~
tional on the dispersion parameter; that is, p(β,
The classical analysis of generalized linear models is obtained if a non~ The posterior mode
informative or flat prior distribution is assumed for β.
5.5. Applications in claims reserving
211
corresponding to a noninformative uniform prior density is the maximum
~ which can be obtained using iterlikelihood estimate for the parameter β,
ative weighted linear regression.
The problem with a Bayesian analysis of GLIMs is that, in general, the
posterior distribution of β~ cannot be calculated exactly, since the marginal
density of the data
Z
~ θ)d
~ θ~
~ ) = p(Y
~ |θ)p(
p(Y
(5.32)
cannot be evaluated in closed form.
Numerical integration techniques can be exploited to approximate (5.32),
from which a numerical approximation of the posterior density of β~ can
be found. When numerical integration techniques become infeasible, we
are left with two main ways to perform approximate posterior analysis:
(i) to provide an asymptotic approximation of the posterior distribution
or (ii) to use stochastic methods to generate a sample from the posterior
distribution.
When the sample size is large enough, posterior analysis can be based
on an asymptotic approximation of the posterior distribution by using a
normal distribution with some mean and variance. This idea generalizes
the asymptotic normal distribution of the maximum likelihood estimates
when their exact sampling distribution cannot be derived or it is too difficult to be used. Asymptotic normality of the posterior distribution provides notable computational advantages, since marginal and conditional
distributions are still normal, and hence inference on parameters of interest can be easily carried out. However, for relatively small samples, the
assumption of asymptotic normality can be inaccurate.
For relatively small samples, stochastic methods (or Monte Carlo methods)
provide an approximate posterior analysis based on a sample of values generated from the posterior distribution of the parameters. The task reduces
to generating a sample from the posterior distribution of the parameters.
A numerical illustration
Consider now the run-off triangle in Table 5.7, taken from Taylor & Ashe
(1983) and used in various other publications on claims reserving.
These data are modelled using a gamma GLIM (see expression (4.51)
with κ = 2) with logarithmic link function.
212
2
766,940
884,021
1,001,799
1,108,250
693,190
937,085
847,631
1,061,648
986,608
3
610,542
933,894
926,219
776,189
991,983
847,498
1,131,398
1,443,370
4
482,940
1,183,289
1,016,654
1,562,400
769,488
805,037
1,063,269
5
527,326
445,745
750,816
272,482
504,851
705,960
6
574,398
320,996
146,923
352,053
470,639
7
146,342
527,804
495,992
206,286
8
139,950
266,172
280,405
Table 5.7: Run-off triangle with non-cumulative claim figures.
9
227,299
425,046
10
67,948
Chapter 5 - Approximation techniques for sums of r.v.’s
1
2
3
4
5
6
7
8
9
10
1
357,848
352,118
290,507
310,608
443,160
396,132
440,832
359,480
376,686
344,014
0.0
0.5
1.0
1.5
213
-0.5
-0.5
0.0
0.5
1.0
1.5
5.5. Applications in claims reserving
2
4
6
8
10
2
4
6
8
10
development year
-0.5
0.0
0.5
1.0
1.5
year of origin
2
4
6
8
10
calendar year
Figure 5.1: Weighted residuals for linear predictor in (5.33), together
with average lines, which represent the average of the weighted standardized
residuals in each period of interest. Note that the average is zero when no
observations occur.
Although the linear predictor in the Probabilistic Trend Family of
models (4.32) is over-parameterized, it provides a flexible modelling structure. For example, one might begin with three parameters (α, β, γ), with
one accident period level parameter, one development period trend parameter and one calendar period trend parameter, which equates to a linear
predictor with the following form,
ηij = α + (j − 1)β + (i + j − 2)γ.
(5.33)
Adding more accident, development and calendar period parameters where
necessary, allows the structure to be extremely flexible.
The weighted residuals for this model (Figure 5.1) indicate that there
are major trends in the development period that are not being captured.
There also appears to be a level change between accident periods one, two-
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
Chapter 5 - Approximation techniques for sums of r.v.’s
0.8
214
2
4
6
8
10
2
4
6
8
10
development year
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
year of origin
2
4
6
8
10
calendar year
Figure 5.2: Weighted residuals for linear predictor in (5.34), together
with average lines, which represent the average of the weighted standardized
residuals in each period of interest. Note that the average is zero when no
observations occur.
three, four and five. To capture these trends extra development period
trend parameters and extra accident period level parameters are required.
The new form of the linear predictor is given by
ηij
= α1 I(i=1) + α2 I(i=2,3) + α3 I(i=4) + α4 I(i>4) + β1 I(j>1)
+ β2 I(j>4) + (j − 5)β3 I(5<j<9) + 3β3 I(j>8) .
(5.34)
The weighted residuals for this updated model (Figure 5.2) indicate that
(5.34) appears to capture the significant levels and trends in the data.
We recall from the previous chapter the definition of the discounted
IBNR reserve under a generalized linear model and normal logreturn process.
SGLIM =
t
t
X
X
i=2 j=t+2−i
~ˆ ij e−Y (i+j−t−1) ,
g −1 (Rβ)
5.5. Applications in claims reserving
year
2
3
4
5
6
7
8
9
10
total
l
SGLIM
360,725
700,465
945,845
1,441,016
1,913,383
2,519,292
3,557,014
4,573,767
5,577,925
20,949,190
c
SGLIM
387,404
765,451
982,425
1,513,186
1,977,934
2,614,564
3,702,302
4,770,944
5,821,804
21,988,048
215
Bayesian
436,151
760,177
970,535
1,448,056
1,919,300
2,558,208
3,641,890
4,727,262
5,638,301
20,360,196
Table 5.8: 95th percentile of the predictive distribution of SGLIM
where the returns are modelled by means of a Brownian motion described
by the following equation
Y (i) = (δ +
ς2
)i + ςB(i),
2
where B(i) is the standard Brownian motion, ς is the volatility and δ is a
constant force of interest.
The discounting process (with δ = 0.08 and ς = 0.11) is incorporated in
the WinBugs code for the gamma GLIM. To enable comparisons with the
results from the comonotonic bounds , flat priors were used both for the row
and column parameters of the linear predictor and for the scale parameter
in the gamma model. Table 5.8 contains the results obtained via MCMC
simulations with the WinBugs program. A burn-in of 10 000 iterations was
allowed, after which another 10 000 iterations were performed.
The bounds for the discounted loss reserve use the maximum likelihood estimates of the parameters in the linear predictor. To incorporate
the error arising from the estimation of these parameters we apply the
bootstrap algorithm as explained in Section 4.5. We bootstrapped 1000
times, computed each time (analytically) the 95th percentile of upper and
lower bound. Table 5.8 compares the Bayesian 95th percentile and the
bootstrapped 95th percentile of the lower and upper bound for the different reserves.
The results for the upper and lower bounds in convex order are given in
the same table. One can see that the results from the comonotonic bounds
are close to the results obtained via MCMC simulation. Thus, at least for
216
Chapter 5 - Approximation techniques for sums of r.v.’s
this example, these bounds provide actuaries with accurate information
concerning the predictive distribution of discounted loss reserves.
5.5.2
The comonotonicity approach versus the asymptotic
and moment matching approximations
In case the underlying variance of the statistical and financial part of the
discounted IBNR reserve gets large, the comonotonic approximations perform worse. We will illustrate this by means of a simple example in the
context of loss reserving and propose to solve this problem using the asymptotic approximations introduced in Section 5.3.
In the following, we assume that the r.v.’s Yij , i, j = 1, . . . , t can be
expressed as products of a deterministic component and an i.i.d. random
component. In particular, we consider the following model
Yij = aij Y ij ,
i, j = 1, . . . , t,
(5.35)
in which Y ij , i, j = 1, . . . , t are i.i.d. r.v.’s and aij > 0, i, j = 1, . . . , t are
positive numbers.
We will consider in this part the simple lognormal linear model (4.1)
~ = Rβ~ + ~,
lnY
~ ∼ N (0, σ 2 I),
~ as before the vector of historical claim figures.
with Y
The accumulated IBNR reserve is given by
IBNR reserve =
t
t
X
X
aij Y ij .
(5.36)
i=2 j=t+2−i
We will again incorporate stochastic discounting factors. We let the positive r.v. Vk from the i.i.d. sequence {Vk , k = 1, . . . , t−1} denote the present
value discounting factor from year k to year k − 1 and consider the two
sequences {Y ij , i = 2, . . . , t; j = t + 2 − i, . . . , t} and {Vk , k = 1, . . . , t − 1}
to be mutually independent. Furthermore, for notational convenience, we
introduce the positive r.v. Zk = V1 V2 · · · Vk , k = 1, . . . , t − 1. Then the
discounted IBNR reserve S is given by
S=
t
t
X
X
i=2 j=t+2−i
aij Y ij Zi+j−t−1 .
(5.37)
5.5. Applications in claims reserving
217
Henceforth, we impose that E[S] < +∞. Approximate values for stoploss premiums and quantiles for S may be obtained by using asymptotic
results. In particular, if {Y ij , i = 2, . . . , t; j = t + 2 − i, . . . , t} and {Vk , k =
1, . . . , t − 1} satisfy the corresponding conditions under which Theorem 13
or Theorem 14 is valid, then for sufficiently large values of d, we have that
π(S, d) ≈
t
t
X
X
i=2 j=t+2−i
aij π Y Zi+j−t−1 , d/aij .
(5.38)
Furthermore, if either FY ∈ R−α for some 0 < α < +∞, and FV ∈ R−∞ ,
or the conditions of Theorem 14 apply, then for sufficiently large values of
p, we have that


t
t

 X
X
F Y Zi+j−t−1 (s/aij ) ≤ 1 − p .
FS−1 (p) ≈ inf s :
(5.39)


i=2 j=t+2−i
As an example, we consider a lognormal linear regression model with chainladder linear predictor to describe the random claims and we use a geometric Brownian motion with drift to represent the stochastic discount factors.
We remark that for this specification Theorem 14 applies. Furthermore, for
this specification the products Y ij Zi+j−t−1 , i = 2, . . . , t; j = t + 2 − i, . . . , t
are lognormal and therefore the present value of the IBNR reserve becomes
a linear combination of dependent lognormal r.v.’s, given by
S=
t
t
X
X
aij Y ij Zi+j−t−1 =
i=2 j=t+2−i
t
t
X
X
eηij eεij e−Y (i+j−t−1) . (5.40)
i=2 j=t+2−i
Notice that this definition is the same as (4.35) for the special case of the
lognormal linear model (with σ = σ̃) and chain-ladder type linear predictor
~ ij = αi + βj .
ηij = (Rβ)
In this illustration, we start with a given set of parameters and define
the reserve as expressed in (5.40). In a real reserving exercise, one has to
build an appropriate statistical model based on the incremental claims in
the run-off triangle and to estimate the parameters from this model.
Using the same notation as in the previous chapter we have for W̃ij :=
−Y (i + j − t − 1) that
1
E[W̃ij ] = −(δ + ς 2 )(i + j − t − 1),
2
2
Var[W̃ij ] = σW̃ = (i + j − t − 1)ς 2 .
ij
218
Chapter 5 - Approximation techniques for sums of r.v.’s
The asymptotic approximations (5.38) and (5.39) become
π(S, d) ≈
t
t
X
X
e
ηij +E[W̃ij ]+ 21 (σ 2
W̃ij
+σ 2 )
i=2 j=t+2−i
q 2
2
ηij + E[W̃ij ] + σW̃
+ σ 2 − log(d) / σW̃
+ σ2
ij
ij
q
2
−dΦ ηij + E[W̃ij ] − log(d) / σW̃
+ σ 2 , d ∈ R+ ,
ij


t
t

 X
X
FS−1 (p) ≈ inf s :
F LN (s) ≤ 1 − p ,
p ∈ (0, 1),


×Φ
i=2 j=t+2−i
2
in which FLN is the cdf of logN ηij + E[W̃ij ], σW̃
+ σ2 .
ij
To compute the lognormal moment matching approximations as described
in Section 5.2 we need expressions for the mean and variance of S. These
are given by
t
t
X
X
E[S] =
e
ηij +E[W̃ij ]+ 21 σ 2
W̃ij
+σ 2
i=2 j=t+2−i
t
t
t
X
X
X
Var[S] =
t
X
e
,
σ 2 + ηij +ηkl +E[W̃ij ]+E[W̃kl ] + 21 σ 2
i=2 j=t+2−i k=2 l=t+2−k
where
σ 2∗
=
W̃ij
2
2∗
× eς min(i+j−t−1,k+l−t−1)+σ − 1 ,
+σ 2
W̃kl
σ 2 if i, j = k, l;
0 if i, j 6= k, l.
We arbitrarily set σ = 3, δ = −0.07, ς
chain-ladder parameters:

 

α1
1.1
 α   1.6 
 2  


 

 α3  =  1.9  ,

 

 α4   2.1 
α5
2.2
= 0.2 and t = 5 and use the following







β1
β2
β3
β4
β5


 
 
 
=
 
 
0
−0.42
−0.38
−0.87
−0.96




.


5.5. Applications in claims reserving
219
d
MC
Appr. 1
Appr. 2
Appr. 3
Ndiff. 1
Ndiff. 2
Ndiff. 3
7500
10000
15000
20000
25000
30000
40000
50000
75000
100000
150000
200000
250000
300000
400000
500000
1868.0
1743.5
1568.7
1446.7
1354.0
1279.7
1165.7
1080.4
933.3
835.6
708.5
626.2
566.5
520.7
453.9
406.5
1771.6
1658.1
1496.9
1383.1
1295.8
1225.4
1116.7
1034.8
892.5
797.4
673.0
592.0
533.4
488.4
422.6
375.9
2541.1
2459.2
2333.6
2237.8
2160.0
2094.4
1987.6
1902.4
1743.6
1600.2
1437.8
1323.5
1260.8
1190.4
1081.8
1000.2
2277.6
2165.8
1998.4
1874.1
1775.2
1693.4
1563.3
1462.2
1280.5
1154.9
985.3
871.7
788.2
723.1
626.8
557.7
5.2%
4.9%
4.6%
4.4%
4.3%
4.2%
4.2%
4.2%
4.4%
4.6%
5.0%
5.5%
5.8%
6.2%
6.9%
7.5%
-36.0%
-41.0%
-48.8%
-54.7%
-59.5%
-63.7%
-70.5%
-76.1%
-86.8%
-91.5%
-102.9%
-111.4%
-122.6%
-128.6%
-138.3%
-146.1%
-21.9%
-24.2%
-27.4%
-29.5%
-31.1%
-32.3%
-34.1%
-35.3%
-37.2%
-38.2%
-39.1%
-39.2%
-39.1%
-38.9%
-38.1%
-37.2%
p
MC
Appr. 1
Appr. 2
Appr. 3
Ndiff. 1
Ndiff. 2
Ndiff. 3
0.95
0.975
0.99
0.995
0.999
8650
17000
38957
70795
257090
7863
15868
37496
68885
253021
4814
12436
37490
79477
374188
7555
17296
45306
87283
337364
9.1%
6.7%
3.8%
2.7%
1.6%
44.3%
26.8%
3.8%
-12.3%
-45.5%
12.7%
-1.7%
-16.3%
-23.3%
-31.2%
Table 5.9: Monte Carlo (MC) versus approximate values of stop-loss
premiums and quantiles for chain-ladder claim sizes and lognormal present
value discounting factors.
In Table 5.9 we numerically compare the asymptotic approximations with
a Monte Carlo (MC) study based on 5 000 0000 simulations. Numerical
results of the comonotonic and moment matching approximations have
also been included. “Appr. 1” refers to the asymptotic approximation,
“Appr. 2” to the convex upper bound and “Appr. 3” to the lognormal
moment matching approach. “Ndiff. ” refers to the normalized difference
× 100%. The numerical results demonstrate that the
defined as MC−Appr.
MC
asymptotic approximation values generally outperform the comonotonic
upper bound and the lognormal moment matching technique. Because the
comonotonic lower bound performed remarkably bad, its numerical values
were left out of the table.
220
5.6
Chapter 5 - Approximation techniques for sums of r.v.’s
Proofs
Theorem 12
In order to prove the theorem, we first establish the following result from
Tang & Tsitsiashvili (2004):
Lemma 11.
Let F1 , F2 and G be three d.f.’s. Suppose that F i (x) > 0 for any real
number x, Fi (0)G(0) = 0, i = 1, 2, and G ∈ R−∞ . If F 1 (x) ∼ F 2 (x), then
F1 ⊗ G(x) ∼ F2 ⊗ G(x).
(5.41)
Proof. From the condition F 1 (x) ∼ F 2 (x) we know that, for any 0 < ε < 1
and all large x, say x ≥ y0 for some y0 > 0,
(1 − ε)F 2 (x) ≤ F 1 (x) ≤ (1 + ε)F 2 (x).
(5.42)
It is not difficult to verify that since F i (y0 ) > 0 for all y0 > 0, i = 1, 2, we
have by the definition of the class R−∞ for i = 1, 2, that
R y0
R y0
G(x/y)F
(dy)
i
0 G(x/y)Fi (dy)
0
≤ lim sup R +∞
lim sup R +∞
x→+∞
x→+∞
y0 G(x/y)Fi (dy)
2y0 G(x/y)Fi (dy)
≤ lim sup
x→+∞
G(x/y0 )(Fi (y0 ) − Fi (0))
G(x/2y0 )F i (2y0 )
= 0
and hence that for i = 1, 2
Z y0
Z
Fi ⊗ G(x) =
G (x/y) Fi (dy) +
∼
Z
0
+∞
G (x/y) Fi (dy)
y0
+∞
G (x/y) Fi (dy)
y0
= G (x/y0 ) F i (y0 ) +
Z
x/y0
F i (x/y)G(dy).
0
Substituting (5.42) to the above leads to
(1 − ε)F2 ⊗ G(x) . F1 ⊗ G(x) . (1 + ε)F2 ⊗ G(x).
Hence, relation (5.41) follows from the arbitrariness of 0 < ε < 1.
5.6. Proofs
221
Then, we proceed with the proof of Theorem 12.
Proof. Clearly, it holds that
" n
#
h
i
X
Pr
ai Zi > x = Pr Y1 a1 + Y2 a2 + . . . Yn−1 an−1 + an Yn
>x .
i=1
Since FY ∈ L and an > 0, we have that
Pr [an−1 + an Yn > x] ∼ Pr [an Yn > x] .
Hence, applying Lemma 11 we obtain that
Pr [Yn−1 (an−1 + an Yn ) > x] ∼ P [an Yn−1 Yn > x] .
Repeatedly applying Lemma 11, we finally obtain that
h
i
Pr Y1 a1 + Y2 a2 + . . . Yn−1 an−1 + an Yn
>x
∼ Pr [an Y1 Y2 · · · Yn−1 Yn > x] .
For the remainder of the proof it suffices to verify that the probabilities
Pr [ai Zi > x], i = 1, 2, · · · , n − 1, on the right-hand side of (5.20) can be
neglected when compared with the probability Pr [an Zn > x]. Since the
class R−∞ is closed under product convolution, we have that the d.f. of
Q
the product ij=1 Yj belongs to the class R−∞ for each i = 1, 2, . . .. Hence,
we verify that for each i = 1, 2, . . . , n − 1, and some 0 < v < 1,
h Q
i
Pr ai ij=1 Yj > x
h Q
i
lim sup
n
x→+∞ Pr a
n
j=1 Yj > x
h Q
i
Pr ai ij=1 Yj > x
i
h Q
≤ lim sup
i
an Q n
x→+∞ Pr a
Y
>
1/v
Y
>
vx,
i
j=i+1 j
j=1 j
ai
h Q
i
i
Pr
a
Y
>
x
i
j
j=1
1
h Q
i lim sup
h Q
i
=
i
n
Pr aani j=i+1 Yj > 1/v x→+∞ Pr ai j=1 Yj > vx
=0
This proves that (5.20) holds.
.
222
Chapter 5 - Approximation techniques for sums of r.v.’s
Theorem 13
To prove the theorem, we first state three lemma’s.
Lemma 12.
Let X and Y be two independent r.v.’s, where X is supported on (−∞, +∞)
with a d.f. F , and Y is strictly positive with a d.f. G. Let V = XY and
denote by H the d.f. of V . If F ∈ D∩L and G ∈ R−∞ , then H ∈ D∩L ⊂ S
and
H(x) F (x).
Proof. This lemma can easily be proved by Lemma 3.8 and Lemma 3.10
of Tang & Tsitsiashvili (2003).
Lemma 13.
If F ∈ D and G ∈ R−∞ , then there exists some ε > 0 such that
G x1−ε = o F (x) .
Proof. This lemma can be proved by Lemma 3.7 of Tang & Tsitsiashvili
(2003).
Lemma 14.
Let F = F1 ∗ F2 , where F1 and F2 are two d.f.’s supported on (−∞, +∞).
If F1 ∈ S, F2 ∈ L, and F 2 (x) = O F 1 (x) , then F ∈ S and
F (x) ∼ F 1 (x) + F 2 (x).
Proof. This result can be obtained by fixing γ = 0 in Lemma 3.2 of Tang
& Tsitsiashvili (2003).
We are now ready to prove Theorem 13.
Proof. First we prove (5.21), which says that
Pr [(a1 + X1 )Y1 + . . . + (an−1 + Xn−1 )Yn−1 . . . Y1 +
+(an + Xn )Yn Yn−1 . . . Y1 > x]
∼ Pr [(a1 + X1 )Y1 > x] + . . . + Pr [(an−1 + Xn−1 )Yn−1 . . . Y1 > x]
+Pr [(an + Xn )Yn Yn−1 . . . Y1 > x] .
5.6. Proofs
223
Applying Lemma 12, we have that the product (an + Xn )Yn is subexponentially distributed and
Pr [(an + Xn )Yn > x] F (x).
(5.43)
Applying Lemma 14, we have that
Pr [(an−1 + Xn−1 ) + (an + Xn )Yn > x]
∼ Pr [(an−1 + Xn−1 ) > x] + Pr [(an + Xn )Yn > x] .
1−ε
Since, by
Lemma 13, there exists some ε > 0 such that G x
o F (x) , we have that
=
Pr [(an−1 + Xn−1 )Yn−1 + (an + Xn )Yn Yn−1 > x]
Z x1−ε Z +∞ !
+
=
Pr [(an−1 + Xn−1 )y + (an + Xn )Yn y > x] dG(y)
0
x1−ε
x
dG(y) + o F (x)
=
Pr (an−1 + Xn−1 ) + (an + Xn )Yn >
y
0
Z x1−ε x
x
Pr (an−1 + Xn−1 ) >
+ Pr (an + Xn )Yn >
dG(y)
∼
y
y
0
+o F (x)
Z +∞ Z +∞ x
Pr (an−1 + Xn−1 ) >
=
−
y
0
x1−ε
x
dG(y) + o F (x)
+Pr (an + Xn )Yn >
y
Z
x1−ε
= Pr [(an−1 + Xn−1 )Yn−1 > x] + Pr [(an + Xn )Yn Yn−1 > x] + o F (x)
∼ Pr [(an−1 + Xn−1 )Yn−1 > x] + Pr [(an + Xn )Yn Yn−1 > x] .
Furthermore, by application of Lemma’s 12 and 14, it follows that (an−1 +
Xn−1 )Yn−1 + (an + Xn )Yn Yn−1 is subexponentially distributed and that
Pr [(an−1 + Xn−1 )Yn−1 + (an + Xn )Yn Yn−1 > x] F (x).
Simply repeating the procedure above and observing that
(an−2 + Xn−2 )Yn−2 + (an−1 + Xn−1 )Yn−1 Yn−2 + (an + Xn )Yn Yn−1 Yn−2
= [(an−2 + Xn−2 ) + (an−1 + Xn−1 )Yn−1 + (an + Xn )Yn Yn−1 ] Yn−2 ,
224
Chapter 5 - Approximation techniques for sums of r.v.’s
we obtain that
Pr [(an−2 + Xn−2 )Yn−2 + (an−1 + Xn−1 )Yn−1 Yn−2 +
+(an + Xn )Yn Yn−1 Yn−2 > x]
∼ Pr [(an−2 + Xn−2 )Yn−2 > x] + Pr [(an−1 + Xn−1 )Yn−1 Yn−2 > x]
+Pr [(an + Xn )Yn Yn−1 Yn−2 > x] .
Hence, repeating the procedure above n − 1 times yields the announced
result (5.21). The proof of (5.22), can be given completely analogously to
the above, since the distribution of ai Xi satisfies
Pr [ai Xi > x] = F (x/ai ) F (x)
and is subexponential.
Corollary 3
Proof. Using (5.43), one can easily verify that
i
hP
n
(a
+
X
)Z
>
x
Pr
i
i
i=1 i
i > 1,
lim inf h P
n−1
x→+∞
Pr
(a
+
X
)Z
>
x
i
i
i
i=1
and that
i
hP
n
(a
X
)Z
>
x
Pr
i
i=1 i i
i > 1.
lim inf h P
n−1
x→+∞
Pr
(a
X
)Z
>
x
i
i
i
i=1
Hence, we can prove (5.23) and (5.24) by substituting (5.21) and (5.22)
into the left-hand-side of (5.23) and (5.24), respectively.
Corollary 4
Proof. Given the asymptotic results (5.21) and (5.22), the proof of this
corollary follows immediately from a well-known result, which was referred
by Cline (1986) to Proposition 3 of Breiman (1965).
Theorem 14
In case the conditions 1 and 2 of Theorem 13 are replaced by the conditions
1’, 2’ and 3’ of Theorem 14, the proof of (5.21) can be established completely analogously to the proof of Theorem 13 using the following three
5.6. Proofs
225
lemma’s, which are the analogs of Lemma 12, Lemma 13 and Lemma 14,
respectively:
Lemma 15.
Let X and Y be two independent lognormally distributed r.v.’s with σ Y <
σX . Furthermore, let V = XY and denote by H the d.f. of V . Then V
follows a lognormal law and F (x) = o(H(x)).
Lemma 16.
If both F and G are lognormal laws with σG < σF , then there exists some
ε > 0 such that
G x1−ε = o F (x) .
Lemma 17.
Let F = F1 ∗ F2 , where F1 and F2 are two lognormal laws. Then F ∈ S
and
F (x) ∼ F 1 (x) + F 2 (x).
Proof. This is a special case of Corollary 1 of Cline (1986) and moreover
is a special case of Lemma 14.
We are now ready to proof Theorem 14.
Proof. The proof of (5.22) can be given analogously, since the distribution
2 .
of ai Xi is again lognormal with Var[log(ai Xi )] = Var[log(Xi )] = σX
Finally, we prove (5.23) and (5.24). By application of Lemma 11 and the
same reasoning as in the proof of Theorem 12, we have for each n = 1, 2, . . .,
and some 0 < v < 1 that
226
Chapter 5 - Approximation techniques for sums of r.v.’s
h
i
Qi
Pr
(a
+
X)
Y
>
x
i
j
i=1
j=1
i
h
lim inf P
Qi
n−1
x→+∞
j=1 Yj > x
i=1 Pr (ai + X)
h
i
Q
Pr (an + X) nj=1 Yj > x
i
h
≥ lim inf P
Qi
n−1
x→+∞
Y
>
x
Pr
(a
+
X)
i
j=1 j
i=1
Pn
=
1
Pn−1
i=1
≥
Pn−1
i=1
=
Pn−1
i=1
lim supx→+∞
Pr[(ai +X)
i
j=1 Yj >x
n
j=1 Yj >x
]
]
Pr[(an +X)
1
lim supx→+∞
Pr[(ai +X)
Pr[(an +X)
i
j=1
i
j=1
Yj >x]
Yj >vx]Pr[
n
j=i+1
Yj >1/v ]
1
lim supx→+∞
Pr[X
Pr[X
i
j=1
i
j=1
Yj >x]
Yj >vx]Pr[
n
j=i+1
Yj >1/v ]
= +∞ > 1.
and
i
Q
(ai X) ij=1 Yj > x
h
i
lim inf P
Qi
n−1
x→+∞
Pr
(a
X)
Y
>
x
i
i=1
j=1 j
i
h
Qn
Pr (an X) j=1 Yj > x
i
h
≥ lim inf P
Qi
n−1
x→+∞
>
x
Y
Pr
(a
X)
j
i
j=1
i=1
Pn
i=1 Pr
=
h
1
Pn−1
i=1
lim supx→+∞
Pr[(ai X)
Pr[(an X)
i
j=1 Yj >x
n
j=1 Yj >x
]
]
= +∞ > 1.
Hence, we can prove (5.23) and (5.24) by substituting (5.21) and (5.22)
into the left-hand-side of (5.23) and (5.24), respectively.
Samenvatting in het
Nederlands (Summary in
Dutch)
Inleiding
In deze thesis bekijken we de reserveringsproblematiek in de verzekeringswereld van naderbij. Een reserveringsstudie komt in grote lijnen neer op de
bepaling van de huidige waarde van de toekomstige schade-uitkeringen. De
deskundigheid en nauwkeurigheid waarmee dit onzeker bedrag tot stand
komt is dan ook cruciaal voor een maatschappij en haar polishouders. De
intrinsieke onzekerheden die hiermee gepaard gaan, mogen bovendien geen
excuus zijn om van een sterk wetenschappelijk onderbouwde analyse af
te zien. Belangen en prioriteiten kunnen verschillen tussen al diegenen
die te maken krijgen met reserveschattingen. Voor het management moet
deze schatting betrouwbare informatie verschaffen om de leefbaarheid en
de winstgevendheid van de maatschappij te maximaliseren. Voor de controle instantie, die zich bezighoudt met de solvabiliteit, moeten de reserves
conservatief bepaald worden om de kans op een faillissement te reduceren.
Voor de fiscus moeten de reserves de werkelijke betalingen zo goed mogelijk weergeven. De polishouder ten slotte wil dat de reserves voldoende
zijn om verzekerde schadegevallen te kunnen betalen, maar wil niet beboet
worden onder de vorm van een te hoge premie voor die garantie.
Het voornaamste doel van het reserveringsproces kan eenvoudig als
volgt beschreven worden. Vanaf een bepaalde, vooraf overeengekomen, dag
is een verzekeraar verantwoordelijk voor alle opgelopen claims. Kosten die
dit schadegeval met zich meebrengen worden opgedeeld in twee categorieën:
227
228
Samenvatting in het Nederlands (Summary in Dutch)
diegene die reeds betaald zijn en diegene die nog niet (volledig) betaald
zijn. Het voornaamste doel van het reserveringsproces is nu het schatten
van die kosten die nog niet betaald zijn door de maatschappij. De verdeling
van mogelijke geaggregeerde onbetaalde schadegevallen kan voorgesteld
worden als een kansdichtheidsfunctie. Er is reeds veel geschreven over
de statistische verdelingen die geschikt zijn bij de studie van risico’s en
verzekeringen. In de praktijk kan men niet beschikken over de volledige
informatie van de onderliggende verdelingen. Daarom moet men zich dikwijls beroepen op beperkte informatie, zoals bv. schattingen van de eerste
momenten van de verdeling. Niet enkel de basisrisicomaten maar ook meer
gesofisticeerde maten (zoals scheefheidsmaten, extreme percentielen van de
verdeling,. . . ) die een dieper inzicht in de onderliggende verdeling vereisen,
zijn erg van belang. De berekening van de eerste momenten kan gezien worden als een eerste poging om meer te weten te komen over de eigenschappen van een verdeling. Bovendien is de variantie niet de meest geschikte
risicomaat om de solvabiliteitsvereisten van een verzekeringsportefeuille te
bepalen. Als tweezijdige risicomaat houdt deze zowel rekening met de
positieve als met de negatieve tekortkomingen hetgeen tot onderschatting
van de reserve zal leiden in geval van een scheve verdeling. Bovendien
benadrukt deze maat niet de staarteigenschappen van de verdeling. In dit
geval lijkt het meer geschikt de VaR (het p-de kwantiel) te gebruiken of zelfs
de TVaR (hetgeen in essentie neerkomt op een gemiddelde van alle kwantielen boven een voorgedefinieerd niveau p). Ook risicomaten gebaseerd
op stop-loss premies (bv. de verwachte shortfall) kunnen in deze context
aangewend worden. Het verkrijgen van de verdeling waarvan dan allerlei maten kunnen berekend worden is het uiteindelijke doel. Deze trends
worden ook aangehaald in de huidige bank- en verzekeringsvoorschriften
(Basel 2 en Solvency 2) die de risico-gebaseerde benadering in ALM benadrukken. Dit vereist een nieuwe methodologische aanpak die toelaat
meer gesofisticeerde informatie over de onderliggende risico’s te verkrijgen.
In de huidige actuariële wetenschappelijke literatuur vinden we weinig
terug over de geschikte berekeningsmethode van de verdeling van reserveuitkomsten. Verscheidene methoden bestaan om efficiënt de verdeling van
sommen van onafhankelijke risico’s te benaderen (zoals Panjer’s recursie,
convolutie, ...). Als bovendien het aantal risico’s in een portefeuille groot
genoeg is, kan men gebruik maken van de Centrale Limiet Stelling om de
geaggregeerde claims via de normale verdeling te benaderen. Zelfs indien
deze onafhankelijkheidsveronderstelling niet voldaan is (wanneer bv. de
Inleiding
229
aanname van onafhankelijkheid op basis van statistische testen verworpen
wordt) wordt deze benadering veel gebruikt in de praktijk omwille van
de mathematische eenvoud. In een aantal praktische toepassingen wordt
deze onafhankelijkheidsveronderstelling nochtans geschonden, hetgeen tot
een significante onderschatting van het risico van de portefeuille kan leiden. Dit is onder meer het geval wanneer het actuarieel technische risico
gecombineerd wordt met het financiële investeringsrisico.
In tegenstelling tot in het bankwezen, is het concept van stochastische
interestvoeten pas recent aan de oppervlakte gekomen in het verzekeringswezen. Traditioneel vertrouwen actuarissen op deterministische interestvoeten. Een dergelijke vereenvoudiging laat toe efficiënte risicomaten
(zoals het gemiddelde, de standaarddeviatie, bovenkwantielen, ...) van
financiële contracten te bepalen. Door een hoge onzekerheid over toekomstige investeringsresultaten worden actuarissen nochtans gedwongen conservatieve aannames te doen om verzekeringspremies en wiskundige reserves te berekenen. Dit heeft tot gevolg dat de diversificatie-effecten van
returns in verschillende investeringsperioden niet in rekening kunnen worden gebracht. Hiermee bedoelen we dat slechte investeringsresultaten in
bepaalde perioden gewoonlijk gecompenseerd worden door zeer goede resultaten in andere perioden. Deze bijkomende kosten worden ofwel naar de
verzekerden doorgerekend, die hogere premies moeten betalen, ofwel naar
de aandeelhouders, die meer economisch kapitaal moeten voorzien. Het
belang van de introductie van modellen met stochastische interestvoeten
is daarom goed begrepen in de actuariële wereld. Ook de laatste bank- en
verzekeringsvoorschriften (Basel 2, Solvency 2) onderstrepen dit belang.
Deze voorschriften leggen de nadruk op de risico-gebaseerde benadering
om economisch kapitaal te bepalen. Het projecteren van cash flows met
stochastische returns is ook belangrijk in de prijsbepaling van verzekeringstoepassingen zoals de ‘embedded value’ (de huidige waarde van cash flows
voortgebracht door de van kracht zijnde polissen) en de ‘appraisal value’
(de huidige waarde van cash flows voortgebracht door de van kracht zijnde
polissen en door polissen die in de toekomst zullen onderschreven worden).
Een wiskundige beschrijving van het aangehaalde probleem kan als volgt
samengevat worden. Zij Xi (i = 1, . . . , n) een stochastisch bedrag dat
betaald moet worden op tijdstip ti en zij Vi de verdisconteringsfactor over
de periode [0, ti ]. We beschouwen dan de huidige waarde van toekomstige
230
Samenvatting in het Nederlands (Summary in Dutch)
betalingen, die geschreven kan worden als een scalair produkt van de vorm
S=
n
X
Xi Vi .
(N.1)
i=1
~ = (X1 , X2 , . . . , Xn ) kan bv. het verzekeringsDe stochastische vector X
~ = (V1 , V2 , . . . , Vn ) het fiof kredietrisico weergeven, terwijl de vector V
nanciële/investeringsrisico weergeeft. In het algemeen veronderstellen we
dat deze vectoren onderling onafhankelijk zijn. In praktische toepassingen
kan deze onafhankelijkheidsaanname wel eens geschonden zijn bv. door
een inflatiefactor met een sterke invloed op betalings- en investeringsresultaten. Men kan dit probleem echter aan pakken door sommen van de
volgende vorm te beschouwen
S=
n
X
X̃i Ṽi ,
i=1
waarbij X̃i = Xi /Zi en Ṽi = Vi Zi de aangepaste waarden zijn uitgedrukt
in reële termen (Zi is een inflatiefactor over de periode [0, ti ]). Daarom is
de onafhankelijkheidsveronderstelling tussen het verzekeringsrisico en het
financiële risico in vele gevallen realistisch en kan zij efficiënt aangewend
worden om verschillende grootheden te verkrijgen die het risico in financiële
instituten beschrijft (bv. verdisconteerde claims of de ‘embedded/appraisal’
waarde van een maatschappij).
Deze verdelingsfuncties zijn typisch complex en niet voor de hand
liggend omwille van twee belangrijke redenen. Eerst en vooral behoort
de verdeling van een som van stochastische veranderlijken met marginale
verdelingen in dezelfde verdelingsklasse in het algemeen niet tot deze verdelingsklasse. Ten tweede verhindert de stochastische afhankelijkheid tussen
de elementen in de som het gebruik van convolutie en maakt het geheel
aanzienlijk ingewikkelder. Bijgevolg worden benaderingsmethoden om functies van sommen van afhankelijke variabelen te berekenen noodzakelijk.
In vele gevallen kan men natuurlijk Monte Carlo simulatie gebruiken om
empirische verdelingsfuncties te verkrijgen. Dit is echter typisch een tijdrovende benaderingsmethode, in het bijzonder indien men staartkansen
wenst te benaderen hetgeen een groot aantal simulaties vereist. Daarom
moet men opzoek gaan naar nieuwe alternatieve methoden. In deze thesis
bestuderen en evalueren we de meest frequent gebruikte benaderingstechnieken voor verzekeringstoepassingen.
Inleiding
231
Het centrale idee in dit werk is het comonotoniciteitsconcept. We
stellen voor het hierboven uiteengezette probleem op te lossen door onderen bovengrenzen voor de som van afhankelijke variabelen te berekenen gebruikmakend van de beschikbare informatie. Deze grenzen zijn gebaseerd
op een algemene techniek voor het berekenen van het onder- en bovengrenzen van stop-loss premies van een som van afhankelijke variabelen, zoals
uiteengezet in Kaas et al. (2000).
De eerste benadering voor de verdelingsfunctie van de verdisconteerde
reserve wordt afgeleid door de afhankelijkheidstructuur tussen de betrokken
stochastische veranderlijken te benaderen door een comonotone afhankelijkheidsstructuur. Op deze manier wordt het meerdimensionale probleem
gereduceerd tot een tweedimensionaal probleem hetgeen opgelost kan worden door te conditioneren en gebruik te maken van eenvoudige numerieke
technieken. Deze benadering is plausibel in actuariële toepassingen aangezien het leidt tot voorzichtige en conservatieve waarden van de reserves
en solvabiliteitsmarges. Indien de onderliggende afhankelijkheidsstructuur
sterk genoeg is, geeft deze bovengrens in convexe orde bevredigende resultaten.
De tweede benadering, die afgeleid wordt door voorwaardelijke verwachtingswaarden te beschouwen, neemt een deel van de afhankelijkheidsstructuur in beschouwing. Deze benedengrens in convexe orde is zeer nuttig om de kwaliteit van de bovengrens als benadering te evalueren en kan
ook gebruikt worden als een benadering van de onderliggende verdeling.
Alhoewel deze keuze niet (actuarieel) voorzichtig is, doet de relatieve fout
van deze benadering significant beter dan de relatieve fout van de bovengrens. Daarom zal de ondergrens verkozen worden in toepassingen waarbij
een hoge nauwkeurigheid van de toegepaste benaderingen vereist wordt
(zoals het prijzen van exotische opties of strategische portefeuille selectie
problemen).
Deze thesis is als volgt ingedeeld.
Het eerste hoofdstuk herhaalt de basis van de actuariële risicotheorie. We
definiëren enkele veel gebruikte afhankelijkheidsmaten en de belangrijkste
risico-orderelaties voor actuariële toepassingen. We introduceren verder
verscheidene welbekende risicomaten en de relaties die onderling gelden.
Verder geeft het eerste hoofdstuk een theoretische achtergrond voor de
concepten van comonotoniciteit en herhaalt het de belangrijkste eigen-
232
Samenvatting in het Nederlands (Summary in Dutch)
schappen van comonotone risico’s.
In Hoofdstuk 2 herhalen we hoe de convexe grenzen kunnen afgeleid worden
en illustreren we de theoretische resultaten aan de hand van een toepassing
met betrekking tot verdisconteerde reserves. Het voordeel van te werken
met een som van comonotone variabelen ligt in de eenvoudige berekening
van de betrokken verdeling. In het bijzonder is deze techniek zeer nuttig
om betrouwbare schattingen te verkrijgen van bovenkwantielen en stoploss premies.
In praktische toepassingen is de bovengrens enkel nuttig indien de
afhankelijkheid tussen opeenvolgende termen van de som sterk genoeg is.
Maar zelfs dan zijn deze benaderingen voor stop-loss premies niet bevredigend. In dit hoofdstuk stellen we een aantal technieken voor om meer
efficiënte bovengrenzen voor stop-loss premies te bepalen. We gebruiken
hiervoor enerzijds de conditioneringsmethode zoals in Curran (1994) en in
Rogers & Shi (1995) en anderzijds de traditionele onder- en bovengrenzen
voor stop-loss premies van sommen van afhankelijke stochastische veranderlijken. We tonen ook hoe deze resultaten kunnen toegepast worden in
het speciale geval van lognormale stochastische veranderlijken. Dergelijke
sommen komt men vaak in de praktijk tegen, zowel in de actuariële als in
de financiële wereld.
We leiden comonotone benaderingen af voor het scalaire produkt van
stochastische vectoren van de vorm (N.1). Een algemene procedure voor
het berekenen van accurate schattingen van kwantielen en stop-loss premies wordt uiteengezet. We bestuderen de verdelingsfunctie van de huidige
waarde van een serie van stochastische betalingen in een stochastisch financiële omgeving beschreven door een lognormaal verdisconteringsproces. Dergelijke verdelingen komen frequent voor in een breed spectrum
van verzekerings- en financiële toepassingen. We verkrijgen nauwkeurige
benaderingen door onder- en bovengrenzen in convexe orde te ontwikkelingen voor dergelijke huidige-waarde-functies. We beschouwen verscheidene toepassingen voor verdisconteerde schadeprocessen onder de Black &
Scholes setting. In het bijzonder analyseren we in detail de gevallen waarbij
de stochastische veranderlijken Xi verzekeringsschades voorstellen gemodelleerd door lognormale, normale (meer algemeen elliptische) en gamma
of invers Gaussische (meer algemeen gematigd stabiele) verdelingen. Door
middel van een reeks numerieke illustraties tonen we dat de methode
zeer nauwkeurige en eenvoudig te verkrijgen benaderingen verschaft voor
Inleiding
233
verdelingsfuncties van stochastische veranderlijken van de vorm (N.1).
In Hoofdstuk 3 en 4 passen we de verkregen resultaten toe op twee belangrijke reserveringsproblemen in het verzekeringswezen en illustreren we
de benaderingen zowel numeriek als grafisch.
In Hoofdstuk 3 beschouwen we een belangrijke toepassing in het domein
van de levensverzekeringen. We trachten conservatieve schattingen te
bekomen voor kwantielen en stop-loss premies van een annuı̈teit en een
ganse portefeuille van annuı̈teiten. Gelijkaardige technieken kunnen aangewend worden om schattingen te verkrijgen van meer algemene verzekeringsprodukten in de sector leven. Onze techniek laat toe ‘personal finance’ problemen zeer nauwkeurig op te lossen.
Het geval van een portefeuille van annuı̈teiten is reeds uitgebreid onderzocht in de wetenschappelijke literatuur, maar enkel in het grensgeval
— voor homogene portefeuilles, wanneer het sterfterisico volledig gediversifieerd is. De toepasbaarheid van deze resultaten in de verzekeringspraktijk kan echter in vraag gesteld worden: in het bijzonder hier, aangezien
een typische portefeuille niet genoeg polissen bevat om te spreken over
volledige diversificatie. Daarom stellen we voor het aantal actieve polissen
in de opeenvolgende jaren te benaderen gebruikmakend van een ‘normal
power’ verdeling en de huidige waarde van de toekomstige uitkeringen te
modelleren als een scalair produkt van onderling onafhankelijke vectoren.
Hoofdstuk 4 focust op het schadereserveringsprobleem. Het correct schatten van het bedrag dat een maatschappij opzij moet zetten om tegemoet
te komen aan de verplichtingen (schadegevallen) die zich in de toekomst
voordoen, is een belangrijke taak voor verzekeringsmaatschappijen om een
correct beeld van haar verplichtingen te krijgen. De historische data die
nodig zijn om schattingen te bekomen voor toekomstige betalingen worden meestal weergegeven als incrementele betalingen in driehoek-vorm.
De bedoeling is deze schadedriehoek te vervolledigen tot een vierkant en
eventueel tot een rechthoek indien schattingen nodig zijn die behoren tot
afwikkelingsjaren waarvan geen data in de driehoek opgenomen zijn. Hiervoor kan de actuaris gebruik maken van een aantal technieken. De intrinsieke onzekerheid wordt beschreven door de verdeling van mogelijke
uitkomsten en men zoekt steeds naar de beste schatting van de reserve.
Schadereservering heeft te maken met de bepaling van de onzekere huidige
234
Samenvatting in het Nederlands (Summary in Dutch)
waarde van een ongekend bedrag van toekomstige betalingen. Aangezien dit bedrag zeer belangrijk is voor een verzekeringsmaatschappij en
haar polishouders zijn de intrinsieke onzekerheden geen excuus om een
wetenschappelijke analyse links te laten liggen. Opdat de reserveschatting
werkelijk de beste schatting van de actuaris zou weergeven, moet zowel de
bepaling van de verwachte waarde van niet-betaalde schadegevallen alsook de geschikte verdisconteringsvoet de beste schatting van de actuaris
weergeven (hiermee bedoelen we dat deze niet opgelegd moet worden door
anderen of door de wetgeving). Aangezien de reserve een provisie is voor
toekomstige betalingen van niet-afgehandelde schadegevallen, geloven we
dat de geschatte schadereserve de tijdswaarde van geld moet weergeven.
In vele situaties is de verdisconteerde reserve nuttig, bv. in een dynamisch
financiële analyse, winstbepaling en het prijs zetten, risicokapitaal, schadeportefeuille transfers,. . . . Idealiter zou de verdisconteerde reserve ook aanvaardbaar moeten zijn voor rapportering. De huidige wetgeving laat het
echter meestal niet toe. Niet-verdisconteerde reserves bevatten in feite een
zekere risicomarge afhankelijk van het niveau van de interestvoet. In dit
hoofdstuk beschouwen we de verdisconteerde IBNR reserve en leggen we
een impliciete marge op gebaseerd op een risicomaat van de verdeling van
de totale verdisconteerde reserve. We modelleren de schadebetalingen gebruikmakend van lognormale lineaire modellen, loglineaire locatie-schaal
modellen en veralgemeende lineaire modellen en leiden accurate comonotone benaderingen af voor de verdisconteerde reserve.
De bootstraptechniek heeft bewezen zeer nuttig te zijn in vele statistische toepassingen en kan in het bijzonder interessant zijn om de variabiliteit van de schadevoorspellingen te bepalen en bovendien om bovengrenzen te construeren met een geschikt betrouwbaarheidsniveau. Haar
populariteit is te wijten aan een combinatie van rekenkracht en theoretische ontwikkeling. Een voordeel van de bootstrapbenadering is dat de
techniek op elke dataset kan toegepast worden zonder een onderliggende
verdeling te veronderstellen. Bovendien kan de meeste software omgaan
met zeer grote aantallen bootstrapiteraties.
In Hoofstuk 5 leiden we andere methoden af om benaderingen te verkrijgen
voor S. We herhalen en evalueren ook kort enkele reeds bestaande technieken. In de eerse sectie van dit hoofdstuk, herhalen we twee bekende
moment gebaseerde benaderingen: de lognormale en de inverse gamma benadering. Mensen uit de praktijk gebruiken vaak een moment gebaseerde
Inleiding
235
lognormale benadering voor de verdeling van S. Deze benaderingen zijn
zo gekozen dat de eerste twee momenten samenvallen met de corresponderende momenten van S.
Alhoewel de comonotone benaderingen in convexe orde bewezen hebben
goede benaderingen te zijn in geval de onderliggende variabiliteit klein is,
doen ze het een stuk minder wanneer de variantie toeneemt. Daarom kijken
we hier naar benaderingen voor functies van sommen van afhankelijke variabelen door gebruik te maken van asymptotische resultaten. Alhoewel
asymptotische resultaten geldig zijn op oneindig, kunnen ze ook nuttig
zijn als benaderingen in de buurt van oneindig. We leiden enkele asymptotische resultaten af voor de staartkans van een som van zwaarstaartige
afhankelijke variabelen.
Sedert 1990 kent het toegepaste Bayesiaanse onderzoek een enorme
groei bij de statistici. Deze explosie heeft weinig te maken gehad met de
groeiende interesse van statistici en econometrici voor de theoretische basis
van de Bayesiaanse analyse of met een plotselinge bewustwording van de
voordelen van de Bayesiaanse aanpak ten opzichte van de frequentistische
methoden, maar heeft vooral een pragmatische grondslag. De ontwikkeling van krachtige rekeninstrumenten (en de bewustwording dat bestaande
statistische tools nuttig kunnen zijn om Bayesiaanse modellen te fitten)
heeft een groot aantal onderzoekers aangetrokken om de Bayesiaanse benadering te gebruiken in de praktijk. Het gebruik van dergelijke methoden
laat onderzoekers toe ingewikkelde statistische modellen te schatten, die
gebruikmakend van standaard frequentistische technieken redelijk moeilijk
zijn, al dan niet onmogelijk. In deze sectie schetsen we vrij algemeen de basiselementen van de Bayesiaanse berekening. Bayesiaanse gevolgtrekking
komt neer op het fitten van een kansmodel op een dataset en het resultaat
samenvatten door middel van een kansverdeling op de modelparameters en
op niet-waargenomen grootheden zoals predicties voor nieuwe observaties.
Er bestaan eenvoudige simulatiemethoden om een steekproef te nemen van
de posterior- en predictieverdeling, waarbij onzekerheid in de modelparameters automatisch meegenomen wordt. Een voordeel van de Bayesiaanse
aanpak is dat we steeds, gebruikmakend van simulatie, de posterior predictieverdeling kunnen berekenen zodat we niet veel energie moeten steken
in het schatten van de steekproefverdeling van teststatistieken.
Uiteindelijk vergelijken we deze benaderingen met de comonotone benaderingen uit het vorig hoofdstuk in de context van de schadereserveringsproblematiek. In geval de onderliggende variantie van het statistische
236
Samenvatting in het Nederlands (Summary in Dutch)
en financiële gedeelte van de verdisconteerde IBNR reserve groter wordt,
presteren de comonotone benaderingen slecht. We illustreren dit aan de
hand van een eenvoudig voorbeeld en stellen de asymptotische resultaten
uit het vorig hoofdstuk als een alternatief voor. We vergelijken al deze
resultaten ook met de lognormale moment gebaseerde benaderingen. Tenslotte bekijken we ook de verdeling van de verdisconteerde reserve wanneer
we de data in de schadedriehoek modelleren met behulp van een veralgemeend lineair model en vergelijken de resultaten van de comonotone benaderingen met de Bayesiaanse benaderingen.
Bibliography
[1] Ahcan A., Darkiewicz G., Goovaerts M.J. & Hoedemakers T. (2005).
“Computation of convex bounds for present value functions of random payments”, Journal of Computational and Applied Mathematics, to appear.
[2] Albrecher H., Dhaene J., Goovaerts M.J. & Schoutens W. (2005).
“Static hedging of Asian options under Lévy models: The comonotonicity approach”, The Journal of Derivatives, 12(3), 63-72.
[3] Antonio K., Beirlant J. & Hoedemakers T. (2005). Discussion of
“A Bayesian generalized linear model for the Bornhuetter-Ferguson
method of claims reserving”by Richard Verrall, North American Actuarial Journal, to be published.
[4] Arnold L. (1974). Stochastic Differential Equations: Theory and Applications, Wiley, New York.
[5] Artzner P. (1999). “Application of coherent risk measures to capital
requirements in insurance”, North American Actuarial Journal, 3(2),
11–25.
[6] Artzner P., Delbaen F., Eber J.M. & Heath D. (1999). “Coherent
measures of risk”, Mathematical Finance, 9, 203–228.
[7] Barnett G. & Zehnwirth B. (2000). “Best estimates for reserves”,
Proceedings of the Casualty Actuarial Society, 87(2), 245–321.
[8] Beekman J.A. & Fuelling C.P. (1990). “Interest and mortality randomness in some annuities”, Insurance: Mathematics & Economics,
9(2-3), 185–196.
237
238
Bibliography
[9] Beekman J.A. & Fuelling C.P. (1991). “Extra randomness in certain
annuity models”, Insurance: Mathematics & Economics, 10(4), 275–
287.
[10] Beekman J.A. & Fuelling C.P. (1993). “One approach to dual randomness in life insurance”, Scandinavian Actuarial Journal, 76(2),
173–82.
[11] Bellhouse D.R. & Panjer H.H. (1981). “Stochastic modeling of interest rates with applications to life contingencies - Part II”, Journal of
Risk and Insurance, 48(4), 628–637.
[12] Beirlant J., Goegebeur Y., Segers J. & Teugels J. (2004). Statistics
of Extremes: Theory and Applications, Wiley, New York.
[13] Bingham N.H., Goldie C.M. & Teugels J.L. (1987). Regular Variation, Cambridge University Press, Cambridge.
[14] Black F. & Scholes M. (1973). “The pricing of options and corporate
liabilities”, Journal of Political Economy, 81, 637–659.
[15] Blum K.A. & Otto D.J. (1998). “Best estimate loss reserving: an
actuarial perspective”, Casualty Actuarial Society Forum Fall 1998,
55-101.
[16] Boyle P.P. (1976). “Rates of return as random variables”, Journal of
Risk and Insurance, 43(4), 693–711.
[17] Bowers N.L., Gerber H.U., Hickman J.C., Jones D.A. & Nesbitt C.J.
(1986). Actuarial Mathematics, Schaumburg, Ill.: Society of Actuaries.
[18] Breiman L. (1965). “On some limit theorems similar to the arc-sin
law”, Theory of Probability and Its Applications, 10(2), 323–331.
[19] Bühlmann H., Gagliardi B., Gerber H.U. & Straub E. (1977). “Some
inequalities for stop-loss premiums”, ASTIN Bulletin 9, 75-83.
[20] Cesari R. & Cremonini D. (2003). “Benchmarking, portfolio insurance and technical analysis: a Monte Carlo comparison of dynamic
strategies of asset allocation”, Journal of Economic Dynamics and
Control, 27, 987-1011.
Bibliography
239
[21] Christofides S. (1990). “Regression models based on log-incremental
payments”, Claims Reserving Manual, 2, Institute of Actuaries, London.
[22] Cline D.B.H. (1986). “Convolution tails, product tails and domains
of attraction”, Probability Theory Related Fields, 72, 529–557.
[23] Cohen A.C. & Whitten B.J. (1988). Parameter estimation in reliability and life span models, Marcel Dekker, Inc., New York.
[24] Cordeiro G.M. & McCullagh P. (1991). “Bias correction in generalized linear models ”, Journal of the Royal Statistical Society B, 53(3),
629–643.
[25] Curran M. (1994). “Valuing Asian and portfolio options by conditioning on the geometric mean price”, Management Science, 40(12),
1705–1711.
[26] Darkiewicz G., Dhaene J. & Goovaerts M.J. (2005a). “Risk measures and dependencies of risks”, Brazilian Journal of Probability
and Statistics, to appear.
[27] Darkiewicz G. (2005b). Value-at-Risk in Insurance and Finance: the
Comonotonicity Approach, PhD Thesis, K.U. Leuven, Faculty of
Economics and Applied Economics, Leuven.
[28] Davison A.C., & Hinkley D.V. (1997) Bootstrap Methods and their
Application, Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press.
[29] De Alba E. (2002). “Bayesian estimation of outstanding claims reserves”, North American Actuarial Journal, 6(4), 1–20.
[30] Dȩbicka J. (2003). “Moments of the cash value of future payment
streams arising from life insurance contracts”, Insurance: Mathematics & Economics, 33(3), 533–550.
[31] Decamps M., De Schepper A. & Goovaerts M.J. (2004). “Pricing
exotic options under local volatility”, Proceedings of the second International Workshop on Applied Probability (IWAP), Athens.
240
Bibliography
[32] Deelstra M., Liinev J. & Vanmaele M. (2004). “Pricing of arithmetic basket options by conditioning”, Insurance: Mathematics &
Economics, 34(1), 55–77.
[33] Denuit M. & Dhaene J. (2003) “Simple characterizations of comonotonicity and countermonotonicity by extremal correlations”, Belgian
Actuarial Bulletin, 3, 22–27.
[34] Denuit M. & Dhaene J. (2004) “Dependent risks”, Encyclopedia of
Actuarial Science, Wiley, Vol. I, 464–471.
[35] Devroye L. (1986). Non-Uniform random variate generation,
Springer-Verlag, New York.
[36] De Vylder F. & Goovaerts M.J. (1979). Proceedings of the first meeting of the contact group ”Actuarial Sciences”, K.U.Leuven, nr 7904B,
wettelijk Depot: D/1979/23761/5.
[37] De Vylder F. & Goovaerts M.J. (1982). “Upper and lower bounds on
stop-loss premiums in case of known expectation and variance of the
risk variable”, Mitt. Verein. Schweiz. Versicherungmath., 149–164.
[38] Dhaene J. (1989). “Stochastic interest rates and autoregressive integrated moving average processes”, ASTIN Bulletin, 19(2), 131–138.
[39] Dhaene J. (1990). “Distributions in life insurance”, ASTIN Bulletin,
20(1), 81–92.
[40] Dhaene J., Wang S., Young V. & Goovaerts M.J. (2000).
“Comonotonicity and maximal stop-loss premiums”, Mitteilungen
der Schweiz. Aktuarvereinigung, 2000(2), 99–113.
[41] Dhaene J., Denuit M., Goovaerts M.J., Kaas R. & Vyncke D.
(2002a). “The concept of comonotonicity in actuarial science and
finance: Theory”, Insurance: Mathematics & Economics, 31(1), 3–
33.
[42] Dhaene J., Denuit M., Goovaerts M.J., Kaas R. & Vyncke D.
(2002b). “The concept of comonotonicity in actuarial science and finance: Applications”, Insurance: Mathematics & Economics, 31(2),
133–161.
Bibliography
241
[43] Dhaene J., Goovaerts M.J. & Kaas R. (2003). “Economical capital
allocation derived from risk measures”, North American Actuarial
Journal, 7(2), 44–59.
[44] Dhaene J., Vanduffel S., Tang Q., Goovaerts M.J., Kaas R. & Vyncke
D. (2004). “Solvency capital, risk measures and comonotonicity: a review”, Research Report OR 0416, Department of Applied Economics,
K.U.Leuven.
[45] Dhaene J., Vanduffel S., Goovaerts M.J., Kaas R. & Vyncke D.
(2005). “Comonotonic approximations for optimal portfolio selection
Problems”, Journal of Risk and Insurance, 72(2), 253–301.
[46] Doray L.G. (1994). “IBNR reserve under a loglinear location-scale
regression model”, Casualty Actuarial Society Forum 1994, 2, 607652.
[47] Doray L.G. (1996). “UMVUE of the IBNR reserve in a lognormal linear regression model”, Insurance: Mathematics & Economics, 18(1),
43–58.
[48] Dufresne D. (1990). “The distribution of a perpetuity with applications to risk theory and pension funding”, Scandinavian Actuarial
Journal, 9, 39–79.
[49] Dufresne D. (2002). “Asian and basket asymptotics”, Research Paper
No. 100, Centre for Actuarial Studies, University of Melbourne.
[50] Dufresne D. (2004). “Stochastic life annuities, Research Paper, Centre for Actuarial Studies, University of Melbourne.
[51] Efron B. (1979). “Bootstrap methods: another look at the jackknife”,
Ann. Statist., 7, 1–26.
[52] Efron B. & Tibshirani R.J. (1993). An Introduction to the Bootstrap.
Chapman and Hall, New York.
[53] Embrechts P., Klüppelberg C. & Mikosch T. (1997). Modelling Extremal Events for Insurance and Finance, Springer, Berlin.
[54] England P.D. & Verrall R.J. (1999). “Analytic and bootstrap estimates of prediction errors in claim reserving”, Insurance: Mathematics & Economics, 25(3), 281-293.
242
Bibliography
[55] England P.D. & Verrall R.J. (2001). “A flexible framework for
stochastic claims reserving”, Proceedings of the Casualty Actuarial
Society, 88(1), 1–38.
[56] England P.D. & Verrall R.J. (2002). “Stochastic claims reserving in
general insurance”, British Actuarial Journal, 8(3), 443–518.
[57] Fang K.T., Kotz S. & Ng K.W. (1990) Symmetric Multivariate and
Related Distributions, Chapman & Hall, London.
[58] Feller W. (1971). An Introduction to Probability Theory and Its Applications, Wiley, New York.
[59] Frees E. (1990). “Stochastic life contingencies with solvency considerations”, Transactions of the Society of Actuaries, 42, 91–129.
[60] Gilks W.R., Richardson S. & Spiegelhalter D.J. (1996) Practical
Markov Chain Monte Carlo, Chapman and Hall, London.
[61] Goovaerts M.J., Kaas R., Van Heerwaarden A.E. & Bauwelinckx T.
(1990). Effective Actuarial Methods, North-Holland, Amsterdam.
[62] Goovaerts M.J. & Redant H. (1999). “On the distribution of IBNR
reserves”, Insurance: Mathematics & Economics, 25(1), 1–9.
[63] Goovaerts M.J., Dhaene J. & De Schepper A. (2000). “Stochastic
upper bounds for present value functions”, Journal of Risk and Insurance, 67(1),1–14.
[64] Goovaerts M.J., Kaas R., Dhaene J., & Tang Q. (2003). “A unified
approach to generate risk measures”, ASTIN Bulletin, 33(2), 173–
192.
[65] Goovaerts M.J., Kaas R., Dhaene J., & Tang Q. (2004). “Some new
classes of consistent risk measures”, Insurance: Mathematics & Economics, 34(3), 505–516.
[66] Heerwaarden A.E. van (1991). Ordering of Risks: Theory and Actuarial Applications, Thesis Publishers, Amsterdam.
[67] Hoedemakers T., Beirlant J., Goovaerts M.J. & Dhaene J. (2003).
“Confidence bounds for discounted loss reserves”, Insurance: Mathematics & Economics, 33(2), 297–316.
Bibliography
243
[68] Hoedemakers T. & Goovaerts M.J. (2004). Discussion of “Risk and
discounted loss reserves”by Greg Taylor, North American Actuarial
Journal, 8(4), 146–150.
[69] Hoedemakers T., Beirlant J., Goovaerts M.J. & Dhaene J. (2005).
“On the distribution of discounted loss reserves using generalized
linear models”, Scandinavian Actuarial Journal, 2005(1), 25–45.
[70] Hoedemakers T., Darkiewicz G. & Goovaerts M.J. (2005). “Approximations for life annuity contracts in a stochastic financial environment”, Insurance: Mathematics & Economics, to be published.
[71] Hoedemakers T., Darkiewicz G., Deelstra G., Dhaene J. & Vanmaele
M. (2005). “Bounds for stop-loss premiums of stochastic sums (with
applications to life contingencies)”, Research Report OR 0523, Department of Applied Economics, K.U.Leuven.
[72] Huang H., Milevsky M.A. & Wang J. (2004). “Ruined moments in
your Life: how good are the approximations?”, Insurance: Mathematics & Economics, 34(3), 421–447.
[73] Hürlimann W. (1996). “Improved analytical bounds for some risk
quantities”, ASTIN Bulletin, 26(2), 185–199.
[74] Hürlimann W. (1998). “On best stop-loss bounds for bivariate sums
by known marginal means, variances and correlation”, Mitt. Verein.
Schweiz. Versicherungmath., 111–134.
[75] Ibbotson Associates (2002). Stocks, Bonds, Bills and Inflation: 19262001, Chicago, IL.
[76] Jansen K., Haezendonck J. & Goovaerts M.J. (1986). “Upper bounds
on stop-loss premiums in case of known moments up to the fourth
order”, Insurance: Mathematics & Economics, 5(4), 315–334.
[77] Jeffreys H. (1946). “An invariant form for the prior probability in
estimation problems”, Proc. Roy. Soc. London Ser. A, 196, 453–461.
[78] Kaas R., Van Heerwaarden A.E. & Goovaerts M.J. (1998). Ordering
of Actuarial Risks, Caire Education Series 1, Caire, Brussels.
244
Bibliography
[79] Kaas R., Dhaene J. & Goovaerts M.J. (2000). “Upper and lower
bounds for sums of random variables”, Insurance: Mathematics &
Economics, 27(2), 151–168.
[80] Kaas R., Goovaerts M.J., Dhaene J. & Denuit M. (2001). Modern
Actuarial Risk Theory, Kluwer Academic Publishers.
[81] Kalbfleisch J.D. & Prentice R.L. (1980). The Statistical Analysis of
Failure Time Data, Wiley, New York.
[82] Karatzas I. & Shreve S.E. (1991). Brownian Motion and Stochastic
Calculus, Springer-Verlag, New York.
[83] Kass, R.E. & Wasserman L. (1996). “The selection of prior distributions by formal rules”, Journal of the American Statistical Association, 91, 1343–1370.
[84] Kremer E. (1982). “IBNR-claims and the two-way model of
ANOVA”, Scandinavian Actuarial Journal, 47–55.
[85] Laeven R.J.A, Goovaerts M.J. & Hoedemakers T. (2005). “Some
asymptotic results for sums of dependent random variables with actuarial applications”, Insurance: Mathematics & Economics, to be
published.
[86] Landsman Z. & Valdez E.A. (2003). “Tail conditional expectations
for elliptical distributions”, North American Actuarial Journal, 7,
55–71.
[87] Lawless J.F. (1982). Statistical Models and Methods for Lifetime
Data, Wiley, New York.
[88] Lehmann E. (1955). “Ordered families of distributions”, Ann. Math.
Statist., 26, 399–419.
[89] Lowe J. (1994). “A practical guide to measuring reserve variability using: Bootstrapping, operational time and a distribution free
approach”, Proceedings of the 1994 General Insurance Convention,
Institute of Actuaries and Faculty of Actuaries.
[90] Mack T. (1991). “A simple parametric model for rating automobile insurance or estimating IBNR claims reserves”, ASTIN Bulletin,
22(1), 93–109.
Bibliography
245
[91] Mack T. (1993). “Distribution free calculation of the standard error
of chain ladder reserve estimates”, ASTIN Bulletin, 23(2), 213–225.
[92] Mack T. (1994). “Measuring the variability of chain-ladder reserve
estimates“, Casualty Actuarial Society Forum Spring 1994, 1, 101182.
[93] McCullagh P. & Nelder J.A. (1992). Generalized Linear Models, 2nd
edition, Chapman and Hall, New York.
[94] Merton R. (1971). “Optimum consumption and portfolio rules in a
continuous-time model”, Journal of Economic Theory 3, 373–413.
[95] Merton R. (1990). Continuous Time Finance, Cambridge, Blackwell.
[96] Michael J.R., Schucany W.R. & Haas R.W. (1976). “Generating random variates using transformations with multiple roots”, The American Statistician, 30, 88–90.
[97] Milevsky M.A., Ho K. & Robinson C. (1997). “Asset allocation via
the conditional first exit time or how to avoid outliving your money”,
Review of Quantitative Finance and Accounting, 9(1), 53–70.
[98] Milevsky M.A. (1997). “The present value of a stochastic perpetuity and the Gamma distribution”, Insurance: Mathematics & Economics, 20(3), 243–250.
[99] Milevsky M.A. & Posner S.E. (1998). “Asian options, the sum of
lognormals, and the reciprocal gamma distribution”, Journal of Financial and Quantitative Analysis, 33(3), 409–422.
[100] Milevsky M.A. & Robinson C. (2000). “Self-annuitization and ruin
in retirement”, North American Actuarial Journal, 4(4), 112–124.
[101] Milevsky M.A. & Wang J. (2004). “Stochastic annuities under exponential mortality”, Research paper, York University and The IFID
Centre.
[102] Nielsen J.A. & Sandmann K. (2003). “Pricing bounds on Asian options”, Journal of Financial and Quantitative Analysis, 38(2), 449–
474.
246
Bibliography
[103] Norberg R. (1990). “Payment measures, interest and discounting. An
axiomatic approach with applications to insurance”, Scandinavian
Actuarial Journal, 73, 14–33.
[104] Norberg R. (1993). “A solvency study in life insurance”, Proceedings
of the Third AFIR International Colloquium, Rome, 822–830.
[105] O’Hagan A. (1994). Bayesian Inference, Kendall’s Advanced Theory
of Statistics, Arnold, London.
[106] Panjer H.H. & Bellhouse D.R. (1980). “Stochastic modeling of interest rates with applications to life contingencies”, Journal of Risk
and Insurance, 47, 91–110.
[107] Panjer H.H. (1998). Financial economics: With applications to investments, insurance and pensions, Schaumburg, Ill.: Society of Actuaries.
[108] Parker G. (1994a). “Moments of the present value of a portfolio of
policies”, Scandinavian Actuarial Journal, 77(1), 53–67.
[109] Parker G. (1994b). “Stochastic analysis of portfolio of endowment
insurance policies”, Scandinavian Actuarial Journal 77(2), 119–130.
[110] Parker G. (1994c). “Limiting distribution of the present value of a
portfolio”, ASTIN Bulletin, 24(1), 47–60.
[111] Parker G. (1994d). “Two stochastic approaches for discounting actuarial functions”, ASTIN Bulletin, 24(2), 167–181.
[112] Parker G. (1996). “A portfolio of endowment policies and its limiting
distribution”, ASTIN Bulletin, 26(1), 25–33.
[113] Parker G. (1997). “Stochastic analysis of the interaction between
investment and insurance risks”, North American Actuarial Journal,
1(2), 55–71.
[114] Pinheiro P.J.R., Andrade e Silva J.M. & de Lourdes Centeno M.
(2003). “Bootstrap methodology in claim reserving”, Journal of Risk
and Insurance, 70(4), 701–715.
Bibliography
247
[115] Renshaw A.E. (1989). “Chain ladder and interactive modelling
(claims reserving and GLIM)”, Journal of the Institute of Actuaries, 116(III), 559–587.
[116] Renshaw A.E. (1994). “On the second moment properties and the
implementation of certain GLIM based stochastic claims reserving
models”, Actuarial Research Paper No. 65, Department of Actuarial
Science and Statistics, City University, London.
[117] Renshaw A.E. (1994b). “Claims reserving by joint modelling”, Actuarial Research Paper No. 72, Department of Actuarial Science and
Statistics, City University, London.
[118] Renshaw A.E. & Verrall R.J. (1994). “A stochastic model underlying
the chain-ladder technique”, Proceedings XXV ASTIN Colloquium,
Cannes.
[119] Rogers L.C.G. & Shi Z. (1995). “The Value of an Asian option”,
Journal of Applied Probability, 32, 1077–1088.
[120] Schoutens W. (2003). Lévy Processes in Finance: Pricing Financial
Derivatives, Wiley, New York.
[121] Shaked M. & Shanthikumar J.G. (1994). Stochastic orders and their
applications, Academic Press.
[122] Simon S., Goovaerts M.J. & Dhaene J. (2000). “An easy computable
upper bound for the price of an arithmetic Asian option”, Insurance:
Mathematics & Economics, 26(2-3), 175–184.
[123] Tang Q. & Tsitsiashvili G. (2003). “Precise estimates for the ruin
probability in finite horizon in a discrete-time model with heavytailed insurance and financial risks”, Stochastic Processes and their
Applications, 108, 299–325.
[124] Tang Q. & Tsitsiashvili G. (2004). “Finite and infinite time ruin
probabilities in the presence of stochastic return on investments”,
Advances in Applied Probability, 36, 1278–1299.
[125] Taylor G.C. & Ashe F.R. (1983). “Second moments of estimates of
outstanding claims”, Journal of Econometrics, 23, 37–61.
248
Bibliography
[126] Taylor G.C. (1996). “Risk, capital and profit in insurance”, SCOR
International Prize in Actuarial Science.
[127] Taylor G.C. (2000). Loss Reserving:
Kluwer Academic Publishers.
An Actuarial Perspective,
[128] Taylor G.C. (2004). “Risk and discounted loss reserves”, North
American Actuarial Journal, 8(1), 37–44.
[129] Valdez E. & Dhaene J. (2004). “Bounds for sums of dependent logelliptical risks”, Working Paper, University of New South Wales.
[130] Vanduffel S., Hoedemakers T. & Dhaene J. (2004). “Comparing approximations for sums of non-independent lognormal random variables”, Research Report OR 0418, Department of Applied Economics, K.U.Leuven.
[131] Vanduffel S. (2005). Comonotonicity: From Risk Measurement to
Risk Management, PhD Thesis, University of Amsterdam, Faculty
of Economics and Econometrics, Amsterdam.
[132] Vanmaele M., Deelstra G. & Liinev J. (2004a). “Approximation of
stop-loss premiums involving sums of lognormals by conditioning on
two random variables”, Insurance: Mathematics & Economics, 35(2),
343–367.
[133] Vanmaele M., Deelstra G., Liinev J., Dhaene J. & Goovaerts M.J.
(2004b). “Bounds for the price of discrete arithmetic Asian options”,
Journal of Computational and Applied Mathematics, to appear.
[134] Verrall R.J. (1989). “A state space representation of the chain-ladder
linear model”, Journal of the Institute of Actuaries, 116, 589–610.
[135] Verrall R.J. (1991). “On the unbiased estimation of reserves from
loglinear models”, Insurance: Mathematics & Economics, 10, 75–80.
[136] Verrall R.J. (2004). “A Bayesian generalized linear model for the
Bornhuetter-Ferguson method of claims reserving”, North American
Actuarial Journal, 8(3), 67–89.
[137] Vyncke D. (2003). Comonotonicity: the Perfect Dependence, PhD
Thesis, K.U. Leuven, Faculty of Sciences, Leuven.
Bibliography
249
[138] Vyncke D., Goovaerts M.J. & Dhaene J. (2004). “An accurate analytical approximation for the price of a european-style arithmetic
Asian option”, Finance (AFFI), 25, 121–139.
[139] Wang S. & Young V.R. (1998). “Ordering risks: Expected utility
theory versus Yaari’s dual theory of risk”, Insurance: Mathematics
& Economics, 22, 235–242.
[140] Waters H.R. (1978). “The moments and distributions of actuarial
functions”, Journal of the Institute of Actuaries, 105, 61–75.
[141] Wedderburn R.W.M. (1974). “Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method”, Biometrika, 61,
439–447.
[142] Wilkie A.D. (1976). “The rate of interest as a stochastic process:
Theory and applications”, Proceedings of the 20th International
Congress of Actuaries, Tokyo 1, 325–337.
[143] Wolthuis H. & Van Hoek I. (1986). “Stochastic models for life contingencies”, Insurance: Mathematics & Economics, 5(3), 217–254.
[144] Wright T.S. (1990). “A stochastic method for claims reserving in
general insurance”, Journal of the Institute of Actuaries, 117, 677–
731.
[145] Yaari M.E. (1987). “The dual theory of choice under risk”, Econometrica, 55, 95–115.
[146] Zehnwirth B. (1989). “The chain-ladder technique - A stochastic
model”, Claims Reserving Manual, 2, Institute of Actuaries, London.