Past and present trends
in aggregate claims analysis
Gordon E. Willmot
Munich Re Professor of Insurance
Department of Statistics and Actuarial Science University of Waterloo
1st Quebec-Ontario Workshop on Insurance Mathematics January 28, 2011 Montreal, Quebec
• goal of talk is a discussion of modelling and analysis of aggregate claims on a portfolio of business
– historical perspective; techniques and complexity of models has changed over time
– models interdisciplinary
∗ credit risk
∗ operational risk
∗ profit analysis
• modelling incorporates two basic components
– random number of “events” of interest (frequency)
– each event generates a random quantity of interest (severity)
• main goal is to aggregate these quantities
– fixed period of time; aggregate claims analysis
– tracking of behaviour over time; surplus analysis
• complexity of models increased greatly recently
– attempt to realistically model quantities
• historical description of aggregate model
– interested in the evaluation of aggregate claims distribution function (df) G(y) = 1 − G(y) where
S = Y1 + Y2 + . . . + YN
N = number of claims, Yi = amount of i-th claim
– traditionally, {Y1, Y2, . . .} assumed to be an iid sequence, inde-pendent of N
– also of interest is stop-loss random variable
(S − y)+ = max(S − y, 0)
∗ stop-loss premium R1(y) = E{(S − y)+} = Ry∞ G(t)dt
• original approaches to evaluation involved parametric approxima-tions applied to G(y) directly
– easy to use, typically requires simple quantities such as moments
– questionable accuracy, particularly in right tail
– difficult to incorporate changes in individual policy characteris-tics such as deductibles and maximums
• commonly used approximations
– normal-based
∗ normal approximations give light right tail
∗ normal power, Haldane’s, and Wilson-Hilferty all assume h(S) is normally distributed for some h(·)
– gamma-based
∗ Beekman-Bowers, translated gamma
– exponential approximations
∗ motivated by ruin theory (compound geometric)
∗ includes Cramer-Lundberg asymptotic formula, Tijms, De Vylder’s method
∗ light right tail
– subexponential approximations
∗ heavy right tail
∗ often based on extreme value arguments
– Esscher’s method
∗ surprisingly good accuracy
∗ gave rise to Esscher transform (applied probability and math-ematical finance; change of measure)
∗ adopted by statistical community for approximating distribu-tion of sample statistics which involve sums of independent random variables)
∗ often referred to as saddlepoint approximations, exponential tilting
• numerical procedures
– simulation
∗ used in 1970’s
∗ advantage in that complicated models may be used
∗ disadvantage in that right tail may be inaccurate unless many values used
∗ disadvantage in that difficult to modify assumptions at indi-vidual claim level
– transform inversion techniques
∗ FFT in discrete case
∗ complex inversion based on aggregate pgf
∗ “black box”
∗ may require discretization of claim size distribution
– continuous inversion approaches
∗ Heckman-Meyers (characteristic function, piecewise constant density)
∗ Laplace transform inversion (much recent progress in queueing community)
– recursions
∗ computation of (discretized) probability mass function of S recursively, beginning with Pr(S = 0)
∗ let pn = a + nb pn−1 for n = 2, 3, . . . ∗ Panjer-type recursion Pr(S = y) = {p1 − (a + b)p0} Pr(Y = y) + y X x=0 a + bx y ! Pr(Y = x) Pr(S = y − x)
– includes most of basic compound models, e.g., Poisson, Bino-mial, negative binoBino-mial, logarithmic series
– extensions to other models as well
– simple to understand and use
– compound Poisson due to Euler, Adelson in queueing context, Panjer (1981) in actuarial context
• individual policy modifications
– deductibles, maximums, and coinsurance on each claim
– easy to incorporate with numerical procedures such as recursions
– statistically, deductibles involve “left truncation” on loss sizes and “thinning” of loss numbers
– maximums involve “right censoring”; coinsurance results in scale changes, both on loss sizes
• trends in aggregate loss modelling in last quarter century
– removal of independent and/or identically distributed assump-tions
∗ possible due to mathematical and computational advances
∗ claim count dependencies
· time series models
· dependence through latent variables as in mixed Poisson and MAP models
∗ claim size dependencies
· MAP models, mixtures as in credibility
· strong dependency concepts such as comonotonicity
∗ dependence between claim sizes and numbers in dependent Sparre Andersen (renewal) risk models
∗ removal of identically distributed assumption
· discounted aggregate claims incorporating inflation
· claim sizes independent but depend on time of occurrence
· (mixed) Poisson process allows reduction to iid case
– stronger inter-disciplinary influences
∗ phase-type assumptions borrowed from queueing theory
· greater flexibility even for simple models
· very useful for complex models (fluid flow techniques)
∗ Gerber-Shiu techniques for option pricing and Esscher trans-form analysis in mathematical finance
∗ wide variety of probabilistic, statistical, and applied mathe-matical tools used in risk analysis
– use of ‘semiparametric’ distributional assumptions
∗ phase-type distributions, combinations of exponentials, mix-ture of Erlangs
∗ all are dense in class of distributions in R+, flexible
∗ use of these models involves a hybrid of analytic and numerical approaches
∗ semi-parametric nature makes estimation nontrivial
∗ can be numerical root-finding difficulties with phase-type dis-tributions (location of eigenvalues for calculation of matrix-exponentials) and combinations of exponentials (partial frac-tion expansions on Laplace transforms)
∗ phase-type distributions
· absorption time in time-homogeneous Markov chain
· particularly useful for fairly complex stochastic models (ad-vantage over other two classes) as well as for simple models
· calculations of most quantities of interest straightforward
· disadvantages
1) knowledge of matrix calculus needed
2) often necessary to assume that all components of model are of phase-type
• mixed Erlang distributions
– huge class of distributions
∗ includes class of phase-type distributions
∗ includes many distributions whose membership in class is not obvious from definition
– extremely useful for simple risk models
∗ model for claim sizes
∗ all quantities of interest computed easily using infinite series (even finite time ruin probabilities)
∗ no root finding needed
∗ only requires use of simple algebra
– present discussion mainly from Willmot and Woo (2007) and Willmot and Lin (2011)
– mathematical introduction ∗ Erlang-j pdf for j = 1, 2, . . . , β > 0 ej(y) = β (βy) j−1 e−βy (j − 1)! , y > 0 ∗ mixed Erlang pdf f (y) = ∞ X j=1 qjej(y), y > 0
where {q1, q2, . . .} is a discrete counting measure
∗ includes Erlang-j as special case qj = 1, and exponential as special case q1 = 1; for many class members, {qj; j = 1, 2, 3, . . .} is most easily expressed through its probability gen-erating function (pgf)
· let Q(z) = P∞j=1qjzj be the pgf, then the mixed Erlang Laplace transform (LT) is ˜ f (s) = Z ∞ 0 e −syf (y)dy = Q β β + s ! ,
implying that f (y) is itself a compound pdf with an expo-nential secondary pdf, or expoexpo-nential “phases” in queueing terminology
· if the LT may be put in this form then the distribution is a mixed Erlang
– loss model properties
∗ tail ¯F (y) = Ry∞ f (x)dx is given by ¯ F (y) = e−βy ∞ X k=0 ¯ Qk(βy) k k! = ∞ X k=1 ¯ Qk−1 β ek(y) where ¯Qk = ∞P j=k+1 qj
∗ for value at risk (VaR) or quantiles; at level p, VaR = vp where ¯
F (vp) = 1 − p, and vp is easily obtained numerically
∗ asymptotic Lundberg type formula available for ¯F (x) via com-pound distribution representation
∗ moments ∞ Z 0 ykf (y)dy = β−k ∞ X j=1 qj(k + j − 1)! (j − 1)!
∗ excess loss (residual lifetime) pdf (payment per payment basis with a deductible for x) still of mixed Erlang form
fx(y) = f (x + y)¯ F (x) = ∞ X j=1 qj,xej(y) with qj,x = ∞ P i=j qi (βx)i−j (i−j)! ∞ P j=1 ¯ Qm(βx) m m!
– force of mortality (failure or hazard rate) µ(y) = f (y)/ ¯F (y) satisfies µ(0) = βq1, µ(∞) = β 1 − z0−1 where z0 is the ra-dius of convergence of Q(z) (µ(∞) = β for finite mixtures), and µ(y) ≤ β (dominates exponential in failure rate and hence stochastic order)
– equilibrium distribution (useful in tail classification and ruin the-ory) still of mixed Erlang form
fe(y) = ¯ F (y) R∞ 0 F (x)dx¯ = ∞ X j=1 qj∗ej(y) with qj∗ = ∞Q¯j−1 P k=1 kqk
– mean excess loss (mean residual lifetime) is r(y) = R0∞ yfx(y)dy (also reciprocal of equilibrium failure rate)
r(y) = 1 µe(y) = ∞ P j=0 ¯ Q∗j(βy)j! j β ∞P j=0q ∗ j+1 (βy)j j! with ¯Q∗j = ∞P k=j+1 qk∗ ∗ also, r(0) = R0∞ yf (y)dy = ∞P j=1jqj/β is the mean, r(∞) = z0/ {β (z0 − 1)} and r(∞) = 1/β if z0 = ∞, and r(y) ≥ 1/β
– aggregate claims with mixed Erlang claim sizes
∗ let {c0, c1, c2, . . .} have the compound pgf
C(z) = ∞
X
n=0
cnzn = P {Q(z)}
where P (z) = E(zN) = P∞n=0pnzn, and {c0, c1, c2, . . .} is it-self a compound distribution which may often be computed recursively ∗ aggregate claims LT Z ∞ 0 e −sydG(y) = P {f (s)} = Pe ( Q β β + s !) = C β β + s !
∗ for stop-loss moments (k = 1 ⇒ stop-loss premium) Rk(y) = e−βy ∞ X n=0 rn,k(βy) n n! = ∞ X n=1 rn−1,k β en(y) where rn,k = β−k ∞ X j=1 cn+jΓ(k + j) Γ(j)
(valid for all k ≥ 0, and R0(y) = ¯G(y))
∗ for TVaR, E(S|S > x) = x + P∞ j=0 C ∗ j(βx) j j! β P∞j=0 c∗j+1(βx)j! j where c∗j = Cj−1/P∞k=1 kck and C∗j = P∞k=j+1 c∗k
also simpler asymptotic formulas (as x → ∞) for VaR and TVaR using Lundberg light-tailed approach
– nontrivial examples of Erlang mixtures
∗ many distributions of mixed Erlang form, after changing the scale parameter
∗ identity for Laplace transforms
β1 β1 + s = β β + s β1 β 1 − 1 − β1 β β β+s
∗ for β1 < β this expresses the well known result that a zero-truncated geometric sum of exponential random variables is again exponential
– Example 1 (mixture of two exponentials)
∗ suppose that (without loss generality) β1 < β2, 0 < p < 1, and f (y) = pβ1e−β1y + (1 − p)β2e−β2y, y > 0
∗ then f (s) = pe β1
β1+s + (1 − p)β2β+s2 , and using the identity with β replaced by β2, it follows that
e f (s) = β2 β2 + s (1 − p) + p β1 β2 1 − 1 − β1 β2 β 2 β2+s ∗ that is, f (s) = Qe ββ2 2+s where Q(z) = z (1 − p) + p β1 β2 1 − 1 − β1 β2 z i.e., q1= (1−p)+pβ1 β2 , and qj= p β1 β2 1−β1 β2 j−1 for j = 2, 3, . . .
– Example 2 (countable mixture of Erlangs) ∗ suppose that f (y) = n X i=1 ∞ X k=1 pikβi(βiy) k−1e−βiy (k − 1)!
∗ assuming that βi < βn for i < n, the identity may be used with β1 replaced by βi and β by βn for each i = 1, 2, . . . , n, to express the Laplace transform
e f (s) = n X i=1 ∞ X k=1 pik βi βi + s !k in the form f (s) = Qe ββn n+s and qj = n X i=1 j X k=1 pikj − 1 k − 1 βi βn !k 1 − βi βn !j−k , j = 1, 2, . . .
in the following example, the distribution is not necessarily of phase-type or a combination of exponentials, and there is no simple representation for the qj’s in general, but they may be obtained numerically in a straightforward manner
– Example 3 (a sum of gammas)
∗ consider the Laplace transform of f (y) given by
e f (s) = n Y i=1 βi βi + s !αi ,
corresponding to the distribution of the sum X1+X2+· · ·+Xn, with the Xi’s being independent random variables, and Xi has the gamma pdf βi(βiy)αi−1e−βiy/Γ(αi)
∗ we assume that the αi’s are positive (not necessarily integers), but the sum m = Pni=1 αi is assumed to be a positive integer
∗ assuming that βi < βn for i < n, it follows that e f (s) = Qββn n+s where Q(z) = zm n−1Y i=1 βi βn 1 − 1 − βi βn z αi
∗ the probabilities {q1, q2, . . .} correspond to convolutions of neg-ative binomial probabilities, shifted to the right by m
∗ simple analytic formulas for {q1, q2, . . .} may be derived in some cases, such as when αi = 1 for all i or when n = 2
∗ in general, however, it follows that qj= 0 for j < m,
qm = Qn−1i=1(βi/βn)αi, and {qm+1, qm+2, . . .} may be computed using the Panjer-type recursion
qj = 1 j − m j−mX k=1 n−1X i=1 αi 1 − βi βn !k qj−k, j = m+1, m+2, . . .
– applications in ruin and surplus analysis
∗ Sparre Andersen (renewal) risk model
∗ mixed Erlang claim sizes
∗ let hδ(x) be the ‘discounted’ (with parameter δ ≥ 0) density of the surplus immediately prior to ruin with zero initial surplus
∗ geometric parameter φδ = ∞R 0 hδ(x)dx, 0 < φδ < 1 ∗ ladder height pdf bδ(y) = ∞ Z 0 fx(y) ( hδ(x) φδ ) dx
∗ Laplace transform of the time of ruin (ruin probability is spe-cial case δ = 0) is the compound geometric tail
¯ Gδ(x) = ∞ X n=1 (1 − φδ) φnδB¯δ∗n(x), x ≥ 0
∗ in mixed Erlang claim size case with f (y) = ∞P
j=1qjej(y), fx(y) is also mixed Erlang, in turn implying that the ladder height pdf bδ(y) = ∞ X j=1 qj(δ)ej(y)
is still mixed Erlang, with LT ˜bδ(s) = Qδ β+sβ where
Qδ(z) = ∞
X
j=1
∗ hence, define the discrete compound geometric pgf Cδ(z) = ∞ X n=0 cn(δ)zn = 1 − φδ 1 − φδQδ(z), and the previous results imply that
¯ Gδ(x) = e−βx ∞ X j=0 ¯ Cj(δ)(βx) j j! = ∞ X j=1 ¯ Cj−1(δ) β ej(x) where ¯Cj(δ) = ∞P n=j+1 cn(δ)
∗ explicit expression for mixed Erlang mixing weights are avail-able for some interclaim time distributions (e.g. Coxian), in which case recursive numerical evaluation is straightforward
∗ deficit at ruin given initial surplus x (relevant quantity for risk management decisions), denoted by |UT|, has mixed Erlang pdf (given that ruin occurs)
hx(y) =
∞
X
m=1
pm,xem(y)
where the distribution np1,x, p2,x, . . .o is given by
pm,x = ∞ P j=mqj(0)τj−m (βx) ∞ P j=1qj (0) j−1P i=0τi (βx) , and τn(x) = ∞ X i=0 ci(0) x i+n (i + n)!
∗ more generally, it can be shown that E ne−δTw (|UT|) I (T < ∞) |U0 = xo = ∞ X m=1 Rm,δem(x)