Disability Frequencies in
Life Insurance
Bernhard König
1, Frank Weber
1,
Mario V. Wüthrich
2For the prediction of disability frequencies,
not only the observed, but also the incurred
but not yet reported (IBNyR) claims have to be
taken into account. In the present paper we
discuss an IBNyR model which allows to
incorporate information of an external economic
factor. The aim is to find such an economic
factor which precedes the disability frequency in
the sense that its time shifted process is strongly
correlated to the occurrence of disability claims.
This economic factor will then serve as a
pre-indicator to improve the quality of the
disability frequency prediction.
1 Introduction
We consider the prediction of disability claims and frequencies in life insurance. A disability claim occurs, if an insured becomes partially or completely unable to work for an extended time period, and if he does not reactivate or die within a given waiting period (waiting period is sometimes also called deferred period or qualification period). Such a disability claim may be caused by sickness or an accident. As a consequence, the insured receives disability benefits according to the insurance contract and his degree of disability.
Abstract:
1
1AXA Winterthur, General-Guisan-Str. 40, Postfach 357, 8401
Winterthur, Switzerland
2ETH Zurich, RiskLab, Department of Mathematics, 8092 Zurich,
The claims frequency is an essential trigger of the disability risk of a given portfolio. The prediction of this claims frequency, however, is a
complicated matter because of late reporting and recording of claims. This problem arises from the delay between the occurrence of a claim and its recording in the insurer’s administrative system, i.e. until it is available for statistical analysis. Such delays, which can be months or sometimes even years, may have several causes:
• One essential cause for late reporting is the uncertainty whether the work incapacity due to either sickness or an accident leads to a long term disability. Another reason is that usually claims are not reported before the end of the waiting period.
• Once the insurer is aware of a (potential) claim, it needs to be verified whether the insured is actually entitled to any disability benefits. Frequently this process includes a comprehensive medical investigation which is often time-consuming.
Disregarding this late reporting and recording problem would lead to a systematic underestimation of disability frequencies, and therefore, to an underestimation of the disability risk. For this reason it is important to predict the number of those disability cases, which have already occured, but are still not recorded in the insurer’s administrative system. This corresponds to the well-known task of estimating the number of incurred but not yet reported (IBNyR) claims in non-life insurance.
The analysis of this prediction problem is ussually based on a so-called
run-off scheme. We consider the matrix {Ni,j ; 0 £ i,j £I}, where Ni,j denotes
the number of the disability cases
• which have occurred in time period iand
• which are recorded in the insurer’s administrative system during the
accounting period k = i + j, i.e. which have a recording delay j.
Thus, the rows iof this matrix correspond to the occurrence periods of
the claims, whereas the columns j agree with their reporting and
recording delays. The last row i= Iof the scheme refers to the latest
complete time period before the statistics are established. Hence, the upper run-off triangle
(1.1) contains the numbers of the observed disability cases. In the following,
the number Iis always assumed to be sufficiently large, such that all
claims have been reported after Itime periods. For our numerical analysis
we will consider 2 different time periods (intervals): (1) years and (2) quarters, respectively, see Section 3, below.
A numerical example is given in Table 1. Apart from the recorded
disability cases Ni,j ŒDI, this scheme also contains the corresponding
numbers siof insured lives in occurrence periods i(as an exposure and
volume measure). The time periods in Table 1 are calendar years. Note
that we have shifted the occurrence period labeling from {0, . . . ,I}
(with I = 11) to {1997, . . . ,1997+I} because in the sequel this will allow
us to relate these occurrence periods to economic factors. The prediction of the number of cases, which have already occurred but are not yet recorded in the insurer’s administrative system, corresponds to the task of predicting the outcome of the random variables in the lower run-off triangle
(1.2) To this end we rely on three different models:
1. Poisson model.This method, that is frequently used in non-life
insurance, allows for occurrence period i and recording delay j
modeling. Dependencies in the accounting periods k = i + jare not
considered. Note, however, that these accounting period dependencies are crucial for the prediction of disability frequencies. For this reason, the Poisson model is not appropriate for our purposes.
2. Verbeek model.This is an alternative method for the prediction of
disability frequencies. It allows for recording delay j and accounting
period k = i + j modeling, but it neglects dependencies on occurrence
periods i.
3. Full model. This is a generalization of both the Poisson model and the Verbeek model. It allows for the modeling of all three directions
i, jand k = i + j.
The crucial point in our analysis is that the accounting period
dependencies are strongly correlated with preceding economic factors. The inclusion of such economic factors into our model will improve the quality of the disability frequency predictions. Therefore, we concentrate on the Full Model in the analysis, while the Poisson and Verbeek models only serve as benchmarks.
Organization of the paper.In Section 2 we first present the three different models mentioned above. Additionally, the basic properties of these methods are discussed. In Section 3 the models are applied to the data presented in Table 1. Here we consider time periods of a year and a quarter. In Section 4 it is shown how the quality of disability frequency predictions can be improved by the consideration of additional external information, such as preceding economic factors. In our concluding
occurrence period i reporting delay j underlying exposure s i 0 1 2 3 4 5 6 7 8 9 10 11 1997 2016 3560 1049 316 130 52 16 84000 480199 1998 1774 3660 1049 361 97 74 70 12 4 0 0 502661 1999 2292 3493 1019 405 125 62 20 12 0 8 515803 2000 1968 4081 1291 426 121 31 8 0 8 536556 2001 2511 5070 1598 387 70 55 12 4 582452 2002 2850 5933 1504 262 90 47 16 601253 2003 3304 5476 1090 285 94 47 609116 2004 2738 5031 1008 320 90 591749 2005 2617 4297 1242 293 600378 2006 2086 4457 930 622947 2007 2144 3746 627236 2008 2379 669942
Section 5 we give a summary of the results and propose further possible steps.
2 The models
In this section we define the three models mentioned above. The first two models are used as benchmark models. Their parameters will be
estimated with maximum likelihood estimation (MLE) methods, whereas the third model (Full Model) will be put into a Bayesian framework.
2.1 Poisson model
Model 2.1 (Poisson Model)Assume that there are fixed exposures si> 0,
and parameters pi , gj > 0 (i, j= 0, . . . ,I) withSIj=0gj= 1 (normalization) such that Ni,jare independent Poisson distributed random variables with mean si pigj for i,j = 0, . . . ,I.
The Poisson Model is well-established in non-life insurance claims reserving, see for example England-Verrall [2]. It has the following
properties: The total number of claims in occurrence period iup to
reporting delay j satisfies
, (2.1)
Henceforth, the expected total number of disability claims in occurrence
period i is given by
(2.2)
This shows that the pi’s are the disability frequencies and the gj’s give
the reporting pattern. Parameter estimation is done with MLE methods,
see Wüthrich-Merz [11], Section 2.3. The MLE’s for piand gj, given
the observations DI, can be found analytically, see Mack [6] and
Wüthrich-Merz [11], Section 2.4: We initialize (using the normalization) (2.3)
where the superscript Pis used for estimators from the Poisson Model.
Then we obtain the MLE’s iteratively for n= 1, . . . ,I by
The MLE predictors for DI care then in the Poisson Model 2.1 given by the
forecast (the notation E
ˆ
[.] indicates that we estimate the mean E[.])(2.5)
and the disability frequencies piare estimated by the MLE’s
ˆ
pPi .The
choice of the Poisson distribution is justified by the fact that it may serve
as approximation to the binomial distribution for large portfolio size siand
small disability frequency pi .
2.2 Verbeek model
Model 2.2 (Verbeek [9] Model) Assume that there are parameters lk> 0
(k = 0, . . . ,2I) and gj > 0 (j= 0, . . . ,I) with Sj=0I gj= 1(normalization) such that Ni,jare independent Poisson distributed random variables with mean li+j gj for i,j = 0, . . . ,I.
In the Verbeek Model the expected total number of disability claims
E [Ci,I ] = si pifrom the Poisson Model, see (2.2), is replaced by the
expression
(2.6)
If there are no accounting period effects, i.e. lk= l for all k, then we are
in the Poisson Model 2.1 with sipi= l. Henceforth, the Verbeek Model 2.2
allows for accounting period effects lkmodeling. The parameters are
again estimated with MLE methods. They can be calculated analytically
given DI , see Verbeek [9]: We initialize (using the normalization)
(2.7) where the superscript V is used for estimators from the Verbeek Model.
Then we iterate for n = I – 1, . . . ,0
(2.8) Note that the information in the Verbeek Model is not sufficient to predict the IBNyR claims DIc. For the prediction of Ni,j , i + j> I, we need to have
an estimate for li+j which unfortunately is not given by (2.7)-(2.8) for i + j
> I. Often in practice lk, for k > I, is estimated by linear regression from l
ˆ
V0 , . . . ,lˆ
VI . We denote the resulting linear regression estimatesby l
ˆ
Vkand hence obtain the Verbeek predictor of Ni,jby(2.9)
(2.10)
for given fixed exposures si> 0.
2.3 Full model
For the Full Model we take a Bayesian point of view. The Bayesian point of view has several advantages, e.g. rather simple numerical algorithms lead to the Bayesian predictors and we do not only get point predictors for the variables of interest, but full posterior distributions that allow for Bayesian inference on parameters.
Model 2.3 (Full Model) Assume there exist fixed exposures si> 0,
i= 0, . . . ,I.
• For given parameters Q= (p0, . . . ,pI ,g0, . . . ,gI ,l0, . . . ,l2I) the random variables Ni,jare independent Poisson distributed with mean
sipili+j gj for i,j = 0, . . . ,I.
• All the components of Qare independent and positive P-a.s. Moreover, lkhas a prior normalization, i.e. E[lk]= 1for k = 0, . . . ,2I.
Remarks.
• Conditionally given the parameters Q, we have an independent Poisson
cells model that allows the modeling of all three directions: occurrence
period pi, reporting delay gjand accounting period lkwith k = i + j.
• Due to the fact that we do not know the true parameters Q, we choose
a prior distribution for Qhighlighting our knowledge and parameter
uncertainty. Then we can do Bayesian inference on Q, given the
observations DI.
• If we have no accounting period effects, we set lk∫ 1 and then we are
in the Bayesian version of the Poisson Model 2.1, see also England et
al. [3]. Our aim is to see whether lk differs from 1, i.e. whether we have
accounting period effects.
• Note that gjis not normalized and therefore piis not a disability
frequency as in the Poisson Model 2.1.
The Bayesian predictor in the Full Model 2.3 for i + j> Iis given by
(2.11)
lkbeing independent of DIfor k >I. The estimator for the disability
(2.12) Note that the last term on the right-hand side cannot be further calculated
analytically because there is an implied posterior dependence between pi
and gj, given DI. Moreover, a Bayesian inference analysis for the
accounting period parameter lk, k£I, allows to compare:
is E[lk|DI] approximately equal to 1? (2.13)
Application of the Full Model in practice.
In order to apply the Full Model 2.3 we need to specify the prior
distributions for Q. Often there is no canonical choice for the prior
distributions. Therefore, often one makes a choice that allows for an easy inference analysis in (2.11) and (2.13). We choose independent gamma
distributions for all components of Qwith prior means
(2.14)
with g
ˆ
jPgiven by the MLE’s of the Poisson Model 2.1 and with coefficientsof variation still to be determined.
The posterior distribution Q|DIcan then be calculated numerically.
The two most popular methods are the Gibbs sampling method and the Markov chain Monte Carlo (MCMC) method. Because these methods are well-established in the literature, see Asmussen-Glynn [1], Gilks et al. [4] and Scollnik [8] we are not going to further discuss these simulation methods here. In our analysis we have used the Metropolis-Hastings [7, 5] algorithm for the MCMC simulation method in a similar fashion as in Wüthrich [10]. Therefore, for the explicit implementation we
refer to the latter reference.
3 Disability frequency estimation: an
example
In our initial analysis we study the data given in Table 1. These data comprise the observed (reported) number of disability claims over the
observation period from 1997 until 2008 together with the exposure sithat
counts the number of persons insured. Due to confidentiality reasons the data were scaled with a factor. In a first step we have used the three models, the Poisson Model 2.1, the Verbeek Model 2.2 and the Full Model
2.3. We have applied these three models to yearly (y) and quarterly (q)
(3.1) where the first upper index denotes the method and the second
the period length.
3.1 Poisson and Verbeek models
Figure 1: Disability frequency estimates
ˆ
pP|y i ,ˆ
pP|q
i from the Poisson model
Figure 1 gives the disability frequency estimates for the Poisson Model 2.1. We observe that the estimates become more volatile for shorter time periods. This already indicates that the Poisson model on short time periods is not appropriate because the parameter uncertainties are too large and because the Poisson model reacts too sensitively to small numbers of observations. Moreover, if we would consider monthly data, we would observe strong seasonal effects which are smoothed in the quarterly and yearly view.
The same figure for the Verbeek Model 2.2, see Figure 2, gives a much
more stable picture. We see that the accounting period parameter lkis
Figure 2: Disability frequency estimates
ˆ
pV|y i ,ˆ
pV|q
i from the Verbeek model
If we compare the quarterly disability frequency estimates p
ˆ
P|qi and p
ˆ
V|q i
we see that the estimation levels are very similar, see Figure 3. The difference in the late occurrence periods (2007-2008) comes from the fact that the linear regression in the Verbeek method is more conservative compared to the Poisson method.
Overall we see a decrease in the disability frequencies between 2003 and 2008 which has mainly to do with the fact that the economy has
recovered after the financial crisis in 2000-2002. This gives already a first hint that the disability frequencies follow economic factors with some delay.
3.2 Full model
In order to apply the Full Model 2.3 we need to specify the parameters of
the prior gamma distributions of Q. This was already partly done in
(2.14). Furthermore, expert judgment says
(3.2)
Henceforth, the coeficient of variation of lkremains to be specified.
We choose five different values:
(3.3)
Then we calculate numerically p
ˆ
F|yi and p
ˆ
F|q
i for these five values of
Vco(lk). The results for the estimated quarterly disability frequencies p
ˆ
F|qi
are provided in Figure 4. We see that the different choices of the coeficient of variation only have an influence on the recent occurrence periods where only little information is available.
Moreover, we can compare the different methods (Poisson, Verbeek and
Full Model). The results for the quarterly frequencies p
ˆ
P|qi , p
ˆ
V|q
i , p
ˆ
F|q
i ,
with std(lk)=0.2 are presented in Figure 5.
Figure 3: Quarterly disability frequency estimates
ˆ
pP|q i ,ˆ
pV|q i from the Poisson and Verbeek model
Figure 4: Quarterly disability frequency estimates
ˆ
pF|qi from the Full
Model with different standard deviations for lk, see (3.3)
One might question this comparison because in the Poisson and the Verbeek models we use MLE for the parameter estimation and in the Full Model we use Bayesian inference methods. However, we would like to emphasize that these two estimation methods often give very similar results, see also the analysis in England et al. [3].
Figure 5: Comparison of the quarterly disability frequency estimates
ˆ
pF|qi , p
ˆ
P|q i ,ˆ
pV|q
i for the three models. For the
In addition, the MCMC simulation method gives the whole posterior
distribution of the number of IBNyR claims at time I
(3.4)
conditionally given DI. The empirical results for the quarterly data with
std(lk) = 0.2 are provided in Figure 6. That is, in the full Bayesian model
we obtain the whole empirical posterior distribution for the number of IBNyR claims. This distribution now allows for the calculation of any risk measure, e.g. the Value-at-Risk for the number of disability claims on a 95% level.
4 Improvement of disability frequency
predictions
Both in the Full Model 2.3 and in the Verbeek Model 2.2 we analyze the
accounting period parameters lk, which are estimated in the Full Model
by Bayesian inference as
(4.1)
see (2.13), and in the Verbeek Model by the MLE’s as l
ˆ
Vk. We provide
the Bayesian estimators l
ˆ
F|yk and l
ˆ
F|q k for std(lk) = 0.2 in Figure 7. If we normalize both lˆ
F|q k and lˆ
V|qk to empirical mean zero and empirical
variance 1 we obtain the results presented in Figure 8.
Figure 7: Comparison of the estimators
ˆ
l F|yk , and
ˆ
l F|q k forstd(lk) = 0.2
Figure 8: Comparison between
ˆ
l V|q k andˆ
lF|q
k (both normalized)
First conclusions. In Figure 8 we see a very similar behavior for
lk, k£ I, in both models. Therefore, the main question is whether this
accounting period pattern is related to economic factors? If it is related to economic factors, the knowledge of these economic factors may help to
improve the estimations and predictions of the accounting period parameters lk, k> I.
In our analysis we have explored the following economic factors: unemployment rate, credit spread, SMI stock market index and sickness daily allowance index. The most convincing one in our analysis was the credit spread that we have obtained between corporate bonds and federal government bonds. For this credit spread we have observed high
correlations with the accounting period parameter lkand also an
appropriate time lag (this is discussed below). Surprisingly,
the unemployment rate did not have any predictive power for improving disability rate forecasts. The (quarterly) credit spread graph is provided in Figure 9. We especially see the high spreads in the financial distress periods after 2000 and in 2008-2009.
We denote the quarterly time series of these credit spreads by (St )tŒt,
where t denotes the set of time points where the credit spread is
available.
Questions.
• Is this credit spread (St )tŒtcorrelated with (
ˆ
lF|qk )k= 0, . . . ,I ?
• If yes, what is the optimal time shift (time lag) between tand k ?
• What can we learn from the credit spread (St )tŒtfor (
ˆ
lF|qk )k= I+1, . . . ,2I ?
Analysis 1.
In a first analysis we calculate the empirical correlations between
(4.2)
where D Œ {0; . . . ,16} denotes the time shift in quarters between these
two time series. The empirical correlations as a function
of D Œ {0; . . . ,16} are provided in Figure 10.
Figure 10: Empirical correlation between the credit spread Sk-D
and
ˆ
l F|qk as a function of the time shift D(in quarters)
We see that a time shift Dof 5 quarters gives an empirical correlation of
almost 70%! This suggests that the credit spread (Sk)kruns about 5 quarters
ahead of the accounting period effects (
ˆ
lF|qk )k, and henceforth, the
knowledge of the credit spread should improve the disability frequency
Analysis 2.
In a second analysis we make a linear regression Ansatz: Assume that
there exist parameters a, bsuch that
(4.3)
where ekdenotes the error term and D is again the time shift. Using the
minimal least squares (MLS) method we can determine the optimal a and
b. We observe a maximal slope b
ˆ
and a minimal p-value (for the nullhypothesis b= 0 under Gaussian error terms ek) again for a shift Dof 5
quarters. The corresponding estimates are
(4.4)
and the p-value is less than 0.1%. Henceforth, from the observation
(Sk)k £Iand with the shift of D= 5 we obtain the linear regression
estimators (see also Figure 11)
(4.5)
Figure 11: Regression line for
ˆ
l F|qk ,see (4.3)-(4.5)
In view of (2.11) and (2.12), this provides the (economic factors) improved predictors
(4.6) (4.7)
The estimator for the disability frequency piis then given by
(4.8)
Figure 12: Quarterly disability frequency estimates in the Full Model compared to the improved estimates which are based on the linear regression (4.3)-(4.6)
The numerical results are presented in Figure 12.
Interpretation of Figure 12.
We see in Figure 12 that the inclusion of the credit spread information gives much more conservative disability frequency predictions. This comes from the fact that we now include information about the financial crisis 2008-2009 into the disability frequency prediction. This financial crisis will increase the frequencies, and if we would neglect this information we would clearly underestimate the number of disability claims.
5 Conclusions and outlook
In a first study we analyze a disability development model that models all three time directions: occurrence period, reporting period and accounting period. Bayesian inference methods allow to predict the number of disability claims and we find that the accounting period parameter is
a relevant parameter. In a second analysis we then study how this accounting period parameter is related to economic time series. We see that the credit spread runs five quarters ahead of the accounting period parameter. Therefore, credit spread information allows to improve disability frequency predictions (note that the credit spread is often used as an indicator for future economic developments).
In a next step one should merge the economic time series model and the disability frequency model into a full stochastic framework that also allows to model economic time series stochastically. This would allow for the analysis of prediction uncertainty and also relax the i.i.d. assumption of the accounting period parameters (which is too restrictive). Finally, one should not only model the number of disability claims but also their claim sizes (disability benefits).
References
1. Asmussen, S., Glynn, P.W. (2007). Stochastic Simulation. Springer. 2. England, P.D., Verrall, R.J. (2002). Stochastic claims reserving in general
insurance. British Act. J. 8/3, 443-518.
3. England, P.D., Verrall, R.J., Wüthrich, M.V. (2011). Bayesian overdispersed Poisson model and the Bornhuetter-Ferguson claims reserving method. To appear in Annals of Actuarial Science.
4. Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (1996). Markov Chain Monte Carlo in Practice. Chapman & Hall.
5. Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 97-109.
6. Mack, T. (1991). A simple parametric model for rating automobile
insurance or estimating IBNR claims reserves. ASTIN Bulletin 21/1, 93-109. 7. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.
(1953). Equation of state calculations by fast computing machines. J. Chem. Phys. 21/6, 1087-1092.
8. Scollnik, D.P.M. (2001). Actuarial modeling with MCMC and BUGS. North American Actuarial J. 5/2, 96-125.
9. Verbeek, H.G. (1972). An approach to the analysis of claims experience in motor liability excess of loss reinsurance. ASTIN Bulletin 6/3, 195-202. 10. Wüthrich, M.V. (2010). Accounting year effects modelling in the
stochastic chain ladder reserving method. North American Actuarial J. 14/2, 235-255.
11. Wüthrich, M.V., Merz, M. (2008). Stochastic Claims Reserving Methods in Insurance. Wiley.