4.2 Generalised Linear Models
4.2.3 Graduation by Reference to a Standard Table
The process of graduation by reference to a standard mortality table involves relating the observed mortality experience to a set of graduated mortality rates by way of a mathematical function. That is, we write:
E(Dx,t) Ex,t =f qx,ts ; (4.4) or, alternatively, E(Dx,t) Ec x,t =f msx,t ; (4.5)
where f(.) denotes a given mathematical function, qs
x,t denotes the initial mortality
rate given by the standard table for lives aged x at time t, and ms
x,t denotes the
central mortality rate given by the standard table for lives aged x at timet.
Commonly, the mathematical function used to relate the observed mortality rates
8
One of the mortality data sets used in this thesis was supplied by the Institute of Actuaries of Australia Mortality Committee. A subset of this data was used to produce the Australian insured life mortality tables, IA95-97 M and F. The data is discussed in greater detail in Section 5.2.
Statistical Models for Mortality and Economic Data 35 to the standard table mortality rate is a simple scaling:
E(Dx,t) Ex,t =αqx,ts ; (4.6) or E(Dx,t) Ec x,t =αmsx,t; (4.7)
where α >0 is a model parameter.
Once a possible relationship has been selected, the parameters are fitted using a method such as maximum likelihood estimation, least squares or weighted least squares.
This method is used in practice to produce graduated mortality rates for Life Insurance applications (such as pricing and reserving) when data are sparse but are believed to come from an experience similar to that for which a graduated table already exists. Nevertheless, there seem to be very few references to this method in the actuarial literature. Two references to this method can be found in Benjamin and Pollard (1980, pp.328–338) and London (1985, pp.24–25).
In Equations (4.4) and (4.5), the standard mortality table rates, qs
x,t and msx,t
are denoted in such a way as to allow for different standard mortality tables in each year, t, but it is rarely the case that tables are compiled in such a way. In practice, mortality tables are generally produced for a single base year (with the base year denoted as year 0) and adjusted to allow for durational mortality improvements through the use of mortality reduction factors. That is, in the context of the initial mortality rates: qx,ts =RF q x,tq s x,0, (4.8)
where RFx,tq denotes the reduction factor for a life aged x, in year t.
Similarly, in the context of the central mortality rates:
ms x,t =RF m x,tm s x,0, (4.9) where RFm
x,t denotes the reduction factor for a life aged x, in year t.
Methods exist for estimating mortality reduction factors by way of fitting a model, such as a GLM, to data collected over an extended number of years. For example, see Renshaw and Haberman (2003). However, due to the limited volume of mortality data available for use in this thesis, any mortality improvement factors estimated using these methods are likely to lead to unreliable forecasts, so we will not follow this practice.
In Australia, organisations such as the Australian Bureau of Statistics and the Australian Government Actuary estimate mortality reduction factors for the Aus-
36 Statistical Models for Mortality and Economic Data tralian population on a regular basis, but no such factors are publicly available for Australian insured life data.
In this thesis, mortality reduction factors based on population mortality data (such as those mentioned in the previous paragraph) will be used as proxies for insured life mortality reduction factors. The selection of these reduction factors is discussed in greater detail in Section 4.2.5.
4.2.4
Generalised Linear Models for Mortality Data Using
a Standard Table
Two types of mortality data are used in this thesis: (i) unit record data9, and (ii) grouped data10. The unit record data gives the number of deaths, central exposures to risk and covariate information for each policyholder represented by the data set, while the grouped data gives the number of deaths and the central exposures to risk, subdivided by covariate information including age and year (this data is described in detail in Chapter 5). Consequently, and in light of the previous discussion, the most appropriate generalised linear model to use as the basis for tests involving either of these data sets is a GLM relating the observed central mortality rates11 to a set of standard table central mortality rates, assuming a Poisson distribution for the number of deaths. One possible form for such a model is:
ln (E(Dx,t)) = ln Ex,tc m s x,t +η; (4.10) or, equivalently, E(Dx,t) =Ex,tc m s x,texp (η) ; (4.11)
where η is a linear combination of the covariates, for example η = β0 +β1Age+
β2Y ear+· · ·.
In this model, the log link has been selected as this is the canonical link function for the Poisson distribution, and ln Ec
x,tmsx,t
, the logarithm of the expected number of deaths using the standard table mortality rates, appears as an “offset”12. The model presented in Equation (4.11) was selected because it is of the form of the graduation model presented in Equation (4.7).
The (insured life) graduated mortality tables produced by the Institute of Ac- tuaries of Australia, which are used as the standard table mortality rates in this
9
Unit record data is data where information is provided for each individual represented by the data set.
10
Grouped data is data where information is only provided for sub-groups of the individuals represented by the data set, where the sub-groups are defined according to covariate information.
11Calculated as ˆm x,t= Dx,t Ec x,t. 12An
offset is a “quantitative variate whose regression covariate is known to be 1” (McCullagh and Nelder (1989, p.206).
Statistical Models for Mortality and Economic Data 37 thesis, give initial mortality rates but not central mortality rates. To adjust for this,
ms
x,t is replaced by the approximation:
msx,t ≈ −ln 1−q s x,t
; (4.12)
as was suggested by Haberman and Renshaw (1996). This approximation is based on the identity: 1−qx,t = exp − Z x+1 x µr,tdr ; (4.13)
where µr,t is the instantaneous rate of mortality13 for a life aged r exact in year t;
and the approximation:
mx,t ≈µsx+0.5,t≈
Z x+1
x
µr,tdr. (4.14)
Combining Equations (4.10) and (4.12) gives the model: ln (E(Dx,t)) = ln −ln 1−qx,ts Ec x,t +η. (4.15)
This is the form of the GLM that is used as the basis for a number of tests involving mortality data throughout our subsequent analysis.