Natural Hedging Using Multi-population Mortality Forecasting Models

(1)

Natural Hedging Using

Multi-population Mortality Forecasting Models

by

Shuang Chen

B.Sc., Nankai University, 2011

Thesis Submitted in Partial Fulﬁllment of the Requirements for the Degree of

Master of Science

in the

Department of Statistics and Actuarial Science Faculty of Science

Fall 2014

However, in accordance with the Copyright Act of Canada, this work may be reproduced without authorization under the conditions for “Fair Dealing.” Therefore, limited reproduction of this work for the purposes of private study,

research, criticism, review and news reporting is likely to be in accordance with the law, particularly if cited appropriately.

(2)

APPROVAL

Name: Shuang Chen

Degree: Master of Science

Title: Natural Hedging Using Multi-population Mortality Fore-casting Models

Examining Committee: Chair: Dr. Gary Parker

Associate Professor

Dr. Cary Chi-Liang Tsai

Senior Supervisor Associate Professor Simon Fraser University

Dr. Yi Lu

Supervisor

Associate Professor Simon Fraser University

Dr. David Campbell

Internal Examiner Associate Professor Simon Fraser University

Date Approved: December 11th, 2014

(3)

Partial Copyright Licence

(4)

Abstract

No mortality projection model can capture future mortality changes accurately so that the actual mortality rates are different from the projected ones. The movement of mortality rates has oppositive impacts on the values of life insurance and annuity products, which creates a chance of nature hedge for both life insurer and annuity provider. A life insurer and an annuity provider can swap their life insurance and annuity business for each other to form their own portfolios for natural hedge. This project is mainly focused on determining the weights of a portfolio of life insurance and annuity products by minimizing the variance of the loss function of the portfolio to reduce mortality and longevity risks for each of the life insurer and the annuity provider. Four Lee-Carter-based models are applied to model the co-movement of two populations of life insurance and annuity insureds, and then determine the weights for comparisons. The block bootstrap method, a model-/parameter-free approach, is also adopted with numerical illustrations to compare the hedging per-formances among the four models.

(5)

I dedicate this thesis to my parents, - who always support and cheer me up.

(6)

Acknowledgments

I would like to express my deepest appreciation to all those who supported and guided me to complete this project. A special gratitude is given to my supervisor, Dr. Cary Tsai, who provided me with patient guidance and valuable suggestions that inspired me and helped me realize self-improvements throughout my two years of studies in SFU. I will always be grateful to have the opportunity to learn new ideas and conduct research under his supervision. Without his devoting time to this project, I wouldn’t have completed the project. I hereby thank you for all your time spent on guiding me.

Furthermore, I would also like to acknowledge with my great appreciation the members of my project committee, Dr. Gary Parker, Dr. Yi Lu and Dr. David Campbell. Thank you for reading my project with your valuable time and giving me insightful comments and suggestions.

A special thank goes to the whole statistics and actuarial science department, where I mastered more advanced knowledge that will beneﬁt me forever. Thank you to all the professors who have ever given me lectures and answered my ques-tions. You helped me to fulﬁll a high level academic achievement.

Last but not least, I would like to take this opportunity to thank my family and my friends who supported me and encouraged me whenever I felt confused and lost. Especially, I would like to express my appreciation to my fellow graduate students, Annie, Biljana, Elena, Fei, Huijing, Sabrina, Vicky and Yi, who brought me fresh ideas and memorable moments during my study.

(7)

List of Tables

4.1 the median of 50 optimal weights . . . 37 4.2 comparisons of sample variances (×1023) and HE’s . . . 38 5.1 p-values for testing stationarity of{dx,t,i} . . . 48

5.2 comparisons of sample variances (×1023) and HE’s (block bootstrap) 55

(10)

List of Figures

3.1 cohort mortality sequence and period mortality sequence . . . 10

3.2 ˆkt,iagainst t . . . 13

3.3 95% predictive intervals on ˆqx,2009+1,ifor the Independent model . . . 15

3.4 95% predictive intervals on ˆqx,2009+1,ifor the joint-k model . . . 19

3.5 95% predictive intervals on ˆqx,2009+1,2for the co-integrated and inde-pendent models . . . 22

3.6 95% predictive intervals on ˆqx,2009+1,ifor the augmented common fac-tor model . . . 26

3.7 mortality curve comparisons among models and actual rates . . . . 27

4.1 Mortality swap: minimizing V ar(LL) + V ar(LA) . . . 30

4.2 Mortality swap: minimizing V ar(LL) . . . 30

4.3 Mortality swap: minimizing V ar(LA) . . . 30

4.4 age span[25, 100] and year span [1981, 2010] . . . 35

4.5 optimal weights for the independent and joint-k models . . . 39

4.6 optimal weights for the co-integrated and augmented common factor models . . . 40

4.7 variances after swap . . . 41

4.8 simulated loss distributions using( ˆwL+A_l , ˆwL+A a ) . . . 42

4.9 simulated loss distributions using( ˆwL l, ˆwLa) . . . 43

4.10 simulated loss distributions using( ˆwA l , ˆwaA) . . . 44

5.1 {dx,t,i} for x = 35, 45, 55, 65 . . . 47

5.2 a circle diagram of dt,i’s . . . 49

5.3 simulated loss distributions using( ˆwL+A_l , ˆwL+A a ) . . . 52

5.4 simulated loss distributions using( ˆwL l, ˆwLa) . . . 53

5.5 simulated loss distributions using( ˆwA l , ˆwaA) . . . 54

(11)

Chapter 1 Introduction

1.1 Motivations

Natural hedging is a strategy of hedging two risks responding oppositely to a change in a common factor. Since life insurers and annuity providers face mor-tality risk (the actual death probabilities are higher than expected) and longevity risk (the actual survival probabilities are larger than expected), respectively, both of them can adopt natural hedging by swapping a portion of their business between each other to reduce mortality and longevity risks.

In this project, we would like to ﬁnd the optimal swapped weights of life and annuity business for a life insurer and an annuity provider, respectively, which min-imize the variance of a loss function of the life insurer or the annuity provider, or the sum of the variances of the two loss functions. To reach the goal, we need some mortality models to forecast mortality rates for life insurance policyholders and annuitants.

Since life insurance and annuities tend to be issued to different populations, using the same mortality table to determine the premiums is inappropriate. Fur-thermore, even though we use two different mortality tables for life insurance pol-icyholders and annuitants, we cannot ignore the dependence between these two populations. Generally speaking, two populations within a territory or country are exposed to the same medical and environmental conditions. Therefore, the mortal-ity rates for two populations might not be independent. In this project, two different

(12)

CHAPTER 1. INTRODUCTION 2

mortality tables and four multi-population mortality models are applied to forecast-ing deterministic and stochastic mortality rates.

The deterministic mortality rates are used to determine the premiums of life and annuity products, and the stochastic mortality ones are used to simulate the losses of the life insurer and annuity provider. Then we calculate the sample variances and sample covariance of the losses of the life insurer and annuity provider for ob-taining the formulas of the optimal weights. A variance reduction ratio, called hedge effectiveness, is proposed to compare the performances of hedging mortality and longevity risks.

This project also carries out the robustness testing to study the sensitivity of the weights and variances obtained from the simulated mortality rates. Finally, be-cause all the results are derived based on the parametric mortality models, we are not sure whether the assumed mortality model is the true one. To avoid the pa-rameter and model risks, the model-free block bootstrap method is used to project mortality rates. Then we redo the numerical results which are compared with those produced by the parametric mortality models.

1.2 Outline

This project consists 5 chapters. Chapter 2 is a review of the previous research on both mortality projection models and hedging of mortality and longevity risks.

In Chapter 3, some actuarial notations used in this project are provided, fol-lowed by the introduction of four mortality projection models, the independent Lee-Carter model, joint-k Lee-Lee-Carter model, co-integrated Lee-Lee-Carter model and aug-mented common factor model, among which the last three can model the depen-dent structure between the mortality rates for two populations. These models can be used to forecast deterministic and stochastic mortality rates for both life insur-ance policyholders and annuitants.

(13)

CHAPTER 1. INTRODUCTION 3

Chapter 4 focuses on natural hedging with mortality swap. In this chapter, three pairs of optimal weights for swapping life insurance and annuity business are stud-ied. Numerical illustrations using the U.S. mortality tables are exhibited. Also, the robustness testing is conducted to test the sensitivity of the numerical results.

In Chapter 5, the model free block bootstrap method is introduced in details, which is then applied to testing the results produced by the multi-population models in Chapter4.

(14)

Chapter 2 Literature Review

The two main parts of this project are mortality projection and natural hedge. The ﬁrst section will go through the articles about the mortality forecasting models and the second section will review those about hedging of mortality and longevity risks.

2.1 Mortality forecasting models

Mortality projection is one of the keys for pricing insurance and annuity products. The most widely used mortality projection model so far is the Lee-Carter model (Lee and Carter (1992)). Between 1900 and 1988, the life expectancy in the United States increased from 47 to 75. Based on the assumption that the life expectancy would continue to increase, Lee and Carter (1992) applied the time series model to forecasting long term mortality rates. This model considers two age-varying factors and one time-varying factor to capture the downward trend in mortality rates. By empirical mortality data, it is assumed that the time-varying factor is ﬁtted by an ARIMA (autoregressive integrated moving average) model.

One of the most significant features of the Lee-Carter model is that it is easy to interpret and its parameters are estimable. Compared to other mortality projection models, the Lee-Carter model bases its forecast on long term historical data, and no explicit assumptions or a life span limit is attached to the model. Moreover, the Lee-Carter model provides confidence regions, which was further proved that the model is efficient in forecasting mortality rates.

(15)

CHAPTER 2. LITERATURE REVIEW 5

Afterwards, Lee and Carter (1992) pointed out that in order to forecast the mor-tality rates for two populations within a territory or country, the same time-varying index, called joint-k index, should be used so that the same trend of mortality growth in response to time for the two populations can be reﬂected. This assump-tion makes sense because usually two populaassump-tions in a country have the same medical service and environmental condition, etc. Thus, when these aspects im-prove, the life expectancy of each population will increase simultaneously to the same level.

Lee and Li (2005) extended the original Lee-Carter model by adding an extra term to the model. They thought that the world is becoming more closely con-nected by modern communication technology. Thus, in a long term, the forecasted mortality rates for different populations will converge. They called it the augmented common factor model, which was developed in two steps. Firstly, Lee and Li (2005) used a common factor to capture the overall mortality growth; then the mortality rates of that population are speciﬁed by an augmented term which reﬂects the fea-ture of that population. This method was applied to forecasting the mortality rates of Canadian populations. In different provinces, the common factor was used to realize the long term convergence, and the augmented term was used to separate the mortality rates for each province.

With the same concern about the long-term convergence problem, Li and Hardy (2011) suggested the so-called co-integrated Lee-Carter model. By investigating the time-varying indices of two populations, they noticed that there exists a co-integrated relationship between the two time indices. In the co-co-integrated model, two time-varying indices are modeled by a linear function.

The last three mentioned models based on the Lee-Carter model all take into consideration the dependent structures among multiple populations; we call them the multi-population mortality projection models. In this project, because the natu-ral hedging strategy is adopted by swapping portions of life and annuity business based on two populations of life insurance policyholders and annuitants, the multi-population mortality projection models are needed.

(16)

2.2 Hedging of mortality and longevity risks

Hedging of mortality and longevity risks is one of the most popular research areas recently in Actuarial Science. Researchers have conducted large amount of study on how to reduce mortality and longevity risks. Some of the ﬁnancial instruments have been applied to hedging these two risks, such as mortality-linked securities, q-forward, etc. The practice of using mortality-linked securities was ﬁrst suggested by Blake and Burrows (2001) by pouring longevity risk into the capital market. They introduced a survivor bond for which the future coupons are based on the percentage of the alive retirees at the future coupon payments dates.

In 2003, Swiss Re issued the ﬁrst mortality-linked security which is to protect insurers from the loss of catastrophes. The bonds were issued via SPV (special purpose vehicle). The payments depend on a mortality index. This bond has been successfully operated for years.

BNP-Paribas and the European Investment Bank in 2004 issued a 25-year longevity bond to hedge longevity risks. This is the ﬁrst longevity bond that came into real business. The coupons were linked to the survivor index based on the actual mortality rates of England and Welsh males aged 65 in 2002.

In 2006, a pension buyout market attracted a lot of attention in UK. In this market both the assets and liabilities of a pension plan were transferred to a life insurer. To realize the transfer, the pension plan has to pay the life insurer the amount of deﬁcit if the assets are less than the liabilities. On the contrary, the insurer will pay the surplus to the pension plan if assets exceeds liabilities. Thus, the pension plan can be secured. This is an efﬁcient method of transferring longevity risk; however, the pension plan will possibly experience loss since the life insurer tends to re-evaluate the pension plan by measuring the assets and liability in a way of more risk aversion.

In 2007, J.P. Morgan introduced q-forwards, another ﬁnancial derivatives, to transfer mortality rates to the capital market. The q-forwards are described as a zero coupon swap between a pre-ﬁxed mortality rate at inception date and the realized mortality rate at maturity date of the q-forward contract.

(17)

Not only the pricing and structure of the mortality-linked securities have been systematically studied but also how to evaluate the hedging methodology. Li and Ng (2011) pointed out that because of lack of longevity trading index from which to evolve the market price of the mortality-linked securities, it still remains a question of how to evaluate the hedging technics. Thus, they proposed a pricing framework for evaluating the mortality-linked securities based on canonical valuation. To con-struct the framework, they suggested a nonparametric model, which helps to avoid the risks from model itself and its parameters.

Gaillardetz et al. (2012) proposed a method of evaluating the hedging error under a stochastic mortality projection. It is very likely that when catastrophes occur insurers would experience huge amount of losses. Thus, evaluating the distribution of hedging error is very important. They applied the regime-switching model (Milidonis et al. (2011)) to extracting the hedging error and conducted the error distribution that made the evaluation feasible.

In recent years, another kind of hedging method called natural hedge has be-come appealing for researchers. Because instead of transferring the risk into ﬁ-nancial market, natural hedge can reduce mortality and longevity risks simply by swapping proportions of life and annuity business between a life insurer and an an-nuity provider. Cox and Lin (2007) are the ﬁrst to thoroughly introduce this concept by noticing the empirical evidence that companies adopting natural hedge within their business usually offer lower premiums than the other companies.

After the proposal of natural hedge, tons of studies have been done in this ﬁeld. Wang et al. (2010) suggested that by mortality duration and convexity matching, the optimal weights for conducting an immunization portfolio can be obtained. Lin and Tsai (2013) gave more elaborate formulas for mortality duration and convexity, which can be applied to a two-product or three-product portfolio.

However, some researchers also threw doubt on natural hedging that all the studies are based on some models, but in reality the future mortality rates might be totally model free, which may cause the method inefﬁcient. Zhu and Bauer (2014) used a non-parametric model to test the natural hedge method and came to the conclusion that higher order variations in mortality rates would affect the efﬁciency of natural hedging. Another problem of natural hedging pointed out by Cox and Lin

(18)

(2007) is that for estimating the future mortality rates, most of the studies use the same life table for both annuity and life insurance, which is not practical.

This project uses multi-population mortality projection models to forecast future mortality rates for life insurance policyholders and annuitants separately based on two life tables. The method of Langrage multipliers is used to help obtain the optimal weights. Moreover, robust testing, together with model testing based on the model-free block bootstrap method, is carried out.

(19)

Chapter 3 Introduction of Multi-population

Models

In this chapter, we ﬁrst introduce four mortality models for multi-populations, which are based on the well known Lee-Carter model. Then we ﬁt the models with mor-tality data from the Human Mormor-tality Database to get the estimated parameters which can be applied to forecasting future mortality rates. The deterministic mor-tality rates can be used to determine the prices of life insurance and annuity prod-ucts, and the stochastic ones can be used to simulate the realized/actual prices in Chapter4.

3.1 Concepts and notations

Let qx,t,i denote the probability that an individual aged x in year t for population

i will die within one year. We add an extra subscript i for studying the multi-population mortality projection models. When forecasting the future mortality rates, there are mainly two types of mortality sequences. One is the cohort mortality se-quence{qx0+j,t0+j,i : j = 0, 1, 2, ...}, and the other is the period mortality sequence {qx0+j,t0,i : j = 0, 1, 2, ...}. In this project, the cohort mortality sequence is used

for pricing. Figure3.1 shows the difference between the cohort mortality sequence and the period one.

Next, denote μx,t,i(s) the function of force of mortality, which is an instantaneous

death rate between age x and x+s in year t for population i aged x. It can be shown

(20)

CHAPTER 3. INTRODUCTION OF MULTI-POPULATION MODELS 10

Figure 3.1: cohort mortality sequence and period mortality sequence

(see Bowers et al. (1997)) that

1 − qx,t,i= e− ₁

0μx,t,i(s)ds_.

Under the assumption of constant force of mortality within one year, that is, μx,t,i(s) =

μx,t,i(0)=μx,t,ifor s ∈ [0, 1), we have

qx,t,i= 1 − e−μx,t,i.

Another frequently used mortality rate is the central death rate mx,t,i which is

deﬁned as the ratio of the number of deaths aged x in year t to the average number of people aged x in year t. Again, under the assumption of constant force of mortality within one year, it can be shown that μx,t,i= mx,t,i, and thus,

qx,t,i= 1 − e−mx,t,i.

(21)

3.2 Multi-population models and constrains

3.2.1 Independent Lee-Carter model

Lee and Carter (1992) proposed a model that the natural logarithm of central death rates can be expressed by two age speciﬁc factors and one time speciﬁc factor as follows:

ln(mx,t,i) = ax,i+ bx,i× kt,i+ εx,t,i,

subject to two constraints, • _xbx,i= 1,

and

• tkt,i= 0,

where

• ax,iis the average age-speciﬁc mortality factor at age x for population i,

• kt,iis the general mortality level in year t for population i,

• bx,iis the age-speciﬁc reaction to kt,iat age x for population i, and

• εx,t,iis the error term which is assumed independent and identically normally

distributed with mean0 and variance σ2_ε_x,ifor all ts.

To ﬁt the model with a given matrix of ln(mx,t,i)s for population i and estimate

all parameters, we may use the singular value decomposition (SVD) method. How-ever, there is a close approximation to SVD.

First,ˆax,ican be derived by the average of the sum of ln(mx,t,i) over a given year

span[t0, t0+ n − 1] by the second constraint, and ˆkt,iequals the sum of[ln(mx,t,i) −

ˆax,i] over a given age span [x0, x0+ m − 1] by the ﬁrst constraint. That is,

t0+n−1

t=t0

ln(mx,t,i) = n × ax,i+ bx,i× t0+n−1

t=t0

(22)

CHAPTER 3. INTRODUCTION OF MULTI-POPULATION MODELS 12 implying ˆax,i= t0+n−1 t=t0 ln(mx,t,i) n , x = x0, x0+ 1, ...., x0+ m − 1. Similarly, x0+m−1 x=x0

[ln(mx,t,i) − ax,i] = kt,i× x0+m−1 x=x0 bx,i, yielding ˆkt,i= x0+m−1 x=x0

[ln(mx,t,i) − ˆax,i], t = t0, t0+ 1, ..., t0+ n − 1.

Finally, ˆbx,i can be obtained by regressing[ln(mx,t,i) − ˆax,i] on ˆkt,iwithout the

constant term involved for each age x. For projecting the future time-varying index kt,i, a key to project the mortality rates, an ARIMA(0,1,0) time series (a random

walk with drift θi) is used, which is given by

ˆkt,i= ˆkt−1,i+ θi+ t,i,

where the error term t,ifollows an independent and identical normal distribution

with mean0 and variance σ2

,ifor all ts. The parameter θican be estimated by

ˆθi= 1

n − 1

t0+n−1

t=t0+1

(ˆkt,i− ˆkt−1,i) =ˆkt0+n−1,i− ˆkt0,i

n − 1 .

Figure 3.2 displays the plots of {ˆkt,i} from year 1981 to year 2009 (t0 = 1981

and n = 29) based on the U.S. male (i = 1) and female (i = 2) mortality rates for an age span[25, 100] from the Human Mortality Database (www.mortality.org). The data set is used throughout this chapter. From Figure3.2 we can see that the {ˆkt,i}

(23)

Figure 3.2: ˆkt,iagainst t

The logarithm of the stochastic central death rate for age x in year t0+ n − 1 + τ

denoted bym˜x,t0+n−1+τ,iis

ln( ˜mx,t0+n−1+τ,i) =ˆax,i+ ˆbx,i× (ˆkt0+n−1,i+ τ × ˆθi+

√

τ × t,i) + εx,t,i

=ln( ˆmx,t0+n−1+τ,i) + √

τ × ˆbx,i× t,i+ εx,t,i,

wheremˆx,t0+n−1+τ,iis the deterministic central death rate. Note that ln( ˜mx,t0+n−1+τ,i) is normally distributed with mean

ln( ˆmx,t0+n−1+τ,i) = ˆax,i+ ˆbx,i× (ˆkt0+n−1,i+ τ × ˆθi), τ = 1, 2, ....

and variance σ2(ln( ˜mx,t0+n−1+τ,i)) = τ × ˆb2x,i× σ2,i+ σε2x,i, where {εx,t,i} and {t,i}

are assumed independent.

Then

ˆ

mx,t0+n−1+τ,i= exp[ˆax,i+ ˆbx,i× (ˆkt0+n−1,i+ τ × ˆθi)],

or

(24)

The estimate of the variance of εx,t,iis given by

ˆσ2 εx,i= 1 n − 2 t0+n−1 t=t0 ε2x,t,i= t0+n−1

t=t0 [ln(mx,t,i) − ˆax,i− ˆbx,i× ˆkt,i]2

n − 2 ,

where x = x0, x0+ 1, ..., x0+ m − 1. The estimate of the variance of t,iis given as

ˆσ2 ,i= 1 n − 2 t0+n−1 t=t0+1 2t,i= t0+n−1 t=t0+1(ˆkt,i− ˆkt−1,i− ˆθi) 2 n − 2 .

Therefore, the estimate of the variance of ln( ˜mx,t0+n−1+τ,i) is obtained by

ˆσ2_{(ln( ˜}_m

x,t0+n−1+τ,i))=ˆσ2x,t0+n−1+τ,i= τ × ˆb

2

x,i× ˆσ2,i+ ˆσε2x,i,

and a100(1 − γ)% predictive interval on qx,t0+n−1+τ,ican be constructed as

1 − exp[−exp(ln( ˆmx,t0+n−1+τ,i) ± zγ₂× ˆσx,t0+n−1+τ,i)].

Figure3.3 shows the projected period mortality rates in year 2010 based on the central death rates from year 1981 to 2009. It is obvious that the estimated mor-tality rates are quite close to the actual ones, and a narrower predictive interval on the female mortality rates than that on the male ones is produced.

(25)

(26)

3.2.2 Joint-k Lee-Carter model

Apparently, two independent models ignore the co-movements between the mor-tality rates for two populations. The males and females in a country live in the same environment, and their mortality rates are affected by some common fac-tors. To relate one population’s mortality rates to the other’s ones, Carter and Lee (1992) introduced the joint-k model where the time-varying index kt,i, which

shows the general mortality level over time, is the same for both populations, that is, kt,1= kt,2 = Kt. The model can be expressed as

ln(mx,t,i) = ax,i+ bx,i× Kt+ εx,t,i, i = 1, 2,

subject to two new constraints: • 2 i=1 xbx,i= 1 and • tKt= 0.

Similarly,ˆax,i can be derived by the average of the sum of ln(mx,t,i) over a year

span[t0, t0+n−1], and ˆKtis equal to the sum of[ln(mx,t,i)− ˆax,i] over an age span

[x0, x0+ m − 1] and the population index. That is,

t0+n−1

t=t0

ln(mx,t,i) = n × ax,i+ bx,i× t0+n−1 t=t0 Kt, implying ˆax,i= t0+n−1 t=t0 ln(mx,t,i) n , x = x0, x0+ 1, ..., x0+ n − 1; and 2 i=1 x0+m−1 x=x0 ln(mx,t,i) = 2 i=1 x0+m−1 x=x0 ax,i+ 2 i=1 x0+m−1 x=x0 bx,i× Kt, giving ˆ Kt= 2 i=1 x0+m−1 x=x0

[ln(mx,t,i) − ˆax,i], t = t0, t0+ 1, ..., t0+ n − 1.

As for bx,i, it is obtained by regressing[ln(mx,t,i) − ˆax,i] on ˆKtwithout the constant

(27)

to follow a random walk with drift θ. That is ˆ

Kt= ˆKt−1+ θ + t,

where the error term t follows an independent and identical normal distribution

with mean0 and variance σ2, and tis independent of εx,t,i. The parameters θ is

estimated by

ˆθ =Kˆt0+n−1− ˆKt0 n − 1 .

The logarithm of the forecasted central death rate for age x in year t0+ n − 1 + τ is

ln( ˜mx,t0+n−1+τ,i) = ˆax,i+ ˆbx,i× ( ˆKt0+n−1+ τ × ˆθ + √

τ × t,i) + εx,t,i

= ln( ˆmx,t0+n−1+τ,i) + √

τ × ˆbx,i× t,i+ εx,t,i,

where

ln( ˆmx,t0+n−1+τ,i) = ˆax,i+ ˆbx,i× ( ˆKt0+n−1+ τ × ˆθ), τ = 1, 2, ....

The estimate of the variance of εx,t,iis

ˆσ2 εx,i=

_t₀+n−1

t=t0 [ln(mx,t,i) − ˆax,i− ˆbx,i× ˆKt]

2

n − 2 ,

and the estimate of the variance of tis

ˆσ2

=

t0+n−1

t=t0+1( ˆKt− ˆKt−1− ˆθ)2

n − 2 .

Thus, the variance of ln( ˜mx,t0+n−1+τ,i) is

ˆσ2

x,t0+n−1+τ,i= τ × ˆb

2

x,i× ˆσ2+ ˆσ2εx,i,

which can be used to construct a100(1 − γ)% predictive interval on qx,t0+n−1+τ,ias

1 − exp[−exp(ln( ˆmx,t0+n−1+τ,i) ± zγ₂× ˆσx,t0+n−1+τ,i)].

Using the same data set as that for Figure3.3, we display corresponding forecasted mortality rates and associated predictive intervals based on the joint-k model in

(28)

Figure3.4. The projected mortality rates and the predictive interval are almost the same with those for the independent model, where the estimated mortality rates are close to actual ones, and the95% predictive interval on the male mortality rates is wider than that on the female ones.

(29)

(30)

3.2.3 Co-integrated model

In the joint-k model, two populations share the same time-varying index Kt. In the

co-integrated model, the time-varying index for population 2 is a linear function of the time-varying index for population 1. Speciﬁcally, assume the mortality rates for populations 1 and 2 are modeled by

ln(mx,t,1) = ax,1+ bx,1× kt,1+ εx,t,1

and

ln(mx,t,2) = ax,2+ bx,2× kt,2+ εx,t,2.

By ﬁtting the model with given data, ˆkt,1 and ˆkt,2 can be estimated separately in

the same way as those for two independent Lee-Carter models. Under the co-integrated model, we assume there is a linear relationship plus an error term et

between ˆkt,1and ˆkt,2, that is,

ˆkt,2= α + β × ˆkt,1+ et. (3.1)

Since ˆkt,1and ˆkt,2are known, the estimates of parameters α and β can be obtained

by simple linear regression method. To build the link between ˆkt,1and ˆkt,2, the

co-integrated model suggests re-estimating ˆkt,2 to get ˆˆkt,2 with ˆkt,1 unchanged using

(3.1) by plugging in the values of ˆα and ˆβ. That is, ˆˆkt,2= ˆα + ˆβ × ˆkt,1.

In the co-integrated model, the estimates of the two variances for population 1, ˆσ2

εx,1 and ˆσ,12 , are the same as those for the original Lee-Carter model. However,

for the second population, the drift of the time-varying index is estimated by

ˆθ2 = ˆˆkt0+n−1,2− ˆˆkt0,2

n − 1 = ˆβ ×

ˆkt0+n−1,1− ˆkt0,1

(31)

and the estimate of the variance of the error term for the time-varying index is

ˆσ2 ,2= _t₀+n−1 t=t0+1(ˆˆkt,2− ˆˆkt−1,2− ˆθ2) 2 n − 2 = ˆβ2_×t0+n−1 t=t0+1 (ˆkt,1− ˆkt−1,1− ˆθ1)2 n − 2 = ˆβ2_{× ˆσ}2 ,1.

The expression of the predictive interval on ˆqx,t0+n−1+τ,iis still the same as that for independent Lee-Carter model. Figure 3.5 is the forecasted mortality rates

ˆqx,2009+1,2 (female, τ = 1) and the associated 95% predictive intervals for the

co-integrated model as well as the corresponding one for the independent model for comparison. We can see that both the forecasted mortality rates and predictive intervals for both models are quite close to each other. Starting from age 40, the forecasted mortality rates for the co-integrated model become slightly higher than those for the independent model. The predictive intervals for both models almost overlap.

(32)

Figure 3.5:95% predictive intervals on ˆqx,2009+1,2for the co-integrated and

(33)

3.2.4 Augmented common factor model

To avoid divergence in life expectancy in a long-run, Lee and Li (2005) suggested adding a common factor to the original model. For the common term, it has the following features:

• bx,1 = bx,2= Bxfor all x, and

• for the time-varying index, kt,1 = kt,2= Ktfor all t.

Two similar constrains apply, which are • 2_i=1_xwi× Bx= 1

and

• _tKt= 0,

where w1and w2are the weights for populations1 and 2, respectively, and w1+w2=

1. Thus, the common factor model can be expressed as

ln(mx,t,i) = ax,i+ Bx× Kt+ εx,t,i.

According to Li and Lee (2005),ˆax,ican still be derived by averaging ln(mx,t,i)

over a year span[t0, t0+ n − 1], that is, t0+n−1 t=t0 ln(mx,t,i) = t0+n−1 t=t0 ax,i+ Bx× t0+n−1 t=t0 Kt, implying ˆax,i= t0+n−1 t=t0 ln(mx,t,i) n , and Ktis obtained by 2 i=1 x0+m−1 x=x0

wi× [ln(mx,t,i) − ˆax,i] = Kt× 2 i=1 x0+m−1 x=x0 wi× Bx, yielding ˆ Kt= 2 i=1 x0+m−1 x=x0

(34)

We set w1 = w2 = 0.5 for the U.S. male and female life tables. Since 2

i=1

wi× [ln(mx,t,i) − ˆax,i] = Bx× ˆKt× 2

i=1

wi= Bx× ˆKt,

we regress2_i=1wi× [ln(mx,t,i) − ˆax,i] on ˆKt to get ˆBx for each age x. To ﬁt the

data better, Li and Lee (2005) added a factor b_x,i× k

t,ifor each population to the

common factor model to form the so-called augmented common factor model as

ln(mx,t,i) = ax,i+ Bx× Kt+ b

x,i× k

t,i+ εx,t,i, i = 1, 2,

with an extra constrain_xbx,i= 1. Notice that b

x,iand k

t,ihere are different from

bx,iand kt,iin the independent model. The constrain

xbx,i= 1 implies that

ˆk

t,i= x0+m−1

x=x0

[ln(mx,t,i) − ˆax,i− ˆBx× ˆKt].

Finally, ˆbx,iis derived by regressing[ln(mx,t,i) − ˆax,i− ˆBx× ˆKt] on ˆk

t,iwithout the

constant term involved for each age x.

The logarithm of the forecasted central death rate for age x in year t0+n−1+τ

is

ln( ˜mx,t0+n−1+τ,i) =ˆax,i+ ˆBx× ( ˆKt0+n−1+ τ × ˆθ + √ τ × t) + ˆbx,i× (ˆk t0+n−1,i+ τ × ˆθi+ √ τ × t,i) + εx,t,i =ln( ˆmx,t0+n−1+τ,i) + √ τ × ( ˆBx× t+ ˆb

x,i× t,i) + εx,t,i

which is normal with mean

ln( ˆmx,t0+n−1+τ,i) = ˆax,i+ ˆBx× ( ˆKt0+n−1+ τ × ˆθ) + ˆb

x,i× (ˆk t0+n−1,i+ τ × ˆθi), and variance σ2(ln( ˜mx,t0+n−1+τ,i)) = τ × ( ˆBx× σ2+ ˆb ₂

(35)

where the three error terms,{εx,t,i}, {t} and {t,i}, are assumed independent,

ˆθ = Kˆt0+n−1− ˆKt0 n − 1 and ˆθ i= ˆk t0+n−1− ˆk t0 n − 1 , i = 1, 2.

To construct the predictive intervals and simulate mortality rates, we need the esti-mates of σ2εx,i, σ2 and σ2,i, which are

ˆσ2 εx,i=

t0+n−1

t=t0 [ln(mx,t,i) − ˆax,i− ˆBx× ˆKt− ˆb

x,i× ˆk t,i]2 n − 3 , i = 1, 2, ˆσ2 = t0+n−1 t=t0+1( ˆKt− ˆKt−1− ˆθ)2 n − 2 , and ˆσ2 ,i= _t₀+n−1 t=t0+1(ˆk t,i− ˆk t−1,i− ˆθ i)2 n − 2 , i = 1, 2.

Figure3.6 displays the forecasted ˆqx,2009+1,i and associated predictive intervals

as well as the actual qx,2009+1,i for the augmented common factor model. When x

is small, the forecasted values are quite close to the true values. However, when x goes up, the projected mortality rates tend to be higher than the actual ones. Again, the predicted interval for the females is narrower than that for the males.

(36)

Figure 3.6:95% predictive intervals on ˆqx,2009+1,ifor the augmented common factor

(37)

Figure3.7 compares the predicted mortality rates among four models with the actual mortality rates for year2010 (τ = 1). It is obvious that all forecasted values are close to actual ones for small x. We notice that the forecasted values from the joint-k model are closer to the true values, while the mortality curve constructed from the augmented common factor model is not as adjacent to the actual mortality curve as the other models.

(38)

Chapter 4 Application in Mortality Swap

4.1 Natural hedging and mortality swap

Mortality (longevity) risk is the risk that the number of deaths (survivors) is higher than expected. When there is an unexpected change in mortality rates, either life insurers or annuity providers will experience a loss. If the mortality rates in-crease unexpectedly in a year, the number of deaths during the year is higher than expected so that life insurance companies need to pay more death beneﬁts. However, in this case, annuity providers gain from the mortality increase. If the mortality rates decline unexpectedly, the impacts on ﬁnancial situation of life in-surers and annuity providers reverse. Natural hedge is a strategy of hedging two risks responding oppositely to a change in a common factor. Since life insurers and annuity providers face mortality and longevity risks, respectively, both of them can adopt natural hedge by swapping a portion of their business each other. When a life insurer (an annuity provider) owns both life and annuity business at the same time, the mortality and longevity risks of the portfolio can be offset to a lower level no matter how the mortality rates change. Now, a question rises: what are the opti-mal portions of business swapped between the life insurer and the annuity provider such that the risk of the portfolio is minimized?

Let L denote the loss function at time zero, which is the present value of future liabilities less the present value of future premium incomes. The values of both future liabilities and premium incomes depend on the future mortality rates. Before swapping the business, a life insurer (an annuity provider) has a portfolio of life

(39)

CHAPTER 4. APPLICATION IN MORTALITY SWAP 29

(annuity) business, and the loss function for the portfolio is denoted by Ll(La). To

hedge mortality (longevity) risk, the life insurer (annuity provider) would like to swap wl (wa) of life (annuity) business to the annuity provider (life insurer); the resulting

loss functions of the life insurer and annuity provider become

LL= (1 − wl) × Ll+ wa× La

and

LA= (1 − wa) × La+ wl× Ll,

respectively, with variances

V ar(LL) =(1 − wl)2× V ar(Ll) + w2a× V ar(La)

+ 2 × (1 − wl) × wa× Cov(Ll, La)

(4.1)

and

V ar(LA) =(1 − wa)2× V ar(La) + w2l × V ar(Ll)

+ 2 × (1 − wa) × wl× Cov(Ll, La).

(4.2)

There are three aspects to approach the optimal pair of weights which min-imizes the variance of a loss function or the sum of the variances of two loss functions. The ﬁrst pair of weights,(wL+A

l , wL+Aa ), is used to minimize V ar(LL) +

V ar(LA) (see Figure 4.1); the second pair of weights, (wLl, waL), is used to

mini-mized V ar(LL) (see Figure 4.2), and the third pair of weights, (wlA, wAa), is used to

minimize V ar(LA) (see Figure 4.3), where we place superscripts L+A, L and A on

wl and wato denote the weights that minimize V ar(LL) + V ar(LA), V ar(LL) and

(40)

Figure 4.1: Mortality swap: minimizing V ar(LL) + V ar(LA)

Figure 4.2: Mortality swap: minimizing V ar(LL)

(41)

In mathematical optimization, the method of Lagrange multipliers is a strat-egy for ﬁnding the local maximum (minimum) of a function subject to some con-straint(s). Let Pl (Pa) stand for the present value of the future premiums of all

life (annuity) policies in the portfolio before swap. When the life insurer (annuity provider) swaps wl(wa) of life (annuity) policies to the annuity provider (life insurer),

the life insurer (annuity provider) loses premium wl×Pl(wa×Pa) and gets premium

wa× Pa(wl× Pl). We set a swap condition wl× Pl= wa× Pawhich will be applied

as the constraint in the three optimization problems mentioned above using the method of Lagrange multipliers. Speciﬁcally, we would like to ﬁnd (wˆl, ˆwa) which

minimizes f(wl, wa) subject to wl× Pl = wa× Pa where f is V ar(LL) + V ar(LA),

V ar(LL) or V ar(LA).

To obtain (wˆL+A_l , ˆwL+A

a ), deﬁne f(wlL+A, wL+Aa ) = V ar(LL) + V ar(LA). By (4.1)

and (4.2),

f (wL+A

l , waL+A) =(1 − wlL+A)2× V ar(Ll) + (waL+A)2× V ar(La)

+ 2 × (1 − wL+A

l ) × wL+Aa × Cov(Ll, La)

+ (1 − wL+A

a )2× V ar(La) + (wlL+A)2× V ar(Ll)

+ 2 × (1 − wL+A

a ) × wL+Al × Cov(Ll, La).

According to the method of Lagrange multipliers with a constraint wL+A

l × Pl =

wL+A

a × Pa, the Lagrange function is deﬁned by

ϕ(wL+A

l , waL+A, λ) = f (wL+Al , waL+A) + λ(wlL+A× Pl− wL+Aa × Pa),

where λ is called a Lagrange multiplier. To obtain the optimal solution, we differen-tiate ϕ with respect to wL+A

l , waL+A, and λ, respectively, and set all results to zero.

That is, ∂ϕ ∂wL+A l = 4wL+A l Vl− 4waL+Aσ2+ 2σ2− 2Vl+ λPl= 0, ∂ϕ ∂wL+A a = 4wL+A a Va− 4wL+Al σ2+ 2σ2− 2Va− λPa= 0,

(42)

CHAPTER 4. APPLICATION IN MORTALITY SWAP 32 and ∂ϕ ∂λ = w L+A l Pl− wL+Aa Pa= 0,

where Vl = V ar(Ll), Va = V ar(La) and σ2 = Cov(Ll, La). Then ( ˆwL+Al , ˆwL+Aa ) can

be solved as ˆ wL+Al = 1 2 × Pa× PaVl+ PlVa− (Pl+ Pa)σ2 Pa2Vl− 2PlPaσ2+ Pl2Va (4.3) and ˆ wL+A a = 1 2 × Pl×PaVl+ PlVa− (Pl+ Pa)σ 2 Pa2Vl− 2PlPaσ2+ Pl2Va . (4.4) Similarly, when f(wL

l, wLa) = V ar(LL) with a constraint wlL× Pl = wLa × Pa, the

optimal weights,wˆL l andwˆaL, become ˆ w_lL=Pa× PaVl− Plσ 2 Pa2Vl− 2PlPaσ2+ Pl2Va (4.5) and ˆ waL=Pl× PaVl− Plσ 2 P_a2Vl− 2PlPaσ2+ Pl2Va; (4.6) when f(wA

l , waA) = V ar(LA) with a constraint wlA×Pl= wAa×Pa, the optimal weights

ˆ wA l andwˆAa are ˆ wAl =Pa× PlVa− Paσ2 P_a2Vl− 2PlPaσ2+ Pl2Va (4.7) and ˆ wA a =Pl× PlVa− Paσ 2 Pa2Vl− 2PlPaσ2+ Pl2Va . (4.8)

Note thatwˆL+A

l = 12( ˆwlL+ ˆwAl ) and ˆwL+Aa = 12( ˆwaL+ ˆwaA). It is very difﬁcult to obtain

the theoretical expressions of Vl, Vaand σ2. Instead, we would like to compute the

corresponding sample variances and covariance by simulating thousands of Vl, Va

(43)

4.2 Numerical illustrations

4.2.1 Assumptions and portfolios

In practice, a life (an annuity) portfolio consists of a variety of life (annuity) products. For simplicity, we assume that the portfolio for the life insurer consists of(65 − x)-payment whole life insurance issued to the insureds aged x = 25 ∼ 64 and the death beneﬁts are paid at the end of the year of death, and that the portfolio for the annuity provider is composed of(65−x)-payment and (65−x) years deferred whole life annuity due issued to the insureds aged x = 25 ∼ 64. Since life insurance (annuities) are more often purchased by those who have poorer (better) health conditions, and we have no life (annuity) tables for consecutive years for forecasting mortality rates with the models proposed in the preceding chapter, we use the U.S. male (female) mortality table for the life (annuity) insureds. Because the year span [t0, t0+ n − 1] is used for estimating the parameters of the mortality models, and

forecasting the mortality rates for years t0 + n − 1 + τ, τ = 1, 2, ...., we set the

beginning of year t1 = t0+ n as time 0. Denote lx,t1,ithe initial number of insureds

aged x at time 0 for population i (i = 1 for life and i = 2 for annuity) and we set l25,t1,1 = l25,t1,2 = 107. By lx+1,t1,i = lx,t1,i× px,t1−1,i, x = 25, ..., 63, the initial numbers of insureds for the entire portfolio can be obtained. The death beneﬁt for life insurance is Bl= 100, 000 and the annual survival beneﬁt is Ba= 10, 000. Based

on the assumptions, the loss functions of the life insurer and annuity provider at time zero are

Ll= 64 x=25 lx,t1,1× (Ax,1× Bl− ¨ax:65−x|,1× Px,l), (4.9) La= 64 x=25

lx,t1,2× (65−x|¨ax,2× Ba− ¨ax:65−x|,2× Px,a), (4.10)

Ax,1 = 100−x k=0 kpx,t1,1· qx+k,t1+k,1· vk+1, (4.11) ¨ax:65−x|,i= 65−x−1 k=0 kpx,t1,i· vk, i = 1, 2, (4.12)

(44)

CHAPTER 4. APPLICATION IN MORTALITY SWAP 34 65−x|¨ax,2 = 100 k=65 (k−x)px,t1,2· vk−x, (4.13) Px,l=A_¨ax,1× Bl x:65−x|,1 , (4.14) and

Px,a=65−x_¨a|¨ax,2× Ba x:65−x|,2

, (4.15)

wherekpx,t1,i =

k−1

j=0px+j,t1+j,i with 0px,t1,i = 1, v = (1 + i)−1 and i = 6% is the

interest rate. Note that we set the limiting age equal to100 for both populations, that is, q100,t,i = 1, i = 1, 2. Moreover, Pl and Pa, the total present values of the future

premiums for the life insurer and annuity provider, respectively, in(4.3) ∼ (4.8) are (see(4.9) and (4.10)) Pl= 64 x=25 lx,t1,1× ¨ax:65−x|,1× Px,l and Pa= 64 x=25 lx,t1,2× ¨ax:65−x|,2× Px,a.

The premiums Px,land Px,ain(4.14) and (4.15), and the total premiums Pland Pa

are pre-determined and do not respond to a change in mortality rates, whereas

Ax,1, ¨ax:65−x|,i and 65−x|¨ax,2 in (4.11), (4.12) and (4.13) vary in response to the

realized mortality rates. Therefore, the deterministic mortality rates, ˆqx,t0+n−1+τ, τ = 1, 2, ..., are used to calculate the premiums Px,l and Pa,x, and the stochastic

ones,˜qx,t0+n−1+τ, τ = 1, 2, ..., are used for simulations to compute Ax,1,¨ax:65−x|,iand 65−x|¨ax,2. When the realized mortality rates are different from the expected ones,

each of the loss functions Ll and Lais either positive or negative. To forecast

de-terministic and stochastic mortality rates for ages25 ∼ 64 and years 2011 ∼ 2086 using the four models given in Chapter3, we set the year span [1981, 2010] and age span[25, 100] (see Figure 4.4) to estimate the parameters with the male and female mortality data from the Human Mortality Database for the life and annuity policies, respectively. Speciﬁcally, for the independent model, we give the following steps to forecast deterministic mortality rates for computing the premiums Px,l and Px,a:

(45)

Figure 4.4: age span[25, 100] and year span [1981, 2010]

1. compute ln( ˆmx,2010+τ,i) = ˆax,i+ ˆbx,i× (ˆk2010,i+ τ × ˆθi), i = 1, 2, τ = 1, 2, ..., 76,

x = 25, ..., 64;

2. transfer ln( ˆmx,2010+τ,i) to ˆqx,2010+τ,i; and

3. take the diagonal entriesˆqx+τ −1,2010+τ,i, i = 1, 2, x = 25, ..., 64, τ = 1, 2, ..., (101−

x).

To generate stochastic mortality rates for simulating Ax,1, ¨ax:65−x|,i and65−x|¨ax,2, a

similar procedure is given as follows:

1. generate s,τ,ifrom N(0, 1) and sε,τ,ifrom N(0, 1), i = 1, 2, τ = 1, 2, ..., 76;

2. multiply s,τ,i by ˆσ,i, and sε,τ,i by ˆσεx,i such that s,τ,i× ˆσ,i ∼ N(0, ˆσ2,i) and

sε,τ,i× ˆσεx,i∼ N(0, ˆσε2x,i);

3. get simulated ln( ˜mx,2010+τ,i) = ln( ˆmx,2010+τ,i)+√τ ×ˆbx,i×s,τ,i×ˆσ,i+sε,τ,i×ˆσεx,i,

i = 1, 2, τ = 1, 2, ..., 76, x = 25, ..., 64; 4. transfer ln( ˜mx,2010+τ,i) to ˜qx,2010+τ,i;

5. take the diagonal entries˜qx+τ −1,2010+τ,i, i = 1, 2, x = 25, ..., 64, τ = 1, 2, ..., (101−

x); and

(46)

The procedures of forecasting the deterministic and stochastic mortality rates for the joint-k, the co-integrated and the augmented common factor models are similar to those above. With{ˆqx+τ −1,2010+τ,i: i = 1, 2, x = 25, ..., 64, τ = 1, ..., (101 −

x)} and N {˜qx+τ −1,2010+τ,i : i = 1, 2, x = 25, ..., 64, τ = 1, ..., (101 − x)}’s, we can

calculate N realized values of Ll and La, get the sample variances Vl and Vaand

the sample covariance σ2, and obtain the optimal weights with(4.3) ∼ (4.8).

4.2.2 Robustness testing

In the preceding subsection, for each of three optimization problems, a pair of opti-mal weights for life and annuity portfolios is produced by some stochastic mortality model which generates N cohort mortality rates from age x to age 100 at time zero for each of x = 25, ..., 64. If we re-run the procedure and generate another set of N cohort mortality rates, can we still produce a pair of optimal weights of close values? In this subsection, we will perform robustness testing. Robustness testing is originally used in computer science whether a computer system can continue to work well in case of invalid inputs. In our case, robustness testing is a way to investigate whether the optimal weights produced by a model is insensitive to the simulated mortality rates.

To complete the robustness testing, we repeat the simulation procedure M (M is set to 50) times, and yield M pairs of optimal weights for each model. Figures 4.5 and 4.6 show scatter plots for the optimal weights ( ˆwl, ˆwa) generated from each

model, from which we can see that the 50 pairs of weights obtained through 50 simulation procedures for each model are quite close to each other. That means the four models are robust to simulations. Within each model, the ﬁrst two pairs of optimal weights ( ˆwL+A_l , ˆwL+A

a ) and ( ˆwlL, ˆwaL) seem to be more consistent than

( ˆwA l , ˆwAa).

Table 4.1 summarizes the median values of 50 optimal weights for each type obtained from the four models. The weightwˆlis more than twice as big aswˆafor all

types because wl/wa= Pa/Pl≈ 2.3. All the optimal weights are within (0, 1) except

forwˆA

l for the independent model. The pairs of optimal weights( ˆwl, ˆwa) from the

joint-k and co-integrated models are quite close to each other. The independent model produces the largest( ˆwL+A_l , ˆwL+A

(47)

whereas the augmented common factor model yields the lowest( ˆwL+A_l , ˆwL+A a ) and

( ˆwA

l , ˆwAa) but the highest ( ˆwlL, ˆwLa) among the four models.

IND JK Co-Int ACF

ˆ

wl wâ wˆl wâ wˆl wâ wˆl wâ

L+A 0.8505 0.3623 0.7385 0.3146 0.7431 0.3169 0.7110 0.3029 L 0.4797 0.2044 0.6459 0.2752 0.6384 0.2723 0.6868 0.2926 A 1.2213 0.5203 0.8311 0.3541 0.8479 0.3616 0.7352 0.3132

Table 4.1: the median of 50 optimal weights

Figure4.7 displays V ar(LL)’s and V ar(LA)’s for four models based on 50

cor-responding( ˆwL+A

l , ˆwL+Aa )’s, ( ˆwLl, ˆwLa)’s and ( ˆwAl , ˆwaA)’s, respectively. These ﬁgures

further conﬁrm the comments on Table4.1 above, and show the variability of 50 variances. Under the joint-k and co-integrated models, the variances from 50 runs of simulations are quite close and smaller, whereas for the independent and aug-mented common factor models, the sample variances are higher and not as stable as those based on the joint-k and co-integrated models.

Figures4.8, 4.9 and 4.10 exhibit the simulated loss distributions before and af-ter swap using the median optimal weights for all four models. It is obvious that the loss distributions after swap for the life insurer and annuity provider are al-most narrowed, which implies that the variance of the loss distribution is reduced signiﬁcantly. No matter using ( ˆwL+A_l , ˆwL+A

a ), ( ˆwlL, ˆwaL) or ( ˆwlA, ˆwAa), the loss

distri-butions for the joint-k and co-integrated models after swap for the annuity provider are much narrower than those for the independent and augmented common factor models, and for the life insurer, there is not much difference in the loss distribu-tions after swap among the joint-k, co-integrated and augmented common factor models.

To future quantify and compare the performances of hedging mortality and longevity risks after swap for the life insurer and annuity provider, respectively, we give a measure called hedge effectiveness (HE; see Li and Hardy (2011)) as follows:

HE(L + A) = 1 −V ar(LL) + V ar(LA) V ar(Ll) + V ar(La),

(48)

CHAPTER 4. APPLICATION IN MORTALITY SWAP 38 HE(L) = 1 −V ar(LL) V ar(Ll), and HE(A) = 1 −V ar(LA) V ar(La).

The HE measure is a variance reduction (variance of a loss function before hedge less the variance of the loss function after hedge) ratio. Clearly, the larger the HE is, the more effective the hedge is. Table4.2 shows the comparisons of HEs, which are consistent with the results from Figures 4.8, 4.9 and 4.10. The independent model overall performs the worst among all models. The HE(L), HE(A) and HE(L + A) for the joint-k model are the largest among the four models, which implies that the joint-k model is the most effective in hedging mortality and longevity risks. However, the co-integrated model produces the smallest variances of the losses before and after swap for both the life insurer and annuity provider.

Independent V ar(Ll) V ar(LL) V ar(La) V ar(LA) HE(L + A) HE(L) HE(A)

( ˆwL+A l , ˆwL+Aa ) 1.8904 1.5384 11.2661 6.1193 0.4180 0.1862 0.4568 ( ˆwL l, ˆwLa) 1.8904 1.0158 11.2661 7.6872 0.3385 0.4627 0.3177 ( ˆwA l , ˆwAa) 1.8904 23.1063 11.2661 5.5967 0.3385 -0.6433 0.5032

Joint-K V ar(Ll) V ar(LL) V ar(La) V ar(LA) HE(L + A) HE(L) HE(A)

( ˆwL+A l , ˆwL+Aa ) 2.6716 0.3320 5.5132 1.6081 0.7630 0.8757 0.7083 ( ˆwL l, ˆwLa) 2.6716 0.2829 5.5132 1.7553 0.7510 0.8941 0.6816 ( ˆwA l , ˆwAa) 2.6716 0.4792 5.5132 1.5590 0.7510 0.8206 0.7172

Co-Integrated V ar(Ll) V ar(LL) V ar(La) V ar(LA) HE(L + A) HE(L) HE(A)

( ˆwL+A l , ˆwL+Aa ) 1.8272 0.3185 4.2565 1.5635 0.6906 0.8257 0.6327 ( ˆwL l, ˆwLa) 1.8272 0.2768 4.2565 1.6887 0.6769 0.8485 0.6033 ( ˆwA l , ˆwAa) 1.8272 0.4437 4.2565 1.5218 0.6769 0.7572 0.6425

ACF V ar(Ll) V ar(LL) V ar(La) V ar(LA) HE(L + A) HE(L) HE(A)

( ˆwL+A l , ˆwL+Aa ) 6.3736 1.8738 15.4550 10.2970 0.4424 0.7060 0.3337 ( ˆwL l, ˆwLa) 6.3736 1.8682 15.4550 10.3138 0.4419 0.7069 0.3320 ( ˆwA l , ˆwAa) 6.3736 1.8906 15.4550 10.2914 0.4419 0.7034 0.3341 Table 4.2: comparisons of sample variances (×1023_{) and HE’s}

(49)

CHAPTER 4. APPLICATION IN MORTALITY SWAP 39 ˆ w_lL+A, ˆwL+A a wˆlL+A, ˆwaL+A ˆ wL l, ˆwLa wˆLl, ˆwaL ˆ wA l , ˆwAa wˆAl , ˆwaA

(50)

CHAPTER 4. APPLICATION IN MORTALITY SWAP 40 ˆ w_lL+A, ˆwL+A a wˆlL+A, ˆwaL+A ˆ wL l, ˆwLa wˆLl, ˆwaL ˆ wA l , ˆwAa wˆAl , ˆwaA

Figure 4.6: optimal weights for the co-integrated and augmented common factor models

(51)

V ar(LL)

variances using( ˆwL+A_l , ˆwL+A

a )

V ar(LA)

variances using( ˆwL+A_l , ˆwL+A

a )

variances using( ˆwL

l , ˆwLa) variances using( ˆwlL, ˆwLa)

variances using( ˆwA

l , ˆwAa) variances using( ˆwAl , ˆwAa)

(52)

(53)

Figure 4.9: simulated loss distributions using( ˆwL l, ˆwaL)

(54)

Figure 4.10: simulated loss distributions using( ˆwA l , ˆwAa)

(55)

Chapter 5 Block Bootstrap Method

In the preceding chapter, we forecast deterministic and stochastic mortality rates with some two-population mortality models to determine the premiums of life and annuity products and simulate the loss functions of portfolios of life and annuity business. The weights of business for swap are calculated by the sample variances and the sample covariance of the loss functions of the life insurer and annuity provider before swap, which can minimize the risk of the portfolio and produce high hedge effectiveness. All the great results are based on that the assumed mortality model is the actual one, which, however, might not be true. Therefore, both the life insurer and annuity provider face model risk and parameter risk that will potentially affect the results. In this chapter, a bootstrap method which is model and parameter free will be applied to generating samples for the future mortality rates to calculate the weighted loss functions and their variances with the weights obtained by each of the four models in Chapter3.

Bootstrap is usually used to resample data with replacement to estimate some statistic of a population from the sampled data. The procedure of the bootstrap (naive bootstrap) is as follows:

1. Draw values from the original data set with replacement to form a new data set of size n∗.

2. Repeat the ﬁrst step N∗times to obtain N∗ new data sets. 3. Compute the test statistic using the new data sets.

(56)

CHAPTER 5. BLOCK BOOTSTRAP METHOD 46

However, the naive bootstrap fails when applied to the mortality rates. First of all, mortality rates over years can be treated as a time series displaying a decreas-ing trend because of the improvement of medical and environmental conditions, which shows mortality rates are not stationary. Secondly, the naive bootstrap is likely to destroy the dependency of mortality rates on both age and time dimen-sions. Alternatively, the block bootstrap can solve the problems above. First, re-garding the non-stationary problem, differencing is a popular and effective method of removing trend from a time series and making the time series weakly stationary. The procedure is given below.

• Convert the empirical mortality rates qx,t,ito ln(mx,t,i), i = 1, 2, x = 25, ..., 100,

t = 1981, ..., 2009.

• Devide ln(mx,t+1,i) by ln(mx,t,i) to get the ratio, denoted by rx,t,i, that is, rx,t,i= ln(mx,t+1,i)

ln(mx,t,i) , t = 1981, ..., 2009.

• Subtract rx,t,ifrom rx,t+1,ito get the difference, denoted by dx,t,i, that is, dx,t,i=

rx,t+1,i− rx,t,i, t = 1981, ..., 2008.

Figure5.1 shows the time series {dx,t,i}, i = 1, 2, for the U.S. males and females

(57)

(58)

To further ensure the time series{dx,t,i} is stationary, the Phillips-Perron (PP)

and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) hypothesis tests are conducted. For the PP test where an AR(1) model is assumed, the null hypothesis is that the time series has a unit root so that the time series is non-stationary, which implies that a small p-value suggests a stationary time series. For the KPSS test, the null hypothesis is that the time series is stationary; thus, a big p-value indicates a sta-tionary time series. Table5.1 exhibits the results of the hypothesis tests. All the p-values for the PP test are below0.05, and the ones for the KPSS test are larger than0.1, which all suggest that the time series {dx,t,i} is stationary.

35 45 55 65

Male PP 0.01 0.01 0.01 0.01 KPSS >0.1 >0.1 >0.1 >0.1 Female PP 0.01 0.01 0.01 0.01 KPSS >0.1 >0.1 >0.1 >0.1 Table 5.1: p-values for testing stationarity of{dx,t,i}

For the second problem regarding dependency, according to Li and Ng (2011), the two-dimensional mortality rates matrix Mi for population i is converted to a

series of column vectors,

mt,i= (mx0,t,i, mx0+1,t,i, ..., mx0+m−1,t,i), t = t0, t0+ 1, ..., t0+ n − 1. Thus, the matrix can be expressed as Mi = {mt0,i, mt0+1,i, ..., mt0+n−1,i}, which contains n elements. In this way, the age dependency can be retained. Before proceeding to the next step, as discussed in the ﬁrst problem above, the vector

mt,i needs to be converted to ln(mt,i), get the ratio vector rt,iand then form the

difference vector

dt,i= (dx0,t0+t−1,i, dx0+1,t0+t−1,i, ..., dx0+m−1,t0+t−1,i), t = 1, 2, ..., n − 2, i = 1, 2. For the time dependency problem, it’s assumed that k consecutive time series,

dt,i, ..., dt+k−1,i, are dependent. While the dependency between dt,i and dt+k−1,i

gradually becomes weaker as k increases, and it’s going to vanish thoroughly for a large k. Thus, the block bootstrapping suggests splitting d into n overlapped blocks

Natural Hedging Using Multi-population Mortality Forecasting Models