• No results found

ESTIMATION OF CLAIM NUMBERS IN AUTOMOBILE INSURANCE

N/A
N/A
Protected

Academic year: 2021

Share "ESTIMATION OF CLAIM NUMBERS IN AUTOMOBILE INSURANCE"

Copied!
17
0
0

Loading.... (view fulltext now)

Full text

(1)

ESTIMATION OF CLAIM NUMBERS IN

AUTOMOBILE INSURANCE

Mikl´os Arat´o(Budapest, Hungary) L´aszl´o Martinek(Budapest, Hungary)

Dedicated to Andr´as Bencz´ur on the occasion of his 70th birthday Communicated by L´aszl´o Lakatos

(Received June 1, 2014; accepted July 1, 2014)

Abstract. The use of bonus-malus systems in compulsory liability

auto-mobile insurance is a worldwide applied method for premium pricing. Con-sidering certain assumptions, an interesting and crucial task is to evaluate the so called claims frequency regarding the individuals. Here we introduce 3 techniques, two is based on the bonus-malus class, and the third based on claims history. The article is devoted to choose the method, which fits to the frequency parameters the best for specific input parameters. For measuring the goodness-of-fit we will use scores, similar to better known divergence measures. The detailed method is also suitable to compare bonus-malus systems in the sense that how much information they contain about drivers.

1. Introduction

The joint application of probabilistic, statistical and Bayesian methods re-lated to specific issues in practice played an important role in several works of Andr´as Bencz´ur (see [1] for example). These methods are applied in the present article concerning an actuarial problem.

Key words and phrases: bonus-malus; claims frequency; Bayesian; scores; Markov chains.

2010 Mathematics Subject Classification: 97M30

The European Union and the European Social Fund have provided financial support to the

(2)

1.1. Motivation

The concept of bonus-malus system is a worldwide, especially in Europe and Asia applied method in compulsory liability automobile insurance. Princi-pally constructed in order to support the premium calculation of drivers. This might vary for different countries, but the idea is similar: policyholders with bad history should pay more than others without accidents in the past years. Schematically and mathematically described, there is an underlying graph with vertices called classes, among others an initial vertex, where every new driver begins. The graph contains a finite number of classes in practice, however, interesting articles are involved in the discussion of infinite systems, see book [10]. After a year without causing any accident he or she jumps up to another class, which has a cheaper premium. Otherwise, in case of causing an accident, the insured person goes downward, to a new class with higher premium, except he or she was already in the worst one.

Many other factors are taken into consideration when calculating one’s pre-mium, such as the engine type, purpose of use, habitat, age of the person etc., which result an a priori premium. Furthermore, the a posteriori premium arises as the product of a priori and the bonus-malus factor. A wide range of literature is available discussing these factors extensively, see [10], for example. Here we only concentrate on the bonus-malus system. Note that premium cal-culation is often realized involving credibility techniques, mixing the insurance institution’s experiences with available national data, according to an appro-priate proportion. The discussion of credibility is also omitted in the present article, see [7] for details, for instance. We also recommend book [3].

Our aim is to estimate the expected λ number of accidents triggered by insured drivers. This is usually called the claims frequency of the policyholder. First we review our necessary assumptions, among others about the distribution of claim numbers of a policyholder in one year, and the Markovian property of a random walk in such a system. Section 2 will discuss the basic problem illus-trated with Belgian, Brazilian and Hungarian examples. Since also a Bayesian approach will be used for estimation of λ, a general a priori assumption is also needed. Based on this and a gappy information about the driver, the a posteriori expected value ofλwill be calculated.

Fitting the possible best estimation for λis a crucial task in the insurer’s operation, since the expected value of claims, and which is the most important, the claims payments are forecasted usingλ. Let us note that for evaluating the size of payments on the part of the insurer, the size of the property damages has to be approximated, not even the number of them. This part of actu-arial calculations is not taken into account in the present article, for further discussions see [6, 12], for instance.

(3)

1.2. Bonus-malus systems

We assume the following constraints, simplifying real-world scenarios: Assumptions 1.1. 1. λ is a constant value in time for each policyholder,

and generally assumed to be the realization of a Λ random variable in-dependent for each person. Remark that the case of time-in-dependent λ

implies using double stochastic processes (Cox processes for instance), i.e.,λ(t) would also be a stochastic process.

2. The random walk on the graph of classes is a homogeneous Markov chain, i.e., the next step depends only on the last state, and is independent of time.

3. The distribution of a policyholder’s claim number for a year is Poisson(λ) distributed, conditionally on {Λ =λ}.

Notation 1.2. For a bonus-malus system consisting ofnclasses, letC1denote

the worst one with the highest premium,C2 the second worst etc., and finally

Cn the best premium class which can be achieved by a driver. Moreover, letYt be the class of the policyholder aftert steps (years).

Using these notations, the Markovian property can be written as P(Yt =

Ci|Yt1, . . . , Y1) = P(Yt = Ci|Yt1). As the Yt process is supposed to be

homogeneous, it is correct to simply write pij instead of P(Yt+1 =j|Yt =i). These values specify an n×n stochastic matrix with non-negative elements, namely the transition probability matrix of the random walk on states. Let us denote it byM(λ). Now to be more specific, we outline the example of three different systems, the Belgian, Brazilian and Hungarian.

Example 1.3 (Hungarian system). In the Hungarian bonus-malus system there are 15 premium classes, namely an initial (A0), 4 malus (M4, . . . , M1)

and 10 bonus (B1, . . . , B10) classes. Using the aforementioned notations, we

can think of it asC1 =M4, . . . , C4 =M1, C5 =A0, C6 =B1, . . . , C15 =B10.

After every claim-free year the policyholder jumps one step up, unless he or she was inB10, when there is no better class to go to. The consequence of every

reported damage is 2 classes relapse, and at least 4 damages pulls the driver back to the worst M4 state. Thus the transition probability matrix takes the

form of Equation 7.1, see Appendix.

Example 1.4 (Brazilian system). 7 premium classes: A0, B1, B2, . . . , B6.

Sometimes written as classes7,6,5, . . . ,1, e.g. in [11]. Transition rules can be found in the cited article.

(4)

Example 1.5 (Belgian system). The new Belgian system was introduced in 1992. We address the transition rules regarding business-users, which can be found in article [11], among others. There are 23 premium classes: M8, M7,

. . .,M1,A0,B1,B2,. . . , B14(sometimes written as classes23,22,21, . . . ,2,1).

2. The Bayesian approach

Our realistic problem is the following. When an insured person changes insurance institution, the new company may not necessarily get his or her claim history, only the class where his or her life has to be continued. The new insurer also knows the number of years the policyholder has spent in the liability insurance system. Nevertheless, based on this two information we would like to provide the best possible estimation for the specific person’sλ.

Recall thatYt=cdenotes the event that the investigated policyholder has spenttyears in the system (more precisely, from the initial class he or she has takent steps), and arrived in classc.

Notation 2.1. π0 is the initial discrete distribution on the graph, which is a

row vector of the form(0, . . . ,0,1,0, . . . ,0). Theith element is 1, which means that each driver begins in the initialCi state.

As a Bayesian approach, suppose that claims frequency is also a random variable, and denote it by Λ. In practice, the gamma mixing distribution is a commonly used choice for that, therefore only this case will be discussed now. For other cases, the calculations can be similar, though not similarly nicely done.

Notation 2.2. Γ(α, β) is the gamma distribution with α shape and β scale parameters, and with density functionf(x) = xα−1Γ(·βαα)·e−βx.

Conditionally on {Λ = λ}, the number of property damages related to a driver is a random variableX∼Poisson(λ), i.e., the conditional probability is

P(X =k|Λ =λ) =λk

k!e−

λ. It is well known (see [3] page 28 for details) that the unconditional distribution will be negative binomial with parameters

(

α,1+ββ

)

in our notations, i.e., if Λ Γ(α, β). Accordingly, the a priori parameters can be estimated by a standard method of moments or maximum likelihood method.

Remark 2.1. It is certainly necessary to make hypothesis testing after all, because our assumptions regarding the mixing distribution might be inaccurate.

(5)

Remark 2.2. Besides the gamma mixing distribution, other alternatives of claims frequency distributions might be worth considering in practice. For instance, ifΛis Inverse Gaussian, the unconditional distribution ofX is Pois-son Inverse Gaussian, see [5, 13]. Furthermore, the case ofΛ∼Log-normal is also realistic, for description see [5]. For a more general parametric consider-ation see [14], which contains Poisson, Negative Binomial or Poisson Inverse Gaussian distributions as special cases, although requires 3 parameters. Be-sides the parametric assumptions, also the non-parametric estimation is worth considering, regarding the mixing distribution, see [2].

2.1. Estimation of distribution parameters

In this subsection we describe an estimation method to compute the ap-proximate values ofαandβ parameters. Assume that the insurance company has claim statistics from the past few years containing mpolicyholders. The

ith insured person causedXiaccidents by his or her fault over a time period of

ti years, where ti is not necessarily an integer. According to our assumption, the distribution ofXi is Poisson(ti·Θ), whereti is a personal time factor and Θ is a Gamma(α, β)-distributed random variable. The unconditional distribu-tion ofXias mentioned above is Negative Binomial

(

α,tβ

i+β

)

, thus its first two moments are (2.1) EXi =ti α β, (2.2) EXi2=t2i α β2 +t 2 i α2 β2 +ti α β.

Now we construct a method of moments estimation. Since this methodology implies several corresponding systems of equations, we have to choose one of them, described as follows. On the one hand, we can say that

mi=1 EXi=αβ mi=1 ti and mi=1 EXi2 = ( α β2 + α2 β2 )∑m i=1 t2i + αβ mi=1

ti, thus the estimators of parameters are the result of the following equations

(2.3) αˆ ˆ β = mi=1 Xi mi=1 ti ,

(6)

(2.4) 1 ˆ β = mi=1 ti mi=1 t2 i     mi=1 X2 i mi=1 Xi 1     mi=1 Xi mi=1 ti .

On the other hand, mi=1 EXi ti =m α

β, thus the first equation for the estimators

is the following αˆˆ β = mi=1 xi ti

m . Generally they do not provide the same results, except ift1=t2=. . .=tm. Numerous claims history scenarios were simulated

for several portfolios, and according to our experiences, the solutions of system of equations 2.3 and 2.4 provided the best estimations for α and β. Finally we have to notice the maximum likelihood method as another obvious solution for this parameter estimation. Unfortunately, the likelihood function generally has no maximum, hence this method has been rejected.

2.2. Conditional probabilities

On the one hand, assume the a priori α, β parameters to be evaluated. On the other hand, consider the information{Yt=c} besides the initial class

Y0, which is fixed for each insured person. According to Bayes’ theorem, the

conditional density of Λ is fΛ|Yt=c(λ|c) = P(Yt=c|Λ =λ)·fΛ(λ) ∫ 0 P(Yt=c|Λ =λ)·fΛ(λ) ,

and the estimation forλis the a posteriori expected value, denoted by

ˆ λ= ∫ 0 λ·fΛ|Yt=c(λ|c)dλ.

Unfortunately, P(Yt = c|Λ = λ) introduces complications, because it only can be evaluated pointwise as a function of λ, calculating the tth power of transition matrixM(λ). IfIdenotes the index of the initial class in the graph, this probability is exactly theM(λ)t

(I,|c|)element of the matrix, where|c|is the

index of class c. Using numerical integration it can be solved relatively fast. If we take a glance at Figure 1, we might see the claim frequency estimations for different countries and differentαandβ parameters. For example, the first figure was made based on the Brazilian bonus-malus system with parameters

α= 1.2 and β = 19, which indicates a relatively high risk portfolio. There is a line in the chart for each bonus class, which values show the estimated

(7)

λ frequencies as a function of time. Note that the mentioned functions do converge slowly to the stationary state, which might take even several decades. Thus the information regarding the elapsed time in the system of the individual is important.

We will refer to this method asmethod.1later.

3. Other methods for frequency estimation

In this section we shortly introduce 2 other (known) possible methods to evaluate policyholders’ claim frequencies. Our assumptions for the distribution of claim numbers still definitely hold.

3.1. Average claim numbers of classes

The following method is probably the simplest, and we will refer to this asmethod.2. Suppose that statistics concerning the last years claim numbers are available. The estimator for a policyholders’ claims frequency in classC

is defined as the average number of claims related to the population in bonus classC in previous year, i.e., ˆλC =

mj=1 kj·χ{jth policyholder is in class C} mj=1· χ{jth policyholder is in class C} , wherem is the total number of policies in previous year,kj is the claim number regarding

jth person, andχis an indicator function, respectively.

For instance, if last year our portfolio contained 5 policyholders each in class C with claims 0,1,0,0,2, then the estimation for insured peoples’ fre-quencies present year in classC is ˆλC = 0.6. Although it might sound as an oversimplification, in some cases it gives the best results.

3.2. Claim history of individuals

The third method is generally prevalent in the actuarial practice, and will be referred asmethod.3. Here we use the insured person’s claim history, i.e., suppose that he or she was insured by our company fortyears, and the distri-bution of claim numbers for each year is Poisson(λ). LetX denote the number of aggregate claims caused by this person in his t-year-long presence in the system, more precisely, in our field of vision. Then the conditional distribu-tion of X is also Poisson with parameter λt. Let us remark that t does not

(8)

Time (years) claims frequency (br a) 0.1 0.2 0.3 0.4 0 5 10 15 20 25 30 A00 B01 B02B03 B04B05 B06 (a) Brazilian,α= 1.2, β= 19 Time (years) claims frequency (br a) 0.1 0.2 0.3 0.4 0.5 0 5 10 15 20 25 30 A00 B01 B02B03 B04B05 B06 (b) Brazilian,α= 1.8, β= 12 Time (years)

claims frequency (hun)

0.05 0.10 0.15 0.20 0.25 0.30 0 5 10 15 20 25 30 M04 M03 M02 M01 A00 B01 B02 B03 B04 B05 B06 B07 B08 B09 B10 (c) Hungarian,α= 1.2, β= 19 Time (years)

claims frequency (hun)

0.1 0.2 0.3 0.4 0 5 10 15 20 25 30 M04 M03 M02 M01 A00 B01 B02 B03 B04 B05 B06 B07 B08 B09 B10 (d) Hungarian,α= 1.8, β= 12 Time (years)

claims frequency (bel)

0.05 0.10 0.15 0.20 0 5 10 15 20 25 30 M08 M07 M06 M05 M04 M03 M02 M01 A00 B01 B02 B03 B04 B05 B06 B07 B08 B09 B10 B11 B12 B13 B14 (e) Belgian,α= 1.2, β= 19 Time (years)

claims frequency (bel)

0.05 0.10 0.15 0.20 0.25 0.30 0.35 0 5 10 15 20 25 30 M08 M07 M06 M05 M04 M03 M02 M01 A00 B01 B02 B03 B04 B05 B06 B07 B08 B09 B10 B11 B12 B13 B14 (f) Belgian,α= 1.8, β= 12

Figure 1: Estimatedλparameters for different countries and α, βparameters on a time horizon of 30 years. (Charts start at points for each class, from where the probability of being there is positive.)

(9)

have to be an integer (people often change insurer in the middle of the year in most countries). Based on this, our estimation on a specialλis the conditional expected value of the Gamma distributed Λ on conditionX =x, i.e., as well known, ˆλ=E|X =x) =xt++βα.

This is also a Bayesian approach as in the case of method.1, but the con-dition is different. Besides notice that first theαandβ parameters have to be estimated exactly the way we have seen it before in Section 2.1.

4. Comparison using scores

The aim of this section is to make a decision which method gives the most accurate estimation of claims frequencies. In this context, we will use the theory of scores, a tool of probabilistic forecasting. For much more detailed information see article [9], as here we will discuss only the most important properties useful for our problem.

Scores are made for measuring the accuracy of probabilistic forecasts, i.e., measuring the goodness-of-fit of our evaluations. Let Ω be a sample space,

A a σ-algebra of subsets of Ω and P is a family of probability measures on (Ω,A). Let a scoring rule be a function S : P ×−→ R = [−∞,∞]. We work with the expected scoreS(P, Q) =∫ S(P, ω)dQ(ω), where measure P is our estimation and Q is the real one. Obviously one of the most important properties of this function is the inequalityS(Q, Q)≥S(P, Q) for allP, Q∈ P. In this caseS is proper relative toP. Additionally, we callS regular relative to classP, if it is real valued, except for the contingency of being−∞in case ofP ̸=Q. If these two properties hold, then the associated divergence function isd(P, Q) =S(Q, Q)−S(P, Q).

Here we mention two main scoring rules, which will be used in sections below for comparing our evaluation methods. Remember that for an individual, the conditional distribution of the number of accidents is Poisson, so in our notations let pi (i = 0,1,2, . . .) be P(X = i|Λ = ˆλ), i.e., the probabilities of certain claim numbers using the estimated ˆλ as condition. Similarly, qi

(i= 0,1,2, . . .) is the same probability, but for the real λfrequency.

Two essential score definitions are introduced below. Although, other con-cepts can be found in the literature, as the spherical score, for instance. The application of the chosen scores is justified by the discrete probabilistic behav-ior of claim numbers. Note that in case of random variables with uncountable range, one has to apply other measures. It is inevitable if addressing claims severities, see [6, 12], for instance. For more details of density or distribution

(10)

function forecasts in general see [9, 8].

4.1. Brier Score

Due to the associated Bregman divergence being d(p, q) = ∑ i

(pi−qi)2, Brier score is sometimes referred as quadratic score. For an analysis regarding precipitation forecasts using Brier scores see [4]. The score is defined as

(4.1) S(P, Q) = 2 ( ∑ i piqi ) ( ∑ i p2i ) 1.

Here we use the unknownqi probabilities deliberately. Although, in practice they are not known, in our simulations presented later in Section 5 we are able to make decisions using them. For the purpose of calculating the score of estimation subsequently, changeqi values to a Dirac delta depending on the number of claims caused. In other words, if the examined policyholder had i

claims last year, then it implies the corresponding score to be

S(P, i) = 2  ∑ j pjδij    ∑ j p2j  1 = 2pi−j p2j1. 4.2. Logarithmic Score

The logarithmic score is defined as

(4.2) S(P, Q) =∑

i

qilogpi.

(In case oficaused accidentsS(P, i) = logpi.) We mention that the associated Bregman divergence is the Kullback-Leibler divergenced(p, q) =∑

i

qilogqpii.

5. Simulation and results

Example portfolios for testing have been simulated in R statistical program. The simulation technique can be used for frequency evaluation in practice, if we have the appropriate inputs. The main goal is to construct a ranking for given inputs among the estimation methods described in Section 2 and 3 according to

(11)

scores measures, to support decision making related to the choice of evaluation technique. It is important to emphasize that the methodology might be applied for other actuarial models, as well. Note that the example simulations below are limited to Poisson distributed claim numbers with Gamma mixing distribution for frequencies.

First we need a portfolio containing N insured individuals, which is used to estimate theαandβ parameters of the negative binomial distribution. We think of it as the policyholders’ histories available in the insurer’s database. Henceforth we refer this instituion asour company. Taking advantage of the entire claim and bonus-malus history, we have done the parameter evaluation exactly the way as described in Section 2.1. In the next step, we generated the history of a portfolio containingM policyholders, assuming that the distri-bution parameters are unchanged compared to the first portfolio. This might result some bias, but this is the best we can do based on our available data. The phenomenon occurs in the operation of insurance institutions. On the one hand, the details of the own portfolio are known. On the other hand, the re-cently acquired (or desired to acquire) portfolio involves only parts of relevant information.

After that, we estimated the claim frequency parameters of policyholders separately based on the three estimation methods described above, and com-pared them to the realλparameters using scores. Our aim is to make a decision among the methods, and decide, which would give the best fit results in cer-tain cases. In other words, for cercer-tain input parameters, which method yields good-fit estimations in expected value, where the method resulting higher score value means the better goodness-of-fit.

Considering that we are interested in the expected values of scores for given inputs, we will apply a Monte-Carlo-type technique. This means that we gen-erate the above mentioned two portfolios r times independently, but in each first ones preserving theαandβ distribution inputs. After that, based on the approximated ˆαand ˆβ, we estimate theλparameters of the second portfolios. Each method gives one score number for each simulation, which is the aver-age of scores calculated for individuals. (Of course aggregate scores would be equally appropriate, since it differs only in anM multiplier from the average.) At last we take the mean of mean scores, and the method resulting higher score is the better.

Fori= 1, . . . , r

generate portfolioPi(1)consisting ofNindividuals, using parametersα, β;

calculate estimators ˆα(i),βˆ(i) related toα, β;

generate portfolio Pi(2) containing M policies, with parameters α, β; it resultsλ(1i), . . . , λ(Mi) frequencies;

(12)

according to ˆα(i),βˆ(i) and an estimation method, calculate estimates

ˆ

λ(1i), . . . ,λˆ(Mi);

to each individualjassign scoreSj(i), and calculate the mean valueS(i)=

1 M(S (i) 1 +. . .+S (i) M).

In accordance with the above detailed notations, the ultimate score value re-garding one estimation method isS =S(1)+...+S(r)

r .

Remark 5.1. For reference we will write and plot the score results also for comparing the real frequencies to the real frequencies, since these scores are not equal to 0, as in the divergence case. As a function of year steps, these scores should be constant, contrary to the charts below, where small differences can be observed. The reason is that in the Monte-Carlo simulations we generated also theλparameters over and over.

Remark 5.2. In our simulations input parameters are the following: 1. α and β distribution parameters. These can be considered as the true

underlying parameters, and unknown for the insurance company. 2. N the number of policyholders in the first portfolio, which is used to

estimate theαandβ parameters.

3. M the number of policyholders in the second portfolio. This contains individuals, whoseλ parameters have to be evaluated. In practice, there might be an overlap between these two files.

4. Number of year steps. This means the time elapsed in years, since the certain individual is insured by our company. Note that it implies the knowledge of claims history and bonus classification, thus it is a very im-portant feature, as it affects the goodness-of-fit of our estimation methods. 5. Number of years elapsed before entering our company. This affects

method.1andmethod.2, because the Markov chain on the bonus classes converges slowly to the stationary distribution.

6. Transition rules of the examined country.

7. Number of simulated portfolios. As we approximate the scores via Monte-Carlo-type technique, it has to be large enough.

Figure 2 shows an example. We simulatedr= 50 times a portfolio contain-ing 80 thousand people, estimatedαandβ parameters, then estimated theλ

(13)

three methods against the real frequencies using scores. For every simulation and country we got 2 scores, the Brier score and Log score. The points of the charts are the average scores for this 50 simulations for given year steps. Cer-tain year steps mean that we generated the second portfolios (the current we are analysing) as we had information about policyholders 1,2,5,10,15 and 20 years back, respectively. Intermediate points are approximated linearly. Note that standard deviation of the sample of Brier scores in the Hungarian example are under 0.0009, and under 0.0016 in case of Log scores.

Remark 5.3. In practice, there are different lengths of claim histories available for different policyholders. In function of this length, we can de-cide that the parameters of a group of insured people will be evaluated using

method.2, and the rest usingmethod.3, for example.

Remark 5.4. Here (see Figure 2) we let the random walks of policyholders in the systems run for 15 years. Then they are assumed to be acquired by our company, which implies our observations to start at that point. In other words, year steps start at that time, when each driver has already spent some time randomly walking on the transition graph.

The example clearly shows the differentiation capability of systems contain-ing more bonus-malus classes. In other words, the more classes the system has, the more years needed for method.3 to get the start of method.2. Here the Bayesian type method.1is the worst method in every case, but we shall not forget that in certain circumstances it can be useful. For example, if the claim history is largely deficient.

On the tested parameters the two types of scores gave almost the same results (difference is not significant), what we have been expecting. Meaning that if method.x is the best according to Brier scores, then it is the best according to Log scores, too.

6. Conclusions

In this article we presented the principle of bonus-malus systems, and the necessary assumptions about the distribution of claim numbers of policyholders, inter alia that theX number of claims caused in a year by an insured person is conditionally Poisson distributed. The goal was to evaluate theseλfrequency parameters, which is the expected number of claims caused, and we did not deal with the size of them. Since the unconditional distribution is negative binomial, we can simply evaluate the shape and scale parameters based on the

(14)

Time (years) mean score v alues (Br ier) −0.150 −0.145 −0.140 5 10 15 20 method.1.mean

method.2.mean method.3.meanreal.frequencies

(a) Brazilian, Brier scores

Time (years) mean score v alues (log) −0.31 −0.30 −0.29 −0.28 −0.27 5 10 15 20 method.1.mean

method.2.mean method.3.meanreal.frequencies

(b) Brazilian, Log scores

Time (years) mean score v alues (Br ier) −0.146 −0.144 −0.142 −0.140 5 10 15 20 method.1.mean

method.2.mean method.3.meanreal.frequencies

(c) Hungarian, Brier scores

Time (years) mean score v alues (log) −0.300 −0.295 −0.290 −0.285 −0.280 −0.275 −0.270 5 10 15 20 method.1.mean

method.2.mean method.3.meanreal.frequencies

(d) Hungarian, Log scores

Time (years) mean score v alues (Br ier) −0.146 −0.144 −0.142 −0.140 5 10 15 20 method.1.mean

method.2.mean method.3.meanreal.frequencies

(e) Belgian, Brier scores

Time (years) mean score v alues (log) −0.300 −0.295 −0.290 −0.285 −0.280 −0.275 −0.270 5 10 15 20 method.1.mean

method.2.mean method.3.meanreal.frequencies

(f) Belgian, Log scores

Figure2: Mean Brier and Log scores in the Brazilian, Hungarian and Belgian system, when N = 80000,M = 20000, with real distribution parameters α= 1.2 andβ= 14, and 50 portfolio simulation.

(15)

insurer’s claim history from past years. Though the second portfolio might have some different parameters, this file consisting of the insurers previously observed claims experiences is the best we can use.

We introduced 3 methods for frequency estimation, where one (method.1) was never used by actuaries to our knowledge, and the other two are known. Note that method.1 is well applicable in case of the Hungarian MTPL in-surance (motor third party liability inin-surance), the reason being that most frequently the data applied by this method are available regarding new policy-holders. The main aim of this article was to decide which method is the most appropriate in certain circumstances, i.e., for given parameters. Our decision is made based on scores, which are devoted to measure the bias of two distri-butions. Assumingmethod.xgives ˆλ1, . . . ,λˆM frequency parameters, and the real ones are λ1, . . . , λM, then method.x is the best choice among the other methods, if the average score is greater than in the other cases. The proposed Monte-Carlo-type algorithm can be used in order to make decisions in practice. At last, but not least, our method presented in Section 2 includes a tech-nique, which is suitable to compare bonus-malus systems in the following sense. In function of years, the longer themethod.2is better than others, the more informative is the system, as we expect more accurate evaluation of claims fre-quencies using the past years’ average claim numbers in different classes, than using other methods. For parameters chosen in example 2, in the Brazilian sys-tem, method.3based on claims history becomes the most appropriate in the second year, while in the Hungarian system it needs 7-8, and in the Belgian 16-17 years. The ranking methodology can be applied using other distributional and model constraints.

7. Appendix

7.1 shows the transition probability matrix of the Hungarian bonus-malus system.

References

[1] Arat´o, M. and A. Bencz´ur,Dynamic placement of records and the clas-sical occupancy problem,Computers and Mathematics with Applications, 7(2) (1981), 173-185.

(16)

(7.1) M(λ) =                                                       1−e−λ e−λ 0 . . . 0 0 0 1−e−λ 0 e−λ .. . 0 0 0 1−e−λ 0 0 . .. 0 0 0 1−e−λ λe−λ 0 .. . 0 0 0 1(λ+ 1)e−λ 0 λe−λ . .. 0 0 0 1(λ+ 1)e−λ λ2·e−λ /2! 0 .. . 0 0 0 1(λ2/2! +λ+ 1)e−λ 0 λ2·e−λ /2! . .. 0 0 0 1(λ2/2! +λ+ 1)e−λ λ3·e−λ /3! 0 . .. 0 0 0 1(λ3/3! +λ2/2! +λ+ 1)e−λ 0 λ3·e−λ /3! .. . 0 0 0 . . . . .. ... ... ... ... .. . 1(λ3/3! +λ2/2! +λ+ 1)e−λ 0 0 .. . 0 0 e−λ 1(λ3/3! +λ2/2! +λ+ 1)e−λ 0 0 . . . λe−λ 0 e−λ                                                      

[2] Denuit, M. and P. Lambert,Smoothed npml estimation of the risk dis-tribution underlying bonus-malus systems,Proc. of the Casualty Actuarial Soc. 2001(to appear)

[3] Denuit, M., X. Marechal, S. Pitrebois and J.F. Walhin,Actuarial Modelling of Claim Counts: Risk Classification, Credibility and Bonus-Malus Systems,Wiley, 2007.

[4] Ferro, C.A.T., Comparing probabilistic forecasting systems with the brier score, Weather and Forecasting,22(5) (2007), 1076-1088.

[5] Fishman, G., Monte Carlo: Concepts, Algorithms and Applications, Springer Series in Operations Research and Financial Engineering, Springer, 1996.

[6] Frangos, N. and S. Vrontos, Design of optimal bonus-malus systems with a frequency and a severity component on an individual basis of au-tomobile insurance,ASTIN Bulletin,31(1) (2001), 1-22.

[7] Frees, E.W., Regression Modelling with Actuarial and Financial Appli-cations,Cambridge University Press, 2010.

[8] Gneiting, T., F. Balabdaoui and A.E. Raftery, Probabilistic fore-casts, calibration and sharpness,J. of the Royal Stat. Soc.: Series B (Sta-tistical Methodology),69(2) (2007), 243-268.

(17)

[9] Gneiting, T. and A.E. Raftery, Strictly proper scoring rules, predic-tion and estimapredic-tion,J. of the American Statistical Association,102(477) (2007), 359-378.

[10] Lemaire, J.,Automobile Insurance,Kluwer-Nijhoff, 1996.

[11] Lemaire, J., Bonus-malus systems: The European and Asian approach to merit-rating, North American Actuarial J.,2(1) (1998), 26-38. [12] Mahmoudvand, R. and H. Hassani, Generalized bonus-malus

sys-tems with a frequency and a severity component on an individual basis in automobile insurance,ASTIN Bulletin,39(5) (2009), 307-315.

[13] Tremblay, l.,Using the Poisson inverse Gaussian in bonus-malus systems, ASTIN Bulletin,22(1) (1992), 97-106.

[14] Walhin, J.F. and J. Paris,Using mixed Poisson processes in connection with bonus-malus systems,ASTIN Bulletin,29(1) (1999), 81-100.

Mikl´os Arat´o

Department of Probability Theory and Statistics E¨otv¨os Lor´and University

Budapest Hungary

aratonm@ludens.elte.hu

L´aszl´o Martinek

Department of Probability Theory and Statistics E¨otv¨os Lor´and University

Budapest Hungary

References

Related documents

Eksperimenti su pokazali da je za mutiranu tRNA Trp , koja nosi supstituciju u D-ruci, smanjena točnost procesa translacije na način da su reakcije koje se odvijaju

penicillin, skin testing for immediate hyper-.. sensitivity to the agent should

Мөн БЗДүүргийн нохойн уушгины жижиг гуурсанцрын хучуур эсийн болон гөлгөр булчингийн ширхгийн гиперплази (4-р зураг), Чингэлтэй дүүргийн нохойн уушгинд том

19% serve a county. Fourteen per cent of the centers provide service for adjoining states in addition to the states in which they are located; usually these adjoining states have

diagnosis of heart disease in children. : The role of the pulmonary vascular. bed in congenital heart disease.. natal structural changes in intrapulmon- ary arteries and arterioles.

Field experiments were conducted at Ebonyi State University Research Farm during 2009 and 2010 farming seasons to evaluate the effect of intercropping maize with

This essay asserts that to effectively degrade and ultimately destroy the Islamic State of Iraq and Syria (ISIS), and to topple the Bashar al-Assad’s regime, the international

The paper is discussed for various techniques for sensor localization and various interpolation methods for variety of prediction methods used by various applications