• No results found

Lifetime distributions involve certain parameters which have to be estimated in order to fit the models to sample data. These parameters may be estimated by evaluating the sample data. In this section, three such parameter estimation methods are reviewed, namely the maximum likelihood method, the least square method and the Bayesian parameter estimation method.

3.7.1 The maximum likelihood method

One of the most robust parameter estimation methods is the maximum likelihood estimation (MLE) method [171]. This method determines the most likely parameter values for a selected distribution. The MLE method essentially takes a sample data set as fixed values and then selects model parameters which maximise a likelihood function for the given data set. The MLE method may be formulated mathematically by taking Tn as a continuous random variable with

n − 1 independent observations T1, . . . , Tn−1 which has a pdf of

f (T1, . . . , Tn−1; θ1, . . . , θk), (3.45)

where θ1, . . . , θk are the unknown parameters of the lifetime distribution model that have to be

estimated [165]. The likelihood function for (3.45) is given by L(θ1, . . . , θk| T1, . . . , Tn−1) =

n−1

Y

n=1

In many cases, the so-called logarithmic likelihood function (log-likelihood function) is more convenient to use, which is simply the logarithm of (3.46), i.e.

Λ = ln L(θ1, . . . , θk | T1, . . . , Tn−1) = n−1

X

n=1

ln f (T1, . . . , Tn−1; θ1, . . . , θk). (3.47)

To obtain the best estimates for the parameter values θ1, . . . , θk, either (3.46) or (3.47) may be

maximised in view of the monotonic growth property of the logarithmic function. The estimated parameter values are the simultaneous solutions to the k equations

δL δθj

= δΛ δθj

= 0, j = 1, . . . , k. (3.48)

The MLE method is a popular estimation method because of various advantages. As the sample size increases, for example, the estimation of the parameter values converge to the true values of the parameters. Another advantage is that the distribution of the parameter estimates is normally distributed, which enables the use of the Fisher information matrix. Another advantage is that the method can also accommodate right-censored and interval-censored data. To obtain a desirably accurate outcome from the MLE method, however, a large sample is often required, because if the sample size is too small, the method is known to perform poorly [180].

3.7.2 The least squares method

The least squares (LS) estimation method aims to minimise the sum of squared residuals between the actually observed values and the values provided by the fitted model [165]. Let Tn be

a continuous random variable with n − 1 independent observations T1, . . . , Tn−1. Then the

residuals may be calculated as

ri = yi− f (T1, . . . , Tn−1| θ1, . . . , θk), (3.49)

where yi is the actually observed value and f (T1, . . . , Tn−1| θ1, . . . , θk) is the value of the fitted

model, given the model parameter values θ1, . . . , θk. The sum of the squared residuals

S =

n−1

X

i=1

ri2 (3.50)

is calculated and minimised in this case. This is may be achieved by taking the partial derivative of S with respect to θ1, . . . , θk and setting each partial derivative equal to zero, as was done in

(3.48). The resulting set of equations are then solved to obtain the unknown values. This process can easily be carried out for linear cases, but in cases where the model is nonlinear, an iterative numerical approximation algorithm is required to obtain the optimal values for the parameters of the model. When employing the LS method, it is therefore useful to linearise the function (if this is possible) in order to ease the process of obtaining good parameter values. If this is achievable, the LS method is straightforward to apply and a solution can be found relatively easily [180]. The LS method is, however, reported to perform poorly in some cases in respect of censored data.

3.7.3 The Bayesian parameter estimation method

Bayes’ theorem [212] describes the conditional probability of an event based on certain conditions which may be related to the event. It relates the probability that an event will occur, to

the probability of associated events occurring or not occurring. The theorem combines prior information with sample data [180]. This characteristic of Bayes’ theorem may be exploited to make inferences about a model fitted to sample data where the prior knowledge is the assumed model parameter values. Let ϕ(θ1, . . . , θk) be the prior distribution, where θ1, . . . , θk are the

assumed model parameter values. The posterior distribution, given the sample data T1, . . . , Tn−1,

is expressed by f (θ1, . . . , θk| T1, . . . , Tn−1) = L(T1, . . . , Tn−1| θ1, . . . , θk)ϕ(θ1, . . . , θk) R ζL(T1, . . . , Tn−1 | θ1, . . . , θk)ϕ(θ1, . . . , θk) dθθθ (3.51) where L(T1, . . . , Tn−1 | θ1, . . . , θk) is the likelihood function described in (3.46) and ζ is the

range of the parameter values θ1, . . . , θk. The denominator in (3.51) may be interpreted as

the probability of obtaining the sample data given the selected model parameter values. The integral generally does not admit a closed-form evaluation and numerical methods are therefore employed in many cases to obtain a solution.

To estimate the model parameters, three approaches may be adopted, namely obtaining the expected values of θ1, . . . , θk, obtaining the median values of θ1, . . . , θk, or obtaining any other

percentile of the parameter values θ1, . . . , θk. The most popular approach involves estimating

the expected values (mean values) of the parameter values. This is achieved by calculating the mean

E(θi) = µθi =

Z

ζi

θif (θ1, . . . , θk | T1, . . . , Tn−1) dθi (3.52)

of each parameter θi ∈ {θ1, . . . , θk} individually, where ζi denotes the i-th orthotope of ζ. A

similar approach is taken to estimate the medians of the parameters. The i-th median is obtained by solving for θ0.5i in Z θi0.5 0 f (θ1, . . . , θk | T1, . . . , Tn−1) dθi= 1 2. (3.53)

This approach may be adapted to estimate any other percentile of the parameter values, by solving for θiz in

Z θzi

0

f (θ1, . . . , θk| T1, . . . , Tn−1) dθi = z, (3.54)

where z is the desired percentile. An advantage of employing the Bayesian parameter estimation method is that it provides a theoretical framework for combining prior information with sample data. The inferences are exact and are conditional on the data — therefore no asymptotic assumptions have to be made. This approach does not, however, specify which prior distribution should be selected and so an external justification for assuming a particular distribution for the sample data is required [26].