• No results found

E-BAYESIAN AND HIERARCHICAL BAYESIAN ESTIMATION IN A FAMILY OF DISTRIBUTIONS

N/A
N/A
Protected

Academic year: 2021

Share "E-BAYESIAN AND HIERARCHICAL BAYESIAN ESTIMATION IN A FAMILY OF DISTRIBUTIONS"

Copied!
8
0
0

Loading.... (view fulltext now)

Full text

(1)

are responsible for its content and its originality; (ii) any possible co-authors agreed to its submission to W-OSDCE.

E-BAYESIAN AND HIERARCHICAL BAYESIAN ESTIMATION IN A FAMILY OF DISTRIBUTIONS

KIAPOUR, A.1AND NAGHIZADEH QOMI, M.2

1Department of Statistics, Babol branch, Islamic Azad University, Babol, Iran

azadeh [email protected]

2 Department of Statistics, University of Mazandaran, Babolsar, Iran

[email protected]

Abstract. In this paper, we deal with Bayesian, E-Bayesian and hierarchical Bayesian estimation in a family of distributions under a squared log error loss function. Specially, E-Bayesian and hierarchical Bayesian estimators for the shape parameter of a Pareto distribution is provided when the scale parameter is known. A monte carlo simulation is conducted for comparision of Bayes and E-Bayesian estimators. A real data set is used for illustrating the proposed estimators.

1. Introduction

A Bayesian approach to a statistical problem requires defining a prior distri-bution over the parameter space and loss function. Many Bayesian believe that just one prior can be elicited. In practice, the prior knowledge is vague and any elicited prior distribution is only an approximation to the true one. So, we elect to restrict attention to a given flexible family of priors. Various solutions to this problem have been proposed. One of the proposed solution is E-Bayesian approach, which has been applied over the last decades. The E-Bayesian method was first introduced by Han (1997). The E-Bayesian estimator of unknown parameter is obtained on the basis of distribution of the hyperparameter(s), for more details, see Han (2007,2009,2011), Jaheen and Okasha (2011) and Kiapour (2018). In some situation, prior distribution parameters may be depend on the hyper parameters. In this situation, we often use of the hierarchical Bayesian estimation method. The hierarchical Bayes method were first introduced by Lindly and Smith (1972).

2010 Mathematics Subject Classification. 62F15.

Key words and phrases. E-Bayesian estimation , Hierarchical Bayes, Pareto distribution. Speaker.

(2)

In Bayesian inference, the most commonly used loss function is convex and sym-metric Squared Error loss (SEL) function which is widely used in decision theory due to its simple mathematical properties. But in some cases, it does not represent the true loss structure. For example it is not useful for estimation of the scale pa-rameter and it assigns the same penalizes to overestimation and underestimation. For estimation of the scale parameter θ, Brown (1968) proposed the Squared Log Error Loss (SLEL) function, which is given by

L(θ, δ) = (ln δ− ln θ)2= [ lnδ θ ]2 , (1.1)

where both θ and δ are positive. This loss is not symmetric and convex, it is convex when δθ ≤ e and concave otherwise, but has a unique minimum at δ = θ and L(θ, δ) is increasing as δ moves away from θ in either direction. In the estimation problems that underestimation is more serious than overestimation, this loss is appropriate to use, see Kiapour and Nematollahi (2011).

In this paper, Bayes, E-Bayesian and hierarchical Bayesian estimators in a fam-ily of distributions have been obtained under the loss function (1.1). In section 2, we state preliminary definitions and formulas of Bayesian, E-Bayesian and hierar-chical Bayesian estimation of unknown parameter. In section 3, we find the Bayes estimator for the parameter θ in a family of distributions under the loss function (1.1). E-Bayesian estimators are developed in section 4. A Monte Carlo simulation is used for a comparision of the E-Bayesian estimators of the shape parameter of a Pareto distribution in section 5. Hierarchical Bayesian estimators are obtained in section 6. The golfers income data is used for practical illustration in section 7. Finally, we end the paper by a discussion.

2. preliminaries

let Xn= (X1, ..., Xn) be independent and identically distributed (i.i.d.) random

variables from a distribution pθ indexed by a real unknown parameter θ. Also, let

(χ, B, p) denoted the probability space generated by X, where χ ⊂ Rn, B is the

σ-field of χ, p ={pθ(x)|θ ∈ θ} and θ is the space parameter. In estimation of θ, let

L(θ, δ) be the loss function (1.1). Then, the posterior risk of δ bases on observations xn = (x

1, ..., xn) can be expressed as

ρ(π, δ) = ln2δ(xn) + E[ln2θ| xn]− 2 ln δ(xn)E[ln θ| xn]. (2.1) The Bayes estimate of θ based on observation xn is any estimate δB(xn) that

minimizes the posterior risk (2.1), which is given by

δB(xn) = eE[ln θ|xn]. (2.2) Information on the appropriate prior is often inadequate to unambiguously spec-ify a prior distribution. The problem of expressing uncertainty regarding prior information can be solved by using a class of prior distributions.

E-Bayesian inference deals with such a problem by constructing methods which are stable to such a lack of information. Cosider a prior π(θ|a, b) for θ with hyper-parameters a and b. The E-Bayesian estimator of θ is the expectation of the Bayes estimator for the all hyperparameters and is defined as

δEB(xn) = ∫ ∫

D

(3)

Table 1. Representation of the family p

Distribution pθ(x) s(x) t(x)

Poisson P oi(θ) x 1

Exponential E(θ) 1 x

Gamma G(α, θ), α > 0 known α x

Pareto P ar(α, θ), α > 0 known 1 ln(xα) Power P (λ, θ), λ > 0 known 1 ln(λ

x)

Negative exponential N E(µ, θ), µ > 0 known 1 x− µ

Inverse gamma IG(α, θ), α > 0 known α 1 x

Inverse gaussian IGa(µ, θ), α > 0 known 1 2

(x−µ)2

2x

where π(a, b) is the prior density function of hyperparameters a and b.

According to Lindley and Smith (1972), one prior distribution may be adapted to the hyper parameters while the prior distribution includes hyper parameters. The corresponding hierarchical prior density function of θ is

π(θ) =

∫ ∫

D

π(θ|a, b)π(a, b)dadb (2.4)

Therefore, the hierarchical Bayesian estimator is obtained based on the hierarchical posterior distributon using (2.2) as δHB(xn) = eE[ln θ|xn] .

3. Bayesian estimation strategy

Let {pθ|θ ∈ Θ} be an one-parameter family of distributions with probabilty

density function (p.d.f.)

fθ(x) = c(x, n)θs(x)e−t(x)θ, x∈ R,

where c(x, n) is a function of x and n, and t, s are fixed. Examples of such distri-butions are given in Table 1.

Let X1, X2, ..., Xn be a sequence of i.i.d. random variable with distribution fθ.

Set X = (X1, X2, ..., Xn). Also, let πa,b be a conjugate family of distribution with

p.d.f.

π(θ|a, b) = b

a

Γ(a)θ

a−1e−bθ, θ > 0, (3.1)

where Γ(a) =0∞xa−1e−xdx is the gamma function, and hyper parameters a > 0

and b > 0. It is easy to verify that the posterior distribution of θ given x is

Gamma(S + α, T + β), where S =ni=1s(xi) and T =

n

i=1t(xi). Therefore, the

Bayes estimator of θ under the loss function (1.1) is given by

δB(x) = e

ψ(S+a)

T + b , (3.2)

where ψ(ν) = d(ν)d ln Γ(ν) = ΓΓ(ν)′(ν) is the digamma function. 4. E-Bayesian estimation

According to Han (1997) , a and b should be selected to guarantee that π(θ|a, b) is a decreasing function of θ. If we take the conjugate prior (3.1), hyperparameters

(4)

0. Prior distribution with thinner tail would make worse robustness of Bayesian distribution. Accordingly, b should not too big while 0 < a < 1. It is better to choose 0 < a < 1 and 0 < b < c (c > 0, and c is a constant).

Suppose that the prior distributions of a and b are uniform distribution in (0, 1) and uniform distribution in (0, c), respectivelly, when a and b are independent. Therefore, the joint prior distribution of a and b is given by

π1(a, b) =

1

c, 0 < a < 1, 0 < b < c. (4.1)

In the following theorem, we obtain E-Bayesian estimator of θ under the loss func-tion (1.1) and prior distribution prior distribution (4.1).

Theorem 4.1. Let xn = (x

1, x2, ..., xn) be the sample observations from the

one-parameter exponential family. Then, the E-Bayesian estimator of θ corresponding to the prior given in (4.1) under the loss function (1.1) is equal to

δEB1(xn) = 1 c ln(1 + c T) ∫ 1 0 eψ(S+a)da. (4.2)

Proof. For π(α, β), the E-Bayesian estimator under the function(1.1) is given by

δEB1(xn) = ∫ 1 0 ∫ c 0 eψ(S+a) c(T + b)dbda = 1 cln(1 + c T) ∫ 1 0 eψ(S+a)da.

which ends the proof. □

Also, suppose that the prior distribution of a is Beta distribution Beta(u, v), and the prior distribution of b is uniform distribution in (0, c), when a and b are independent. Then, the joint prior distribution of a and b is given by

π2(a, b) =

1

cB(u, v)a

u−1(1− a)v−1, 0 < a < 1, 0 < b < c, (4.3)

where B(u, v) =01xu−1(1−x)v−1dx is the beta function. In the following theorem,

we obtain the E-Bayesian estimator of θ under the loss function (1.1) and prior distribution (4.3).

Theorem 4.2. If xn = (x

1, x2, ..., xn) are the sample observations from the

one-parameter exponential family, then, the E-Bayesian estimator of θ corresponding to the prior given in (4.3) under the loss function (1.1) is all equal to

δEB2(xn) =1 cln(1 + c T) ∫ 1 0 eψ(S+a) 1 B(u, v)a u−1(1− a)v−1da. (4.4)

Proof. The E-Bayesian estimator under the function(1.1) is given by

δEB2(xn) = ∫ 1 0 ∫ c 0 eψ(S+a) (T + b) 1 cB(u, v)a u−1(1− a)v−1dbda = 1 cln(1 + c T) ∫ 1 0 eψ(S+a) 1 B(u, v)a u−1(1− a)v−1da.

(5)

5. Simulation study

In this section, we perform a numerical comparison between the Bayes and E-Bayesian estimators for the shape parameter of a Pareto distribution. For this purpose, we generate sequences n of independent random samples from Pareto distribution with true value of parameter α = 200 and θ = 3.

Let δk

i, k = 1, 2, 3 stands for δB(xn) with a = 0.6 and b = 2 given by (3.2)

and E-Bayesian estimators δEBi(xn), i = 1, 2 given by (4.1) and (4.3) for selected values c = 2.5, 3, 3.5, u = 3 and v = 2. in ith replication, respectively. Repeat these tasks M = 104 times and calculate the value of Estimated Risk (ER) using

the following formula

ER(δk) = 1 M Mi=1 (ln δik− ln θ)2. (5.1)

The results are summarized in Table 2. It is seen from Table 2 that the perfor-mance of the E-Bayes estimators are quite satisfactory than the Bayes estimator. Moreover, the estimated risk decreases as the sample size increases.

Table 2. Results of ER for Bayes and E-Bayesian estimators

n c δB δEB1 δEB2 20 2.5 0.10209 0.07367 0.07223 3 0.08048 0.07875 3.5 0.08870 0.08669 50 2.5 0.03120 0.02572 0.02545 3 0.02718 0.02686 3.5 0.02898 0.02860 100 2.5 0.01910 0.01637 0.01624 3 0.01715 0.01699 3.5 0.01807 0.01789

6. Hierarchical Bayesian estimation

In this section, we obtain hierarchical Bayesian estimators of θ Based on two pro-posed prior distributions π1(a, b) and π2(a, b). First, consider the prior distributions

π1(a, b). Then, the hierarchical prior distrbution is given by

π1(θ) =

∫ 1 0

c 0

π(θ|a, b)π(a, b)dbda

= 1 c ∫ 1 0 ∫ c 0 ba Γ(a)θ a−1e−bθdbda, θ > 0. (6.1)

In the following theorem, we obtain the hierarchical Bayesian estimator of θ under the loss function (1.1) and the hierarchical prior distribution of θ in (6.1). Theorem 6.1. Let xn = (x

1, x2, ..., xn) be the sample observations from the

one-parameter exponential family. Then, the hierarchical Bayesian estimator of θ under the loss function (1.1) is equal to

δHB1(xn) = exp ( ∫1 0 ∫c 0 baΓ(S+a)

Γ(a)(T +b)S+a(ψ(S + a)− ln(T + b))dbda ∫1

0

c 0

baΓ(S+a) (T +b)S+aΓ(a)dbda

(6)

Proof. The hierarchical posterior density function of θ is given by π1|xn) = π1(θ)L(θ|xn) ∫ 0 π1(θ)L(θ|x n)dθ = ∫1 0 ∫c 0 βα Γ(a)θ S+a−1e−(T +b)θdbda ∫1 0 ∫c 0 ba Γ(a) 0 θ S+a−1e−(T +b)θdθdbda = ∫1 0 ∫c 0 ba Γ(a)θ S+a−1e−(T +b)θdbda ∫1 0 ∫c 0 baΓ(S+a) (T +b)S+aΓ(a)dbda

(6.3) We have E[ln θ|xn] = ∫ 0 (ln θ)π1|xn)dθ = ∫1 0 ∫c 0 ba Γ(a) 0 (ln θ)θ S+a−1e−(T +b)θdθdbda ∫1 0 ∫c 0 baΓ(S+a) (T +b)S+aΓ(a)dbda = ∫1 0 ∫c 0 baΓ(S+a) Γ(a)(T +b)S+α(ψ(S + a)− ln(T + b))dbda ∫1 0 ∫c 0 baΓ(S+a) (T +b)S+aΓ(a)dbda

. (6.4)

Thus, the proof is completed. □

Now, consider the prior distributions π2(a, b). Then, the hierarchical prior

dis-trbution is given by π2(θ) = 1 cB(u, v) ∫ 1 0 ∫ c 0 ba Γ(a)θ a−1e−bθ au−1 (1− a)v−1dbda, θ > 0. (6.5)

In the following theorem, we obtain the hierarchical Bayesian estimator of θ under the loss function (1.1) and the hierarchical prior distribution of θ in (6.5). Theorem 6.2. Let xn = (x

1, x2, ..., xn) be the sample observations from the

one-parameter exponential family. Then, the hierarchical Bayesian estimator of θ under the loss function (1.1) is equal to

δHB2(xn) = exp ( ∫1

0

c 0

baΓ(S+a)au−1(1−a)v−1

Γ(a)(T +b)S+a (ψ(S + a)− ln(T + b))dbda ∫1

0

c 0

baΓ(S+a)au−1(1−a)v−1 (T +b)S+aΓ(a) dbda

) (6.6)

Proof. The hierarchical posterior density function of θ is given by π2|xn) = π2(θ)L(θ|xn) ∫ 0 π2(θ)L(θ|xn)dθ = ∫1 0 ∫c 0 baau−1(1−a)v−1 Γ(a) θ S+a−1e−(T +b)θdbda ∫1 0 ∫c 0 baau−1(1−a)v−1 Γ(a) 0 θ S+a−1e−(T +b)θdθdbda = ∫1 0 ∫c 0 baau−1(1−a)v−1 Γ(a) θ S+a−1e−(T +b)θdbda ∫1 0 ∫c 0

baΓ(S+a)au−1(1−a)v−1 (T +b)S+aΓ(a) dbda

(6.7) We have E[ln θ|xn] = ∫ 0 (ln θ)π2|xn)dθ

(7)

Table 3. the golfers income data 3581 1960 1433 1184 1066 1005 883 841 778 753 2474 1684 1410 1171 1056 1001 878 825 778 746 2202 1627 1374 1109 1051 965 871 820 771 729 1858 1537 1338 1095 1031 944 849 816 769 712 1829 1519 1208 1092 1016 912 844 814 759 708 = ∫1 0 ∫c 0 baau−1(1−a)v−1 Γ(a) 0 (ln θ)θ S+a−1e−(T +b)θdθdbda ∫1 0 ∫c 0 baΓ(S+a)au−1(1−a)v−1 (T +b)S+aΓ(a) dadb =

∫1 0

c 0

baΓ(S+a)au−1(1−a)v−1

Γ(a)(T +b)S+a (ψ(S + a)− ln(T + b))dbda ∫1

0

c 0

baΓ(S+a)au−1(1−a)v−1 (T +b)S+αΓ(a) dbda

(6.8)

which ends the proof. □

7. A real example

Consider the golfers incomae data (Arnold, 2015). The given 50 golfers earning more than 70000 dollar, their income by the end of the 1980 years data are shown in Table 3 (unit: 1000 dollar). A Pareto distribution with scale parameter α = 703 and the shap2e parameter θ = 2.23 has a good fit to data. The Bayesestimates with a = 0.6 and b = 2, E-Bayesian and hierarchical Bayesian estimates with

u = 3, v = 2 and selected values of c = 2.5, 3, 3.5 are summarized in Table 4. It

is observed that the E-Bayesian and hierarchical Bayesian estimates are very close. Also, these estimates are all robust.

Table 4. Results for Bayes, E-Bayesian and hierarchical estimates

c δB δEB1 δEB2 δHB1 δHB2

2.5 2.1084 2.1749 2.1793 2.2327 2.2318

3 2.1524 2.1567 2.2306 2.2301

3.5 2.1305 2.1348 2.2296 2.2293

8. Discussion

Our aim of this paper is to study the Bayes, E-Bayesian and hierarchical Bayesian estimation of the unknown scale parameter for an exponential family of distributions under the SLEL function. First, we derive the Bayes estimator by choosing an explicit prior distribution over the parameter of interest. In practical situations, the prior knowledge is vague and any elicited prior distribution is only an approximation to the true one. So, the E-Bayesian and the hierarchical Bayesian analysis can be employed. Therefore, we investigated the performance of E-Bayesian estimators for selected values of c (an upper bound for b) in comparison with the Bayes estimator. Our ndings in a simulation study showed that E-Bayesian estimators work better than the Bayes estimator. We also considered the golfers income data. In this case, the E-Bayesian estimators performed better than other estimators.

(8)

References

1. Arnold, B. C. (2015), Pareto distributions, Chapman and Hall/CRC Press.

2. Brown, L. D. (1968), Inadmisibility of the usual estimator of scale parameters in problems with unknown location and scale parameters, Annals of Mathematical Statistics, 39, 29-48. 3. Han, M. (1997), The structure of hierarchical prior distribution and its applications, Chin.

Oper. Res. Manag. Sci. 6, 31-40.

4. Han, M. (2007), E-Bayesian estimation of failure probability and its application, Math. Chin,

Comput. Model. 45, 1272-1279.

5. Han, M. (2009), E-Bayesian estimation and hirarchical Baysian estimation of failure rate,

Appl.Math. Model. 33, 1915-1922.

6. Han, M. (2011), E-Bayesian estimation and hirarchical Baysian estimation of failure probability,

Commun. Stat. Theory Methods. 40, 3303-3314.

7. Jaheen, Z. F. and Okasha, H. M. (2011), E-Bayesian estimation for the burr type XII model based on type- 2 censoring, Appl. Math. Model. 35, 4730-4737.

8. Kiapour, A. (2018), Bayes, E-Bayes and robust Bayes premium estimation and prediction under the squared log error loss function, Journal of the Iranian Statistical Society, 17, 33-47. 9. Kiapour, A., and Nematollahi, N. (2011), Robust Bayesian prediction and estimation under a

square error loss function, Statistics and Probability Letters, 81, 1717-1724.

10. Lindley, D. V, and Smith, A. F. M. (1971), Bayes estimates for the linear model, J. Stat. Soc.

References

Related documents

Diagnosing the spatio-temporal effect of deep convection on upper- tropospheric moist processes using a composite technique and demon- strating the technique’s viability to do the

At the same time, median duration of the objective response for the known cytostatics both in mono-regime and combination as chemotherapy of the first line constitutes 6 to 8 months

Previous studies in the field of self-destruction concentrated on direct self-destruction, and therefore, chronic (indirect) self-destruction has not been studied

We have calculated the phase shifts using two different methods, the microscopic R -matrix theory [37] and the present method.. The accuracy of the R -matrix theory is well tested,

Our results suggest that the time domain features for swallowing sounds are not different between nectar-thick and honey-thick fluids, while the water swallows had

In order to achieve the aim of the study, answers to the following questions: how do valuers in Lagos generate anchor data when anchoring on and adjusting values of residential