Estimation of a common location parameter

(1)

Estimation of

a common location parameter

Von der Fakult¨at f¨ur Mathematik, Informatik und Naturwissenschaften der RWTH Aachen University zur Erlangung des akademischen Grades einer Doktorin der Naturwissenschaften

genehmigte Dissertation

vorgelegt von Diplom-Mathematikerin

Xiaofang Wang aus Henan

Berichter: Univ.-Prof. Dr. Udo Kamps Univ.-Prof. Dr. Erhard Cramer

Tag der m¨undlichen Pr¨ufung: 27 Juni 2014

(2)

I would like to hereby sincerely express my gratitude to all those people who have made this dissertation possible and supported me in many ways during the time of this thesis.

My deepest gratitude is to my supervisor Professor Udo Kamps. He gave me the freedom to research on my own and guided me to recover when I faltered in my study. His patience and support helped me to overcome many crisis situations and finish this dissertation.

Moreover, I would like to offer my special thanks to Professor Erhard Cramer for his advices that helped me to improve this dissertation.

I also wish to acknowledge the help provided by the former and current members of the Institute of Statistics. I have been amazingly fortunate to work with such friendly and helpful people. In particular, I would like to thank my colleague Dr. Stefan Bedbur, for the many valuable discussions that helped me understand my research area better, and Dr. Quon Nhon Vuong, for his continuous encouragement.

I am deeply indebted to my parents, Yazhen and Kexuan Wang, for giving me the financial support during the years of my study in Germany and moral support during this thesis. Moreover, I want to thank the lovely brothers and sisters in the Evangelical Chinese Church Aachen for encouraging me and praying for me through these difficult years.

Finally, I give my special thanks to my dear husband Bin Sun with all my heart. Thank him for helping me to adjust to a new country, for teaching me to think optimistically in the face of difficulties and for going through the difficult years with me together. Without his support, encouragement, patience and unwavering love I am not possible to finish this thesis.

(3)

Chapter 1 Introduction

As an extension of (ordinary) order statistics, sequential order statistics were introduced to model sequential out-of-n systems (r ≤ n) (cf. Kamps (1995a, 1995b)). A (n-r+1)-out-of-n system describes a technical structure, in which n components of the same kind are involved. The n components start working simultaneously and the structure works as long as at least n-r+1 components are functioning. It means that the system fails if r or more components fail. For example, an airplane with four turbines will not fail if at least three turbines are operational. Thus, this turbine system works as a 3-out-of-4 system. If, in addition, the failure of a component in a (n-r+1)-out-of-n system affects the failure rate of the remaining components in the system, the system is called a sequential (n-r+1)-out-of-n system. As in the example of the turbine system, the breakdown of a turbine can increase the stress of the remaining turbines, which could lead to an increase of failure rate. The failure time of the components and the failure time of the system of a sequential (n-r+1)-out-of-n system can be described by sequential order statistics.

1.1 Sequential Order Statistics

A definition of sequential order statistics has been introduced by Kamps (1995a), and we restate it in Def. 1.1.1. Later, Cramer and Kamps (2003) found another possibility of a definition, which is given in Def. 1.1.2. Def 1.1.2 coincides with Def. 1.1.1 provided that the distribution functions F1, . . . , Fn in Def 1.1.2 are continuous.

Definition 1.1.1 (Sequential order statistics)

Let (Y_j(i))1≤j≤n−i+1, 1≤i≤n be a sequence of independent random variables with Y (i)

j ∼ Fi, 1 ≤

j ≤ n − i + 1, 1 ≤ i ≤ n, where F1, . . . , Fn are continuous distribution functions with F1−1(1) ≤

· · · ≤ F−1 n (1).

(6)

Furthermore, let

X_j(1) = Y_j(1), 1 ≤ j ≤ n, and X_?(1) = min{X₁(1), . . . , X_n(1)}, and for 2 ≤ i ≤ n let

X_j(i) = F_i−1Fi(Y (i) j ) 1 − Fi(X?(i−1)) + Fi(X?(i−1)) , 1 ≤ j ≤ n − i + 1, and

X_?(i) = min{X₁(i), . . . , X_n−i+1(i) }. Then the random variables X?(1), . . . , X

(n)

? are called sequential order statistics (SOSs) based on

F1, . . . , Fn.

Definition 1.1.2 (Sequential order statistics (General Definition)) Let F1, . . . , Fn be distribution functions with F1−1(1) ≤ · · · ≤ F

−1

n (1), and let V1, . . . , Vn be

independent random variables with Vi ∼ Beta(n − i + 1, 1), 1 ≤ i ≤ n.

Then the random variables

X_?(i) = F_i−1(X(i)) with X(i) = 1 − ViF¯i(X?(i−1)), 1 ≤ i ≤ n, X?(0) = −∞,

where ¯Fi = 1 − Fi, 1 ≤ i ≤ n, are called sequential order statistics (SOSs) based on F1, . . . , Fn.

In Def. 1.1.1, we can see the reason why SOSs are used to describe the life spans of the components in a sequential (n-r+1)-out-of-n system. The system starts with n components whose life spans are described by random variables Y₁(1), . . . , Yn(1) that are iid distributed with

distribution function F1. After the first failure at time x(1)? , the life spans of the remaining n − 1

components are described by Y₁(2), . . . , Y_n−1(2) that are iid distributed with distribution function F2 truncated on the left at x(1)? . This truncation ensures that the second failure time x(2)? is

not prior to the first one x(1)? . Proceeding this way results in an ordered sample of failure times

x(1)? ≤ · · · ≤ x (n)

? . In a (n-r+1)-out-of-n system, we consider only the first r (r ≤ n) failure

times, after which the system does not work any more. Thus, x(r)? also describes the life span

of the system. Let X?(j) be the random variable of x (j)

? , 1 ≤ j ≤ n, the joint density function of

the first r (r ≤ n) SOSs X?(1), . . . , X?(r) has been derived in Kamps (1995a) as follows,

f(X?(1), ..., X (r) ? )_(x(1)_{, . . . , x}(r)_{) =} n! (n − r)! r Y j=1 " 1 − Fj(x(j)) 1 − Fj(x(j−1)) n−j · fj(x (j)₎ 1 − Fj(x(j−1)) # , (1.1)

where F1, . . . , Fn are absolutely continuous distribution functions with corresponding density

(7)

Throughout this doctoral thesis, we will restrict ourselves to a particular choice of the distri-bution functions F1, . . . , Fn, namely

Fj = 1 − (1 − F )α

(j)

, 1 ≤ j ≤ n, (1.2)

where F is an absolutely continuous baseline distribution function with density function f and α(1)_{, . . . , α}(n) _{are positive real parameters. Note that the failure rate of this particularly chosen}

Fj, 1 ≤ j ≤ n, is proportional to the failure rate of the baseline function F , given by α(j)_1−Ff .

The first r SOSs X?(1), . . . , X (r)

? (r ≤ n) based on Fj in (1.2), 1 ≤ j ≤ n, are random variables

that represent the first r failure times of the components in a (n-r+1)-out-of-n system with conditional proportional failure rates. All of the n components in the system start working with failure rate α(1) f

1−F. Once the j

th _{failure occurs, 1 ≤ j ≤ r, the failure rate of the still}

functional n − j components changes from α(j)_1−Ff to α(j+1)_1−Ff . α(1), . . . , α(n)are usually called model parameters of the system.

The joint density function of the first r SOSs X?(1), . . . , X?(r)(r ≤ n) based on Fj in (1.2),

1 ≤ j ≤ n, is given by: f(X?(1), ..., X (r) ? )_(x(1)_{, . . . , x}(r)_{) =} n! (n − r)! r Y j=1 α(j) ! "_r−1 Y j=1 1 − F (x(j))m (j) f (x(j)) # × 1 − F (x(r)₎α(r)(n−r+1)−1 f (x(r)), (1.3) where F−1(0) ≤ x(1) _{≤ · · · ≤ x}(r) _{< F}−1_{(1), r ≤ n, and m}(j)_{= (n − j + 1)α}(j)_{− (n − j)α}(j+1)₋

1, 1 ≤ j ≤ n − 1 (cf. Cramer and Kamps (2001a, p. 311)).

More details about the distribution theory and properties of SOSs and generalized order statis-tics (GOSs) (such as the relationship between them) can be found in Kamps (1995a, 1995b) and Cramer and Kamps (2001a). Another approach to the stochastic modeling of reliability systems with conditional proportional failure rates has been given by Hollander and Pe˜na (1995).

1.2 Location-scale Family

The joint density function of the first r SOSs in (1.3) is dependent on the baseline function F which is an absolutely continuous function. Out of the many possible F ’s, we concentrate in this thesis only on the ones which come from a location-scale family F defined by its distribution function F with

(8)

where g(t) is a differentiable, strictly increasing function on [g−1(µ), g−1(∞)), and g−1(∞) is defined as the point a0 ∈ R with lim

x%a0

g(x) = ∞. µ and σ in F are usually called location and scale parameter, respectively. Some members of the family F are shown in Table 1.1.

Choosing F ∈ F, (1.3) can be rewritten as: f(X?(1), ..., X?(r))_(x(1)_{, . . . , x}(r)₎ = n! (n − r)!σ r r Y j=1 α(j)g0(x(j)) exp −σα(j)(n − j + 1) g(x(j)) − g(x(j−1)) ! , (1.5) where g−1(µ) := x(0) _{≤ x}(1) _{≤ · · · ≤ x}(r) _{< g}−1_(∞).

In this doctoral thesis we concerns the estimation of the location parameter µ and the scale pa-rameter σ in the baseline function F ∈ F with observations of some different (n-r+1)-out-of-n systems. The observations are described in Situation 1.1.

As described in Situation 1.1., all the the system parameters rki’s and nki’s, model

parame-ters αki = (α (1)

ki , . . . , α (nki)

ki )’s and the function gki’s, 1 ≤ i ≤ sk, 1 ≤ k ≤ m, are assumed

to be known. The location parameter µ in all the baseline functions is supposed to be the same, whereas the scale parameters σk’s are different from each other for k = 1, . . . , m. The

(n-r+1)-out-of-n systems based on Fki’s with the same µ and m different σk’s are called m

different systems with the common location parameter µ. The systems are said to be differ-ent iff they possess differdiffer-ent σ_k0s. It means that the other parameters αki’s, rki’s, nki’s and

the functions gki’s in the kth system could be different. From the kth system, sk observations

(xk1, ...., xksk), 1 ≤ k ≤ m, are obtained and we call these s =

Pm

k=1sk observations a sample.

So far, studies on the estimation problem with a common location parameter do not involve such complex density functions as in (1.5). Most of them considered only normal or exponen-tial distributions. They estimated the unknown common location parameter of some simple distributions based on complete samples or type II censored samples. The scale parameters in the distributions are unknown and possibly unequal.

First, we refer to some papers that considered the point estimations of the common location parameter in several exponential distributions by using complete samples, when the different scale parameters are assumed to be unknown. Ghosh and Razmpour (1984) have proposed the maximum likelihood estimator (MLE), a modified MLE (MMLE) and the uniformly minimum variance unbiased estimator (UMVUE) of the common location parameter. Pal and Sinha (1990) have given a small class of estimators, each of which dominates the MLE in terms of mean squared error and Pitman closeness. Pal and Sinha’s results have been extended by Jin and Crouse (1998a, 1998b) in two aspects. They introduced a wider class of estimators, and they used a class of risk functions that includes mean squared error by the comparison of esti-mators.

(9)

The estimation of the common location parameter in type II censored distributions was first discussed by Jin and Elfessi (2001). In that paper the MLE of the common scale parameter of type II censored Pareto distributions has been given, and a class of estimators dominating the MLE in terms of mean squared error has been obtained. Since estimating the common scale pa-rameter of Pareto distributions and estimating the common location papa-rameter of exponential distributions are equivalent under proper transformation, this work can be viewed as a study on the estimation of the common location parameter of several type II censored exponentials. Not only the point estimation for common location parameter has been studied, we can also find many previous works that concern the intervals estimation of the common location parameter. Methods that were proposed for constructing confidence intervals of the common mean of sev-eral normal distributions can be found in Fairweather (1972) and Jordan and Krishnamoorthy (1996), where complete samples were used.

Since there exists a one-to-one correspondence between confidence intervals and hypothesis tests, the methods that were used for testing hypotheses can also be seen as methods for con-structing confidence intervals, such as Cohen and Sackrowitz’s method (Cohen and Sackrowitz (1984)) and the methods in the area of combined independent significance tests. Some ex-amples in the area of combined independent significance tests are: Tippet’s Method (Tippett (1931)), Wilkinson’s method (Wilkinson (1951)), Fisher’s method (Fisher (1958)), weighted Fisher’s method (Goods (1955)), inverse normal method (Stouffer, Suchman, DeVinney, Star, and Williams (1949)), weighted inverse normal method (Lipt´ak (1958)), and logit method (George (1977)).

Except Jin and Elfessi (2001), all studies listed above were done by using complete samples. For the estimation problem concerning type-II censored SOSs, where the common location pa-rameter of the (n1i-r1i+1)-out-of-n1i system, 1 ≤ i ≤ s1, is known, we refer to the work done

by Cramer and Kamps (2001b). They gave the MLE of the scale parameter σ1 of baseline

functions by using type II censored samples.

As mentioned above, we will study the estimation problem of the common location and scale parameters in the density function of SOSs (see 1.5) in this thesis. Henceforth, the systems that we handle in this thesis, always refer to systems with a common location parameter. Since exponential or Pareto distribution is a special case of SOSs with density function (1.5) and complete samples are special cases of type II censored samples, the results in this thesis are more general than the results in the references. In other words, when proper values are assigned to some parameters in (1.5), the results in this thesis should coincide with the results in the references.

In the following, we will first use the observations from one or more systems to estimate the scale parameter(s) where the common location parameter µ is assumed to be known. After which

(10)

the common location as well as the scale parameters will be estimated on the supposition that µ ist unknown.

(11)

T able 1.1: Elemen ts of F (σ > 0 , a > 0 , b ∈ R ) Nr. g (x ) F (x ) = 1 − exp (− σ (g (x ) − µ )) supp ort and parameters Distribution 1 x 1 − exp( − σ (x − µ )) x ≥ µ Exp onen tial 2 ax + (bx ) 2 1 − exp( − σ (ax + (bx ) 2 − µ )) x ≥ − a + √ (a 2+4 b 2µ ) 2 b 2 µ = 0 Lifetime distribution (cf. Flehinger and Le wis (1959 )), µ ≥ − a 2 4 b 2 , b 6= 0 a = 0 ,b = 1 Ra yleigh 3 ax + (cx ) 3 1 − exp( − σ (ax + (cx ) 3 − µ )) x ≥ 1 ( c µ + q µ 2+ 4 a 3 27 c 3 c ) 1 3 + 1 ( c µ − q µ 2+ 4 a 3 27 c 3 c ) 1 3, c > 0 µ = 0 Lifetime distribution (cf. Flehinger and Le wis (1959 )) 4 ax + (bx ) 2 + (cx ) 3 1 − exp( − σ (ax + (bx ) 2 + (cx ) 3 − µ )) c > 0 ,x ≥ x0 1 extra complexit y 5 x a 1 − exp( − σ (x a − µ )) x ≥ µ 1 a ≥ 0 W eibull a = 2 Ra yleigh 6 (x − c) a 1 − exp( − σ (( x − c) a − µ )) x ≥ c + µ 1 a, µ 1 a ≥ 0 W eibull 7 log ( x + c a ) 1 − ( a exp( µ ) x + c ) σ x ≥ a exp( µ ) − c c = 0, P areto of T yp e I 2 8 log (ax λ − b) 1 − ( exp( µ ) ax λ− b ) σ x ≥ ( b +exp( µ ) a ) 1 λ a=1,b=0, P areto of T yp e I 2 λ > 0 µ = 0 ,b = 1 ,λ = 1 ,a = 1 σ 9 − log (b − ax λ ) 1 − exp( µσ )( b − ax λ ) σ ( b − exp( − µ ) a ) 1 λ ≤ x < ( b ) a 1 λ, λ > 0 generalized P areto (cf. Hosking and W allis (1987 ) and Dargahi-Noubary (1989 )) 10 log ( ax λ+ c d ) + mx 1 − exp( µσ )( d ax λ+ c ) σ exp( − mσ x ) c, m, d, λ > 0 , x ≥ x0 1 a = 1 , λ = 1, µ = 0, c = d P areto of typ e II I 2 11 (log (ax + c)) β +1 1 − exp( µσ ) exp( − σ (log (ax + c)) β +1 ) x ≥ exp( µ 1 β +1 )− c a , c = 1, µ = 0 , σ = 1, Lifetime µ ≥ 0 , β ≥ 0 distribution (cf. Dhillon (19 81 )) 12 exp(( ax ) β ) − d 1 − exp( µσ ) exp( σ (d − exp(( ax ) β ))) x ≥ (ln( µ + d )) 1 β a , β > 0, d ≥ 1 − µ µ = 0 , σ = 1 , d = 1, Ex p onen-tial P o w er (cf. Dhillon (1981 )) 1x 0 is the unique ro ot of g (x ) = µ . 2cf. Johnson, Kotz, and Balakrishnan (1994 ) V o l. 1 pp.574.

(12)

Situation 1.1: F or m ∈ N and sk ∈ N , 1 ≤ k ≤ m , the notations and the obser v ations are as follo ws: system mo del parameters system parameters baseline function observ at ion 1 α (1) 11 , .. ., α (n 11 ) 11 r11 , n11 F11 (x ) = 1 − exp( − σ1 (g11 (x ) − µ )) x11 = (x (1) 11 ,. .. ,x (r 11 ) 11 ) x11 ,. .. ,x 1 s1 α (1) 12 , .. ., α (n 12 ) 12 r12 , n12 F12 (x ) = 1 − exp( − σ1 (g12 (x ) − µ )) x12 = (x (1) 12 ,. .. ,x (r 12 ) 12 ) are considered .. . .. . .. . .. . as observ ations α (1) 1s 1 , .. ., α (n 1 s1 ) 1 s1 r1s 1 , n1 s1 F1 s1 (x ) = 1 − exp( − σ1 (g1 s1 (x ) − µ )) x1 s1 = (x (1) 1s 1 ,. .. ,x (r 1 s1 ) 1 s1 ) of system 1 2 α (1) 21 , .. ., α (n 21 ) 21 r21 , n21 F21 (x ) = 1 − exp( − σ2 (g21 (x ) − µ )) x21 = (x (1) 21 ,. .. ,x (r 21 ) 21 ) x21 ,. .. ,x 2 s2 α (1) 22 , .. ., α (n 22 ) 22 r22 , n22 F22 (x ) = 1 − exp( − σ2 (g22 (x ) − µ )) x22 = (x (1) 22 ,. .. ,x (r 22 ) 22 ) are considered .. . .. . .. . .. . as observ ations α (1) 2s 2 , .. ., α (n 2 s2 ) 2 s2 r2s 2 , n2 s2 F2 s2 (x ) = 1 − exp( − σ2 (g2 s2 (x ) − µ )) x2 s2 = (x (1) 2s 2 ,. .. ,x (r 2 s2 ) 2 s2 ) of system 2 .. . .. . .. . .. . .. . .. . m α (1) m1 , .. ., α (n m 1 ) m 1 rm 1 , nm 1 Fm 1 (x ) = 1 − exp( − σm (g m 1 (x ) − µ )) xm 1 = (x (1) m1 ,. .. ,x (r m 1 ) m 1 ) xm 1 ,. .. ,x ms m α (1) m2 , .. ., α (n m 2 ) m 2 rm 2 , nm 2 Fm 2 (x ) = 1 − exp( − σm (g m 2 (x ) − µ )) xm 2 = (x (1) m2 ,. .. ,x (r m 2 ) m 2 ) are con sid ered .. . .. . .. . .. . as observ ations α (1) ms m , .. ., α (n ms m ) ms m rms m , nms m Fms m (x ) = 1 − exp( − σm (g ms m (x ) − µ )) xms m = (x (1) ms m ,. .. ,x (r ms m ) ms m ) of system m where -rki and nk i , 1 ≤ i ≤ sk , 1 ≤ k ≤ m , are p ositiv e in tegers with rki ≤ nk i , -α (j ) k i , 1 ≤ j ≤ nk i , 1 ≤ i ≤ sk , 1 ≤ k ≤ m , is a p ositiv e real n u m b er, -gki , 1 ≤ i ≤ sk , 1 ≤ k ≤ m , is a differen tiable, strictly increasing func tion on [g − 1 k i (µ ), g − 1 k i (∞ )) , -µ is an arbitrary real n um b er and σk , 1 ≤ k ≤ m , is a p ositiv e real n um b er.

(13)

Chapter 2 Models with known common location

parameter

In this chapter we focus our attention on the point estimation of the unknown scale parameter(s) σ (or σ’s), when the common location parameter µ, system parameters, model parameters and the g-functions (see (1.5) & Situation 1.1) are known. First, we give some estimators of the unique unknown scale parameter σ (observations are obtained from one system, meaning m = 1 in Situation 1.1), followed by estimators of m different unknown σ’s (observations of m possibly different systems, as described in Situation 1.1). These estimators and their properties have been studied in Kamps (1995a), Kamps (1995b), Cramer and Kamps (2001a) and Cramer and Kamps (2001b). The difference between our study and the former studies is that we obtain the estimators and their properties by using the properties of the exponential family, which simplifies the proofs. We begin with the definition of the exponential family.

Definition 2.1

Let (X, B) be a measurable space, let Θ 6= ∅ be a parameter set and let P = {Pθ : θ ∈ Θ} be

a family of probability measures on (X, B). If there exist a σ-finite measure µ on (X, B) that dominates P and a µ-density of Pθ, θ ∈ Θ, which is given by

dPθ dµ (x) = C(θ) exp m X k=1 ζk(θ)Tk(x) ! h(x), x ∈ X,

then P is called a m-parametric exponential family in ζ1, . . . , ζm and T1, . . . , Tm, provided that

m ∈ N, C, ζ1, . . . , ζm are real-valued functions on Θ and h, T1, . . . , Tm are B − B1-measurable

functions on (X, B) with h > 0.

Frequently, it is more convenient to use the ζ1, . . . , ζm as the parameters and write the density

in the canonical form dPζ dµ (x) = C(ζ) exp m X k=1 ζkTk(x) ! h(x), x ∈ X.

(14)

2.1 Estimation of the single unknown scale parameter

In this section, we estimate the scale parameter based on s independent observations of a system. We will take the observations given as in Situation 1.1 with m = 1 and s1 = s. Since

we have just one system, we can simplify the notation used in Situation 1.1 by dropping the first subscript of all parameters, functions and observations. The modified Situation 1.1 is described as follows, and we name it Situation 2.1. As explained in the previous chapter, the model parameters, system parameters and the g-functions in the system could be different.

Situation 2.1

model parameters system parameters baseline function observation

α(1)₁ , . . . , α(n1) 1 r1, n1 F1(x) = 1 − exp(−σ(g1(x) − µ)) x1 = (x (1) 1 , . . . , x (r1) 1 ) α(1)₂ , . . . , α(n2) 2 r2, n2 F2(x) = 1 − exp(−σ(g2(x) − µ)) x2 = (x (1) 2 , . . . , x (r2) 2 ) . . . . α(1)s , . . . , α(ns s) rs, ns Fs(x) = 1 − exp(−σ(gs(x) − µ)) xs= (x (1) s , . . . , x(rss))

We begin with the assumptions on the concerning parameters and functions: - ri and ni, 1 ≤ i ≤ s, are positive integers with ri ≤ ni,

- αi = (α (1)

i , . . . , α (ni)

i ), 1 ≤ i ≤ s, is a vector with positive real elements,

- gi, 1 ≤ i ≤ s, is a differentiable and strictly increasing function on [gi−1(µ), g −1 i (∞)),

- and µ is an arbitrary real number.

All these parameters and functions are supposed to be known. In Situation 2.1, xi = (x

(1)

i , . . . , x (ri)

i ), 1 ≤ i ≤ s, is an observation of the (ni-ri+1)-out-of-ni

system based on function Fi(x) = 1−exp(−σ(gi(x)−µ)). Let Xi = (Xi(1), . . . , X (ri)

i ), 1 ≤ i ≤ s,

be the corresponding SOSs which take values in the measurable space (Rri

≤, R ri ≤ ∩ Bri), where Rr≤i := {Rri|g −1 i (µ) ≤ x(1) ≤ · · · ≤ x(r) < g −1 i (∞)} and R ri

≤ ∩ Bri denotes the Borel sets of

Rr≤i. In the following section, we will estimate the unknown scale parameter σ based on these

s observations. Let PXi _{= {P}Xi

σ = fσXiλri|Rri≤ : σ ∈ R+} be a parametric family of probability measures on the

space (Rri

≤, R ri

≤∩ Bri), where λri, 1 ≤ i ≤ s, denotes the Lebesgue measure on (R ri

≤, R ri

(15)

and ·|B denotes the restriction of a measure to a measurable subset B ∈ Bri. Then, fXi_(x i) =σriexp σ ri X j=1 α(j)_i (ni− j + 1) gi(x (j−1) i ) − gi(x (j) i ) ! × (ni)! (ni− ri)! r Y j=1 α(j)_i g0_i(x(j)_i ) ! , xi ∈ Rr≤i and x (0) i := g −1 i (µ), λ ri_{− a.e.} _(2.1) Moreover, we define eX(s) _{= (X} 1, . . . , Xs) and let PXe (s) = {PXe(s) σ = ⊗si=1P Xi i : σ ∈ R+} be the

family of probability measures on the space (×s i=1R

ri

≤, ⊗si=1(R ri

≤ ∩ Bri)). Since X1, . . . , Xs are

independent, the density function of eX(s) with respect to the Lebesgue measure ⊗s_i=1λri _{on the}

space (×s i=1R

ri

≤, ⊗si=1(R ri

≤ ∩ Bri)) can be easily derived, and given by:

fXe(s) σ (ex (s) ) = c(s)(σ) exp σ T(s)(_ex(s)) h(s)(x_e(s)), (2.2) e x(s)∈ ×s_i=1_Rri ≤ and x (0) i := g −1 (µ), 1 ≤ i ≤ s, ⊗s_i=1λri_{− a.e.,} where c(s)(σ) = σ s P i=1 ri , (2.3) T(s)(x_e(s)) = s X i=1 ri X j=1 α(j)_i (ni− j + 1) gi(x (j−1) i ) − gi(x (j) i ) , (2.4) and h(s)(_ex(s)) = s Y i=1 (ni)! (ni− ri)! ! _s Y i=1 r Y j=1 α(j)_i g0_i(x(j)_i ) ! . (2.5) Evidently, PXe(s)

forms a one-parametric exponential function in σ and the statistic T(s)_{. Using}

the properties of an exponential family (cf. Bedbur (2011, Chapter 2 )), many results for estimators of σ can be acquired immediately. In the rest of this section, R always denotes the known constant s P i=1 ri. Theorem 2.1.1

(i) The Maximum Likelihood Estimator (MLE) of σ exists and is given by

σ?(s) = − R T(s)_{( e}_X(s)₎.

(ii) σ?(s) _{∼ InvGam(Rσ, R), i.e,}

fσ?(s)(t) = (Rσ) R Γ(R) ( 1 t) R+1_exp(−Rσ t ), t > 0. (2.6)

(16)

(iii) E (σ?(s))k = Γ(R−k)_Γ(R) (Rσ)k, k < R,

in particular, E(σ?(s)) = _R−1Rσ , V ar(σ?(s)) = _(R−1)(Rσ)2_(R−2)2 .

(iv) M SEσ(σ?(s)) = E(σ − σ?(s))2 = _{(R−1)(R−2)}R+2 σ2.

(v) σ?(s) is minimal sufficient and complete for PXe(s)

. (vi) _σ?(s)1 is an efficient estimator of

1

σ based on s independent observations, i.e., 1 σ?(s) has

uniformly minimum variance among all unbiased estimators of 1_σ. However, σ?(s) is not an efficient estimator for σ.

Proof. Let π : R+ → π(R+) : σ 7→ _dσd(− ln(c(s)(σ))) = −R_σ, then π is a bijective function, and

the inverse function of π is π−1 _{: R}−→ R+ : y 7→ −R_y.

Let σ ∈ R+ be fixed and let U = (−σ₂,σ₂). According to La. 2.1.18 of Bedbur (2011, p. 20),

the moment generating function of −T(s) at t ∈ U is given by m_−T(s)(t) = c(s)_(σ) c(s)_{(σ − t)} = σR (σ − t)R = (1 − t σ) −R , t ∈ U. (2.7)

Hence, −T(s)∼ Γ(σ, R) whose density function is given by f−T(s)(t) = σ

R

Γ(R)t

R−1

exp(−σt), t > 0. (2.8)

(i) Following from (2.8), Pσ(T(s)( eX(s)) ∈ R−) = 1 for all σ > 0. With the application of Thm.

2.2.3 in Bedbur (2011, p. 26), σ?(s) _{= π}−1 _T(s)₍ e x(s)_{) = −} R T(s)₍ e x(s).

(ii), (iii) and (iv) are obvious from (i) and the density function of −T(s)_.

(v) Since V ar(T(s)_{) 6= 0, the assertion follows from Thm. 2.1.9 of Bedbur (2011, p. 15) and}

La. 2.1.27 of Bedbur (2011, p. 23).

(vi) Let g(σ) = 1_σ. First, we prove that _σ?(s)1 is an unbiased estimator for g(σ). With the

application of Thm. 2.1.15 of Bedbur (2011, p. 18), we see that Eσ(_σ?(s)1 ) =

1 R d dσ ln(c (s)_{(σ)) =} 1

σ = g(σ). Furthermore, following from Thm. 2.1.22 of Bedbur (2011, p. 21), IfX(s)e σ

(σ) = V ar(T(s)_{) =} R

σ2 hold, where I_fX(s)e σ

(σ) denotes the fisher information of PXe(s). Then, V ar( 1 σ?(s)) = 1 Rσ2 = d_dg(σ) 2 I fX(s)e σ (σ) −1

, which implies that V ar( 1

σ?(s)) attains the lower bound of Cram´

er-Rao inequality of g(σ) (cf. Thm. 3.3 of Shao (1994, p. 169)). Thus, _σ?(s)1 has the minimum

variance among all unbiased estimators of g(σ) based on s independent observations.

Since E(σ?(s)_{) 6= σ, it is obvious that σ}?(s) _{is not efficient for σ.}

Remark 2.1.1

(i) In the special case of s = 1 and α(j)₁ = 1, j = 1, . . . , n1, i.e., estimation with a sample

of type II censored ordinary order statistics of an exponential distribution, the MLE of _σ1

(17)

can be rewritten as: σ? = 1 r1 r1 X i=1 X₁(j)+ (n1− r1)X (r1) 1 ! , which coincide the result in Epstein (1956).

(ii) Given gi(x) = x, 1 ≤ i ≤ s and σ = _σ1, the above-mentioned results can be found in

Thm. 3.2 of Cramer and Kamps (2001b), where Cramer and Kamps have given the MLE of the single unknown scale parameter σ and its properties based on a sample of a (ni-ri+1)-out-of-ni system with baseline function F (t) = 1 − exp(−t−µ_σ ).

Since we have found a sufficient and complete statistic for PXe(s), the unique uniformly minimum

variance unbiased estimator (UMVUE) of σ can easily be obtained according to the theorem of Lehmann-Scheff´e (cf., e.g., Thm. 3.1 of Shao (1994, p. 162)).

Theorem 2.1.2

The UMVUE of σ is given by

σ??(s) = − R − 1

T(s)_{( e}_X(s)₎. (2.9)

Proof. The statement follows clearly from Thm. 2.1.1 (iii) and (v).

Since σ??(s) is just a linear transformation of σ?(s), the density function, moments and the properties of sufficiency and completeness of σ??(s) can be easily derived. We will only list the results in the following theorem.

Theorem 2.1.3

The UMVUE has the following properties: (i) σ??(s) ∼ InvGam((R − 1)σ, R), i.e,

fσ??(s)(t) = ((R − 1)σ) R Γ(R) ( 1 t) R+1_exp(−(R − 1)σ t ), t > 0. (ii) E (σ??(s))k = Γ(R−k)_Γ(R) ((R − 1)σ)k, k < R,

in particular, E(σ??(s)) = σ, V ar(σ??(s)) = _R−2σ2 .

(iii) M SEσ(σ??(s)) = E(σ − σ??(s))2 = V ar(σ??(s)) = σ

2

R−2.

(iv) σ??(s) _{is minimal sufficient and complete for P}Xe(s)

.

Now, we turn our attention to the dominance of the two estimators σ?(s) and σ??(s). First, we compare them in the traditional way: by mean squared error (MSE). From that point of view, the UMVUE is preferred to the MLE. However, when another criterion, Pitman closeness, is used to compare them, MLE shows a better performance. Before the comparison, we give a definition of Pitman closeness:

(18)

Definition 2.1.1 (Pitman closeness)

For the estimation problem with parameter space Θ, an estimator ˆθ of θ ∈ Θ is said to be Pitman closer than another estimator ˜θ if

Pθ

|ˆθ − θ| < |˜θ − θ|≥ 0.5, ∀ θ ∈ Θ, (2.10) with strict inequality for at least one θ ∈ Θ.

Theorem 2.1.4

(i) M SE(σ??(s)) < M SE(σ?(s)), for R ≥ 2. (ii) σ?(s) _{is Pitman-closer than σ}??(s)_.

Proof. (i) M SE(σ??(s)_{) < M SE(σ}?(s)_{) ⇔} σ2

R−2 < R+2 R−1

σ2

R−2 ⇔ −1 < 2, for R ≥ 2.

(ii) Similar to the proof of La. 3.2.4 in Bedbur (2011, p. 59), we can prove that Pσ(|σ?(s)− σ| <

|σ??(s)_{− σ|) > 0.5 for all σ > 0.}

In the remainder of this subsection, we deduce some asymptotic properties of σ?(s) _{and σ}??(s)_,

namely, the strong consistency and the asymptotic efficiency. The strong consistency of an estimator describes the behavior of the estimator when the sample size tends to infinity, whereas the asymptotic efficiency describes the asymptotic distribution of the estimator.

Definition 2.1.2 (Strong consistency)

Let ˆθ(s) be a point estimator of a parameter θ ∈ Θ for every sample size s, where θ is a k dimensional vector. Then, the sequence {ˆθ(s)}_s∈N is called strongly consistent for θ iff ˆθ(s) →a.s.

θ as s → ∞, i.e., iff ˆθ(s) converges almost surely towards θ as s tends to infinity. Theorem 2.1.5

Both, (σ?(s)₎

s∈N and (σ??(s))s∈N are strongly consistent for σ.

Proof. The density function fXe(s) σ (ex (s)_{) in (2.2) is equal to:} fXe(s) σ (ex (s)_{) =} s Y i=1 r Y j=1 η_i(j) ! exp s X i=1 ri X j=1 η_i(j)T_i(j)(xi) ! _s Y i=1 (ni)! (ni− ri)! ! _s Y i=1 r Y j=1 g_i0(x(j)_i ) ! , where η_i(j):= σα(j)_i , T_i(j)(xi) := (ni− j + 1) gi(x (j−1) i ) − gi(x (j) i ) , j = 1, . . . , ri, i = 1, . . . , s. Let η := (η₁(1), . . . , η(r1) 1 , . . . , η (1) s , . . . , η(rss)) and let eT( eX(s)) := (T₁(1)( eX(s)), . . . , T₁(r1)( eX(s)), . . . , Ts(1)( eX(s)), . . . , Ts(rs)( eX(s))). Thus, PXe (s)

forms a multivariate exponential family in η and eT. Applying the properties of the exponential family, we have

m_{− e}_T(t) = s Q i=1 r Q j=1 η_i(j) s Q i=1 r Q j=1 (η_i(j)− t(j)_i ) = s Y i=1 r Y j=1 1 1 − t (j) i η(j)_i , t ∈ ×s_i=1×ri j=1(− η_i(j) 2 , η_i(j) 2 ).

(19)

Therefore, −T_i(j), j = 1, . . . , ri, i = 1, . . . , s, are mutually independent and exponentially

dis-tributed with location parameter 0 and scale parameter η(j)_i . After a linear transformation, we obtain −α(j)_i T_i(j) iid∼ exp(0, σ), j = 1, . . . , ri, i = 1, . . . , s.

Applying the strong law of large numbers, −T(s) R = 1 s P i=1 ri s X i=1 ri X j=1 (−α_i(j)T_i(j))

converges almost surely (a.s.) towards Eσ(−α1(1)T (1)

1 ( eX(s))) = σ1, where T

(s) _{is defined as in}

(2.4).

Remark 2.1.2

If ϕ : R+ → Γ is a bijective continuous function, then both (ϕ(σ??(s)))s∈N and ϕ(σ?(s))_s∈N are

strongly consistent for estimating ϕ(σ). This statement follows directly from Thm. 1.10 in Shao (1994, p. 59).

Definition 2.1.3 (Asymptotic efficiency)

Let X1, . . . , Xs be iid random variables with probability measure from a parametric family PX

indexed by θ and let (ˆθ(s))_s∈N be a sequence of estimators of θ based on a sequence of ( eX(s)₎ s∈N,

where θ is a k dimensional vector and eX(s)_{= (X}

1, . . . , Xs), s = 1, 2, . . . . Then, the sequence

{ˆθ(s)}_s∈N is said to be asymptotically efficient if, for every θ ∈ Θ, √ s ˆθ(s)− θ→dNk 0, I_fX(θ) −1 , (2.11)

where I_fX(θ) denotes the Fisher information matrix of PX at θ ∈ Θ.

From Def. 2.1.3, we see that if we want to discuss the asymptotic efficiency of the sequences of estimators (σ?(s))s∈N and (σ??(s))s∈N, we need iid random variables. As a result, we will add

some conditions to the system parameters ri’s in the following theorem.

Theorem 2.1.6

(i) Let ri = r in (2.1) for i = 1, . . . , s, s ∈ N. Then, (σ?(s))s∈N and (σ??(s))s∈N are

asymp-totically efficient for estimating σ, i.e √ s(σ?(s)− σ) →dN (0, IfX1(σ) −1 ) = N (0, σ 2 r ) (2.12) and √ s(σ??(s)− σ) →dN (0, IfX1(σ) −1 ) = N (0, σ 2 r ). (2.13)

(20)

(ii) If ϕ : R+ → Γ is a continuous function such that ϕ0(σ) exists and ϕ0(σ) 6= 0, ∀σ ∈ R+,

then (ϕ(σ?(s)))s∈N and (ϕ(σ??(s)))s∈N are asymptotically efficient for estimating ϕ(σ), i.e.,

√ s(ϕ(σ?(s)) − ϕ(σ)) →d N (0, IfX1(σ) −1 (ϕ0(σ))2) = N (0, σ 2 r (ϕ 0 (σ))2) (2.14) and √ s(ϕ(σ??(s)) − ϕ(σ)) →dN (0, IfX1(σ) −1 (ϕ0(σ))2) = N (0, σ 2 r (ϕ 0 (σ))2). (2.15) Proof. (i) Putting ri = r, i = 1, . . . , s, fXi in (2.1) gives us the following form:

fXi_(x i) =σrexp (σTi(xi)) (ni)! (ni− ri)! r Y j=1 α(j)_i g0_i(x(j)_i ) ! , (2.16) where Ti(xi) := Pr j=1α (j) i (ni − j + 1) gi(x (j−1) i ) − gi(x (j) i ) , xi ∈ Rr≤ and x (0) i := g −1 i (µ).

Obviously, −T1(X1), . . . , −Ts(Xs) are independent random variables. Since PXi = {PσXi =

fXi

σ λr|Rr : σ ∈ R+} forms an exponential family in σ and Ti, the density function of −Ti(Xi), i =

1, . . . , s, which can be obtained analogously to the proof of Thm. 2.1.1 by computing the mo-ment generating function, is given by:

f−Ti(Xi)_{(t) = σ}r_exp(−σt)t

r−1

Γ(r), t > 0. (2.17)

Following from (2.17), −Ti(Xi)’s are identically distributed and the probability measure of each

−Ti(Xi) is from an exponential family. Using the properties of the exponential family, we know

the MLE of σ in (2.17) (based on s iid observations) exists and is given by ˆσ(s)_{= −} rs Ps

i=1Ti(Xi).

The asymptotic efficiency of ˆσ(s) _{was proven in Exa. 3.12 of Lehmann and Casella (1998, pp.}

450-451). Since ˆσ(s) _{= σ}?(s) _{and the Fisher information of (2.16) and (2.17) are the same, we}

know that (2.12) holds.

Now, we prove the asymptotic efficiency of σ??(s). Setting X(s) := −_TR(s) and Y

(s) _{:= −}R−1 T(s) +

R

T(s), s = 1, 2, . . . , the assertion about the asymptotic efficiency of σ

??(s) _{is provided by}

(Y(s))s∈N →p 0 (cf. Thm. 3.4.3 (i) in Sen and Singer (1993, p. 130)). To prove (Y(s))s∈N→p 0,

we calculate the asymptotic distribution of −Y(s)_:

For y ∈ (0, ∞), lim s→∞P (−Y (s)_{≤ y) = lim} s→∞P (−T (s) _≥ 1 y) = exp(− σ y) lims→∞ rs−1 X q=0 (σ_y)q q! = exp(− σ y) exp( σ y) = 1. For y ∈ (−∞, 0), lim s→∞P (−Y (s)_{≤ y) = lim} s→∞P ( 1 −T(s) ≤ y) = 0.

As a result, −Ys→dZ with P (Z = 0) = 1. Applying Thm. 1.8 (vii) of Shao (1994, p. 51), we

know (Y(s)₎

s∈N →p 0 holds.

(21)

2.2 Estimation of the multiple unknown scale

parame-ters

In the previous section, we have obtained estimators of the single unknown scale parameter based on s independent observations of one system. In this section, we continue to estimate scale parameters, but based on observations of m possibly different systems. The observations are exactly the same as described in Situation 1.1.

Let eX(s) _{:= (X}

11, . . . , X1s1, . . . , Xk1, . . . , Xksk, . . . , Xm1, . . . , Xmsm) be SOSs of observations

(x11, . . . , x1s1, . . . , xk1, . . . , xksk, . . . , xm1, . . . , xmsm) (as in Situation 1.1) and let

m

P

k=1

sk = s.

Since xki’s are independent observations of systems with possibly different parameters and

baseline functions, Xki’s are independent, but possibly not identically distributed. Xki =

(X_ki(1), . . . , X(rki)

ki ), 1 ≤ i ≤ sk, 1 ≤ k ≤ m, takes values in the measurable space (Rr≤ki, R rki ≤ ∩Brki), where Rrki ≤ := {(x (1) ki , . . . , x (rki) ki ) ∈ R rki|g−1 ki (µ) ≤ x (1) ki ≤ · · · ≤ x (rki) ki < g −1 ki (∞)} and R rki ≤ ∩ Brki

denotes the Borel sets of Rrki

≤ , 1 ≤ i ≤ sk, 1 ≤ k ≤ m. In the following, we will estimate the

vector σ = (σ1, . . . , σm) of scale parameters, given that the common location parameter µ,

model parameters α11, . . . , αmsm and functions g11, . . . , gmsm are known.

Let PXe(s) m = {P e X(s) σ = f e X(s) σ ⊗mk=1 ⊗ sk i=1λrki|Rrki≤ : σ ∈ R m

+}, where λrki denotes the Lebesgue

measure on (Rrki_{, B}rki_{) and ·|}

B denotes the restriction of a measure to a measurable subset

B ∈ Brki_{, 1 ≤ i ≤ s}

k, 1 ≤ k ≤ m. The density function fXe

(s) σ is given by: fXe(s) σ (ex (s)_{) = c}(s)_{(σ) exp} m X k=1 σkT (s) k (ex (s)₎ ! h(s)(x_e(s)), (2.18) with c(s)(σ) = m Y k=1 σ Psk i=1rki k , T_k(s)(_ex(s)) = sk X i=1 rki X j=1 α(j)_ki (nki− j + 1) gki(x (j−1) ki ) − gki(x (j) ki ) , 1 ≤ k ≤ m, h(s)(x_e(s)) = m Y k=1 sk Y i=1 _r_ki Y j=1 α_ki(j)g_ki0 (x(j)_ki ) ! (nki)! (nki− rki)! ! and xki = (x (1) ki, . . . , x (rki) ki ) ∈ R rki ≤ , x (0) ki := g −1 ki (µ), 1 ≤ i ≤ sk, 1 ≤ k ≤ m, ⊗mk=1⊗ sk i=1λ rki_| Rrki≤ -a.e.. Then, PXe(s)

m forms an m-parametric exponential family in σ and T(s) =

T₁(s), . . . , Tm(s)

. It follows that we can use the properties of the exponential family to estimate the vector σ of scale parameters. The MLE and its properties are listed in the following theorem. The

(22)

L¨owner Ordering used in the theorem defines an ordering of two matrices A and B: A ≥ B iff A − B is a positive semidefinite matrix. From this point forward, diag(a1, . . . , am) always

denotes a m × m-diagonal matrix with diagonal elements a1, . . . , am and Rk always denote sk

P

i=1

rki, k = 1, . . . , m.

Theorem 2.2.1

Using the properties of a multidimensional exponential family (cf. Bedbur (2011, chapter. 2)), we obtain

(i) The Maximum Likelihood Estimator (MLE) of σ is given by σ?(s) = (− R1 T₁(s)( eX(s)₎, . . . , − Rm Tm(s)( eX(s)) ). (ii) E(σ?(s)_{) = (}R1σ1 R1−1, . . . , Rmσm Rm−1) and Cov(σ ?(s)_{) = diag(} (R1σ1)2 (R1−1)2(R1−2), . . . , (Rmσm)2 (Rm−1)2(Rm−2)).

(iii) σ?(s) _{is minimal sufficient and complete for P}Xe(s)

m . (iv) (−T (s) 1 R1 , . . . , − Tm(s)

Rm) is an efficient estimator for (

1 σ1, . . . , 1 σm), i.e., (− T₁(s) R1 , . . . , − Tm(s) Rm) has

uniformly the minimal covariance matrix in the sense of L¨owner Ordering among all unbiased estimators of (_σ1

1, . . . ,

1

σm) based on s independent observations. The lower bound

of Rao-Cram´er inequality Cov(_σ?(s)1 ) is attained at diag(

1 σ2 1R1, . . . , 1 σ2 mRm).

Proof. For fixed σ ∈ Rm

+, let δ = min{σ1, . . . , σm}/2 > 0, and let U = (−σ, σ)m. The

moment generating function of −T(s) _{at t ∈ U can be calculated as follows:}

m−T(s)(t) = c(s)_(σ) c(s)_{(σ − t)} = m Y k=1 σk σk− tk Rk = m Y k=1 1 − tk σk −Rk . Hence, − T₁(s)( eX(s)_{), . . . , −T}(s)

m ( eX(s)) are jointly independent random variables, and −T_k(s)( eX(s)),

1 ≤ k ≤ m, has a Gamma distribution with scale parameter σk and shape parameter Rk, i.e.:

f−Tk(s)( eX (s)₎ (tk) = σRk k Γ(Rk)tRk −1 k exp(−σktk), k = 1, . . . , m.

(i) can be proven analogously to the proof of Thm. 2.1.1.

(ii) With the joint distribution function of (− T₁(s)( eX(s)_{), . . . , −T}(s)

m ( eX(s))), (ii) can be easily

computed.

(iii) Since Cov(−T(s)_{) > 0, the assertion follows directly with the application of Thm. 2.1.9}

(ii) of Bedbur (2011, p. 15) and Thm. 2.1.27 of Bedbur (2011, p. 23).

(23)

p. 29). To obtain the UMVUE of σ˜_k for an arbitrary ˜k ∈ {1, . . . , m}, we consider the following

situa-tion.

Situation 2.2

Suppose, for a ˜k ∈ {1, . . . , m}, σ1, . . . , σ_k−1˜ , σ˜_k+1, . . . , σm are considered as fixed nuisance

pa-rameters. Then, by setting

˜ h(_ex(s)) = m Y k=1 sk Y i=1 rki Y j=1 α(j)_ki · g0_k(x(j)_ki) ! (nki)! (nki− rki)! !     m Y k=1, k6=˜k σ Psk i=1rki k     exp( m X k=1, k6=˜k σkTk(ex (s) )), (2.19) (2.18) can be rewritten as fXe(s) σ (ex (s) ) = σ Ps˜k i=1r˜_ki ˜ k exp(σk˜T (s) ˜ k (ex (s) ))˜h(_ex(s)), xki ∈ Rr≤ki, 1 ≤ i ≤ sk, 1 ≤ k ≤ m. Hence, PXe(s) ˜ k = {P e X(s)

σ : σ˜k > 0} forms a one-parametric exponential family in σk˜ and T˜k.

The UMVUE σ?? ˜

k of σ˜k in Situation 2.2 can be directly obtained by replacing R with R˜k and

T(s)_{( e}_X(s)_{) with T}(s) ˜ k ( eX

(s)_{) in Thm. 2.1.3.}

Theorem 2.2.2

In Situation 2.2, the UMVUE of σk based on s observations of eX(s) is given by

σ??(s)_k = − Rk− 1

T_k(s)( eX(s)₎, 1 ≤ k ≤ m. (2.20)

All of the properties of MLE and UMVUE in the previous section, such as the moments of MLE, the dominance of the two estimators, etc., can be used on σ?(s)_k and σ??(s)_k by substituting Rk for R and T

(s)

k ( eX(s)) for T(s)( eX(s)).

For the rest of this section, we will be focusing on the asymptotic properties of the sequences (σ?(s)₎

s∈Nand (σ??(s))s∈N, under the assumptions s(s)_k s s→∞ → ak> 0, 1 ≤ k ≤ m, and Pm k=1ak = 1, where σ??(s) _{= (σ}??(s) 1 , . . . , σ ??(s) m ). Theorem 2.2.3 The sequence (σ?(s)₎

s∈N of MLEs and the sequence (σ??(s))s∈N of UMVUEs are strongly

consis-tent for estimating σ.

Proof. Since the elements in σ?(s) _{and σ}??(s) _{are independent of each other, the statements}

follow from Thm. 2.1.5.

Remark 2.2.1

If ϕ : Rm+ → Γ is a bijective continuous function, the sequences (ϕ(σ?(s)))s∈N and (ϕ(σ??(s)))s∈N

(24)

The concept of the asymptotic efficiency of a sequence of estimators from iid random variables has been introduced in Def. 2.1.3, and it can be extended to an independent but not necessarily identically distributed (inid) case in the following sense (Lehmann and Casella (1998, pp. 475-476)). Suppose that Xk1, . . . , Xksk, k = 1, . . . , m, are the random variables corresponding to

the observations in the kth _{sample, and that X}

k1, . . . , Xksk are iid distributed with density

function fk,θ, where θ = (θ1, . . . , θm). X11, . . . , X1s1 . . . , Xk1, . . . , Xksk, . . . , Xm1, . . . , Xmsm

are assumed to be independently, but not necessarily identically distributed. The sample sizes s1, . . . , sm with s := Pm k=1sk satisfy that sk s → ak > 0 as s → ∞ and Pm k=1ak = 1. Then, a

sequence (ˆθ(s))_s∈N of estimators of θ is said to be asymptotically efficient if √ s(ˆθ(s)− θ) →d Nm  0, m X k=1 akIfk(θ) !−1 ,

where Ifk(θ), k = 1, . . . , m, denotes the Fisher information matrices corresponding to fk,θ.

In this extended definition of asymptotic efficiency, m sets of random variables are needed, and the random variables in each set are iid distributed. However, every random variable in MLE σ?(s) _{and UMVUE σ}??(s) _{could be different from the others. It follows that when we discuss the}

asymptotic efficiency of the sequences (σ?(s)₎

s∈N and (σ??(s))s∈N, we have to include additional

conditions for the system parameters rki’s.

Theorem 2.2.4

Suppose rki = rk, i = 1, . . . , sk, k = 1, . . . , m, m ∈ N.

(i) The sequence (σ?(s)₎

s∈N and the sequence (σ??(s))s∈N are asymptotically efficient for

esti-mating σ, provided that √ s(σ?(s)− σ) →d Nm(0, (J (σ))−1), √ s(σ??(s)− σ) →d Nm(0, (J (σ))−1), where J (σ) = m P k=1 akdiag(0, . . . , 0,_σrk2 k , 0, . . . , 0) = diag(a1_σr12 1 , . . . , am_σrm2 m). (ii) If ϕ : Rm

+ → Γ is a continuously differentiable function with |Dϕ(σ)| 6= 0, ∀σ ∈ Rm+, the

sequences (ϕ(σ?(s)₎₎

s∈N and (ϕ(σ??(s)))s∈N are asymptotically normally distributed, i.e.

√

s ϕ(σ?(s)) − ϕ(σ) →dNm(0, Dϕ(σ)(J (σ))−1D0ϕ(σ))

and

√

(25)

Proof. (i) Analogously to the proof of Thm. 2.1.6 , we can obtain sk, k = 1, . . . , m, iid Gamma

distributed random variables −Tk1(Xk1), . . . , −Tksk(Xksk) with scale parameter σk and shape

parameter rk, where Tki(xki) := rk P j=1 α(j)_ki(nki − j + 1) gki(x (j−1) ki ) − gki(x (j) ki) , 1 ≤ i ≤ sk, 1 ≤

k ≤ m. Also, the MLE of σ in the distribution functions of −Tki(Xki)’s can be rewritten as:

ˆ σ(s) = (ˆσ1, . . . , ˆσm) = (−_Ps1 r1s1 i=1 T1i(X1i) , . . . , − rmsm sm P i=1 Tmi(Xmi)

). Applying Thm. 1 (iv) of Bradley and Gart (1962), we have √ s(σ?(s)− σ) =√s( ˆσ(s)− σ) →d Nm(0, (J (σ))−1), where J (σ) = − m P k=1 akE(∂ 2_ln(f_−Tk1(Xk1)₎ ∂σpσq ) = m P k=1 akdiag(0, . . . , 0,_σrk2 k , 0, . . . , 0) = diag(a1_σr12 1, . . . , am rm σ2 m).

The assertion for (σ??(s)₎

s∈N follows from the asymptotic distribution of (σ?(s))s∈N in

combina-tion with the multivariate Slutsky’s theorem (cf., e.g., La. 6.3 in Bilodeau and Brenner (1999, p. 78)).

(ii) can be proven analogously to the proof of the second part of La. 2.2.15 in Bedbur (2011,

(26)

(27)

Chapter 3 Model with unknown common location

parameter

The basic model in this chapter is almost the same as the one in Section 2.2, which means s observations of m different systems (as in Situation 1.1) are available. The only difference is that the common location parameter µ in this chapter is assumed to be unknown. However, precisely because of this difference, we have to use a completely different approach to estimate the unknown parameters.

We begin with the family of parametric probability measures: Let Xki = (X

(1)

ki , . . . , X (rki)

ki ), 1 ≤ i ≤ sk, 1 ≤ k ≤ m, which takes values in the measurable space

(Rrki

≤ , R rki

≤ ∩ Brki), be the SOSs of the observation xki, where Rr≤ki := {(x (1) ki , . . . , x (rki) ki )|g −1 ki (µ) ≤ x(1)_ki ≤ · · · ≤ x(rki) ki < g −1 ki (∞)} and R rki

≤ ∩ Brki denotes the Borel sets of R rki

≤ , 1 ≤ i ≤ sk, 1 ≤

k ≤ m. Moreover, let eX(s) := (X11, . . . , X1s1, . . . , Xk1, . . . , Xksk, . . . , Xm1, . . . , Xmsm), and let

PXe(s) mix := {P e X(s) µ,σ = f e X(s) µ,σ ⊗mk=1⊗ sk i=1λrki|Rrki≤ : µ ∈ R, σ ∈ R m +}, where λrki, 1 ≤ i ≤ sk, 1 ≤ k ≤ m,

denotes the Lebesgue measure on (Rrki_{, B}rki_{) and ·|}

B denotes the restriction of a measure to a

measurable subset B ∈ Brki_{. The density function f}Xe(s)

µ,σ is given by: fXe(s) µ,σ (˜x (s)₎₌ m Y k=1 σ Psk i=1rki k ! exp m X k=1 σk sk X i=1 rki X j=1 α(j)_ki(nki− j + 1) gki(x (j−1) ki ) − gki(x (j) ki) !! × exp m X k=1 σk sk X i=1 α(1)_ki nkiµ !! _m Y k=1 sk Y i=1 rki Y j=1 α(j)_ki g_ki0 (x(j)_ki) ! (nki)! (nki− rki)! ! , xki ∈ Rr≤ki, gki(x(0)_ki ) := 0, 1 ≤ i ≤ sk, 1 ≤ k ≤ m. (3.1)

In this chapter, we will estimate the common location parameter µ and the vector σ = (σ1, . . . , σm) of scale parameters, given that model parameters α11, . . . , αmsm and functions

g11, . . . , gmsm are known. Since the location parameter µ is unknown, P

e X(s)

(28)

exponential family any more, and we must compute all of the estimators and their properties using basic methods. For brevity, we will be using the following denotations:

- bki := α (1) ki nki, i = 1, . . . , sk, k = 1, . . . , m, - bk:= sk P i=1 bki, Rk := sk P i=1 rki, - a := m P k=1 bkσk, - and h(s)₍ e x(s)_{) :=} Qm k=1 sk Q i=1 rki Q j=1 α(j)_kig_ki0 (x(j)_ki ) ! (nki)! (nki−rki)! ! .

In this chapter, we first calculate the MLE and the UMVUE of µ and σ = (σ1, . . . , σm) in

(3.1), and thereafter a modified MLE (MMLE) of µ. Moreover, the (asymptotical) properties of them will be studied. At the end of this Chapter we will introduce a new class of general estimators of µ, which dominate the MLE of µ in terms of risk under a convex loss function and also in terms of Pitman closeness.

3.1 MLE of the common location parameter and the

scale parameters

Theorem 3.1.1

The joint MLE (µ?(s), σ?(s)) of (µ, σ) with σ = (σ1, . . . , σm) is given by

µ?(s) = min i=1,...,sk k=1,...,m {gki(X (1) ki )}, and σ?(s)= (− R1 e T1( eX(s)) , . . . , − Rm e Tm( eX(s)) ), where e T_k(s)( eX(s)) = sk X i=1 rki X j=1 α(j)_ki (nki− j + 1) gki(X (j−1) ki ) − gki(X (j) ki ) + bkµ?(s), 1 ≤ k ≤ m.

Proof. The likelihood function of µ and σ is given by:

L(s)(µ, σ) = m Y k=1 σRk k ! exp m X k=1 σk sk X i=1 rki X j=1 α(j)_ki (nki− j + 1) gki(x (j−1) ki ) − gki(x (j) ki) !! × exp (aµ) h(s)(_ex(s)) m Y k=1 sk Y i=1 1_[g−1 ki(µ) ≤ x (1) ki ≤ ... ≤ x(rki)ki < g −1 ki (∞)] . (3.2)

(29)

If there exists a k and an i with µ > gki(x (1)

ki ), then the likelihood function L

(s)_{(µ, σ) equals 0.}

For µ ≤ gki(x (1)

ki ), i = 1, . . . , sk, k = 1, . . . , m, the likelihood function is given by:

L(s)(µ, σ) = eh(s)_σ (_ex(s)) exp (aµ) , where eh(s)σ (ex (s)_{) = h}(s)₍ e x(s)) _m Q k=1 σRk k exp( m P k=1 σk( sk P i=1 rki P j=1 α(j)_ki (nki−j +1)(gki(x (j−1) ki )−gki(x (j) ki )))).

L(s)_{(µ, σ) is a non-negative and non-decreasing function in µ for µ ≤} _min i=1,...,sk

k=1,...,m

{gki(x (1)

ki )}, when

σ is fixed. Thus, the function L(s)(µ, σ) is maximized for µ = min

i=1,...,sk k=1,...,m {gki(x (1) ki )} with any arbitrary σ.

Since µ?(s) _{is independent of σ, σ}?(s) _{is a value of σ that maximizes L}(s)_(µ?(s)_{, σ). By}

substi-tuting µ = µ?(s) _{into the logarithm of the likelihood function (3.2), we have}

l(µ?(s), σ) = m X k=1 Rkln(σk) + m X k=1 σkTek(ex (s) ) + ln(h(s)(_ex(s))) = m X k=1 Rkln( σk(− eTk(ex (s)₎₎ Rk ) + m X k=1 σkTe_k( e x(s)) + ln(h(s)(_ex(s))) + m X k=1 Rkln( Rk − eTk(xe (s)₎₎) = m X k=1 Rkln( σk(− eTk(ex (s)₎₎ Rk ) + m X k=1 σkTe_k(x_e(s)) + ˘h(s)(x_e(s)) ≤ m X k=1 Rk( σk(− eTk(xe (s)₎₎ Rk − 1) + m X k=1 σkTe_k( e x(s)) + ˘h(s)(_ex(s)) =˘h(s)(_ex(s)) − m X k=1 Rk, where ˘h(s)₍ e x(s)_{) = ln(h}(s)₍ e x(s)_{)) +}Pm k=1Rkln(_{− e}_T Rk k(ex

(s)₎₎) and the equality holds iff

σk(− eTk(ex

(s)₎₎

Rk

= 1, k = 1, . . . , m.

This implies that the likelihood function L(σ, µ?(s)_{) attains its maximum at σ}

k = − Rk

e

Tk(ex(s))).

Now, we derive the density functions of the stated MLEs. Theorem 3.1.2

For rki ≥ 2, i = 1, . . . , sk, k = 1, . . . , m, we obtain

(i) µ?(s) is exponentially distributed with location parameter µ and scale parameter a, i.e., fµ?(s)(z) = a exp (−a(z − µ)) , z ≥ µ.

(30)

(ii) The density function of σ?(s) = (σ₁?(s), . . . , σ?(s)m ) is given by f(σ1?(s),...,σ ?(s) m )_(y 1, . . . , ym) = 1 a m Y k=1 σRk k Γ(Rk+ 1) (Rk yk )Rk+1 ! exp(− m X k=1 σkRk yk )× m X k=1 (1 − 1 Rk )bkyk ! , (y1, . . . , ym) ∈ Rm+.

(iii) µ?(s) _{and σ}?(s) _{are independent.}

Proof. (i) Considering La. 2.4 of Cramer and Kamps (2003) and the assumptions of the model in this Chapter, the marginal distribution of X_ki(1), i = 1, . . . , sk, k = 1, . . . , m, is given by:

X_ki(1) ∼ 1 −1 −1 − (1 − Fki)α

(1) ki

nki

. It means that the distribution function of X_ki(1) has the following form:

FXki(1)(x) = 1 − exp(−σ

kα (1)

ki nki(gki(x) − µ)).

Then, the distribution function of g(X_ki(1)) is given by Fg(Xki(1))(x) = 1 − exp(−σ

kα (1)

ki nki(x − µ)).

Considering the properties of the exponential distribution, we have fµ?(z) = m X k=1 σkbk ! exp − m X k=1 σkbk ! (z − µ) ! , z ≥ µ. (3.3)

(ii) To prove the assertions of (ii) and (iii), we need additionally the joint density function of (µ?(s)_{, σ}?(s)_{) and the marginal density function of σ}?(s)_{. At first, we derive the joint density}

func-tion with the help of the joint density funcfunc-tion of (Q11, . . . , Q1s1, . . . , Qm1, . . . , Qmsm, µ

?(s)_), where Qki := rki P j=1 α(j)_ki(nki−j+1) gki(X (j) ki ) − gki(X (j−1) ki ) , and gki(X (0) ki ) := 0, i = 1, . . . , sk, k = 1, . . . , m. For µ ≤ z < qki α(1)_kinki , i = 1, . . . , sk, k = 1, . . . , m, we have, P (Qki ≤ qki, µ?(s) > z, i = 1, . . . , sk, k = 1, . . . , m) =P ( rki X j=1 α_ki(j)(nki− j + 1) gki(X_ki(j)) − gki(X_ki(j−1)) ≤ qki, gki(X_ki(1)) > z, i = 1, ..., sk, k = 1, ..., m) =P ( rki X j=2 α_ki(j)(nki− j + 1) gki(X (j) ki ) − gki(X (j−1) ki ) ≤ qki− α (1) ki nkigki(X (1) ki ), α(1)_ki nkigki(X (1) ki ) > α (1) ki nkiz, i = 1, . . . , sk, k = 1, . . . , m) = m Y k=1 sk Y i=1 Z ∞ α(1)_kinkiz Z qki−zki 0 σrki−1 k x rki−2 Γ(rki− 1) exp(−σkx) dx dPZki(zki) (3.4) :=I(q11, . . . , qmsm, z).

(31)

(3.4) holds, becausePrki j=2α (j) ki (nki− j + 1) gki(X (j) ki ) − gki(X (j−1) ki )

is gamma distributed with scale parameter σk and shape parameter rki − 1, which follows from the arguments in the

proof of Thm. 2.1.5. In (i) we have proven that gki(X (1)

ki ), i = 1, . . . , sk, k = 1, . . . , m, is

exponentially distributed with the location parameter µ and the scale parameter bkiσk. Then,

it is obvious that the linear transformation Zki := bkigki(X (1)

ki ) is exponentially distributed with

location parameter bkiµ and scale parameter σk.

Thus, the partial derivative of I(q11, . . . , qmsm, z) w.r.t. q11, . . . , qmsm, provided that bkiµ ≤

zki < qki and rki ≥ 2, i = 1, . . . , ski, k = 1, . . . , m, is given by:

∂I(q11, . . . , qmsm, z) ∂q11. . . ∂q1s1. . . ∂qm1. . . ∂qmsm = m Y k=1 sk Y i=1 Z ∞ bkiz σrki−1 k (qki− zki)rki−2 Γ(rki− 1) exp (−σk(qki− zki)) σkexp (−σk(zki− bkiµ)) dzki = m Y k=1 sk Y i=1 σrki k (rki− 2)! exp(−σk(qki− bkiµ)) Z qki bkiz (qki− zki)rki−2dzki = m Y k=1 sk Y i=1 σrki k (rki− 2)! exp(−σk(qki− bkiµ))(−1) 1 (rki− 1) (qki− zki)rki−1|q_bki_ki_z = m Y k=1 sk Y i=1 σrki k (rki− 2)! exp(−σk(qki− bkiµ)) 1 (rki− 1) (qki− bkiz)rki−1 :=I(z) Since ∂P (Qki≤ qki, µ?(s)≤ z, k = 1, . . . , m, i = 1, . . . , sk) ∂q11. . . ∂q1s1. . . ∂qm1. . . ∂qmsm∂z =∂P (Qki≤ qki, k = 1, ..., m, i = 1, ..., sk) ∂q11. . . ∂q1s1. . . ∂qm1. . . ∂qmsm∂z − ∂P (Qki ≤ qki, µ ?(s) _{> z, k = 1, ..., m, i = 1, ..., s} k) ∂q11. . . ∂q1s1. . . ∂qm1. . . ∂qmsm∂z

and the first part of the right-hand side of the above equation is 0, we obtain the joint density function of (Q11, . . . , Qmsm, µ ?(s)_), f(Q11, ..., Qmsm, µ?(s))_(q 11, . . . , qmsm, z) = − ∂I(z) ∂z =c exp − m X k=1 σk sk X i=1 (qki− bkiµ) !! _m X k=1 sk X i=1 (rki− 1)bki qki− bkiz ! _m Y k=1 sk Y i=1 (qki− bkiz)rki−1 ! , where c :=Qm k=1 Qsk i=1 σrki_k (rki−1)!, µ ≤ z < qki bki and rki ≥ 2, i = 1, . . . , ski, k = 1, . . . , m.

In the following part, we will deduce the joint density function of (− eT1, . . . , − eTm, µ?(s)) by

(32)

k = 1, . . . , m, Yksk := sk P i=1 Qki− bkµ?(s), k = 1, . . . , m, and Ys+1:= µ?(s), i.e. Y = Q11, . . . , Q1(s1−1), s1 X i=1 Q1i− b1µ?(s), . . . , Qm1, . . . , Qm(sm−1), sm X i=1 Qmi− bmµ?(s), µ?(s) ! . Comparing the definition of eTk( eX(s)), k = 1, . . . , m, in Thm. 3.1.1 with Y, we see eTk( eX(s)) =

−Yksk. The joint density function of Y can be written as:

fY(y11, . . . , y1s1, ym1, . . . , ymsm, ys+1) =c exp − m X k=1 σk(yksk + bk(ys+1− µ)) !        m X k=1 sk−1 X i=1 " (rki− 1)bki(yki− bkiys+1)rki−2× Y l∈{1,...,m} p∈{1,...,sl−1} (l,p)6=(k,i) (ylp− blpys+1) rlp−1 m Y l=1 ylsl− sl−1 X p=1 ylp+ sl X p=1 blpys+1− blslys+1 !rlsl−1 + m X k=1     (rksk − 1)bksk yksk − sk−1 X i=1 yki+ sk X i=1 bkiys+1− bkskys+1 !r_ksk−2 × Y l∈{1,...,m} p∈{1,...,sl−1} (ylp− blpys+1)rlp −1 Y l∈{1,...,m} l6=k ylsl− sl−1 X p=1 ylp+ sl X p=1 blpys+1− blslys+1 !r_lsl−1            , where y ∈ A := {ys+1 ≥ µ, yksk > 0, bk(sk−1)ys+1 < yk(sk−1) < yksk + bk(sk−1)ys+1, k = 1, . . . , m, bkiys+1 < yki < yksk + sk−1 P l=i bklys+1− sk−1 P l=i+1 ykl, i = 1, . . . , sk − 2, k = 1, . . . , m} and rki ≥ 2, i = 1, . . . , sk, k = 1, . . . , m.

Next, we integrate fY(y) w.r.t. yki, i = 1, . . . , sk − 1, k = 1, . . . , m, which gives the joint

density function of (− eT1, . . . , − eTm, µ?) for (y1s1, . . . , ymsm, ys+1) ∈ B := {ys+1 ≥ µ, yksk >

0, k = 1, . . . , m}. f(− eT1, ..., − eTm, µ?(s))_(y 1s1, . . . , ymsm, ys+1) = Z . . . Z A\B | {z } s−m times fY(y11, . . . , y1s1, . . . , ym1, . . . , ymsm, ys+1)dy11. . . dy1(s1−1). . . dym1. . . dym(sm−1) = c exp − m X k=1 σk yksk+ sk X i=1 bki(ys+1− µ) !! ×

(33)

             m X k=1 sk−1 X i=1        (rki− 1)bki Z . . . Z D | {z } s−m times ˜ yrki−2 ki        Y l∈{1,...,m} p∈{1,...,sl−1} (l,p)6=(k,i) ˜ yrlp−1 lp        m Y l=1 ylsl− sl−1 X p=1 ˜ ylp !r_lsl−1 d11...m(sm−1)        + m X k=1     (rksk − 1)bksk× Z . . . Z D | {z } s−m times yksk− sk−1 X i=1 ˜ yki !r_ksk−2     Y l∈{1,...,m} p∈{1,...,sl−1} ˜ yrlp−1 lp     Y l∈{1,...,m} l6=k ylsl− sl−1 X p=1 ˜ ylp !r_lsl−1 d11...m(sm−1)                  , where d11...m(sm−1) := d˜y11. . . d˜y1(s1−1). . . dym1. . . d˜ym(sm−1), ˜yki := yki− bkiys+1, i = 1, . . . , sk− 1, k = 1, . . . , m, and D = ( 0 < ˜yki < yksk− sk−1 X l=i+1 ˜ ykl, i = 1, . . . , sk− 1, k = 1, . . . , m ) . Let Dki := ( 0 < ˜yki < yksk − sk−1 X l=i+1 ˜ ykl ) , 1 ≤ i ≤ sk− 1, k = 1, . . . , m. Then D = D11∩ · · · ∩

D1(s1−1)∩· · ·∩Dm1∩· · ·∩Dm(sm−1). Furthermore, let dki...k(sk−1) denote d˜ykid˜yk(i+1). . . d˜yk(sk−1),

1 ≤ i ≤ sk− 1, 1 ≤ k ≤ m. Then, we have Z D_l(sl−1) . . . Z Dl1 sl−1 Y p=1 ˜ yrlp−1 lp ! ylsl− sl−1 X p=1 ˜ ylp !r_lsl−1 dl1...l(sl−1) = Z D_l(sl−1) . . . Z Dl2 Γ(rl1)Γ(rlsl) Γ(rl1+ rlsl) sl−1 Y p=2 ˜ yrlp−1 lp ! ylsl− sl−1 X p=2 ˜ ylp !rl1+r_lsl−1 dl2...l(sl−1) = Z D_l(sl−1) . . . Z Dl3 Γ(rl1)Γ(rlsl) Γ(rl1+ rlsl) Γ(rl2)Γ(rl1+ rlsl) Γ(rl1+ rl2+ rlsl) sl−1 Y p=3 ˜ yrlp−1 lp ! ylsl− sl−1 X p=3 ˜ ylp !rl1+rl2+r_lsl−1 dl3...l(sl−1) = · · · = Z D_l(sl−1) Γ(rlsl) sl−2 Q p=1 Γ(rlp) Γ( sl−2 P p=1 rlp+ rlsl) ˜ yr_lpl(sl−1)−1(ylsl− ˜ylp) sl−2 P p=1 rlp+r_lsl−1 dl_sl−1 = sl Q p=1 Γ(rlp) Γ( sl P p=1 rlp) y sl P p=1 rlp−1 lsl = sl Q p=1 Γ(rlp) Γ(Rl) yRl−1 lsl , l = 1, . . . , m.

(34)

Analogously, the following integral can be calculated as: Z D_k(sk−1) . . . Z Dk1 ˜ yrki−2 ki     sk−1 Y p=1 p6=i ˜ yrkp−1 kp     yksk − sk−1 X p=1 ˜ ykp !r_ksk−1 dk1...k(sk−1) = Rk− 1 rki− 1 sk Q p=1 Γ(rkp) Γ(Rk) yRk−1 ksk .

Altogether for this first set of integrals:

Z . . . Z D ˜ yrki−2 ki        Y l∈{1,...,m} p∈{1,...,sl−1} (l,p)6=(k,i) ˜ yrlp−1 lp        m Y l=1 ˜ ylsl− sl−1 X p=1 ˜ ylp !r_lsl−1 d˜y11. . . d˜ym(sm−1) =y_ks−1 k Rk− 1 rki− 1     m Y l=1 sl Q p=1 Γ(rlp) Γ(Rl) yRl−1 lsl     .

Similarly, for the integrals in the second bracket [ ], we have:

Z . . . Z D yksk− sk−1 X i=1 ˜ yki !r_ksk−2 Y l∈{1,...,m} p∈{1,...,sl−1} ˜ yrlp−1 lp Y l∈{1,...,m} l6=k ylsl− sl−1 X p=1 ˜ ylp !r_lsl−1 d˜y11. . . d˜ym(sm−1) =y−1_ks k Rk− 1 rksk− 1     m Y l=1 sl Q p=1 Γ(rlp) Γ(Rl) yRl−1 lsl     .

Thus, the sum in the brackets {} in the density function, f(− eT1,...,− eTm,µ?(s))_{, is given by:}

{} = m X k=1 sk−1 X i=1     bkiyks−1k(Rk− 1) m Y l=1 sl Q p=1 Γ(rlp) Γ(Rl) yRl−1 lsl     + m X k=1     bksky −1 ksk(Rk− 1) m Y l=1 sl Q p=1 Γ(rlp) Γ(Rl) yRl−1 lsl     = m Y k=1 sk Y i=1 Γ(rki) !     1 m Q k=1 Γ(Rk)     m Y k=1 yRk−1 ksk ! _m X k=1 bk(Rk− 1)yks−1k ! .

(35)

Finally, we obtain the joint density function of (− eT1, . . . , − eTm, µ?(s)): f(− eT1, ..., − eTm, µ?(s))_(y 1s1, . . . , ymsm, ys+1) = m Y k=1 sk Y i=1 σrki k Γ(rki) ! _m Y k=1 sk Y i=1 Γ(rki) !     1 m Q k=1 Γ(Rk)     m Y k=1 yRk−1 ksk × exp − m X k=1 σk(yksk + bk(ys+1− µ)) ! _m X k=1 bk(Rk− 1)yks−1k ! = m Y k=1 σRk k Γ(Rk) yRk−1 ksk ! exp − m X k=1 σkyksk ! exp − m X k=1 σkbk(ys+1− µ) ! _m X k=1 bk(Rk− 1)yks−1k ! , (3.5) where yksk > 0, k = 1, . . . , m, and ys+1 ≥ µ.

With the joint density function of (− eT1, . . . , − eTm, µ?(s)), the density functions of σ?(s), can be

easily computed.

The joint density function of (− eT1, . . . , − eTm) is given by:

f(− eT1, ..., − eTm)_(t 1, . . . , tm) = Z ∞ µ f(− eT1, ..., − eTm, µ?(s))_(t 1, . . . , tm, z)dz = m Y k=1 σRk k Γ(Rk) tRk−1 k ! _m X k=1 bk(Rk− 1)t−1k ! exp − m X k=1 σktk !     1 m P k=1 bkσk     , (t1, . . . , tm) ∈ Rm+. (3.6) Applying density transformation, we have the joint density function of σ:

f(σ?(s)1 , ..., σ ?(s) m )_(y 1, . . . , ym) =     1 m Q k=1 y2 k Rk     1 a m Y k=1 σRk k Γ(Rk) (Rk yk )Rk−1 ! exp(− m X k=1 σk Rk yk ) m X k=1 (Rk− 1)bk( Rk yk )−1 ! =1 a m Y k=1 σRk k Γ(Rk+ 1) (Rk yk )Rk+1 ! exp(− m X k=1 σkRk yk ) m X k=1 (1 − 1 Rk )bkyk ! , (y1, . . . , ym) ∈ Rm+.

(iii) Obviously, the product of fµ?(s)_{(z), given in (3.3), and f}(− eT1, ..., − eTm)_(t

1, . . . , tm), given in

(3.6), equals f(− eT1, ..., − eTm, µ?(s))_(t

1, . . . , tm, z), z ≥ µ and t1 > 0, . . . , tm > 0, which results in

the independence of the statistics µ?(s) _{and (− e}_T

(36)

Remark 3.1.1

(i) Given m = 2, rki = nki = 1, α (1)

ki = 1, gki(x) = x, x ≥ µ, i = 1 . . . , sk, k = 1, 2, the

results in Thm. 3.1.1 and the density function of µ?(s) _{in Thm. 3.1.2 coincide with the}

results in Ghosh and Razmpour (1984), where the MLEs of the common location parameter µ and the scale parameters _σ1 = (_σ1

1,

1

σ2) of two uncensored exponential distributions were

given by µ?(s) = min{X₁₁(1), . . . , X_1n(1)₁, X₂₁(1), . . . , X_2n(1)₂} σ?(s)_k = 1 sk sk X i=1 (X_ki(1)− µ?_{) =} 1 sk sk X i=1 X_ki(1)− µ?_{, 1 ≤ k ≤ 2.}

(ii) In special case of several type II censored exponential distributions, i.e., m = 1, α(j)_1i = 1, g1i(x) = x, x ≥ µ, r1i ∈ N, n1i ∈ N with r1i ≤ n1i, 1 ≤ j = 1 ≤ n1i, 1 ≤ i ≤ s1, the

MLEs of the common location parameter µ and the scale parameter σ = _σ1

1 were given in

Epstein and Sobel (1954):

µ?(s)= min{X_1i(1), . . . , X_1s(1)₁}, σ?(s) = Ps1 i=1 Prki j=1X (j) 1i − µ?(s) − (n1i− r1i) Xir_iri − µ?(s) Ps1 i=1r1i = Ps1 i=1 Prki j=1X (j) 1i − X (j−1) 1i − _s₁ P i=1 n1i µ?(s) Ps1 i=1r1i , which coincide with the MLEs given in Thm. 3.1.1.

(iii) Let m = 1, σ = _σ1

1 and g1i(x) = x, x ≥ µ, 1 ≤ i ≤ s1, the MLEs of µ and σ =

1

σ1 can be

found in Thm. 3.4 of Cramer and Kamps (2001b), given by µ?(s) = min{X₁₁(1), X₁₂(1), . . . , X_1s(1)₁} σ?(s)= _P_s₁1 i=1r1i s1 X i=1 r1i X j=1 α(j)_1i (n1i− j + 1)(X (j) 1i − X (j−1) 1i ), where X_1i(0) := µ?_{, 1 ≤ i ≤ s} 1.

Furthermore, they have proven that µ?(s) _{is exponentially distributed with location}

param-eter µ and scale paramparam-eter σ1 s1

X

i=1

α_1i(1)n1i.

(iv) In a special case of Situation 1.1. with s1 = · · · = sm = 1, g11(x) = g21(x) = · · · =

(37)

1 σ = (

1 σ1, . . . ,

1

σm) of m different (n-r+1)-out-of-n systems can be found in Schenk (2002,

section 8.2), given by µ?(s)= min{X₁₁(1), X₂₁(1), . . . , X_m1(1)} σ_k?(s) = 1 rk1 rk1 X i=1 α(j)_k1(nk1− i + 1)(X (j) k1 − X (j−1) k1 ) − α(1)_k1nk1 rk1 µ?, 1 ≤ k ≤ m,

where X_k1(0) := 0, 1 ≤ k ≤ m. The density function of µ?(s) in Schenk (2002) also coincide with the density function in Thm. 3.1.2.

Theorem 3.1.3

For rki ≥ 2, i = 1, . . . , sk, k = 1, . . . , m, the MLE µ?(s) obtained in Thm. 3.1.1 has the

following properties:

(i) The lth moment of µ?(s) is given by

E (µ?(s))l = l X q=0 l! q! µq al−q, l ∈ N. In particular, E µ?(s) = 1_a + µ and V ar µ?(s) = _a12.

(ii) µ?(s) _{is asymptotically unbiased for µ, provided that b} ki = α

(1)

ki nki > δ, for some δ > 0, i =

1, . . . , sk, k = 1, . . . , m.

(iii) With the same assumptions as in (ii) (bki = α (1)

ki nki > δ, for some δ > 0, i = 1, . . . , sk, k =

1, . . . , m), µ?(s) is strongly consistent for µ. Proof.

(i) follows from the density function of µ?(s)_.

(ii) Since Bias(µ?(s)_{) = E(µ}?(s)_{) − µ =} 1

a, the assertion can be proven if a → ∞ as s → ∞.

With the assumptions we have

a = m X k=1 sk X i=1 α(1)_ki nkiσk > δ m X k=1 skσk ≥ δs min 1≤k≤m{σk} → ∞, as s → ∞.

(iii) For any > 0, it can be shown:

∞ X s=1 P (|µ?(s)− µ| > ) = ∞ X s=1 exp(−a) ≤ ∞ X s=1 exp(−δ min 1≤k≤m{σk}) s = exp(−β) 1 − exp(−β) < ∞, where β := δ min 1≤k≤m{σk}.

Estimation of a common location parameter

Estimation of

a common location parameter

Contents

Chapter 1

Introduction

1.1

Sequential Order Statistics

1.2

Location-scale Family

Chapter 2

Models with known common location

parameter

2.1

Estimation of the single unknown scale parameter

2.2

Estimation of the multiple unknown scale

parame-ters

Chapter 3

Model with unknown common location

parameter

3.1

MLE of the common location parameter and the

scale parameters