"Prediction in Multivariate Mixed Linear Models"

(1)

Discussion Papers are a series of manuscripts in their draft form. They are not intended for circulation or distribution except as indicated by the author. For that reason Discussion Papers may not be reproduced or distributed without the written consent of the author

CIRJE-F-180

Prediction in Multivariate Mixed

Linear Models

Tatsuya Kubokawa University of Tokyo M. S. Srivastava University of Toronto October 2002

(2)

Prediction in Multivariate Mixed Linear Models

T. Kubokawa and

M.S. Srivastava

University of Tokyo and University of Toronto

October 30, 2002

The multivariate mixed linear model or multivariate components of variance model with equal replications is considered. The paper addresses the problem of predicting the sum of the regression mean and the random effects. When the feasible best linear unbiased predictors or empirical Bayes predictors are used, this prediction problem reduces to the estimation of the ratio of two covariance matrices. We propose scale invariant Stein type shrinkage estimators for the ratio of the two covariance matrices. Their dominance prop-erties over the usual estimators including the unbiased one are established, and further domination results are shown by using information of order restriction between the two covariance matrices. It is also demonstrated that the empirical Bayes predictors that em-ploys these improved estimators of the ratio of the two covariance matrices have uniformly smaller risks than the crude Efron-Morris type estimator in the context of estimation of a matrix mean in a fixed effects linear regression model where the components are unknown parameters.

Key words and phrases: Multivariate mixed linear model, multivariate components

of variances, small area estimation, prediction, shrinkage estimation, empirical Bayes procedure, order restriction, orthogonal equivariance, covariance matrix, Bartlett’s de-composition, decision-theory, posted land price data.

AMS subject classiﬁcations: Primary 62F11, 62J07, Secondary 62C15, 62C20.

1 Introduction

Mixed linear models or variance components models have been eﬀectively and extensively employed in practical data-analysis when the response is univariate. For example, in estimation of small area means, they have been used as a method of pooling or smoothing data to strengthen the accuracy of the estimators of small area means. Mixed linear models are related to empirical Bayes models. Practical applications and theoretical studies of these models have been given by Fay and Herriot (1979), Battese, Harter and Fuller (1988), Prasad and Rao (1990), Ghosh and Rao (1994) and references therein.

In contrast to these activities in the univariate mixed linear models, the multivariate mixed linear models have received little attention except in the multiple regression model

(3)

with replicates considered by Rao (1975) in which the vector of p response variables y_i follow the model

y_i =Xγ_i+_i, γ_i =γ + η_i, (1.1)

i = 1, . . . , k where X is a p × m matrix of known constants, γ is an m-vector of unknown

parameters and, η_i and _i are independently distributed with η_i i.i.d. as N_m(0,F ) and

i i.i.d. asNp(0, σ2I). Reinsel (1985) extended this model in which

γ_i =βb_i+η_i (1.2)

whereβ is an m×q matrix of unknown parameters and b_i are vectors of known constants. Battese et al . (1988)’s model is a variation of the model given in (1.1).

In both models (1.1) and (1.2), y_i are independently distributed random vectors with covariance matrix given by

Cov (y_i) =XF X+ σ2I_p.

Thus, when m < p, both σ2 and F can be unbiasedly estimated. However, when

Cov (_i) = Σ,

a completely unknown matrix, no optimality results are available in predicting, say,βb_i+

η_i whenη_i are random or whenη_i are ﬁxed (with some constraints). Thus, in this paper, we consider the following multivariate mixed linear models

y_ij =βb_ij +α_i+_ij, (1.3)

i = 1, . . . , k, j = 1, . . . , r, where β is a p × q matrix of unknown parameters and b_ij are

q× 1 known vectors. It is assumed that α_i and _ij are independently distributed where

αi are i.i.d. Np(0, ΣA) and ij are i.i.d. Np(0, Σ). Let

θi =βbi. +αi, bi. = r−1 j bij and Θ = (θ₁, . . . ,θ_k).

Our objective is to predict Θ under the squared loss function where Θ is chosen to

minimize the risk

R( Θ, ω) = E_ω

tr ( Θ− Θ)Σ−1( Θ− Θ)

,

where ω = (β, Σ, Σ_A) denotes the set of unknown parameters.

When the parameters of the model is known, the best linear unbiased predictor (BLUP) would be employed. The BLUP is interpreted as a Bayes rule. Since the pa-rameters are unknown in the model, their estimators are substituted in the BLUP, and the substituted predictor is called the feasible BLUP or empirical Bayes predictor. The

(4)

empirical Bayes predictor lies in between the predictor associated with each small area and the estimator pooling whole data.

In Section 2, we consider the case when b_ij = b_i, j = 1, . . . , r. It is shown that the problem of predicting small area means is reduced to that of estimating the ratio ∆ of covariance matrices. The estimator of the ratio matrix ∆ is important in the empirical Bayes predictor as it determines the extent to which the estimator associated with each small area should be shrunken towards the pooled estimator.

In the estimation of a covariance matrix and a ratio of two covariance matrices, several types of estimators improving upon usual estimators such as unbiased ones are available. One of them is the James-Stein type estimator based on the Bartlett’s lower triangular decomposition. It is known that the James-Stein type estimator depends on the coor-dinate system. We therefore consider scale invariant Stein-type shrinkage estimator for

∆ in Section 3. By taking into account the order restriction between the two covariance

matrices, another improved predictor is obtained in Section 3. It may be mentioned that an estimator of ∆ can also be obtained by considering multivariate beta distribution as in Bilodeau and Srivastava (1992) and Konno (1992b).

In Section 4, numerical comparisons of several improved predictors are given based on Monte Carlo simulation. The Monte Carlo studies show that the Efron-Morris type truncated predictor proposed in Section 3 has very nice risk-performances with much smaller risks than the sample mean vector. An example treating data derived by Monte Carlo simulation is given there.

The problem of estimating a matrix mean in a ﬁxed eﬀect linear regression model is addressed in Section 5. In Section 6, we consider the general model (1.3) and give an example analyzing the posted land price data in Kanagawa prefecture in Japan.

2 A Mixed Linear Model

In this section, we deal with the following multivariate mixed linear model with equal replications:

y_ij =βb_i+α_i+_ij, i = 1, . . . , k, j = 1, . . . , r, (2.1) where y_ij’s are p-variate observation vectors, β is a p × q common unknown regression coefficient, b_i’s are q × 1 covariates, α_i’s are p × 1 random effects having a p-variate normal distribution N_p(0, Σ_A), and _ij’s are p× 1 random error terms having N_p(0, Σ). It is supposed that α_i’s and _ij’s are mutually independent and that Σ_A and Σ are unknown positive-definite dispersion matrices.

Letθ_i =βb_i+α_i and Θ = (θ₁, . . . ,θ_k). Then the problem we consider in this paper is to predict the p× k matrix Θ where estimator Θ of Θ is evaluated in terms of the risk

function R_m( Θ, ω) = E_ω tr ( Θ− Θ)Σ−1( Θ− Θ) (2.2)

(5)

for unknown parameters ω = (β, Σ, Σ_A).

The multivariate mixed linear model is also interpreted as an empirical Bayes model:

y_ij ∼ Np(θi, Σ) andθihaving prior distributionNp(βbi, ΣA). In this situation, the Bayes

estimator of θ_i is given by

θB_i = θB_i (β, Σ, Σ_A) =y_i.− ΣΣ−1₂ (y_i.− βb_i),

for y_i. = r−1r_j=1y_ij and Σ₂ = Σ + rΣ_A. This is the best linear unbiased predictor (BLUP) in our setup.

Since the Bayes estimator θB_i depends on the unknown parametersβ, Σ and Σ_A. they should be estimated from the marginal distributions. Let

S = k i=1 r j=1 (y_ij − y_i.)(y_ij − y_i.), W = r k i=1 (y_i.− βb_i)(y_i.− βb_i), β =k i=1 y_i.b_i( k i=1 bibi)−,

where A− denotes the generalized inverse of the matrix A. Assume that the rank of _k

i=1bibi is q1 with q1 ≤ q < p. Then it is seen that

S ∼ Wp(Σ, n), n = k(r− 1),

W ∼ Wp(Σ2, m), m = k− q1

and that S, W and β are mutually independent. The parameters β, Σ and Σ_A (or Σ₂) are respectively estimated by β, Σ and Σ_A (or Σ₂) which are functions of β, S and W . This results in an empirical Bayes estimator (or feasible BLUP) of θ_i of the form

θEB i = θ B i (β, Σ, ΣA) =yi.− Σ Σ −1 2 (yi.− βbi), (2.3)

which means that the sample mean of the i-th small area is shrunk towards the common value that uses all the data. It is known that y_i. has an unstable variance because of

small data in the i-th small area. But this undesirable property can be avoided by using the estimator θEB_i which borrows the data from the surrounding small area. The ratio

Σ Σ−1₂ of the estimators of covariance matrices determines the extent to which y_i. should

be shrunk. Since the parameter space is restricted as

Σ1/2(Σ + rΣ_A)−1Σ1/2 = Σ1/2Σ−1₂ Σ1/2 ≤ I_p, (2.4) the ratio should be in 0 ≤ Σ1/2Σ−1₂ Σ1/2 ≤ I_p, where A1/2 denotes the positive deﬁnite factorization of the symmetric matrix A and A ≤ B means that B − A is non-negative deﬁnite. Two extreme cases of taking Σ1/2Σ−1₂ Σ1/2 = 0 and = I_p yield y_i. and βb_i, respectively, both of which are inappropriate. It is reasonable to choose Σ Σ−1₂ such that θEB_i has a uniformly smaller risk than the existing estimators including y_i..

(6)

Let ΘEB = (θEB₁ , . . . , θEB_k ). Noting that E_ω[tr ( ΘB − Θ)Σ−1( ΘB − Θ)] = pk/r −

(k/r)tr ΣΣ−1₂ , we see that the risk function of ΘEB is written by

R_m( ΘEB, ω) (2.5) = E_ω tr ( ΘEB− ΘB)Σ−1( ΘEB− ΘB) + E_ω tr ( ΘB− Θ)Σ−1( ΘB− Θ) = r−1E_ω tr ( Σ Σ−1₂ − ΣΣ−1₂ )Σ−1( Σ Σ−1₂ − ΣΣ−1₂ )W + kr−1p− tr ΣΣ−1₂ ,

so that the risk of ΘEB depends on the risk of the estimators of the ratio of covariance matrices deﬁned by R( ∆, ω) = E_ω tr ( ∆− ∆)Σ−1( ∆− ∆)W (2.6)

for ∆ = ΣΣ−1₂ and its estimator ∆. We shall look for an estimator ∆ that has a smaller

risk R( ∆, ω).

The usual unbiased estimator of ∆ is ∆UB = n−1(m− p − 1)SW−1 with risk

R( ∆UB, ω) =−n−1(n− p − 1)(m − p − 1) + mtr ∆.

Since the crude estimator (y₁., . . . ,y_k.) of Θ corresponds to the the estimator ∆ = 0, its

risk has R(0, ω) = mtr ∆ and this shows that ∆UB is better than the crude estimator ∆ = 0 for m > p+1 and n > p+1. More generally the estimator ∆(a) = aSW−1, a multiple of

SW−1_{, has the risk R(}_{∆(a), ω) =}_{{n(n + p + 1)(m − p − 1)}−1_a2_{− 2na + m} tr ∆, which}

can be minimized at a = (m− p − 1)/(n + p + 1) with risk

R( ∆₀, ω) =−n(n + p + 1)−1(m− p − 1) + mtr ∆, where

∆₀ = (n + p + 1)−1(m− p − 1)SW−1. (2.7)

3 Improved Estimators

In this section, we shall provide several types of estimators of ∆ improving upon ∆₀

relative to the risk R( ∆, ω).

We ﬁrst consider two types of James-Stein type estimators based on Bartlett’s decom-position. Let G+_L denote the group of lower triangular matrices with positive diagonal elements. Let T and U be matrices in G+_L such that S = T T and W = UU. Then we can consider two types of James-Stein estimators:

(7)

where C = diag (c₁, . . . , c_p) and D = diag (d₁, . . . , d_p) for c_i = m− i − 1, i = 1, . . . , p, (3.1) d_i = 1 n + p + 3− 2i i−1 j=1 1− 1 n + p + 3− 2j , i = 2, . . . , p, (3.2)

and d₁ = (n + p + 1)−1. It has been shown in Kubokawa and Srivastava (1999) that these estimators have smaller risk than the estimator ∆₀. However, these estimators depend on the coordinate systems and therefore no further consideration will be given to these estimators in this paper. Instead, we shall consider scale equivariant estimators as described below.

Let A be a p × p nonsingular matrix such that S = AA and W = AF A for

F = diag (f1, . . . , fp), f1 ≥ f2 ≥ . . . ≥ fp. Then we consider estimators of the form

∆(Ψ) =AΨ(f)A−1, Ψ(f) = diag (ψ₁(f), . . . , ψ_p(f)) (3.3) for positive functions ψ_i(f)’s of f = (f₁, . . . , f_p). This estimator is equivariant under the group of scale transformations (S, W ) → (BSB,BW B) for p× p nonsingular matrix

B. Using the same arguments as in Konno (1991, 92a), we can write the risk function of

∆(Ψ).

Proposition 1. The risk function of ∆(Ψ) is given by

R( ∆(Ψ), ω) = E_ω r( ∆(Ψ)) + mtr Λ, (3.4) where r( ∆(Ψ)) = p i=1 (n + p− 3)f_iψ_i2− 4f_i2ψ_i∂ψi ∂f_i − 2 j>i f_i2ψ2_i − f_j2ψ_j2 f_i− f_j (3.5) −2(m − p + 1)ψi− 4fi∂ψi ∂f_i − 4 j>i f_iψ_i− f_jψ_j f_i− f_j .

Proof. The method of proof is similar to Konno (1992a), using the Wishart identity

obtained by Stein (1977) and Haff (1979). Let G(S) be a p × p matrix such that the (i, j) element g_ij(S) is a function of S = (s_ij) and define the differential operatorD_S by

{DSG(S)}ij =

_p

a=1diagaj(S), where dia = 2−1(1 + δia) ∂/∂sia with δia = 1 for i = a

and δ_ia = 0 for i = a. When S is distributed as W_p(Σ, n), the Stein-Haﬀ identity is expressed by E_Σ[tr{G(S)Σ−1}] = E_Σ[(n− p − 1)tr {G(S)S−1} + 2tr {D_SG(S)}]. Also note that a similar identity is given for W having W(Σ₂, m). Applying the Stein-Haﬀ

identity, we observe that

R∗( ∆, ω) =E_ω tr ∆W ∆Σ−1 − 2tr ∆W Σ−1₂ =E_ω (n− p − 1)tr ∆W ∆S−1+ 2trD_S( ∆W ∆) −2(m − p − 1)tr ∆− 4tr D_W( ∆W ) .

(8)

Since ∆ =AΨ(f)A−1 with S = AA and W = AF A, R∗( ∆(Ψ), ω) is rewritten by R∗₍_{∆(Ψ), ω) =E}

ω[(n− p − 1)tr ΨF Ψ + 2tr DS(AΨF ΨA)

−2(m − p − 1)tr Ψ − 4tr DW(AΨF A)] . (3.6)

Here the following calculations due to Loh (1988) and Konno (1992a) are useful:

trD_S(AΦ(F )A) = p i=1 pφ_i− f_i∂φi ∂f_i − j>i f_iφ_i− f_jφ_j f_i− f_j , (3.7) trD_W(AΦ(F )A) = p i=1 ∂φ_i ∂f_i + j>i φ_i− φ_j f_i− f_j , (3.8)

where Φ() = diag (φ₁(), . . . , φ_p()). Combining (3.6), (3.7) and (3.8) provides that

R∗( ∆(Ψ), ω) = p i=1 E_ω (n− p − 1)f_iψ_i2+ 2pf_iψ_i2− 2f_i ψ_i2+ 2f_iψ_i∂ψi ∂f_i −2 j>i f_i2ψ2_i − f_j2ψ_j2 f_i− f_j − 2(m − p − 1)ψi− 4 ψ_i+ f_i∂ψi ∂f_i − 4 j>i f_iψ_i− f_jψ_j f_i− f_j ,

which leads to the expression (3.5) of Proposition 1.

In the next two subsections, we consider two scale invariant estimators which we call Stein type and Efron-Morris type of estimators.

3.1 Stein type scale-equivariant estimator

In this subsection, we consider the Stein type scale-equivariant estimator of the form

∆ST =Adiag (b₁/f₁, . . . , b_p/f_p)A−1, (3.9) for b_i = (m + p−2i −1)/(n −p + 2i + 1). These constants b_i’s were given by Konno (1991) in estimation of a matrix mean, and have the order relation that b₁ > b₂ > . . . > b_p. It has been shown by Srivastava and Solankey (2002) that this estimator has a very good risk performance as compared to many of its competitors including the ones proposed by Breiman and Friedman (1997). Using (3.4) and (3.5), we shall show that the estimator

∆ST dominates ∆₀ relative to the risk (2.6).

From (3.4) and (3.5), the risk function of the estimator ∆ST is given by

R( ∆ST, ω)− mtr Λ (3.10) = p i=1 E_ω (n + p + 1)b 2 i f_i − 2 j>i b2_i − b2_j f_i− f_j − 2(m − p − 1) b_i f_i − 4 j>i b_i− b_j f_i− f_j .

(9)

Following Konno (1991), we have that for k = 1, 2 and b_i > b_j for i < j, p i=1 p j=i+1 bk_i − bk_j f_i− f_j = p i=1 1 f_i p j=i+1 f_i f_i− f_j bk_i − bk_j (3.11) ≥ p i=1 1 f_i p j=i+1 bk_i − bk_j = p i=1 1 f_i (p− i)bk_i − p j=i+1 bk_j ,

since f_i/(f_i − f_j) > 1. On the other hand, the risk of the estimator ∆₀ is derived by putting b₁ =· · · = b_p = (m− p − 1)/(n + p + 1) in (3.10), and we observe that

R( ∆₀, ω)− mtr Λ = −(m− p − 1) 2 n + p + 1 p i=1 E_ω 1 f_i . (3.12)

Combining (3.10), (3.11) and (3.12), we see that the risk diﬀerence of ∆ST and ∆₀ is evaluated as R( ∆ST, ω)− R( ∆₀, ω) =_i=1p E_ωf_i−1h(i), where

h(i) = (n− p + 2i + 1)b2_i − 2(m + p − 2i − 1)b_i+ 2 p j=i+1 b_j(b_j+ 2) + (m− p − 1) 2 n + p + 1 ,

so that it is suﬃcient to show that

h(i)≤ 0 (3.13)

for i = 1, . . . , p− 1, since it is easy to verify that h(p) ≤ 0. Although this proof was given by Konno (1991), a simple proof is given below. Note that h(i− 1) is rewritten as

h(i− 1) = −(m + p− 2i + 1) 2 n− p + 2i − 1 + 2 p j=i b_j(b_j+ 2) + (m− p − 1) 2 n + p + 1 = h(i)−(a i+ 2)2 a_i− 2 + 2 a2_i a2_i + 4 a_i a_i + a2_i a_i = p k=i −(ak+ 2)2 a_k− 2 + 2 a2_k a2_k + 4 a_k a_k + a2_k a_k , (3.14)

where a_i = m + p− 2i and a_i = n− p + 2i + 1. It can be easily checked that for each k,

−(ak+ 2)2 a_k− 2 + 2 a2_k a2_k + 4 a_k a_k + a2_k a_k ≤ 0,

which shows the inequality (3.13).

Proposition 2. The scale-equivariant Stein type estimator ∆ST dominates ∆₀ rela-tive to the risk (2.6).

(10)

3.2 Efron-Morris type scale-equivariant estimator

Another scale-equivariant estimator we call Efron-Morris type is given by

∆EM = αSW−1+ β 1 trS−1WIp, (3.15) where α = m− p − 1 n + p + 1, β = (p− 1)(p + 2)(1 + α) n− p + 3 = (p− 1)(p + 2)(m + n) (n + p + 1)(n− p + 3).

That is, ∆EM =AΨ(f)A−1 where Ψ(f) = diag (ψ₁, . . . , ψ_p) and ψ_i = α/f_i+β/p_j=1f_j. Thus, from (3.5), it follows that the function r(·) for ∆EM is given by

r( ∆EM) = i f_i−1(n + p + 1)α2− 2(m − p − 1)α+ 4β2 ifi2 (_if_i)3 +1 ifi (n− p − 1)β2− 2 [(p − 1)(p + 2) + p(m − p − 1) − (np + 2)α] ≤ i f_i−1(n + p + 1)α2− 2(m − p − 1)α +1 ifi (n− p + 3)β2− 2 [(p − 1)(p + 2) + p(m − p − 1) − (np + 2)α] =− i f_i−1(m− p − 1) 2 n + p + 1 + 1 ifi (n− p + 3)β2− 2(p − 1)(p + 2)(1 + α),

where the last equality is derived by substituting α = (m− p − 1)/(n + p + 1). Since

β = (p− 1)(p + 2)(1 + α₃)/(n− p + 3), we get that r( ∆EM)≤ − i f_i−1(m− p − 1) 2 n + p + 1 − 1 ifi 1 n− p + 3 (p− 1)(p + 2)(n + m) n + p + 1 2 ,

which is smaller than r( ∆₀) =−_if_i−1(m− p − 1)2/(n + p + 1). Hence we get

Proposition 3. The Efron-Morris type scale equivariant estimator ∆EM dominates

∆₀ relative to the risk (2.6).

Konno (1992a) proposed the constant β∗ = (p− 1)(p + 2)/(n − p + 3) instead of β and we denote the Efron-Morris type estimator for the constant β∗ by ∆(β∗). Noting that β∗ < β and that_if_i2 ≤ (tr F )2, we can see that ∆EM given by (3.15) improves on

(11)

3.3 Improvement by use of order restriction

Recalling that Σ₂ = Σ + rΣ_A, we notice that there is the order restriction Σ₂ > Σ

between Σ and Σ₂. The estimators treated in the previous sections can be shown to be further improved upon by using this knowledge. The resulting estimators of Θ correspond to the positive-part Stein estimator for m = 1.

For the scale-equivariant estimator ∆(Ψ) =AΨ(f)A−1 with

Ψ(f) = diag (ψ₁(f), . . . , ψ_p(f)), given by (3.3), consider the truncated estimator of the form

∆(Ψ∗) =AΨ∗(f)A−1, Ψ∗(f) = diag (min{ψ₁(f), 1}, . . . , min{ψ_p(f), 1}) . Then we can get

Proposition 4. The truncated estimator ∆(Ψ∗) dominates ∆(Ψ) relative to the risk

(2.6).

Proof. The risk diﬀerence of the estimators ∆(Ψ) and ∆(Ψ∗) with Ψ = diag (ψ₁, . . . , ψ_p) and Ψ∗ = diag (ψ₁∗, . . . , ψ∗_p) is written by

R( ∆(Ψ), ω)− R( ∆(Ψ∗), ω) = E tr ∆(Ψ)− ∆(Ψ∗) Σ−1 ∆(Ψ) + ∆(Ψ∗)− 2∆ W = EtrΨ + Ψ∗ − 2A−1∆A F (Ψ − Ψ∗)AW A−1. (3.16) Since ∆≤ I and Ψ ≥ Ψ∗, it can be seen that the r.h.s. of the last equation in (3.16) is greater than or equal to E[2tr (Ψ∗− I)F (Ψ − Ψ∗)AΣ−1A], which is equal to zero since

ψ_i∗ = min(ψ, 1). This proves Proposition 4.

Applying the truncation rule to the Stein and Efron-Morris type scale-equivariant estimators ∆ST and ∆EM, we obtain the truncated estimators

∆ST ∗= ∆(ΨST ∗), ΨST ∗= diag min b_i f_i, 1 , i = 1, . . . , p , (3.17)

∆EM∗= ∆(ΨEM∗), ΨEM∗= diag min α f_i + β _p j=11/fj , 1 , i = 1, . . . , p , (3.18)

which improve on ∆ST and ∆EM.

4 Numerical Risk Compariasons and Empirical

Stud-ies

We shall investigate the risk-performances of estimators of Θ numerically. From (2.3) and (2.5), the risk function of the estimator ΘEB given by (2.3) is expressed by the risk given

(12)

by (2.6), that is, the problem is reduced to that of estimating the ratio of the covariance matrices ∆ = ΣΣ−1₂ . Since the estimator (y₁., . . . ,y_k.) of Θ is minimax in terms of the

risk (2.2) and it corresponds to the estimator ∆ = 0 in the reduced problem with the risk

R(0, ω) = mtr ∆, we use the relative risk eﬃciency Ef f ( ∆, ω) = E_ω

tr ( ∆− ∆)Σ−1( ∆− ∆)W

/(mtr ∆) (4.1)

to investigate the performances of the estimators estimators ∆ of ∆. When the eﬃciency

Ef f ( ∆, ω) is less than or equal to 1, it means that the estimator θEB_i =y_i.− ∆(y_i.−βb_i),

i = 1, . . . , k, with the ∆ is minimax, namely, it improves on (y₁., . . . ,y_k.) in terms of

the risk (2.2).

The estimators we want to compare are the unbiased estimator ∆UB = n−1(m−

p− 1)SW−1, the James-Stein type estimators ∆JS₁ = (n + p + 1)−1SU−1CU−1 and

∆JS₂ = (m− p − 1)T DTW−1 given in the begining of Section 3, the best scale multiple estimator ∆₀ = (n + p + 1)−1(m− p − 1)SW−1, the Stein type estimator

∆ST =Adiag (b₁/f₁, . . . , b_p/f_p)A−1 given by (3.9), the Efron-Morris type estimator

∆EM = αSW−1+ β 1 trS−1WIp given by (3.15), the other Efron-Morris type estimator

∆EMK = αSW−1+ β∗ 1 trS−1WIp

for the constant β∗ given by Konno (1992a), the truncated estimator ∆BE∗ = ∆(ΨBE∗) for

ΨBE∗= diag min(n + p + 1)−1(m− p − 1)/f_i, 1, i = 1, . . . , p ,

the Stein type truncated estimator ∆ST ∗= ∆(ΨST ∗) for

ΨST ∗= diag (min{b_i/f_i, 1} , i = 1, . . . , p)

given by (3.17) and the Efron-Morris type truncated estimator ∆EM∗= ∆(ΨEM∗) for

ΨEM∗ = diag min α f_i + β _p j=11/fj , 1 , i = 1, . . . , p

given by (3.18). These estimators are, respectively, abbreviated by the notations

(13)

Table 1: Relative Risk Eﬃciencies of the Estimators in the Case that p = 3, q = 2,

k = 20, r = 2, n = 20, m = 18, Σ = I₃ and Σ_A = Hdiag (γ/10, 2γ/10, 3γ/10)H for

γ = 0, . . . , 9 γ U B J S₁ J S₂ BE ST EM EM K BE∗ ST∗ EM∗ 0 0.377 0.348 0.352 0.351 0.247 0.267 0.282 0.268 0.159 0.117 1 0.377 0.348 0.352 0.351 0.251 0.268 0.284 0.274 0.170 0.131 2 0.377 0.348 0.352 0.351 0.259 0.272 0.286 0.282 0.187 0.153 3 0.377 0.349 0.352 0.351 0.267 0.275 0.289 0.289 0.203 0.172 4 0.377 0.349 0.352 0.351 0.275 0.279 0.292 0.294 0.216 0.187 5 0.377 0.349 0.352 0.351 0.282 0.282 0.294 0.297 0.227 0.200 6 0.377 0.349 0.352 0.351 0.288 0.285 0.297 0.299 0.236 0.209 7 0.377 0.349 0.352 0.351 0.293 0.288 0.299 0.301 0.243 0.217 8 0.377 0.349 0.352 0.351 0.298 0.291 0.301 0.302 0.249 0.224 9 0.377 0.349 0.352 0.351 0.302 0.293 0.303 0.303 0.254 0.229

Table 2: Relative Risk Eﬃciencies of the Estimators in the case that p = 10, q = 2,

k = 20, r = 2, n = 20, m = 18, Σ = I₁₀ and Σ_A =Hdiag (i × γ/10, i = 1, . . . , 10)H for γ = 0, . . . , 9 γ U B J S₁ J S₂ BE ST EM EM K BE∗ ST∗ EM∗ 0 0.824 0.732 0.749 0.747 0.345 0.255 0.299 0.676 0.239 0.122 1 0.824 0.735 0.750 0.747 0.382 0.303 0.343 0.689 0.308 0.208 2 0.824 0.736 0.750 0.747 0.421 0.352 0.387 0.700 0.367 0.283 3 0.824 0.737 0.751 0.747 0.452 0.389 0.421 0.708 0.408 0.334 4 0.824 0.738 0.751 0.747 0.476 0.418 0.447 0.713 0.439 0.372 5 0.824 0.739 0.751 0.747 0.496 0.441 0.468 0.716 0.464 0.402 6 0.824 0.739 0.751 0.747 0.514 0.460 0.486 0.719 0.484 0.425 7 0.824 0.739 0.751 0.747 0.528 0.477 0.501 0.720 0.501 0.445 8 0.824 0.740 0.751 0.747 0.541 0.492 0.514 0.721 0.515 0.461 9 0.824 0.740 0.751 0.747 0.553 0.504 0.526 0.722 0.527 0.476

(14)

Table 3: Relative Risk Eﬃciencies of the Estimators in the case that p = 10, q = 8, k = 20, r = 3, n = 40, m = 12, Σ = I₁₀ and Σ_A =Hdiag (i × γ/10, i = 1, . . . , 10)H for γ = 1, . . . , 10 γ U B J S₁ J S₂ BE ST EM EM K BE∗ ST∗ EM∗ 0 0.941 0.856 0.935 0.935 0.287 0.098 0.101 0.895 0.173 0.038 1 0.941 0.872 0.936 0.935 0.361 0.224 0.227 0.898 0.289 0.179 2 0.941 0.880 0.937 0.936 0.426 0.326 0.328 0.900 0.371 0.285 3 0.942 0.886 0.937 0.936 0.474 0.396 0.398 0.902 0.427 0.358 4 0.942 0.890 0.937 0.936 0.512 0.448 0.450 0.903 0.469 0.412 5 0.942 0.894 0.938 0.936 0.543 0.489 0.491 0.904 0.503 0.455 6 0.943 0.896 0.938 0.937 0.570 0.523 0.525 0.905 0.532 0.490 7 0.943 0.899 0.938 0.937 0.593 0.552 0.553 0.905 0.556 0.519 8 0.943 0.901 0.938 0.937 0.613 0.576 0.577 0.906 0.578 0.544 9 0.943 0.903 0.938 0.937 0.631 0.597 0.598 0.906 0.596 0.565

The risk functions or the relative risk eﬃciencies of the above estimators are computed through simulation experiments based on 50,000 replications. The simulation experiments are done in the following three cases:

Case 1: p = 3, q = 2, k = 20, r = 2, n = 20, m = 18, Σ = I₃ and

Σ_A=Hdiag (γ/10, 2γ/10, 3γ/10)H, Σ₂ =I₃+ rΣ_A for γ = 0, . . . , 9 and an orthogonal matrix H,

Case 2: p = 10, q = 2, k = 20, r = 2, n = 20, m = 18, Σ =I₁₀ and Σ_A=Hdiag (i × γ/10, i = 1, . . . , 10)H, Σ₂ =I₁₀+ rΣ_A for γ = 0, . . . , 9. Case 3: p = 10, q = 8, k = 20, r = 3, n = 40, m = 12, Σ =I₁₀ and Σ_A=Hdiag (i × γ/10, i = 1, . . . , 10)H, Σ₂ =I₁₀+ rΣ_A for γ = 0, . . . , 9.

The relative risk eﬃciencies Ef f ( ∆, ω) of the above estimators for the three cases are

given in Tables 1, 2 and 3, respectively. Form these tables, the following conclusions can be drawn.

(1) The Efron-Morris type truncated estimator EM∗ has very nice risk behaviours and it is the best among all the estimators so that it is highly recommended.

(2) The comparison between the Stein-type and Efron-Morris type estimators ST and

EM depend on the cases. Table 1 reveals that ST is better than EM while Tables 2 and

(15)

(3) The estimator EM is better than EM K, but the diﬀerences in the risks are quite small.

(4) The estimators U B, J S₁, J S₂, BE and BE∗ are minimax, but much worse than

EM∗ through the three tables.

We now give an example for data derived by Monte Carlo simulation.

Example 1. (Simulated Data) We ﬁrst treat a data sety_ij, i = 1, . . . , k, j = 1, . . . , r, derived by Monte Carlo simulation under the model (2.1) in the case that k = 20, r = 3,

q = 2, p = 5, n = k(r− 1), m = k − q, Σ = I₅, Σ_A =Hdiag (i/10, i = 1, . . . , 5)H and

β_ij = (β)_ij = i + j− 2. The covariates b₁, . . . ,b₂₀ are derived as independent random variables from a 2-variate normal distribution with

E[b_i] = i− 1 2 1 1 , Cov (b_i) = 1 ρ ρ 1 ,

for ρ = 0.4. Then (y₁., . . . ,y₂₀.), β, S and W are computed as the followings:

β =       −0.0343 1.0667 0.8944 2.0810 1.9547 3.0670 2.9659 4.0867 3.9309 5.0445      , (y₁., . . . ,y₂₀.) =                                     0.251 0.117 −1.170 −1.532 −1.471 −1.289 −0.259 −1.098 0.903 −0.639 2.022 3.771 4.620 7.731 7.409 2.280 9.107 13.457 17.784 24.354 1.465 6.646 10.798 14.410 19.976 1.982 6.993 11.499 16.937 21.567 5.674 13.799 21.814 30.496 39.932 2.055 7.280 14.994 19.060 22.942 5.436 10.756 20.050 30.312 35.811 4.825 13.262 21.981 29.531 38.479 6.975 19.493 30.814 42.024 53.244 5.159 15.304 27.075 37.801 47.510 3.978 13.889 23.322 31.381 42.676 5.160 17.225 29.040 42.796 53.938 6.515 21.040 33.741 49.448 62.569 11.707 29.466 49.589 68.192 87.503 6.078 22.272 37.262 52.430 65.752 10.216 26.370 46.310 64.047 80.195 9.751 29.300 46.661 66.522 84.010 10.239 24.600 43.285 60.447 77.882                                     ,

(16)

S =       41.921 3.293 0.951 0.720 3.357 3.293 45.926 −9.940 11.328 7.114 0.951 −9.940 46.946 10.198 −0.629 0.720 11.328 10.198 46.175 −8.015 3.357 7.114 −0.629 −8.015 58.362      , W =       43.722 −5.520 7.103 1.167 −3.000 −5.520 29.700 −2.892 −10.936 −2.106 7.103 −2.892 34.376 6.710 −0.493 1.167 −10.936 6.710 58.105 −3.625 −3.000 −2.106 −0.493 −3.625 13.327      . Based on these statistics, θ_i =βb_i+α_i can be predicted by

θi =yi.− ∆(yi.− βbi),

for an estimator ∆ of ∆ = Σ(Σ + rΣ_A)−1. We denote the predictors θ_i for the estimators

∆UB, ∆ST, ∆EM, ∆EM∗ by

pU B, pST, pEM, pEM∗, respectively.

The sample mean vector (y₁., . . . ,y₂₀.) corresponding to ∆ = 0 is also treated as a

predictor.

The predicted values by pST , pEM and pEM∗forθ₁,θ₁₁andθ₂₀are reported in Table 4 where θ_i = (θ_i1, θ_i2, θ_i3, θ_i4, θ_i5). For each i, the values of θ_i, y_i and βb_i are also given there. It is revealed in Table 4 that the predicted values for pST , pEM and pEM∗ are betweeny_i and βb_i because of shrinkingy_i towards βb_i. It is also seen that the predicted values by pEM∗ are, in most cases, closer to the true values ofθ_i than those byy_i. Since we know the true values of θ_i in this example, we can compute the prediction errors such as k_i=1(θ_i − θ_i)(θ_i− θ_i), which are given by {(y₁., . . . ,y₂₀.), 35.984}, {pUB, 27.232}, {pST, 23.032}, {pEM, 23.973} and {pEM∗_{, 20.364}_{}. The predictor pEM}∗_{with the}

Efron-Morris type truncated estimator ∆EM∗ has the smallest prediction error.

5 Estimation of Mean Matrix in a Fixed Eﬀects Model

In this section we consider the model (2.1)

y_ij =βb_i+α_i +_ij, i = 1, . . . , k, j = 1, . . . , r,

in which α_i are unknown parameters representing the ﬁxed eﬀects of the ith group. We shall assume that

k

i=1

αibi = 0,

and _ij are i.i.d. N_p(0, Σ). Thus, Cov (y_ij) = Σ. Our aim is to estimate Θ = (θ₁, . . . ,θ_k), where

(17)

Table 4: Predicted values of the Predictors for i = 2, 11, 20 θ_i1 θ_i2 θ_i3 θ_i4 θ_i5 θi -1.3065 -1.1717 -0.4064 -0.1669 -1.1618 y_i -1.2899 -0.2590 -1.0989 0.9033 -0.6391 βbi -0.2721 -0.4260 -0.5585 -0.7047 -0.8405 i = 2 U B -1.1024 -0.6585 -0.9092 0.4891 -0.9594 pST -0.9390 -0.6180 -0.8582 0.3247 -0.9145 pEM -0.9166 -0.6409 -0.8223 0.2111 -0.9592 pEM∗ -0.8906 -0.5870 -0.8319 0.2125 -0.8174 θi 6.389 19.253 29.775 41.484 53.350 y_i 6.975 19.493 30.814 42.024 53.244 βbi 6.805 18.282 30.256 42.196 53.491 i = 11 U B 6.862 18.946 30.679 41.762 53.310 pST 6.855 18.865 30.592 41.812 53.343 pEM 6.841 18.767 30.581 41.832 53.352 pEM∗ 6.849 18.783 30.578 41.832 53.394 θi 9.291 26.246 43.522 59.932 77.322 y_i 10.239 24.600 43.285 60.447 77.882 βbi 9.046 25.732 43.274 60.700 77.196 i = 20 U B 9.874 24.921 43.297 60.792 77.023 pST 9.729 25.054 43.301 60.790 77.109 pEM 9.675 25.113 43.293 60.799 76.993 pEM∗ 9.719 25.204 43.277 60.801 77.232

(18)

under the loss function

tr ( Θ− Θ)Σ−1( Θ− Θ),

where Θ is an estimator of Θ. In previous sections, several estimators dominating ∆₀

have been proposed, and from (2.3) and (2.5), it is seen that the resulting estimators of

Θ in the random eﬀects model are better than the estimator

Θ₀ = (y₁., . . . ,y_k.)− ∆₀Y ,# where #Y = y1.− βb1, . . . ,yk.− βbk and ∆₀ = (m− p − 1)(n + p + 1)−1SW−1. In this section, we demonstrates that these dominance results still hold in the ﬁxed eﬀects models.

Consider the fixed effects model (2.1) whereα₁, . . . ,α_k are p×1 unknown fixed effects such that k_i=1α_ib_i = 0. Let B = (b₁, . . . ,b_k) and assume that rank (B) = q₁ ≤ q < k. Letting P = B(BB)−B, we observe that

(y₁.− θ₁, . . . ,y_k.− θ_k)P =

β − βB,

(y₁.− θ₁, . . . ,y_k.− θ_k) (I_p− P ) = #Y − #α,

where α = (α# ₁, . . . ,α_k). When we look into estimators of the general form

Θ( ∆) = (y₁., . . . ,y_k.)− ∆ #Y (5.1) for p× p matrix ∆ = ∆(S, W ), the diﬀerence Θ( ∆)− Θ is written by

Θ( ∆)− Θ = (y₁.− θ₁, . . . ,y_k.− θ_k) (P + I_p− P ) − ∆ #Y = (β − β)B + # Y − #α − ∆ #Y .

Note that β, S and #Y (or W ) are mutually independent and that βB ∼ Np×kβB, r−1Σ,P ,

#

Y ∼ Np×kα, r# −1Σ,Ip − P ,

S ∼ Wp(Σ, n),

where rankBB = q₁ ≤ q and rank (I_p− P ) = m = k − q₁. Then the risk function of Θ

in terms of (2.2) is R_m( Θ( ∆), ω) =E_ω tr Σ−1(β − β)BB(β − β) + E_ω tr Σ−1( #Y − #α − ∆ #Y )( #Y − #α − ∆ #Y ) .

Since I_p− P is idempotent, there exists a k × m matrix Q₁ such that Q₁Q₁ =I_m and

Ip− P = Q1Q1. Deﬁne p× m matrices Z and µ by Z = √r #Y Q1 and µ = √rαQ# 1.

Then Z ∼ N_p×m(µ, Σ, I_m) and W = ZZ, so that R_m( Θ( ∆), ω) is expressed as

R_m( Θ( ∆), ω) = pq₁/r +(1/r)E_ω tr Z − µ − ∆(S, ZZ)Z Σ−1 Z − µ − ∆(S, ZZ)Z .

(19)

When the prior distribution ofµ is supposed as π : µ ∼ N_p×m(0, rΣ_A), the Bayes risk of Θ( ∆) is Eπ R_m( Θ( ∆), ω) =(pq₁+ pm)/r− (m/r)tr Σ(Σ + rΣ_A)−1 + (1/r)E_ω tr ∆− ∆ Σ−1 ∆− ∆ W =pk/r + (1/r)Eπ E_ω tr ∆Σ−1∆W − 2tr ∆Σ−1∆W .

If there exists an unbiased estimator R∗(S, ZZ) such that

Eπ E_ω R∗(S, ZZ) = Eπ E_ω tr ∆Σ−1∆W − 2tr ∆Σ−1∆W ,

the Bayes risk can be represented by

Eπ R_m( Θ( ∆), ω) = Eπ E_Σ,µµ pk/r + (1/r) R∗(S, ZZ) .

It is here noted that E_Σ,µµ[ R∗(S, ZZ)] is a function of Σ and µµ, and that µµ has

Wp(rΣA, m). Since the Wishart distribution is complete, the same arguments as used in

Efron and Morris (1996) shows that

R_m( Θ( ∆), ω) = E_Σ,µµ

pk/r + (1/r) R∗(S, ZZ)

.

Hence the risk function of Θ( ∆) in the ﬁxed eﬀects model can be derived automatically

from the risk of ∆ in the mixed linear model.

Proposition 5. In the ﬁxed eﬀects model, consider the problem of estimating the

unknown matrix of parameters Θ =β(b₁, . . . ,b_k) + (α₁, . . . ,α_k) by the estimator Θ( ∆)

given by (5.1) relative to the risk (2.2). For the scale-equivariant estimator ∆(Ψ) given

by (3.3), the unbiased estimator of the risk of Θ( ∆(Ψ)) is pk/r + (1/r)r( ∆(Ψ)), where

r( ∆(Ψ)) is given by (3.5).

Corollary 1. In the ﬁxed eﬀects model, the estimators Θ( ∆ST) and Θ( ∆EM) for

i = 1, 2, 3 dominate the estimator Θ₀ = Θ( ∆₀) for the risk (2.2).

For the case of b₁ =· · · = b_k= 0, Proposition 5 was given by Konno(1991). However, by using the arguments of Efron and Morris (1976), we obtain simpler proofs even in the general case.

6 An Extension of the Model and Remarks

We here consider the model given in (1.3) which is an extension of the model (2.1), and investigate whether the series of dominance results in the previous sections hold in the extended model.

(20)

A simple extension of the model is given by

y_ij =βb_ij +α_i+_ij, i = 1, . . . , k, j = 1, . . . , r, (6.1) whereb_ij’s are q× 1 known vectors and the other parameters and constants are the same as deﬁned in (2.1). Then the exponent in the joint distribution ofy_ij’s is written by

i,j (y_ij− βb_ij − α_i)Σ−1(y_ij − βb_ij − α_i) + i α iΣ−1A αi = i,j y_ij − y_i.− β(b_ij − b_i.) Σ−1y_ij − y_i.− β(b_ij − b_i.) + i (θ_i− θB_i )(rΣ−1+ Σ−1_A )(θ_i− θB_i ) + r i (y_i.− βb_i.)Σ−1₂ (y_i.− βb_i.),

where θ_i =βb_i. +α_i for b_i. = r−1_jb_ij and θB

i =yi.− ΣΣ−12

y_i.− βb_i. . (6.2)

Let U = (u₁₁, . . . ,u_1r;· · · ; u_k1, . . . ,u_kr) and C = (c₁₁, . . . ,c_1r;· · · ; c_k1, . . . ,c_kr) for

uij =yij− yi. and cij =bij − bi.. Also, let P2 =B(B B)−B and P1 =C(CC)−C

for B = (b₁., . . . ,b_k.). Then, i,j y_ij − y_i.− β(b_ij − b_i.) Σ−1 i,j y_ij − y_i.− β(b_ij − b_i.) = tr Σ−1(U − βC)(U − βC) = tr Σ−1S + tr Σ−1(β₁− β)CC(β₁− β),

where β₁ =UC(CC)− and S = (U − β₁C)(U − β₁C). Letting Y = (y₁., . . . ,y_k.),

we see that r i (y_i.− βb_i.)Σ−1₂ (y_i.− βb_i.) = rtr Σ−1₂ (Y − βB)(Y − βB) = tr Σ−1₂ W + rtr Σ−1₂ (β₂− β)B B(β₂− β),

where β₂ =Y B(B B)−andW = r(Y −β₂B)(Y −β₂B). Assuming that rank (CC) =

q₁ ≤ q and rank (B B) = q₂ ≤ q, we observe that S, W , β₁ and β₂ are mutually inde-pendent and that

S ∼ Wp(Σ, n), n = k(r− 1) − q1,

W ∼ Wp(Σ2, m), m = k− q2,

β₁C ∼ Np×kr(βC, Σ, P1),

(21)

where β_i has the degenerated normal distribution in the case that q_i < q for i = 1, 2.

As an empirical Bayes procedure suggested from (6.2), we consider the estimator θEB_i (β₂) =y_i.− ∆(y_i.− β₂b_i.), (6.3) where the estimator ∆ = Σ Σ−1₂ based on S and W is constructed. The risk of the estimator ΘEB(β₂) = (θEB₁ (β₂), . . . , θEB₁ (β₂)) is given by R_m( ΘEB(β₂), ω) = r−1E_ω tr ( ∆− ∆)Σ−1( ∆− ∆)W + r−1(pk− mtr ∆), so that all the dominance results in the previous sections can be applied to the model (6.1).

It is noted that the regression coeﬃcients β has two independent estimators β₁ and β₂ with diﬀerent covariance matrices. Hence it is natural to consider random weighted combined estimator β of the form

vec (β) = {(CC₎−_⊗_Σ_}−₊_{{(B B}₎−_{⊗ r}−1_Σ 2}− ₋ ×{(CC₎−_⊗_Σ_}−_{vec (}_β 1) +{(B B)−⊗ r−1Σ2}−vec (β2) ,

where vec (U) = (u₁, . . . ,u_q) for U = (u₁, . . . ,u_q) and ⊗ denotes the Kronecker prod-uct. However it is diﬃcult to study any exact dominance property for the combined estimator β.

A practically appealing model may be the case with unequal replications:

y_ij =βb_ij +α_i+_ij, i = 1, . . . , k, j = 1, . . . , n_i, (6.4) which was discussed by Fuller and Harter (1987) for estimation of small area. It seems, however, intractable to establish any exact dominance results in the model (6.4). The work of deriving eﬃcient estimators by using approximation (asymptotic) theories rests in the future.

Example 2. (Posted Land Price Data) We here treat the posted land price data in

Japan which are disclosed as the Digital National Information and also obtained from the web page of the Ministry of Land, Infrastructure and Transport. 45 small areas are chosen from Kanagawa prefecture next to Tokyo, and 5 spots are taken from each area. For the

j-th spot in the i-th small area, the posted land prices (Yen) per m2 of the spot in 2001 and 1996, denoted by y_ij1 and y_ij2 respectively, are observed with two covariates b_ij2 and

b_ij3 where b_ij2 is the distance from the spot to the nearby railway station and b_ij3 is the nearby distance from the station to the Tokyo station. All the data are transformed by logarithm before carrying out the analysis. Many people living in Kanagawa use railways to commute to Tokyo, so that the distances to the nearby station and to Tokyo aﬀect the land prices. In fact, the multiple correlation coeﬃcient between y_ij1’s and (b_ij2, b_ij3)’s is 0.7932.

(22)

Since the land price depends on the region such as city and town, we need to consider regional eﬀect in the model building. We thus employ the mixed linear model (6.1) to analyze the data, namely,

yij =βbij +αi+ij, i = 1, . . . , 45, j = 1, . . . , 5,

where p = 2, q = 3, k = 45, r = 5, y_ij = (y_ij1, y_ij2)t and b_ij = (b_ij1, b_ij2b_ij3)t for b_ij1 = 1. Then, n = k(r− 1) − q + 1 = 178 and m = k − q = 42. Also β₂, S and W are given by

β₂ = 15.204 −0.390 −0.196 15.569 −0.400 −0.212 , S = 3.488 4.885 4.885 7.560 , W = 3.018 3.510 3.510 4.685 ,

which yeild the unbiased estimates of Σ and Σ_A, given by Σ = 0.01959 0.02744 0.02744 0.04247 , Σ_A= 0.01045 0.01122 0.01122 0.01381 .

Calculating the modiﬁed likelihood ratio test for the hypothesis H₀ : Σ_A = 0 (see page 490 in Srivastava (2002)), we see that the null hypothesis is rejected, which implies that the regional eﬀects exist in the data.

Based on these statistics, the average land price θ_i =βb_i. +α_i in the i-th area can be estimated by (6.3)

θEB

i (β2) =yi.− ∆(yi.− β2bi.),

for an estimator ∆ of ∆ = Σ(Σ + rΣ_A)−1. Using the same notations as in Example 1, we denote the estimators θ_i for the estimators ∆UB, ∆ST and ∆EM by pU B, pST and pEM . The sample mean vector y_i., the regression synthetic estimate β₂b_i. and the

estimates by pU B, pST and pEM are reported in Table 5 for the nine choosed small areas,

i = 3, 8, 14, 17, 20, 25, 35, 36, 45, where the estimates given in Table 5 are transformed by

exponential, namely, they give estimates of average price (Yen) per m2. It is revealed in Table 5 that the estimates pU B, pST and pEM are between y_i. and β₂b_i. because of

shrinkingy_i. towards β₂b_i.. However, the shrinkage factors ∆UB, ∆ST and ∆EM are not so large as their values are given by

∆UB = −0.09717 0.30125 −0.43998 0.68319 , ∆ST = −0.07767 0.28610 −0.41786 0.66345 , ∆EM = −0.08156 0.29625 −0.43269 0.68586 .

It is also noted that the truncated rules (3.17) and (3.18) do not change the estimates for the data treated in this example, namely, ∆ST ∗ = ∆ST and ∆EM∗ = ∆EM.

Acknowledgments. The authors are grateful to Professor Y. Kanemoto, University

(23)

Table 5: Estimates of Average Land Prices

Area Year y_i. pU B pST pEM β₂b_i.

i = 3 2001 163,780 171,400 171,420 171,670 192,850 1996 192,830 206,240 206,160 206,600 236,380 i = 8 2001 228,900 230,240 230,130 230,180 226,250 1996 276,150 280,530 280,370 280,520 280,470 i = 14 2001 356,480 333,230 333,470 332,750 297,320 1996 494,000 441,170 441,860 440,250 372,460 i = 17 2001 334,830 318,560 318,310 317,820 267,080 1996 422,060 396,170 396,060 395,270 332,590 i = 20 2001 293,340 277,730 278,160 277,670 270,070 1996 414,270 372,660 373,510 372,220 336,410 i = 25 2001 278,510 264,090 263,740 263,310 211,900 1996 340,330 320,320 320,050 319,450 261,170 i = 35 2001 330,290 310,460 310,520 309,910 270,140 1996 438,920 398,650 399,000 397,780 334,950 i = 36 2001 262,970 278,630 278,850 279,350 339,050 1996 325,300 350,780 350,850 351,680 427,850 i = 45 2001 225,090 207,150 207,300 206,750 177,650 1996 307,290 268,120 268,580 267,410 216,110

(24)

ﬁrst author was supported in part by a grant from the Center for International Research on the Japanese Economy, the University of Tokyo, and by the Ministry of Education, Japan, Grant Nos. 11680320 and 13680371. The research of the second author was supported in part by Natural Sciences and Engineering Research Council of Canada. This work was done during the visit of the ﬁrst author to the University of Toronto, 1999 and 2002 summer.

REFERENCES

Amemiya, Y. (1994). On multivariate mixed model analysis. In: Multivariate Analysis

and Its Applications, IMS Lecture Notes - Monograph Series 24.

Battese, G.E., Harter, R.M. and Fuller, W.A. (1988). An error-components model for prediction of county crop areas using survey and satellite data. J. Amer. Statist. Assoc.,

83, 28-36.

Bilodeau, M. and Srivastava, M.S. (1992). Estimation of the eigenvalues of Σ₁Σ−1₂ . J.

Multivariate Anal., 41, 1-13.

Breiman, L. and Friedman, J.H. (1997). Predicting multivariate responses in multiple linear regression. J. Roy. Statist. Soc. Ser. B, 59 , 3-54.

Efron, B. and Morris, C. (1976). Multivariate empirical Bayes estimation of covariance matrices. Ann. Statist., 4, 22-32.

Fay, R.E. and Herriot, R. (1979). Estimates of income for small places: An application of James-Stein procedures to census data. J. Amer. Statist. Assoc., 74, 269-277.

Fuller and Harter (1987). The multivariate components of variance model for small area estimation. In: Small Area Statistics (R. Platek, J.N.K. Rao, C.E. Sarndal and M.P. Singh, eds.), John Wiley and Sons, New York.

Ghosh, M. and Rao, J.N.K. (1994). Small area estimation: An appraisal. Statist. Science,

9, 55-93.

Haﬀ, L.R. (1979). An identity for the Wishart distribution with applications. J.

Multi-variate Anal., 9, 531-542.

Konno, Y. (1990). Families of minimax estimators of matrix of normal means with un-known covariance matrix. J. Japan Statist. Soc., 20, 191-201.

Konno, Y. (1991). On estimation of a matrix of normal means with unknown covariance matrix. J. Multivariate Anal., 36, 44-55.

Konno, Y. (1992a). Improved estimation of matrix of normal mean and eigenvalues in the

multivariate F - distribution. Doctoral Dissertation, Institute of Mathematics, University

of Tsukuba.

Konno, Y. (1992b). On estimating eigenvalues of the scale matrix of the multivariate F distribution. Sankhya (Ser. A), 54, 241-251.

Kubokawa, T. and Srivastava, M.S. (1999). Prediction in multivariate mixed linear models with equal replications. Discussion Paper CIRJE-F-62, Faculty of Economics, University of Tokyo.

(25)

Loh, W.-L. (1988). Estimating covariance matrices. Ph.D. thesis. Stanford University. Prasad, N.G.N. and Rao, J.N.K. (1990). The estimation of the mean squared error of small-area estimators. J. Amer. Statist. Assoc., 85, 163-171.

Rao, C.R. (1975). Simultaneous estimation of parameters in diﬀerent linear models and applications to biometric problems. Biometrics, 31, 545-554.

Reinsel, G.C. (1985). Mean squared error properties of empirical Bayes estimators in a multivariate random eﬀects general linear model. J. Amer. Statist. Assoc., 80, 642-650. Srivastava, M.S. (2002). Methods of Multivariate Statistics, Wiley, New York.

Srivastava, M.S. and Solanky, T.K.S. (2002). Predicting multivariate response for linear regression model. Commun. Statist.-Comp. Simul., to appear.

Stein, C. (1977). Lectures on multivariate estimation theory. (In Russian.) In

Investi-gation on Statistical Estimation Theory I, 4-65. Zapiski Nauchych Seminarov LOMI im.

V.A. Steklova AN SSSR vol. 74, Leningrad.(In Russian) Tatsuya Kubokawa

Faculty of Economics, University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113, JAPAN E-Mail: [email protected], Fax: +81-3-5841-5521, Tel: +81-3-5841-5656 M. S. Srivastava

Department of Statistics, University of Toronto, 100 St George Street, Toronto, Ontario, CANADA M5S 3G3

E-Mail: [email protected], Fax: +1-416-978-5133, Tel: +1-416-978-4450