• No results found

Fitting general linear model for longitudinal survey data under informative sampling

N/A
N/A
Protected

Academic year: 2021

Share "Fitting general linear model for longitudinal survey data under informative sampling"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

FITTING GENERAL LINEAR MODEL FOR LONGITUDINAL

SURVEY DATA UNDER INFORMATIVE SAMPLING

ABDULHAKEEM A.H. EIDEH1

Sampling designs for surveys are often complex and informative, in the sense that the selection probabilities are correlated with the variables of interest, even when conditioned on explanatory variables. In this case conventional analysis that disregards the informativeness can be seriously biased, since the sample distribution differs from that of the population. Most of the studies in social surveys are based on data collected from complex sampling designs. Standard analysis of survey data often fails to account for the complex nature of the sampling design such as the use of unequal selection probabilities, clustering, and post-stratification. The effect of the sample design on the analysis is due to the fact that the models in use typically do not incorporate all the design variables determining the sample selection, either because there may be too many of them or because they are not of substantive interest.

AL-QUDS UNIVERSITY, PALESTINE

Statistics in Transition-New Series, Vol 11, No 3, 2010

ABSTRACT

The purpose of this article is to account for informative sampling in fitting superpopulation model for multivariate observations, and in particular multivariate normal distribution, for longitudinal survey data. The idea behind the proposed approach is to extract the model holding for the sample data as a function of the model in the population and the first order inclusion probabilities, and then fit the sample model using maximum likelihood, pseudo maximum likelihood and estimating equations methods. As an application of the results, we fit the general linear model for longitudinal survey data under informative sampling using different covariance structures: the exponential correlation model, the uniform correlation model, and the random effect model, and using different conditional expectations of first order inclusion probabilities given the study variable. The main feature of the present estimators is their behaviours in terms of the informativeness parameters.

Key words: General Linear Model, Informative sampling, Longitudinal Survey Data, Maximum Likelihood, and Sample distribution.

1. Introduction

1Address for correspondence: Dr. ABDULHAKEEM A.H. EIDEH, Department of Mathematics, College of Science and Technology, Al-Quds University, Abu-Dies campus, Palestine, P.O. Box 20002, Jerusalem. E-mail: [email protected].

(2)

However, if the sampling design is informative in the sense that the outcome variable (variable of interest) is correlated with the design variables not included in the model, even after conditioning on the model covariates, standard estimates of the model parameters can be severely biased, leading possibly to false inference. Pfeffermann (1993, 1996) reviews many examples reported in the literature that illustrate the effects of ignoring the sampling process when fitting models to survey data and discusses methods that have been proposed to deal with this problem, see also Skinner, Holt, and Smith (1989), Kasprzyk, Duncan, Kalton and Singh (1989), Hoem, (1989), and Chambers and Skinner (2003). It should be emphasized that standard inference may be biased even when the original sample is a simple random sample, due to non-response, attrition and imperfect frames that results in de facto a posterior differential inclusion probabilities.

To overcome the difficulties associated with the use of classical inference procedures for cross sectional survey data, Pfeffermann, Krieger and Rinott (1998) proposed the use of the sample distribution induced by the assumed population models, under informative sampling, and developed expressions for its calculation. Similarly, Eideh and Nathan (2006) fitted time series models for longitudinal survey data under informative sampling. Furthermore, Eideh (2008) fitted random effects or subject-specific effects models for analyzing normal data, which are assumed to be correlated, under the concept of informative sampling.

The plan of this paper is as follows. In Section 2 we define sample distribution and sample likelihood. In Section 3 we extract the sampled distribution of the multivariate normal distribution under informative sampling. In Section 4 we fit the general linear model for longitudinal survey data. Section 5 provides a discussion of the results. 2. Sample distribution and sample likelihood

Let U =

{

1,...,N

}

denote a finite population consisting of Nunits. Let y be the target or study variable of interest and let yi be the value of y for the ith population unit. Let xi, iU be the value of an auxiliary variable(s), x, and z=

{

z1,...,zN

}

be the values of a known design variable, used for the sample selection process but not included in the working model under consideration. In what follows we consider a sampling design with selection probabilities πi =Pr(is)>0, and sampling weight

i i

w =1π ; i =1,...,N. In practice, the πi’s may depend on the population values

(

x,y,z

)

. We express this by writing: πi =Pr(is|x,y,z). The sample s consists of the subset of U selected at random by the sampling scheme with inclusion probabilities π1,...,πN. Denote by

(

)

′ = I1,...,IN

I the N by 1 sample indicator (vector) variable such that Ii =1 if unit iU is selected to the sample and Ii =0 if otherwise. The sample s is defined accordingly as s=

{

i|iU,Ii =1

}

and its complement by c=s =

{

i|iU,Ii =0

}

. We assume probability sampling, so that

0 ) Pr( ∈ >

= i s

i

(3)

We now consider the population values y1,...,yN as random variables, which are independent realizations from a distribution with probability density function

(

i | i

)

p y x

f , indexed by a vector parameter θ.

According to Krieger and Pfeffermann (1997), the (marginal) sample probability density function of yi is defined as:

(

)

(

)

(

) (

)

(

)

(

) (

)

(

π θ γ

)

θ γ π γ θ θ γ γ θ γ θ , , | , | , , | , , | s Pr , | , , | s Pr s , , , | , , | i i p i i p i i i p i i i p i i i i p i i s x E x y f y x E x i y f y x i i x y f x y f = ∈ ∈ = ∈ = x (1)

where θ is the parameter of the population distribution, γ is the parameter indexing

(

| , ,γ

)

Pr is xi yi and

(

i i

)

p

(

i i i

) (

p i i

)

i p x E x y f y x dy E π | ,θ,γ =

π | , ,γ | ,θ

Having derived the sample distribution, Pfeffermann, Krieger and Rinott (1998) proved that if the population measurements yi are independent, then as N →∞(with

nfixed, where n is the sample size), the sample measurements are asymptotically independent, so we can apply standard inference procedures to complex survey data by using the marginal sample distribution for each unit. Based on the sample data

{

yi,xi,wi; is

}

, we can estimate the parameters of the population model in two steps:

Step-one: According to Pfeffermann and Sverchkov (1999), estimate the informativeness parameters γ using the following relationship:

(

i | i, i

)

1 p

(

πi | i, i

)

s w x y E x y

E = (2) Thus the informativeness parameters can be estimated using regression analysis. Denoting the resulting estimate of γ by γ~.

Step-two: Substitute γ~ in the sample log-likelihood function, and then maximize the resulting sample log-likelihood function with respect to the population parameters, θ:

( )

( )

(

)

( )

θ

(

θ γ

)

γ θ π θ γ θ ~ , , | log ~ , , | log ~ , 1 1 i i s n i srs i i p n i srs rs w E l E l l x x

= = + = − = (3) where lrs

( )

θ,γ~ is the sample log-likelihood after substituting γ~ in the sample
(4)

( )

=

{

(

)

}

s

i p i i

srs f y x

l θ log | ,θ

is the classical log-likelihood obtained by ignoring the sample design. 3. Multivariate normal distribution under informative sampling

Most classical methods of multivariate analysis of continuous data are based on an examination of the structure of population mean vectors and covariance matrices. We consider the problem of estimating the mean vector μ and the covariance matrix Vof a vector of study variables y, from survey data obtained under informative sampling. The original theoretical basis for this is the multivariate normal distribution. The existing work in this area deals with the estimation problem when the sampling scheme is noninformative; see for example Smith and Holmes (1989).

The following theorem focuses on multivariate normal distribution under different modeling of the population conditional expectation of first order inclusion probabilities.

Theorem 1. Assume that the population distribution is q-dimensional multivariate normal with mean vector μ and covariance matrix V, that is:

(

y y

)

Nq

(

)

i N p iq i i 1,..., ~ , , =1,..., ′ = μ V y

Let Ep

( )

yijj and Covp

(

yij,yik

)

=vjk, i=1,...,N; j,k =1,2,...,q.

1. Under the exponential inclusion probability model – exponential sampling:

(

)

(

)

( )

(

)

= = ′ + = | q j ij j i i i p y a a a E 1 0 0 exp exp exp ay y π (4)

The sample probability density function of yi is:

( ) ( )

{

(

)

}

{

(

)

}

   + + = − − − Va μ y V Va μ y V yi q i i s f 2π 0.5 0.5exp 0.5 1 (5) That is, i

(

yi ,...,yiq

)

~Nq

(

,

)

,i 1,...,n s 1 + = ′ = μ Va V y . So that,

( )

(

)

( )

y Va Va μ y y + = + = ∈ | = i p i p i s E s i E E and
(5)

( )

(

)

( )

i p i p i s Var s i Var Var y V y y = = ∈ | = 2. Under the linear inclusion probability model – linear sampling:

(

)

= + = ′ + = | q t ij j i i i p y b b b E 1 0 0 y b y π (6)

The sample probability density function of yi is:

( ) (

)( )

{

( )

(

)

(

)

}

μ b μ y V μ y V y b y ′ + − ′ − − ′ + = − − − 0 1 5 . 0 5 . 0 0 2 exp 0.5 b b f i i q i i s π (7) Furthermore,

( )

( )

μ b Vb y μ b Vb μ y ′ + + = ′ + + = 0 0 b E b E i p i s (8a) and

( )

( )

( )( )

(

)

( ) ( )( )

(

)

2 0 2 0 μ b Vb Vb y μ b Vb Vb V y y ′ + ′ − = ′ + ′ − = = b Var b Cov Var i p i s i s (8b) Proofs:

1. As an extension of equation (1), the sample probability density function of yi is given by:

( )

(

)

(

( )

) ( )

i p i p i i p i p i s E f E i f f π π y y y y = | ∈s = | (9) So that,

( )

(

)( )

(

{

(

)

)

(

)

}

( )

(

) (

Q

)

f q i i q i i s 5 . 0 exp 5 . 0 exp 2 5 . 0 exp 5 . 0 exp 2 exp 5 . 0 2 1 5 . 0 5 . 0 − ′ − = ′ + ′ − ′ − − ′ = − − − − − Va a V Va a μ a μ y V μ y V y a y π π (10) where Q=

(

yiμ

)

V−1

(

yiμ

)

−2a

(

yiμ

)

.

Setting Ci =yiμ , then Q can be written as: Ci V CiaCi

= −

2

Q 1 .Using theorems

in multivariate statistical analysis, see Johnson and Wichern (1998), page 68, we have:

(6)

i i i V V C aV V C C 0.5 0.5 2 0.5 0.5 Q= ′ − − − ′ − then , let Now Di V 0.5Ci − = :

(

D V a

) (

D V a

)

aVa a V V a a V V a D V a D D D V a D D ′ − − ′ − = ′ − ′ + ′ − ′ = ′ − ′ = 1 5 . 0 5 . 0 5 . 0 5 . 0 5 . 0 5 . 0 5 . 0 5 .. 0 2 2 Q i i i i i i i i

But Ci =yiμ and Di =V−0.5Ci , so that Q can be expressed equivalently as:

(

)

(

)

(

)

(

(

(

)

)

)

(

)

(

)

(

(

)

)

Q 1 5 . 0 5 . 0 5 . 0 5 . 0 5 . 0 5 . 0 Va a Va μ y V Va μ y Va a a V V μ y V a V V μ y V ′ − + − ′ + − = ′ − − − ′ − − = − − − i i i i

Thus after substituting this expression of Q in (10), we get (5).

Hence the multivariate normal distribution in the sample is the same as in the population, except that the mean is shifted by the constant Va. Notice that the sample probability density function does not depend on a0. Note also:

( )

ij j j j jj q jq s y av a v a v E = + +...+ +...+ µ 1 1 (11a) and

( )

y v Cov

(

y y

)

v j k q Vars ij = jj, p ij, ik = jk, ≠ =1,..., (11b) 2. Substitute (6) in (9) we get (7).

Let us now compute the first two moments of this sample pdf. In order to do this we will use the moment generating function technique. The moment generating function of the sample probability density function is given by:

( )

(

(

)

)

(

(

)

)

( )

( )

( )

μ b u u b u μ b y y μ b y b y u y u u ′ + ′ + ′ + = ′ + ′ + ′ = ′ =

0 0 0 0 0 exp exp b d dM M b b d f b b E M i i p i p i i p i i i i i i s where

( )

i exp

(

i .5 i i

)

p M u = uμ+ uVu and
(7)

( )

p

( )(

i i

)

i i p M d dM Vu μ u u u + = Thus we have:

( )

( )

(

(

i

)

)

i p i s b b M M b μ V u μ b u u 0 0 0 + ′ + ′ + = (12)

This equation gives explicitly the relationship between the population and the sample moment generating functions. Notice that the sample and population moment generating functions are different unless b=0, in which case the sampling mechanism is noninformative.

Let Rs

( )

ui =logMs

( )

ui . Differentiating this expression twice and setting ui =0, we get (8a) and (8b).

From (8a) and (8b), we can see that:

( )

, ... ... ... ... 1 1 0 1 1 iq q j j jq q jj j j j ij s b b b b v b v b v b y E µ µ µ µ + + + + + + + + + + = (13a)

( )

(

(

)

)

2 1 1 0 2 1 1 ... ... ... ... q q j j jq q jj j j jj ij s b b b b v b v b v b v y Var µ µ µ + + + + + + + + − = (13b)

(

)

(

(

)(

)

2

)

1 1 0 1 1 1 1 ... ... ... ... ... ... , q q j j kq q kj j k jq q jj j j jk ik ij p b b b b v b v b v b v b v b v b v y y Cov µ µ µ + + + + + + + + + + + + − = , (13c) for i=1,...,n, jk =1,2,...,q.

Also Vars

( )

yijVarp

( )

yij and the equality holds if and only if b=0, that is when the sampling mechanism is noninformative. On the other hand, if b≠0, then the means the variances and the covariances change, in contradiction to what happens for the sample probability density function (5), where only the means change.

To illustrate the results, from now on, we only consider the particular cases of the exponential inclusion probability model – exponential sampling, see equation (4). The following are special cases of Theorem 1.

Corollary 1. (Univariate normal distribution, q=1)

Let Ep

( )

yi11 and Varp

( )

yi1 =v11. Under the exponential sampling:

(

i i1

)

exp

(

0 1 i1

)

p y a a y

E π | = + (14) We get the following result:

y1~N

(

1 a1v11,v11

)

s

(8)

Corollary 2. (Bivariate normal distribution,q=2).

Let Ep

( )

yi11,Ep

( )

yi22, Varp

( )

yi1 =v11, Varp

( )

yi2 =v22, Covp

(

yi1,yi2

)

=v12

and Corp

(

yi1,yi2

)

12. Under the exponential sampling:

(

i i1, i2

)

exp

(

0 1 i1 2 i2

)

p y y a a y a y

E π | = + + (16) and using the properties of multivariate normal distribution, see Johnson and Wichern (1998), page 171, we obtain the following results:

1. The joint sample probability density function of

(

yi1,yi2

)

is:

(

)

                  + + + + ′ 22 21 12 11 22 2 21 1 2 12 2 11 1 1 2 2 1, ~ , v v v v v a v a v a v a N y y s i i µ µ (17a)

2. The sample marginal probability density functions of yi1 and yi2 are respectively given by:

(

1 1 11 2 12 11

)

1~N av a v ,v y s i µ + + (17b) and

(

2 1 21 2 22 22

)

2~N av a v ,v y s i µ + + (17c) 3. The conditional sample probability density function of yi1 given yi2 is univariate normal distribution with conditional mean:

(

)

( )

(

( )

)

(

)

(

)

(

)

(

)

(

2

)

12 11 1 2 2 22 11 12 1 22 2 12 11 1 2 2 22 11 12 1 22 2 21 1 2 2 22 11 12 12 2 11 1 1 2 2 11 22 12 1 2 1 1 | ρ µ ρ µ µ ρ µ µ ρ µ ρ − + − + =       − + − + = + + − + + + = − + = v a y v v v v v a y v v v a v a y v v v a v a y E y v v y E y y E i i i i s i i s i i s (17d)

and conditional variance:

(

)

(

)

(

1 2

)

2 12 11 2 1| i 1 p i | i i s y y v V y y V = −ρ = (17e) That is,

(

) (

)

(

) (

)

{

2

}

12 11 2 12 11 1 2 2 5 . 0 22 11 12 1 2 1 |y ~N µ +ρ v v y −µ +av 1−ρ ,v 1−ρ y i s i i (17f)
(9)

Notice that:

(

1| 2

)

(

1| 2

)

1

(

1| 2

)

Es yi yi =Ep yi yi +aVp yi yi (17g) 4. Similarly, the conditional sample probability density function of yi2 given yi1 is:

(

) (

)

(

) (

)

{

2

}

12 22 2 12 22 2 1 1 5 . 0 11 22 12 2 1 2 |y ~N µ +ρ v v y −µ +a v 1−ρ ,v 1−ρ y i s i i (17h)

Thus the sample and population probability density functions are different, but belong to the same family of distribution, which is normal. Also the change occurs only for the means and conditional means, whereas the variances, conditional variances, and covariance do not change.

In particular if a2 =0, that is the inclusion probabilities depend only on yi1 (which is the case in panel surveys), then we have:

1. The joint sample probability density function of

(

yi1,yi2

)

is:

(

)

                  + + ′ 22 21 12 11 12 1 2 11 1 1 2 2 1, ~ , v v v v v a v a N y y s i i µ µ (18a)

2. The marginal sample probability density functions of yi1 and yi2 are respectively given by:

(

1 1 11 11

)

1~N a v ,v y s i µ + (18b) and

(

2 1 12 22

)

2~N av ,v y s i µ + (18c) 3. The conditional sample probability density function of yi1 given yi2 is:

(

)

(

) (

)

(

)

{

2

}

12 11 2 2 5 . 0 22 11 12 2 12 11 1 1 2 1 |y ~N µ +av 1−ρ +ρ v v y −µ ,v 1−ρ y i s i i (18d)

4. The conditional sample probability density function of yi2 given yi1 is:

(

) (

)

(

)

{

2

}

12 22 1 1 5 . 0 11 22 12 2 1 2 |y ~N µ +ρ v v y −µ ,v 1−ρ y i s i i (18e)

Notice that the sample and population probability density functions of yi2 |yi1 are same, while the other sample and population distributions are different.

Birnbaum et al. (1950) studied the effect of selection performed on some coordinates of a multi-dimensional population, but in different point of view.

(10)

Notice that the sample and population pdf’s of yi2|yi1 are same, while the other sample and population distributions are different.

Corollary 3. (Bivariate normal distribution,q=2).

Let Ep

( )

yi11,Ep

( )

yi22, Varp

( )

yi1 =v11, Varp

( )

yi2 =v22, Covp

(

yi1,yi2

)

=v12

and Corp

(

yi1,yi2

)

12. Under the linear model:

(

i i1, i2

)

0 1 i1 2 i2 p y y b b y b y

E π | = + +

and using the properties of multivariate normal distribution, we have the following results:

1. The joint sample probability density function of

(

yi1,yi2

)

is:

(

)

(

1 2

)

2 2 1 1 0 2 2 1 1 0 2 1, p i , i i i i i s f y y b b b y b y b b y y f µ µ + + + + = (19a) 2. Integrating (19a), with respect to yi2, we have:

( )

(

)

(

)

(

)

(

)

(

)

(

)

(

)

(

)

(

)

( )

(

)

( )

( )

( )

(

)

(

)

( )

(

)

(

)

( )

1 1 1 , , 1 , 1 , , 1 , , 1 2 2 1 1 0 1 1 5 . 0 11 22 12 2 1 1 0 2 1 2 1 2 2 2 2 1 1 0 1 2 1 2 1 1 1 1 0 2 2 1 1 0 2 1 2 1 2 2 2 2 1 1 0 2 2 1 1 1 2 2 1 0 2 2 1 1 0 2 2 1 2 2 2 2 1 1 0 2 2 1 1 1 2 2 1 0 2 2 1 1 0 2 2 1 2 2 1 1 0 2 2 1 1 0 2 2 1 1 i p i i i i i p i p i i i p i p i p i i p i i i p i p i i i i p i i i i p i i i p i i i i p i i i i p i i i p i i i i i s i s y f b b b y v v y b b dy y y f y f y b b b b y y E y f b y f y b y f b b b b dy y y f y f y b b b b dy y y f y b dy y y f b b b b dy y y f y b b b b dy y y f y b dy y y f b b b b dy y y f b b b y b y b b dy y y f y f                 + +         −       + + + = + + + + + + + = + + + + + + = + + + + + + = + + + + = =

µ µ µ ρ µ µ µ µ µ µ µ µ µ µ µ µ µ µ µ (19b)

3. Similarly, integrating (19a), with respect to yi1, we obtain the following marginal sample probability density function of yi2:

(11)

( )

(

)

( )

2 2 2 1 1 0 2 2 5 . 0 22 11 12 1 1 2 2 0 2 p i i i i s f y b b b y v v b y b b y f                 + +         −       + + + = µ µ µ ρ µ (19c)

4. Using (19a), (19b), and (19c), we get the following conditional sample probability density function of yi1 given yi2:

(

)

(

( )

)

(

)

(

)

( )

(

)

(

( )

)

(

)

(

1 2

)

2 2 2 1 1 0 2 2 1 1 0 2 2 1 2 2 2 1 1 0 2 2 1 1 0 2 2 2 1 1 0 1 2 1 2 2 0 2 1 2 2 1 1 0 2 2 1 1 0 2 2 1 2 1 | | , | , , | i i p i i i p i i i p i i p i i i p i i i p i i p i i i p i i i s i i s i i s y y f y b y y E b b y b y b b y f y y f y b y y E b b y b y b b y f b b b y y E b y b b y y f b b b y b y b b y f y y f y y f         + + + + = + + + + = + + + + ÷ + + + + = = µ µ µ µ (19d) where

(

)

(

2 2

)

5 . 0 22 11 12 1 2 1 | µ ρ  −µ      + = i i i p y v v y y E

5. Similarly, the conditional sample probability density function of yi2 given yi1 is:

(

)

(

)

(

2 1

)

0 1 1 1 2 2 0 2 2 1 1 1 2 | | | p i i i i i p i i i i s f y y b y b y y E b b y b y b y y f         + + + + = (19e) where

(

)

(

1 1

)

5 . 0 11 22 12 2 1 2 | µ ρ  −µ      + = i i i p y v v y y E

Thus in this case we see that the sample probability density functions and population probability density functions are very different. Also notice that formulas (19b, c) are true, not only for the bivariate normal distribution, but for any joint probability density function of

(

yi1,yi2

)

, provided that the marginal and conditional distributions and the corresponding moments are exist.

The following theorem provides the maximum likelihood estimators of the mean vector μ and covariance matrixV.

(12)

Theorem 2. Assume

(

y y

)

Nq

(

)

i N p iq i i 1,..., ~ , , =1,..., ′ = μ V

y are independent. Let

n y

y1,..., be a sample of size n selected by informative sampling.

1. Under the exponential sampling – equation (4): the maximum likelihood estimators of μ and Vare given by:

a V y μˆ = ˆ~ (20a) and

(

)(

i

)

{ }

ij

{ }

ij n i i s n − − ′ = = =

= − V y y y y Vˆ ˆ 1 1 (20b) where ~a is the least square estimator under the model:Es

(

wi |yi

)

=exp

(

a0ayi

)

,

(

)

′ = y1,...,yq y

= − = n j ij i n y y 1 1 , and

(

)

(

)

= − = = n k j jk i ik ij ij s n y y y y Vˆ 1 .

2. Under the linear sampling – equation (6): the maximum likelihood estimators of μ and Vare defined by the equations:

(

yμˆ

)

(

b~0 +~bμˆ

)

=Vˆb~ (21a) and

(

)(

)

′ =

= μ y μ y Vˆ ˆ ˆ 1 n i n (21b) where b~0 and b~ are the least squares estimators under the model:Es

(

wi |yi

)

=1

(

b0byi

)

. Solve (21a) and (21b) iteratively. Start with

classical maximum likelihood estimators. Proofs:

1. Exponential sampling. Using the two-step method of estimation.

Step-one. Estimation of informativeness parameters a0 and a via the relationship (2).

Step-two. After substituting ~ain (5), the resulting sample log-likelihood is given by:

(

)

(

) (

)

= ∗ − ∗ ′ − − − − = n i i i rs nq n l 1 1 5 . 0 log 5 . 0 2 log 5 . 0 ,V V y μ V y μ μ π (22) where μ* =μ+Va~.

According to Johnson and Wichern (1998), page 182, the maximum likelihood estimators of μ*and Vare given by:

(13)

y μ* = ˆ and

(

)(

)

′ =

= − y y y y V i n i i n 1 1 ˆ .

Hence the result – equation (20).

2. Linear model. Using the two-step method of estimation.

Step-one. Estimation of informativeness parameters, b0 and b via the relationship (2).

Step-two. After substituting b~0 and b~ in (7), the resulting sample log-likelihood is given by:

(

μ V

)

=− − V

(

yμ

)

V

(

yμ

)

(

+bμ

)

= − ~ ~ log 2 1 log 2 1 2 log 2 1 , 0 1 1 b n n nq l n i i i rs π (23)

Now differentiating lrs

(

μ,V

)

with respect toμ andV, we can show that the maximum likelihood estimators of μ and Vare defined by equation (21).

The following corollary provides the maximum likelihood estimators of the mean µ and the variance σ2 when the population model is univariate normal and the sampling process is exponential and linear, which is a particular case of Theorem 2 with q=1. Corollary 4. Assume y N

(

)

i N p i ~ , , 1,..., 2 = σ

µ are independent. Let y1,...,yn be a sample of size n selected under the following sampling schemes.

1.Exponential sampling. The maximum likelihood estimators of µ and σ2 are given by: 2 1ˆ ~ ˆ σ µ = ya (24a) and

(

)

= − = = n i i y y n s 1 2 1 2 2 ˆ σ (24b) where a~1 is the least square estimator of the informativeness parameter a1.

2. Linear sampling. The maximum likelihood estimators of µ and σ2 are given by:

(

ˆ

)

~

(

~ ~ ˆ

)

0 ˆ 1 1 1 1 0 1 2

− − + = = − n i i nb b b y µ µ σ (25a) and
(14)

(

)

= − = n i i y n 1 2 1 2 ˆ ˆ µ σ (25b)

where b~0 and b~1 are the estimators of the informativeness parameters b0 and b1. Corollary 5. Assume

(

y y

)

N

(

)

i N p i i i 1, 2 ~ 2 , , =1,..., ′ = μ V

y . Under the exponential

sampling – equation (4): the maximum likelihood estimators of µ12

,

v11,v22,v11and are given by:

12 2 11 1 1 1 ~ ~ ˆ = ya sa s µ (26a) 12 1 22 2 2 2 ~ ~ ˆ = ya sas µ (26b)

{ }

, , 1,2 ˆ 22 21 12 11 = =       = s i j s s s s ij V (27)

4. Application – fitting general linear model for longitudinal survey data under informative sampling

4.1 Population model:

Let yit,i=1,...,N;t =1,...,T be the measurement on the i-th subject at time t=1,...,T. Associated with each yit are the (known) values,xitk,k =1,...p, of p explanatory variables. We assume that the yit follow the regression model:

yit1xit1 +...+βpxitpit (28) where εit are random sequence of length T associated with each of the N subjects. In our context, the longitudinal structure of the data means that we expect the εit to be correlated within subjects.

Let yi =

(

yi1,...,yiT

)

′,

(

)

= it itp

it x 1,...,x

x and let β=

(

β1,...,βp

)

′ be the vector of unknown regression coefficients. The general linear model for longitudinal survey data treats the random vectors yi,i=1,...,N as independent multivariate normal variables, that is

(

V

)

y ~ q i ,

p

i N (29) where xi is the matrix of size Tby p of explanatory variables for subject i, and V has

( )

jkth element, vjk =covp

(

yij,yik

)

,j,k =1,...,T; see Diggle, Liang and Zeger
(15)

4.2 Covariance structure of y : i

It is useful at this stage to consider what form the matrix V might take. We consider three cases: the exponential correlation model, the uniform correlation model; see Diggle, Liang and Zeger (1994), and the random effect model; see Skinner and Holmes (2003).

Case1: The exponential correlation model:

In this model, the

( )

t,s −th element of V has the form:

(

y y

)

t s T

vts =covp it, is =σ2ρts , , =1,..., (30) Note that the correlation,vts σ2, between a pair of measurements on the same unit decays toward zero as the time separation between the measurements increases. Case2: The uniform correlation model:

In this model, we assume that there is a positive correlation,ρ, between any two measurements on the same subject. So that the

( )

t,sth element of V has the form:

(

y y

)

t s T vts covp it, is , 1,..., 2 ≠ = = = σ ρ ; vtt ,t 1,...,T 2 = =σ (31) Case3: Random effects models:

Under this model the multivariate outcomes yi =

(

yi1,...,yiT

)

′,i=1,...,N are independent with mean vector and covariance matrix given respectively by:

( ) (

yi = T

)

=μ p E β1,...,β (32a)

( )

i = u T + T =∑ p y J V 2 2 cov σ σ (32b) where JT denotes the Tby Tmatrix all of whose elements are one, and the

( )

t,t′ −th

element of VT is t t T t t .,..., , ; ′= ′ − ρ . 4.3 Sampling design:

We assume a single-stage informative sampling design, where the sample is a panel sample selected at time t =1 and all units remain in the sample till time t=T. Examples of longitudinal surveys, some of which are based on complex sample designs, and of the issues involved in their design and analysis can be found in Herriot and Kasprzyk (1984), and Nathan (1999). In many of the cases described in these papers, a sample is selected for the first round and continues to serve for several rounds. Then it is intuitively reasonable to assume that the first order inclusion probabilities, πi, depend on the population values of the response variable at the first

(16)

occasion only, the values yi1, and on xi1=

(

xi11,,xi1p

)

′, and the values of known design variable, z=

{

z1,...,zN

}

, used for the sample selection, but not included in the working model under consideration.

4. 4 Sample distribution:

Under exponential inclusion probability model:

(

i i i

)

(

i i i p i p

)

p y a a y a x a x a x E | 1, 1 =exp 0 + 0 1 + 1 11+ 2 12 +...+ 1 ∗ x π (33)

Using (29), (33) and Theorem 1, we have:

(

μ V

)

θ x y , , *, 0

~

q s i i a N (34) where,

[

+ + +

]

′ = i1 a0v11 i2 a0v12 iT a0v1T * ,..., ,x β x β β x μ

Note that, let yi,T1 =

(

yi2,yi3,,yiT

)

′. Alternatively, the sample probability density function of yi is can be written as:

(

i i

)

s

(

i i

) (

p iT i i

)

s f y f y f y x = 1x1 y, 1 | 1,x (35) where

(

)

(

)

     − ′ − − = 2 11 0 1 1 11 11 1 1 2 1 exp 2 1 , , y a v v v y fs i xi θ γ i xiβ π (36)

(

)

( )

(

)

(

)

(

)

    = − − − ∗ − − − ∗ − − − , 1 1 1 1 , 0 1 1 , 2 1 1 , 0 1 1 1 , 2 1 exp 2 1 , | iT T T iT T T T i i T i p V V y f y x y μ y μ π (37)

[

]

(

)

(

)

′      ′ − + ′ ′ − + ′ = = ∗ ∗ ∗ ∗ − − β x β x β x β x x y μ 1 1 11 1 1 1 11 21 2 1 1 , 1 ,..., , | i i T iT i i i i i T i p T y v v y v v y E

with general term:

( )

1,1 1, 1 1 11 1 , 1 + ′+ − + ′ + ′ = t tt t t t v v v v v ; t,t′=1,...,T −1. So that we have the following sample model:

. , , 2 , 1 ; ... 1 1 0 n i x x y it i it itp p it it  = + = + + + + = ∗ ∗ ε ε β β β β x , (38)

(17)

where β0 =a0v11, β∗ =

(

β01,...,βp

)

′, xi =

(

1,xi

)

and the εit are a random sequence correlated within subjects .

Note that if a0 =0, that is the sampling design is noninformative, then the population and sample models are the same.

4.5 Estimation:

We consider three method of estimation, namely: unweighted maximum likelihood, pseudo maximum likelihood, and two-step estimation based on the sample distribution.

4.5.1 Unweighted maximum likelihood:

Maximum likelihood for the case where the sampling design is ignorable: the value of

(

β V

)

θ= , that satisfy:

(

)

(

)

(

)

      + + ∂ ∂ − = ∂ ∂

= − n i i i srs nq n l 1 1 log 2 log 2 1 , V y μ V y μ θ V μ θ π (39)

4.5.2 Pseudo maximum likelihood:

The pseudo maximum likelihood estimator of θ=

(

β,V

)

is defined as the solution of:

( )

ln2 ln

{

(

)

}

{

(

)

}

0 2 1 ˆ 1 =     + + + + ∂ ∂ − = − ∈

V y μ Va V y μ Va θ θ i i s i i w w q U π (40)

( )

ln

{

(

)

}

ln

{

(

| ,

)

}

0 ˆ 1 1 , 1 1 = ∂ + ∂ ∂ = ∈ ∈

p iT i i s i i i s s i i ws f y n N y f w U y x θ x θ θ (41)

For more discussion, see Eideh and Nathan (2006). 4.5.3 Two-step method:

Step one. Estimate a0 via the model: Es

(

wi |yi1,xi1

)

=exp

(

a0a0yi1axi1

)

. Step two. Using Theorem 2, the maximum likelihood estimators of β∗ and V are defined as the solution of:

(

)

log2 log

(

) (

)

0 2 1 , 1 1 =       − ′ − + + ∂ ∂ − = ∂ ∂

= ∗ − ∗ ∗ n i i i rs nq n l V y μ V y μ θ V β θ π (42) where β∗ =

(

β01,...,βp

)

′ and μ* =

[

xi1β+a0v11,xi2β+a0v12,...,xiTβ+a0v1T

]

′.
(18)

4.6 Variance estimation:

For variance estimation we use the inverse of Fisher information matrix and the bootstrap approach for variance, see Pfeffermann and Sverchkov (1999, 2003) and Eideh and Nathan (2006).

(a) Fisher information matrix approach:

The inverse of the observed Fisher information matrix evaluated at θˆ =

( )

βˆ,Vˆ is given by:

( ) ( )

[ ]

( )

1 ˆ 2 1 1 ˆ ˆ ˆ − = −               ∂ ′ ∂ ∂ − = = θ θ θ θ θ θ θ l n I Vs s (43) (b) Bootstrap approach:

Let θˆ =

( )

βˆ,Vˆ be the sample maximum likelihood estimator of θ=

(

β,V

)

based on any of the equations (39-42) and θˆb =

(

βˆb,Vˆb

)

be the ML estimator computed from the bootstrap sample b=1,...,B, with the same sample size, drawn by simple random sampling with replacement from the original sample – the sample drawn under informative sampling design. The bootstrap variance estimator of θˆ =

( )

βˆ,Vˆ is defined as:

( )

′            =

= b boot B b boot b boot B Vˆ θˆ 1 θˆ θˆ θˆ θˆ 1 (44) where

= B b b boot B θ θˆ 1 ˆ . 5. Conclusions

In this paper, we extend the definition of univariate sample distribution into multivariate random variables. Also, we consider a new method of estimating the parameters of the superpopulation model for analyzing multivariate normal observations from finite population when the sampling design is informative. Furthermore, the general linear model for longitudinal survey data under informative sampling using different covariance structures: the exponential correlation model, the uniform correlation model, and the random effect model, was fitted under informative sampling.

The main feature of the present estimators is their behaviours in terms of the informativeness parameters.

The paper is purely mathematical. The role of informativeness of sampling mechanism in adjusting various estimators for bias reduction, based on simulation

(19)

study, under different population models and different modeling of conditional expectations of first order inclusion probabilities given response variable and covariates, can be found in Pfeffermann and Sverchkov (1999, 2003), Nathan and Eideh (2004), Eideh (2008), and Eideh and Nathan (2006, 2009).

I hope that the new mathematical results obtained will encourage further theoretical, empirical and practical research in these directions.

ACKNOWLEDGEMENTS

The author is grateful to the referees for their valuable comments.

REFERENCES

Birnbaum Z.W., Paulson E., and Andrews F.C. (1950). On the Effect of Selection Performed on Some Coordinates of a Multi-Dimensional Population. Psychometrika, 15, pp 191-204.

Chambers, R. and Skinner, C. (2003). Analysis of Survey Data. New York: John Wiley.

Diggle, P. J., Liang, K. Y, and Zeger, S. L. (1994). Analysis of Longitudinal Data. Oxford: Science Publication.

Eideh A.H. (2008). Estimation and Prediction of Random Effects Models for Longitudinal Survey Data under Informative Sampling. Statistics in Transition – New Series .Volume 9, Number 3, December 2008, pp 485 – 502.

Eideh, A. H. and Nathan, G. (2009). Two-Stage Informative Cluster Sampling with application in Small Area Estimation. Journal of Statistical Planning and Inference.139, pp 3088-3101.

Eideh, A. H. and Nathan, G. (2006)

Journal of Statistical Planning and

Inference,136, 3052-3069.

Fahrmeir, L. and Tutz, G. (2001). Multivariate statistical modeling based on generalized linear models,2nd edn. New York: Springer.

Herriot, R.A., and Kasprzyk, D. (1984). The survey of income and program participation. American Statistical Association, Proceedings of the Social Statistics Section,pp.107-116.

Hoem, J.M. (1989). The issue of weights in panel surveys of individual behaviors. In Panel Surveys, (Eds.), Kasprzyk, D., Duncan, G.J., Kalton, G., and Singh, M.P., New York: Wiley, pp. 539-565.

Lohnson, R.A., and Wichern, D.W. (1998). Applied Multivariate Statistical Analysis, 4th edn. New Jersey: Prentice Hall.

Kasprzyk, D., Duncan, G.J, Kalton, G., and Singh, M.P. (Eds.) (1989). Panel Surveys. New York: Wiley.

Krieger, A.M, and Pfeffermann, D. (1997) Testing of distribution functions from complex sample surveys. Journal of Official Statistics. 13: 123-142.

Mardia, K.V., Kent, T.J. and Bibby, J.M. (1979). Multivariate analysis, New York: Academic Press.

Nathan, G. (1999). A review of sample attrition and representativeness in three longitudinal surveys. GSS methodology Series No. 13. London: Office of National Statistics.

(20)

Nathan, G. and Eideh, A. H. (2004) : Échantillonage et Méthodes d’Enquêtes. (P. Ardilly – Ed.) Paris: Dunod, pp 227-240.

Pfeffermann, D. (1993). The role of sampling weight when modeling survey data. International Statistical Review 61: 317-337.

Pfeffermann, D. (1996). The use of sampling weights for survey data analysis. Statistical Methods in Medical Research, V.5: 239-261.

Pfeffermann, D., Krieger, A. M, and Rinott, Y. (1998). Parametric Distributions of Complex Survey Data under Informative Probability Sampling. Statistica Sinica, 8, 1087 1114.

Pfeffermann, D. and Sverchkov, M. (1999). Parametric and Semi-Parametric Estimation of Regression Models Fitted to Survey Data. Sankhya, 61, Ser. B, 66-186. Pfeffermann, D. and Sverchkov, M. (2003). Fitting Generalized Linear Models under Informative Probability Sampling. Analysis of Survey Data. (eds. R. Chambers and C. J. Skinner), pp. 175-195. New York: Wiley.

Skinner, C.J., Holt, D., and Smith, T.M.F (Eds.) (1989). Analysis of Complex Surveys, New York: Wiley.

Skinner, C.J., and Holmes, D. (2003). Random Effects Models for Longitudinal Data. Analysis of Survey Data. (eds. R. Chambers and C. Skinner), pp. 175-195. New York: Wiley.

Smith, T.M.F. and Holmes, D.J. (1989). Multivariate analysis.In Analysis of Complex Surveys, eds. C.J. Skinner, D. Holt and T.M.F. Smith, New York: Wiley, pp. 165-190.

References

Related documents

The hypothesis for this study was formulated as H1: High level of customer satisfaction will result in high level of customer loyalty for Ghana Telecom

The research scholar will collect information from 25 small and 25 medium enterprises in Mumbai to find out their problems in raising funds through various sources and whether

heterogeneous databases ; H.2.3 [Information Systems]: Database Management— data ma- nipulation languages ; query languages ; H.3.3 [Information Storage and Retrieval]: Infor-

Table 3 shows the results of multivariate analysis, age, occupation, income, distance, perceived susceptibility, perceived serious- ness, cues to action, motivation

+/- HL, tinnitus, N/V Hx – deep/mixed gas (Heliox) dive painless, no issues equalizing, typically +/- HL, tinnitus, N/V. Signs of MEBT,

After examining the results of the analysis, three variables (total population, percent of households earning greater than $200,000 annually and percent non- urban population)

We present a survey on the role of initial public offerings (IPOs) and venture capital (VC) in Germany after the Second World War. Between 1945 and 1983 IPOs hardly played a role

In the current regulatory framework (years 2019-2021), AGEN is implementing certain actions to achieve energy efficiency targets (such as incentives to reduce network