INFORMATIVE SAMPLING ON TWO OCCASIONS: ESTIMATION AND PREDICTION

(1)

Estimation and Prediction

Abdulhakeem A.H. Eideh

Department of Mathematics

Faculty of Science and Technology Al-Quds University, Abu-Deis Campus Palestine

Abstract

The sample distribution is defined as the distribution of the sample measurements given the selected sample. Under informative sampling, this distribution may be different from the corresponding population distribution. Sampling on two occasions under informative sampling design, utilizing the sample and sample-complement distributions for occasion one, the matched sample and unmatched sample distributions, and matched sample-complement and unmatched sample-complement for occasion two, is proposed for predicting finite population total of a variable under study for the current (second) occasion, viewing information collected on the first (previous) occasion as auxiliary information. An interesting result of the present analysis is that known predictors in common use are shown to be special cases of the present predictors obtained under informative sampling, thus providing them a new justification.

Keywords: Matched distribution, Sample-complement distribution, Unmatched distribution.

1. Introduction

The practice of relying on samples for the collection of important series of data, published at regular intervals has become common. In most surveys, interest centers on the current total or average. For discussions of repeated sampling in general and on sampling on two occasions, in particular, under noninformative sampling, see Cochran (1977). However if the design is informative, in the sense that the study or response variable is correlated with design variables not included in the model, even after conditioning on the model covariates, standard estimates of the model parameters can be severely biased, leading possibly to false inference. For example, see Pfeffermann, Krieger and Rinott (1998) and Eideh (2010).

In this paper we propose to deal with the prediction of finite population total of a variable under study for the second occasion, using information collected on the first occasion as an auxiliary variable, and under informative sampling, by combining two separate statistical methodologies: sampling on two occasions and methods of analysis under complex informative sampling.

(2)

Methods of prediction under informative sampling have been investigated by Sverchkov and Pfeffermann (2004), and Pfeffermann and Sverchkov (2007) in the context of analytical inference from complex surveys for cross-sectional analysis based on data from a single occasion. Later, Eideh and Nathan (2009) investigated the effects of informative two-stage cluster sampling on estimation and prediction with application in small area estimation.

Previous work in this area deals with sampling on two occasions under equal and unequal probability of selection sampling designs. See for example Arnab (1998), and Prasad and Graham (1994). In particular, none of the above studies extract the sample matched and sample un-matched distributions, for sampling on two occasions, from the population distribution and first order inclusion probabilities. The key reference about effects of informative design on sample distributions, applied here to the case of repeated sampling, is Sverchkov and Pfeffermann (2004).

As pointed out by Sverchkov and Pfeffermann (2004) in Section 8 Concluding Remarks “Further experimentation with this kind of predictors and MSE (mean square error) estimation is therefore highly recommended”.

Thus, the aims of the present study are then to extend and develop the methods of prediction of finite population totals under informative sampling by utilizing the sample distribution and sample-complement distributions for sampling on two occasions. In Section 2 a review of known results on the sample and sample-complement probability density functions (pdfs) is given. In Section 3 we introduce the marginal distributions of matched and un-matched sample observations for occasion two. Marginal distributions of complement-matched and complement-un-matched samples are discussed in Section 4. In Section 5 we present the results of prediction of finite population totals of a variable under study for the second occasion, under informative sampling. Prediction methods are discussed in Section 6. In Section 7 we give examples. Section 8 presents the estimation of mean square error and Section 9 provides a discussion of the results.

It should be noted that this paper is based completely on model-based inference (rather than randomization based).

2. Review of results on sample and sample-complement probability density functions

(3)

chosen. The corresponding inclusion probabilities are denoted by 1i for element

U

i . Let I1i 1 if is1 and I1i 0, otherwise, be sample membership indicator

of element iUfor the first occasion. At the first occasion, we assume that the population values



y11,x11



,,



y1N,x1_N



are independent realizations of random variables Y1 and X1 with continuous joint pdf fp



y1i,x1i



. In the application, the

variable Y1 is the variable of interest for period one and is observed only for the sample on the first occasion. (In practice, the variable of interest for period one may often be observed also for the sample of the second occasion (retrospective studies)). The variable X1 represents auxiliary variable and its values are assumed known for the whole population. Let z



z1,...,zN



be the values of a

known design variable, used for the sample selection process but not included in the working model under consideration. In what follows we consider a sampling design with selection probabilities i Pr(is) where i1,...,N . In practice, the

i

 ’s may depend on the population values



x,y,z



. We consider single stage sampling, with inclusion probabilities:



_i _i _i





_i _i _i



i i s|y ,x z g y x z

π1 Pr  1 1 1 ,  1 , 1 , (2.1)

for some function g and all units iU. See Eideh (2010) for further discussion on examples of g.

Since 1,...,N are defined by the realizations



y1i,x1i,zi



,i1,...,N, therefore

they are random realizations defined on the space of possible populations.

According to Pfeffermann, Krieger and Rinott (1998), the conditional marginal sample pdf of Y1i is defined as:











 





i i



i i p i i i

i i i p i i s

|x I

|x y f ,x |y I

,I |x y f |x y f

1 1

1 1 1

1 1

1 1 1 1

1

1 Pr

1

  

 

(2.2)

with the second equality obtained by application of Bayes theorem.

Note that the conditional marginal sample pdf is different from the super population pdf generating the finite population values, unless



I1_i 1|y1_i,x1_i



Pr



I1_i 1|x1_i



Pr    for all possible values y1i, in which case the

sampling process or scheme is noninformative or can be ignored conditional on

i

x₁ .

Denote by Ep andEs the expectations under the population and sample pdfs,

respectively. Then according to Pfeffermann, Krieger and Rinott (1998), (2.2) can be written as:







 



i i p

i i p i i i p i i

s _E _x

x y f x y E

x y f

1 1

1 1 1

1 _|

| ,

| |

1 



(4)

where



_i _i







_p



_i _i _i

 

_p _i _i



_i

p x E y x f y x dy

E 1 | 1 1 | 1, 1 1 | 1 1 (2.4)

It follows from (2.3) that the population and sample pdf’s are different, unless



_i _i _i



_p



_i _i



p π |y ,x E π |x

E 1 1 1  1 1 for all y1i, in which case the sampling process can be

ignored for inferences that condition on the x1.

Comment 1. Note that Ep





1i |y1i,x1i



Ez_iy₁_i,x₁_iEp





i |y1i,x1i,zi



, so that zi is

integrated out in (2.3). See Remark 1 in Sverchkov and Pfeffermann (2004) for further discussion.

Let w1i 11i define the sampling weight of unit iU. According to Pfeffermann

and Sverchkov (1999), the following relationships hold:









i i i s i i i

p π |y ,x _E _w _|y _,x

E

1 1 1 1

1 1

1

 (2.5a)









i i s i i

p π |x _E _w _|x

E

1 1 1

1

 (2.5b)









i i s

i i i s i i

p _E _w _|x

|x y w E |x y E

1 1

1 1 1 1

1

1 1

 (2.5c)

 



 



i s

i i s i

p _E _w

y w E y E

1 1 1 1

1 1

 (2.5d)

 

i s i

p _E _w

E

1 1

1

1 

 (2.5e)

Similar to (2.2), the conditional marginal sample-complement pdf (for units not in 1

s , denoted by _sc

1 ) is defined as:











 





i i



i i p i i i

i i i p i i s

|x I

|x y f ,x |y I

,I |x y f |x y f c

1 1

1 1 1

1 1

1 1 1 1

1

0 Pr

0

1

  

 

(2.6a)

It follows from Sverchkov and Pfeffermann (2004) that this pdf can be written as:







 









 









_i _i



s

i i s i i i

s

i i p

i i p i i i p i

s

x w

E

x y f x y w

E

x E

x y f x y E

y f c

1 1

1 1 1

1 1

1

1 1 1

1 1i

1

| 1

| ,

| 1

| ,

| 1 x

|

1

1 1

1

 



 



 

(5)

Now, using (2.6a), (2.5b) and (2.5c), we have:















 













 













































  

  

  

 



 

 



i i i s

i i s

i i i s

i i p

i i i p

i i p

i i i i p

p

i i

i p

i i p i i i i p

i i

i p

i i p i i i p

i

i i

s i i

i s

|x |x w E

y w E

|x w E

|x y w E

|x π E

|x y π E

x E

x y y E

E

dy x

E

x y f x y y E

dy x

E

x y f x y E

y

dy x y f y |x

y

E c c

1 1 1

1

1 1 1

1 1

1 1 1 1

1 1

1

1 1 1

1 1

1 1 1

1 1

1 1i 1 1 1

1

1 1 1

1

1 1 |

1

, | 1

| 1

| ,

| 1

| ,

| 1

|

1 1 1

 

(2.7)

where E_sc

1 denotes the expectation under the sample-complement pdf. Using (2.3), (2.5a), (2.5b) and (2.6), we have the following results:







_i _i



_s



s



_i i _i i



_i



p i i s

,x |y w E

|x w E |x

y f

|x y f

1 1 1

1 1

1  (2.8)









s



i



i



i



i i s

,x |y w E

|x w E |x

y f

|x y f

c ₁ ₁ ₁

1 1 1

1 1

 

 (2.9)

and







_i _i



s_s



i_i i_i

 

_is

 



_s



i _i



i



_ii



p i i s

|x w E ,x |y w E

,x |y w E |x w E |x

y f

|x y f c

1 1 1

1 1

1 1 1 1

1 1

1 1 1

1 1

1

 

 (2.10)

According to (2.8), the sample and population pdfs are different unless



i i i



s



i i



s w y x E w x

E₁ ₁ | ₁, ₁  ₁ ₁ | ₁ for all y1i, in which case the sampling process can

be ignored for inference that conditions on the x1.

The key references to the relationships between the population and sample distributions and their applications are the articles by Eideh and Nathan (2006, 2009), Eideh (2007, 2008, 2009), Krieger and Pfeffermann (1997), Pfeffermann, Krieger and Rinott (1998), Pfeffermann and Sverchkov (1999, 2003, 2007), Skinner (1994), and Sverchkov and Pfeffermann (2004).

3. Marginal distributions of matched and unmatched sample observations for occasion two

(6)

complementary sample induced by the design P₁

 

 . When sampling on the second occasion we have more information than at the first occasion. For every

1

s

i , we know the values



y1i,x1_i



,i1,...,n and x1i, in1,...,N. For the second

sample, we can consider situations of no overlap, complete overlap, or partial overlap with the first sample. Cochran (1977) considers, under noninformative sampling, the optimal designs for the estimation of different parameters at the second occasion. It is intuitively clear that there are cases in which the information from the first occasion may be used to improve the current estimates. Hence we opt for dealing with the situation of partial overlap, for which the other situations can be considered as special cases.

At the second occasion, two independent samples are drawn, a matched sample and an unmatched sample. The matched sample, s2m, of size nm, is drawn from

1

s by the designPm

 

|s1 such that Pm



s2m |s1



is the conditional probability of choosing s2m on the second occasion, given that s1 was selected on the first occasion. The inclusion probabilities under this design are denoted



₂ ₁



2mi Pr is m|is

 for elements is1. The unmatched sample, s2u, of size m

u n n

n   , is drawn from s₁c _U _s₁ _{according to the design}

 

c

u s

P | ₁ such that



c



u

u s s

P 2 | 1 is the conditional probability of choosing s2u, given the complementary

sample _sc

1. The inclusion probabilities under this design are denoted



c



u u

i i s2 i s1

2 Pr  | 

 . Note that s2mand s2u are disjoint and are chosen

independently. The total sample at the second occasion is s2 s2m s2u. Let



_i _i



_i



_i _i



i y y x x x

y  1, 2 ,  1, 2 are the values of Yand X for unit i for the two occasions. The variable Y2i is observed for all elements in the second sample.

We assume that inclusion probabilities may depend on the values of

i i

i Y X

Y1 , 2 , 1 and X2ifor the same unit:



m _i _i



m



_i i



m

i i s |i s y x g y x

π2 Pr  2  1, ,  2 , (3.1)

for some function g2m and all elements is1, and



u _i



u

 

_i i

u

i Pri s2 |i s1c,y ,xi g2 y ,x

2    

 (3.2)

for some function g2u and all units is1c.

Let I2mi 1 if is2m and I2mi 0 if is1s2m. Also, let I2ui 1 if is2u and

0 2ui 

I if is1c s2u. To find the marginal distribution of the matched sample

observations, we treat the first sample as if it were a population. Assume that Y2i

denotes the value of a response variable Y2, associated with unit i that belongs to the new ‘population’ s1 



1,...,n



. If Y2i depends on Y1i and xi, then the

conditional sample pdf of Y2i is defined analogously to (2.2) and (2.3) as:













i i m

i i i s i i m i

i i

s _i _s _y _x

x y y f x y s i x

y y f _m

, | Pr

, | ,

| Pr

, |

1 2

1 2 2

1

2 1

2 



(7)

which can be written as:













i i m

i s

i i i s i i m

i s i i i

s _E _y _x

x y y f x y E

x y y f _m

, |

, | ,

| ,

|

1 2

1 2 2

1 2

1

1 1

2



 (3.4)

where











m _i _i



_s



_i _i _i



_i

i s i

i m

i

s π |y ,x E π |y x f y |y x dy

E₁ ₂ ₁ ₁ ₂ , ₁ ₂ ₁, ₂ (3.5)

and Es₁ denotes the expectation under the first sample pdf fs1.

Similar to (1.5), we have:









i i m

i s i i m

i

s π |y x _E _w _|_y _x

E

m ,

1 ,

2 2

2

1  (3.6a)









i i m

i s i i m

i

s π |y x _E _w _|y _x

E

m ,

1 ,

1 2 1

2

1  (3.6b)









i i m

i s

i i i m

i s i i i s

x |y w E

x |y y w E x |y y E

m m

, , ,

1 2

1 2 2 1

2

2 2

1  (3.6c)

 



 

_m



i s

i m

i s i

s _E _w

y w E y E

m m

2 2 2 2

2 2

1  (3.6d)

 

_m

i s m

i

s π _E _w

E

m 2

2

2 1

1

 (3.6e)

where m

i m

i

w2 12 andEs2m denotes the expectation under the matched sample

pdf.

Using (3.4) and (3.6, a, b), we have the following:











m _i _i



i s

i i m

i s

i i i s

x y | w E

x |y w E x |y y f

x |y y f

m m m

, , ,

,

2 1 2 1

2 1 2

2 2

1

2  (3.7)

In order to find the distribution of the unmatched sample, we treat the sample-complement units in the first occasion as if it were a population. In the same way as in the matched sample one can obtain the conditional marginal unmatched sample pdf of Y2i which is given by:













i u

i s

i i s i i u

i s i i

s _E _x

x y f x y E

x y f

c

c c

u |

| ,

| |

2

2 2

1

1 1

2 



 (3.8)

where











u _i _i



_s



_i _i



_i

i s i

u i

s x E y x f y x dy

E c ₂ | c ₂ | ₂ , c ₂ | ₂

1 1

1



(3.9)

and E_sc

(8)

Analogously to (3.6), we have the following relationships:









i i u

i s i i u

i

s π |y x _E _w _|y _x

E

u c

2 2 2 2

2 2

, 1 ,

2

1  (3.10a)









i u

i s i u

i

s π |x _E _w _|_x

E

u c

2 2 2

2 1

1

 (3.10b)









i u

i s

i i u

i s i i

s _E _w _|_x

x | y w E x

| y E

u u c

2 2

2 2 2 2

2

2 2

1  (3.10c)

 



 

_u



i s

i u

i s i

s _E _w

y w E y E

u u c

2 2 2 2

2 2

1  (3.10d)

 

_u

i s u

i

s π _E _w

E

u c

2 2

2 1

1

 (3.10e)

where u

i u

i

w2 1 2

;   .

Using (3.8) and (3.10a) and (3.10b), we have the following:











u _i



i s

i i u

i s

i i s

x | w E

x |y w E x

| y f

x | y f

u u

c u

2 2

2 2 2 2

2 2 2

2 2

1

2  , (3.11)

4. Marginal distributions of matched-complement and unmatched-complement samples

The matched-complement and unmatched-complement sample pdf’s are needed to predict finite population totals for the second occasion using sample data for both occasions under informative sampling.

Similar to (2.6), the conditional marginal matched sample-complement pdf, i.e., the pdf for units is2m is defined as:







































mi i i



s

i i i s i i m

i s

i i m

i

i i i s i i m

i

m i i i i s i i i s

x |y w E

x |y y f x y | w E

x , |y I

x |y y f x y | I

I x |y y f x |y y f

m

m m

c m

, 1

, ,

1 0 Pr

, ,

0 Pr

0 ,

, ,

1 2

1 2 2

1 2

1 2 2

2 1 2 1

2

2 2

1 1

2

 



  

 

(4.1)

Also, the following relationship holds:























mi i i



s

i i i m

i s

i i m

i s

i i i m

i s

i i i s

x |y w E

x |y y w E

x |y π E

x |y y π E

x |y y E

m m c

m

, 1

,

1 2

1 2 2

1 2

1 2 2 1

2

2 2

2 2 2

  

(9)

Application of (3.7) and (4.1) yields the following ratios:











m



_i _i



i s i i m i s i i i s i i i s x y | w E x |y w E x |y y f x |y y f m m c m m , 1 , 1 , , 2 1 2 1 2 1 2 2 2 2 2    (4.3) and













 



m



_i _i



Analogously to (4.1), the conditional marginal unmatched sample-complement pdf, i.e., the pdf for units is2u is defined as:







































ui i



s i i s i i u i s i u i i i s i i u i u i i i s i i s x | w E x | y f x |y w E x | I x | y f x |y I I x | y f x | y f u u u c c c u 1 , 1 0 Pr , 0 Pr 0 , 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 2         (4.5)

Also, we have the relationship:























ui i



Using (3.11) and (4.5) we obtain the following ratios:











u



_i



_i



i s i u i s i i s i i s x |y w E x | w E x | y f x | y f u u c u u , 1 1 2 2 2 2 2 2 2 2 2    (4.7) and











 



u



_i



i s i i u i s i i u i s i u i s i i s i i s x | w E x , |y w E x |y w E x | w E x | y f x | y f u u u u c c u 1 , 1 2 2 2 2 2 2 2 2 2 2 2 2 1 2    (4.8)

5. Prediction of finite population totals under informative sampling

Sverchkov and Pfeffermann (2004) develop various methods for prediction of finite population totals under informative sampling for a single occasion using only information obtained from that occasion. In this and subsequent sections we extend these methods to predict finite population totals under informative sampling using data obtained from sampling on two occasions.

Let T2 



_iN__{1 2}y _i define the population total that we want to predict using the

(10)

variables that may contain some or all of the design variables. For the prediction process we have the following available information:

(a) The information that comes from the first occasion denoted by:











yi,π i is ,xi,I i iU





1 1 1 : 1 1 1 : (5.1)

(b) The information that comes from the second occasion denoted by:



















u c



i i u u

i i m

i i m m

i

i,π i s x ,I i s y ,π i s x ,I i s

y2 2 2 2 2 1 2 2 2 2 1

2  :  ; :  ; :  , ; 

 (5.2)

Thus the available information for the prediction process is 12.

Let Tˆ₂ Tˆ₂

 

 define the predictor. The mean square error (MSE) of Tˆ2 given  with respect to the population pdf is defined by:

 











 













   



 _ _



| |

ˆ

| ˆ

ˆ

2 2

2 2 2 2

T V T

E T

T T E T MSE

p p

p

(5.3)

Note that Vp



T2 |



does not depend on Tˆ2 , thus MSE

 

Tˆ2 is minimized when







 |

ˆ₂ _E _T₂

T p .

Now Ep



T2|



can be composed as:









































0 1

,

2 1 2 1

2 1 2

1

2 2

2 2 1

2 1

2

2 2 2

2

2 2 2

2

2 1 2

2



 

 

 

 

  

  

 

   



 _

 

c u

c

u c c

m m

c u u

c m m

s

i s i

s

i s i

s

i s i

s

i s i

s i

u i i p s

i

u i i p

s i

m i i p s

i

m i i p

U

i p i

N

i i

p p

| y E |

y E |

y E

,I | y E ,I

| y E

,I | y E I

| y E

| y E |

y E | T E

(5.4)

where in the last equality we assume that



y2j, js2m



and





y2i,2mi



:is2m



are

independent given y1j,xj, also



y2j : js2u



and





y2i,πu2i



:is2u



are

independent given xj.

But we know the values



y2i :is2m



and



y2i :is2u



, so that to predict T2, we need to predict the Y2 values not in s2m andnotins2u. Equation (4.4) can be written as:

















 

 

 



c u

c u c

m c

m u

m j s

j j s s

j s j j j

s

i i

s

i i

p T y y E y y x E y x

E

2 2 2

2 2

2

| ,

|

| 2 2 2 1 2

(11)

Thus the prediction of T2 reduces to the prediction of Esc



y j y j xj



m 2 | 1 ,

2 and



j j



s y |x

E c u 2

2 .

6. Prediction methods

In this section we consider the non-parametric and semi-parametric prediction of 2

T .

6.1 Non-parametric prediction

In this subsection we consider estimation of the expectations Esc



y j y j xj



m 2 | 1 ,

2 and Esc



y j j



u 2 |x

2 , and hence prediction of T2, based on estimation of only sample expectations. The key to this method are the relationships (4.2) and (4.6). These relationships suggest the following two-step procedure:

Step-one:

(a) Estimate Es2m



w2mi |y1i,xi



and hence





i i m

i s

m i

mi _E _ww _y _x

q

m 1 | ,

1 1 2

2

2 



 by regressing

m i

w2 against



y1i,xi



,is2m. Denote the resulting estimate by:



| ,



1

ˆ 1

ˆ

1 2

2

2 

 

i i m

i s

m i mi

x y w E

w q

m

(6.1)

and let y2mi qˆmiy2i.

For further discussion on estimation of this conditional expectation, under single stage informative sampling, see Eideh (2010).

(b) Estimate Es2u



w2ui |x2i



and hence





i u

i s

u i

ui _E _ww _x

q

u 2 2

2 | 1 1

2 



 by regressing u

i

w2

against x2i,is2u. Denote the resulting estimate by:



|



1

ˆ 1

ˆ

2 2 2

2 

 

i u

i s

u i ui

x w E

w q

u

(6.2)

and let y2ui qˆuiy2i.

Step-two:

(a) Estimate Es2m



y2mi |y1i,xi



by regressing 2 m

i

y against



y1i,xi



and substitute in

(4.2) to get the estimate of Esc



y i yi xi



m 2 | 1,

2 .

(b) Estimate Es2u



y2ui |x2i



by regressing 2 u i

y against xi and substitute in (4.6) to

get the estimate of Esc



y i x i



u 2 | 2

(12)

Thus by (5.5) the prediction of T2 is given by:













        c u u c m m u

m j s

j j uj s s

j s mj j j j

s i i s i i y q E x y y q E y y T 2 2 2 2 2 2 2 2 2 1 2 2 2 1 ,

2 ˆ ˆ | , ˆ ˆ |x

ˆ _(6.3)

This predictor depends on the models holding for the matched sample observations y2mi q2miy2i, is2m and the unmatched sample observations

i u

i q y

y₂  ₂ ₂ , is2u.

Another predictor can be introduced which bases on the estimation of



_j _j _j



s y y x

E 2_m 2 | 1 , 2 from the matched sample data, and the estimation of



j j



s y x

E 2_u 2 | 2 from the unmatched sample data. The estimator depends on the relationship:























































































                                     c

u cu

u c

u u

c

m cm

m c

m m

c

u cu

u c

u u

c

m cm

m c m m c u u c u u c m m c m m c u c u c m c m s

j s j j u u j s s j s j j

s

j s j j j m m j s s j s j j j

s

j s j j j s s j s j j j

s

j s j j j j s s j s j j j j j

s

j s j j s j j s j j

s

j s j j j s j j j s j j j

s

j s j j

s

j s j j j

x | y E y E n n N n n N x | y E x |y y E y E n n n n x |y y E x | x | y E y E x | y E x |y x |y y E y E x |y y E x | y E x | y E x | y E x |y y E x |y y E x |y y E x | y E x , |y y E 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 1 2 2 2 2 1 1 2 2 1 2 2 2 2 1 2 1 2 1 2 2 1 2 1 , 1 , , , , , , , (6.4) where

sample.

unmatched

the

of

size

the

is

and

sample

matched

the

of

size

the

is

u m

n

The

nature of this approximation is based on Sverchkov and Pfeffermann (2004), equation (4.10).

Using (4.2) and (4.6), the matched sample-complement and the unmatched sample-complement means in the last two rows of (6.4) can be estimated, respectively, by:













 















                   m m m m m m m c m s

i i s i i i

(13)

and













 

















 

 

    

  

 

 



u

u u

c u

s

i i s i i

s l

u l u

u i

u

j j s j u

j s

u j s

j j s j s

x y E y w

n w n

x y E y w

E w E

x y E y E

2

2 2

2

| ˆ

1

1 1

1

| ˆ

1 ˆ

| ˆ

ˆ

2 2

2 2 2

2

(6.6)

Thus we have the following predictor for T2:























ˆ



|



)

1

1 1

( 1

) , | ˆ

1

1 1

( 1

| ˆ

, | ˆ

ˆ

2

2 2

2

2 2 2

2 2

2

2 2

1 2 2

2 2

2 1

2 2

, 2





 



 

 



 

  

 



u

u m

m

c u

u c

m m u

m

s

i i s i i

s l

u l u

u i

u u

s

i i s i i i

s l

m l m

m i

m m

s

j s j j

s

j s j j j

s

i i

s

i i

x y E y w

n w n

n n N

x y y E y w

n w n

n n

x y E x

y y E y

y T

(6.7)

This predictor is fully determined by estimating only the conditional expectations:



_j _j _j



s y y x

E 2_m 2 | 1 , and Es2u



y2j |xj



, which can be carried out using an appropriate

regression analysis.

6.2 Semi-parametric estimation under given matched sample-complement and given unmatched sample-complement models

Following Sverchkov and Pfeffermann (2004), in this section we show that if the models holding for units outside the matched and unmatched samples can be identified and estimated properly from the matched and unmatched data, it is possible to estimate the unknown parameters of these models without having to estimate the regressions Es2m



w2mi |y1i,xi



andEs2_u



wu2i |xi



.

Assume that the matched sample-complement model for units outside the matched sample is:







mj j j



_s



mj j j



m



j j



m

s

m j j j j

s j x y r x

y E

x y E

x y C y

c m c

m

2 1

2 2 1

2

2 1

2

, , ,

| ,

0 ,

|

,

2

2   

 

 





(6.8)

where Cm



y1j,xj



is a known function of



y1j,x2j



that depends on unknown

vector parameter m and rm



y1j,xj



is a known function of



y1j,xj



with

2

(14)

The vector parameter m can be estimated by:























,



| ,

, ,

| 1

1 min

arg

, | ,

, min

arg

1 1

2 1 ~ 2 1

2 2 ~

1 2

1 1 ~ 2 ~

2 2 2

    

 

 _

    

  

  

    

 

 



j j j

j m

j j j

j j m

j s

m j s

j j j

j m

j j j

s m

x y x

y r

x y C y x y w

E

w E

x y x

y r

x y C y E

m

m m m

m c

m m

 



(6.9)

where the second row of (6.9) is obtained using (4.2).

Thus the vector parameter m can be estimated, based on only the matched

sample data, by:















 

   

 

 



m

s

i m i i

i i i

mi

m _r _y _x

x y C y q

2 ,

, ˆ

min arg ˆ

1

2 1 ~ 2 ~

1

 

 (6.10)

where





1 , |

ˆ 1

ˆ

1 2

2

2 

 

i i m

i s

m i mi

x y w E

w q

m

.

Similarly suppose that the unmatched sample-complement model for units outside the unmatched sample is of the form:

 



uj j



_s



uj j



u

 

j u

s

u j j j

s j x r x

E x

E

x C y

c u c

u

2 2

2 2 2

2 2

, |

, 0 |

2

2   

 

 





(6.11)

where Cu

 

xj is a known function of xj that depends on an unknown vector

parameter

 

isaknownfunctionof with unknown.

and



2



_u r_u x_j x_j

The vector parameter u can be estimated by:

 





 







1 |





 



|

1 min

arg

| min

arg

2 ~ 2 2

2 ~

2 ~ 2 ~

2 2 2

    

 

 _

    

  

  

    

 

 



i i

u i i

i u

i s

u i s

j j

u

j j

s u

x x

r x C y x w

E w E

x x

r x C y E

u

u u u

u c

u u

 



(6.12)

where the second row of (6.12) is obtained using (4.6).

Thus the vector parameter u can be estimated, based only on the unmatched

sample data, by:

 





 



 

   

 

 



u

u _i _s _u _i

i i

ui

u _r _x

x C y q 2

2 ~ 2 ~

1 argmin ˆ

ˆ 



(15)

where





1 | ˆ 1 ˆ 2 2 2    i u i s u i ui x w E w q u .

In our situation we have the following estimates of Esc



y j y j xj



m 2 | 1 ,

2 and



j j



s y |x

E c u 2

2 :



| ,



C



,



ˆ ₂ ₁ _ˆ ₂

1

2 j j j j j

s y y x y x

E

m c

m   (6.14)

and



j j



 

j

s y C x

E

u c

u 1

2 2 |x ˆ

ˆ



 (6.15)

Thus the predictor of the finite population total, T2, is given by:















 



                c u u c m m u m c u c u c m c m u m s

j β j

s

j β j j

s

i i

s

i i

s

j s j j

s

j s j j j

s i i s i i , x C x y C y y x | y E x |y y E y y T 2 1 2 1 2 2 2 2 2 2 2 2 ˆ 2 ˆ 2 2 2 1 2 2 2 3 2 , ˆ , ˆ ˆ (6.16)

Now we can base our prediction of the finite population total, T2, without conditioning on the y1i andxi or xi, because from (6.8) and (6.11) we can deduce

that:















,



, , | , , 1 2 1 2 1 2 1 1 2 2 2                    j j m j j j s j j j j m j j j

s _r _y _x

x y C y E x y x y r x y C y E m c m m c m   (6.17) and

 





 



 



_      _          _ i u i i s j j u j j

s _r _x

x C y E x x r x C y E u c u u c u 2 2 2 2 2 2 |   (6.18)

Thus, application of (4.2) and (4.6) to the right hand sides of (6.17) and (6.18) but without conditioning on y1i and xi or xi and since Es2m

 

w2mi and

 

u i

s w

E₂_u ₂ are constants, leads to the following estimates:











            m m s

i m i i

i i i

m i

m _r _y _x

x y C y w 2 , , 1 min arg ˆ 1 2 1 ~ 2 2 ~ 2    (6.19) and







 





         _   u u

u _i _s _u _i

i i

u i

u _r _x

x C y w 2 2 ~ 2 2 ~

2 argmin 1

ˆ 



 (6.20)

Hence the following predictor of the finite population total, T2:







 



        c u u c m m u

m j s

j s

j j j

(16)

Note that the predictor Tˆ2,4 does not require the identification and estimation of the expectations: Es2m



w2mi |y1i,xi



andEs2u



w2ui |xi



, while Tˆ2,3 requires that.

7. Examples

7.1 Prediction with no auxiliary variables

In this section we assume that there are no auxiliary variables x2and y1, and in the next section we assume the auxiliary variable y1. So the predictor is given by:

 



 



        u c u m S c m u

m j s

j s s

j s j

s

i i

s

i y i y E y E y

T 2 2 2 2 2 2 2 2 2 2 2 ˆ ˆ _(7.1)

It follows from (4.2) and (4.6) that:

























_                                                 



      i u i s u i s u i m i s m i s m s i i s i i s

j _s u_i j u

i s

s

j _s m_i i

m i s s i i s i i y w E w E n n N y w E w E n n y y y w E w E y w E w E y y T u u m m u m U _u u M S _m m u m 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 ˆ 1 ˆ 1 ˆ 1 ˆ 1 ˆ 1 ˆ 1 ˆ 1 ˆ ˆ 2 2 2 2 2 2 2 ₂ 2 2 2 2 2 2 (7.2)

Estimating the four unknown expectations in (7.2) by the respective sample means yields the following estimate:











































1

 

1





1

 

1



(17)

For sampling design such that w n

m

s i

m

i 



₂ 2 and 2

n N w

u

s i

u

i