ESTIMATION OF FINITE POPULATION MEAN USING RANDOM NON–RESPONSE IN SURVEY SAMPLING

(1)

Non–Response in Survey Sampling

Housila P. Singh

School of Studies in Statistics Vikram University, Ujjain-456010 India

[email protected]

Ramkrishna S. Solanki

School of Studies in Statistics Vikram University, Ujjain-456010 India

[email protected]

Abstract

This paper consider the problem of estimating the population mean under three different situations of random non–response envisaged by Singh et al (2000). Some ratio and product type estimators have been proposed and their properties are studied under an assumption that the number of sampling units on which information can not be obtained owing to random non– response follows some distribution. The suggested estimators are compared with the usual ratio and product estimators. An empirical study is carried out to show the performance of the suggested estimators over usual unbiased estimator, ratio and product estimators. A generalized version of the proposed ratio and product estimators is also given.

Keywords: Auxiliary variable, Study variate, Random non-response, Bias and Variance.

1. Introduction

(2)

observation are usually encountered in opinion polls, market research surveys, mail enquiries, socio economic investigation, medical studies and many other experiments. Statisticians have identified for some time that failure to account for the stochastic nature of incompleteness can spoil inference. An obvious question arises what one needs to assume to justify ignoring the incomplete mechanism. Rubin (1976) advocated three concepts: missing at random (MAR), observed at random (OAR) and parameter distribution (PD). Rubin defined “the data are MAR if the probability of the observed missing ness pattern, given the observed and unobserved data, does not depend on the value of the unobserved data”. Heitzan and Basu (1996) have distinguished the meaning of missing at random (MAR) and missing completely at random (MCAR) in a very nice way, see Singh et al (2000). Tracy and Osahan (1994) studied the effect of random non-response on the conventional ratio estimator of the population mean in two situations: (i) non-response on the study as well as the auxiliary variables and (ii) non-non-response on the study variable only. Singh et al (2000) suggested three regression type estimators, which are further generalized by Singh and Tracy (2001) and Singh et al (2007), in the presence of random non-response in different situations under an assumption that the number of sampling units on which information can not be obtained due to random non response follow some distribution. Singh and Joarder (1998), Singh et al (2003) and Ahmed et al (2005) have studied the effect of random non-response on different estimators of population variance.

In this paper we have studied the effect of random non-response on the Tuteja and Bahl (1991) ratio and product estimators and on their generalized version in different situations under an assumption that the number of sampling units on which information can not be obtained owing to random non-response follows some distribution.

2. Distribution of Random Non-response and Some Expected values

Let U = (U1,U2,...,UN) denote the population of units from which a sample

random sample of size n is drawn without replacement. If r{r = 0,1,2,…,(n-2)} denotes the number of sampling units on which information could not be obtained due to random non-response, and then the remaining (n-r) units in the sample can be treated as simple random sampling without replacement (SRSWOR) sample from U. We assume that r lies between 0 and (n-2) i.e. 0≤ r ≤(n-2). We assume that if p denotes the probability of non-response among the (n-2) possible values of responses, then r has the discrete distribution:

P(r) = r n 2 r

r 2 n _C _p _q

) p 2 np (

) r n

(   



 _(2.1)

where q = (1-p) and r = 0,1,2,…,(n-2).

Let us define,

) r n ( o 

 = 1

Y y₍_n _r₎



 _,

) r n ( 1 

 =

X x₍_n__r₎

- 1, 1

X x

1 

 ,





    

r n

1

i i

1 )

r n

( (n r) y

(3)

) r n (

x  = (nr)1







r n

1

i i

x , x = _n1





n

1

i i

x , 2 ) r n ( y

s _ = ₍_n__r_₁₎1





 



r n

1 i

2 ) r n (

i y }

y

{ ,

2 ) r n ( x

s _ = ₍_n__r_₁₎1









r n

1 i

2 ) r n (

i x }

x

{ , 2

x

s = ₍_n_₁₎1







n

1 i

2 i x)

x

( ,

) r n ( xy

s _ = ₍_n__r_₁₎1





 

 



r n

1

i i (n r)

) r n (

i x }{y y }

x

{ , 2

y

S = ₍_N_₁₎1







N

1 i

2 i Y)

y

( ,

2 x

S = ₍_N_₁₎1







N

1 i

2 i X)

x

( , S_xy= ₍_N_₁₎1





 

N

1

i i i

) Y y )( X x

( , X= _N1





N

1

i i

x and

Y=_N1





N

1

i i

y .

The probability model defined at (2.1) is free from actual data values, hence it can be considered as a model suitable for MAR situation. Then under the probability model given by (2.1), we have following results:

E {_o₍_n__r₎_{} = E {}

) r n ( 1 

 _{} = E (}

1

 _{) = 0,}

E { 2

) r n ( o 

 _{} = [}

) p 2 nq (

1

 -N 1 _] 2

y

C , E { 2

) r n ( 1 

 _{} = [}

) p 2 nq (

1

 -N 1 _] 2

x

C , E ( 2

1

 _{) =} ) N

1 n 1

(  2

x

C ,

E {_o₍_n__r₎ ₁₍_n__r₎} = [

) p 2 nq (

1

 -N 1 _{] ρ}

x yC

C , E {_o₍_n__r₎ ₁} = )

N 1 n 1

(  ρC_yC_x,

E {₁₍_n__r₎ ₁} = ) N 1 n 1

(  2

x

C

where,

y

C =S_y Y, C_x=S_x X, ρ = S_xy/ (S_xS_y),

It may be noted that if p = 0 i.e. there is no non-response, the above expected values coincides with the usual results.

3. Proposed Strategies for Finite Population Mean

3.1 Strategy – I:

We are considering the situation when random non-response exists on both the

study variable y and the auxiliary variable x and population mean Xof the

auxiliary variable x is known. The classical ratio and product estimators in this situation are respectively defined by,

R 1

t = y(nr){ }

x X

) r n

(  (3.1)

P 1

t = y(nr){ }

X

(4)

To the first degree of approximation, the biases and variances of t₁_Rand t₁_Pare respectively given by,

B (t₁_R) = [

) p 2 nq (

1

 N

1

 _] ) X S (

2

x ₍_R__₎ _(3.3)

B (t₁_P) = [

) p 2 nq (

1

 N

1

 _] ) X S (

2

x _(3.4)

Var (t₁_R) = [

) p 2 nq (

1

 N

1

 _{] [} 2 y

S +R 2 x

S (R-2β)] (3.5)

Var (t₁_P) = [

) p 2 nq (

1

 N

1

 _{] [} 2 y

S +R 2 x

S (R+2β)] (3.6)

where 2

x xy S

S



 is the population regression coefficient of y on x and





 



N

1

i i i

) 1 N /( ) Y y )( X x

( and 2

x

S is defined earlier.

We suggest the modified ratio and product estimators on the lines of Tuteja and Bahl (1991) respectively as,

Re 1

t = y(nr) exp }

x X

x X {

) r n (

 



 _(3.7)

Pe 1

t = y(nr) exp }

X x

{

) r n (

 

 

(3.8)

The generalized version of t₁_Reand t₁_Peis given by ,

) 1 a ( 1

t = y(nr)exp }

) X x

(

) X x

( a {

) r n (

) r n ( 1

 

 

(3.9)

where ‘a₁’ is a constant to be chosen suitably. For a₁ = -1, ₎

1 a ( 1

t reduces to the

modified ratio estimator t₁_Rewhile for a₁= 1, it reduces to the modified product

estimatort₁_Pe. If we set a₁= 0, the estimator ₎

1 a ( 1

t bowls down to the usual

unbiased estimatory(nr).

To obtain the biases and mean squared errors oft₁_Re, t₁_Pe and ₎

1 a ( 1

t , we express

Re 1

t , t₁_Pe and ₎ 1 a ( 1

t in terms of ε’s we have,

Re 1

t =Y (1_o₍_n__r₎) exp ] } 2

{ [

) r n ( 1

 

  

 (3.10)

Pe 1

t =Y (1_o₍_n__r₎) exp ] } 2

{ [

) r n ( 1

 

  

(5)

) 1 a ( 1

t =Y (1_o₍_n__r₎) _exp ] } 2 { a [ ) r n ( 1 ) r n ( 1 1      (3.12)

Expanding the right hand sides of (3.10), 3.11) and (3.12) in terms of ε’s and neglecting terms of ε’s having power greater than two we have,

Re 1

t  Y[1_o₍_n__r₎

2 1  ) r n ( 1   + 8

1₍₃_ _

 2 ) r n (

1 4o(nr) 1(nr))] or,

(t₁_Re-Y) = Y [_o₍_n__r₎

2 1  ) r n ( 1   ₊ 8

1₍₃_ _

 2 ) r n (

1 4o(nr) 1(nr))] (3.13)

Pe 1

t  Y[1_o₍_n__r₎+

2 1 ) r n ( 1   + 8 1₍₄ ) r n ( o 

 ₁₍_n__r₎ 2 ) r n ( 1    )] or,

(t₁_Pe-Y) = Y[_o₍_n__r₎₊

2 1 ) r n ( 1   ₊ 8 1₍₄ ) r n ( o   ) r n ( 1   2 ) r n ( 1  

 )] (3.14)

) 1 a ( 1

t  Y[1_o₍_n__r₎+

2 a₁ ) r n ( 1   + 8 ) 2 a (

a₁ ₁  ₂

) r n ( 1   + 2 a₁ ) r n ( o 

 ₁₍_n__r₎] or, ( ₎ 1 a ( 1

t -Y) = Y[_o₍_n__r₎+

2 a₁ ) r n ( 1   + 8 a₁

{(a₁- 2) 2 ) r n ( 1 

 + 4_o₍_n__r₎ ₁₍_n__r₎}] (3.15)

Taking expectation of both sides of (3.13), (3.14) and (3.15) we get the biases oft₁_Re, t₁_Pe and ₎

1 a ( 1

t to the first degree of approximation, respectively as:

B (t₁_Re) = [

) p 2 nq ( 1  N 1  ] ) X 8 S

( 2x _(3R_₄_₎ _(3.16)

B (t₁_Pe) = [

) p 2 nq ( 1  N 1

 ](₈S_X2x )(4β – R) (3.17)

B { ₎

1 a ( 1

t } = [

) p 2 nq ( 1  N 1  _] ) X 8 S a

( 1 2x _{( _ 1

a 2)R + 4β} (3.18)

It is to be noted that for a1= -1 and a1 = 1 in (3.18) we get the biases of t₁_Re

and t₁_Pe respectively given by (3.16) and (3.17). It is also interesting to note that if

we set a1 = -2 and a1 = 2 in (3.18) we get the biases of the classical ratio (t₁_R)

and product (t₁_P) estimators given by (3.3) and (3.4) respectively.

Squaring both sides of (3.13), (3.14) and (3.15) and neglecting terms of εs'

having power greater than two we have, (t₁_Re - Y)2 ₌ _Y2_[ 2

) r n ( o 

 + ₄1{2 _ 

) r n (

1 4o(nr) 1(nr)}] (3.19)

(t₁_Pe - Y)2 ₌ _Y2_[ 2 ) r n ( o   ₊ 4 1 { 2 ) r n ( 1 

 _{+ 4}

) r n ( o   ) r n ( 1 

(6)

{ ₎ 1 a ( 1

t - Y}2 ₌ _Y2_[ 2 ) r n ( o 

 + (a₄1) {a1₁2₍_n_r₎ + 4o(nr) 1(nr)}] (3.21)

Taking expectation of both sides of (3.19), (3.20) and (3.21) we get the variances of t₁_Re,t₁_Pe and ₎

1 a ( 1

t to the first degree of approximation respectively as

Var (t₁_Re) = [

) p 2 nq (

1

 N

1

 _{] [} 2 y

S + )

4 RS

( 2x ₍_R_- ₄_β₎_] _(3.22)

Var (t₁_Pe) = [

) p 2 nq (

1

 N

1

 ] [ 2 y

S + (RS₄2x) (R+ 4β)] (3.23)

Var { ₎

1 a ( 1

t } = [

) p 2 nq (

1

 N

1

 _{] [} 2 y

S + )

4 RS a

( 1 2x ₍_a _R

1 +4β)] (3.24)

Var {y₍_n_r₎} = [₍_nq1_₂_p₎ _N1 ] S2_y (3.25)

where Var (t₁_R), Var (t₁_P),Var (t₁_Re),Var (t₁_Pe) are respectively given by (3.5), (3.6), (3.22), and (3.23).

3.2 Choice of the Optimum Value of the Constant ‘a1’ and Thus the

Optimum Estimator

The variance of the proposed class of estimators ₎

1 a ( 1

t is minimized for

1

a = - 2_Rβ (3.26)

Thus the resulting minimum variance of ₎

1 a ( 1

t is given by

min. Var { ₎

1 a ( 1

t } = [

) p 2 nq (

1

 -N 1 _] 2

y

S 1( - ρ2) _(3.27)

If p= 0, then the variance in (3.27) reduces to the variance of the usual

regression estimator. In this situation, we can estimate ‘a1’ by

1

aˆ = -2 {

) r n (

Rˆ ˆ

 



}, (3.28)

where ˆ₍_n__r₎ ₌ ₂

) r n ( x

) r n ( xy

s s

 

, and Rˆ₍_n__r₎ = ) r n (

) r n (

x y

 

. Thus we have the following theorems:

Theorem 3.1 -: The estimator based on estimated optimum value

1

aˆ = -2 }

Rˆ ˆ {

) r n (

 



, if random non-response exists on both the study and auxiliary

variables, is given by

) 1 aˆ ( 1

t = y(nr) exp[ ]

} x X { Rˆ

} x X { ˆ 2

) r n ( )

r n (

) r n ( )

r n (

 

  

(7)

It can be easily shown to the first degree of approximation that

} t { Var . min ) 1 ( S ] N 1 ) p 2 nq (

1 [ } t {

Var 1(aˆ₁)  _  2y 2  1(a₁) (3.30)

3.3 Strategy – II:

We are considering the situation when information on the study variable y could not be obtained for r units while information on the auxiliary variable x is available

and the population mean X of the auxiliary variable x is known. For example in

an industrial survey, the enumerator may not be able to obtain data on the output of some factories while data on the number of workers or the size of machinery may be obtained readily from the records, see Tracy and Osahan (1994).

In such circumstances, the usual ratio and product estimators are defined by, R

2

t = y(nr) (X x) (3.31)

P 2

t = y(nr)(x X) (3.32)

To the first degree of approximation, the biases and variance of the estimators R

2

t and t2P are respectively given by,

B (t2R) = _X1 (_n1  _N1 )S2x(R – β) (3.33)

B (t2P) = (_n1  _N1 ) _X

S2

x _β _(3.34)

Var (t2R) = (_n1  _N1 ) [S2y +RSx2 (R- 2 β)] + [₍_nq₊1₂_p₎ _n1] S2y (3.35)

Var (t2P) = _n

1 (

N 1

 _{) [} 2 x 2 y +RS

S (R+ 2 β)] + [₍_nq₊1₂_p₎

n 1

 _] 2 y

S (3.36)

We suggest the modified ratio and product estimators for population mean Y

respectively as,

Re 2

t =y(nr) exp )

x X

x X (

 

(3.37)

Pe 2

t =y(nr) exp )

X x

X x (

 

(3.38)

and the generalized version of t2Re and t2Pe is given by,

) 2 a ( 2

t =y(nr) exp { }

) X x (

) X x ( a₂

 

(3.39)

To the first degree of approximation, the biases and variances of the estimators Re

2

t , t2Pe and t2(a2) are respectively given by,

B (t2Re) = _n

1 (

N 1

 _{) (} ) X 8 S2

(8)

B (t2Pe) = (_n1  _N1 ) (₈_X)

S2

x ₍₄___R₎ _(3.41)

B {t2(a₂)} = (_n1  _N1 ) ( )

X 8

S a 2

x

2 ₍_a ₂₎_R ₄ ₎

2    (3.42)

Var (t2Re) = (_n1  _N1 ) [S2y+ ₄ )

RS

( x2 (R4)] + [₍_nq₊1₂_p₎

n 1

 ] 2 y

S (3.43)

Var (t2Pe) = (_n1  _N1 ) [S2y+(RS₄ ) 2

x ₍_R₊₄_β₎_{] + [}

) p 2 + nq (

1

n 1

 _] 2 y

S (3.44)

Var {t2(a2)} = (1_n  _N1 ) [S2y+ ₄ )

RS a

( 2 x2 (a2R+4β)] + [₍_nq₊1₂_p₎ _n1]S2y (3.45)

The variance of the estimator t2(a2)at (3.45) is minimized for

2

a = -2_Rβ (3.46)

Substitution of (3.46) in (3.45) yields the minimum variance of the estimator )

2 a ( 2

t as

min. Var {t2(a2)} = [₍_nq₊1₂_p₎- _n1]S2y+(_n1 _N1)Sy2 (12) (3.47)

If p = 0 then the variance in (3.47) reduces to the variance of the usual

regression estimator. We can estimate a2by putting aˆ₂= 

 



 

) r n (

Rˆ ˆ 2

, where

 

ˆ₍_n _r₎= ₂ x

) r n ( yx

s

s _

and 

r) n (

Rˆ =

x

y₍_n_r₎ _{, in (3.39) we get an estimator based on}

estimated optimum value of a2 for population mean Yas

) 2 aˆ ( 2

t = y(nr) exp }

) x X ( Rˆ

) x X ( ˆ 2 {

) r n (

  

  



. (3.48)

It can be shown to the first degree of approximation that

Var { ₎

2 aˆ ( 2

t } = min. Var {t2(a₂)}.

3.4 Strategy – III

Here we again consider the situation when information on the study variate y could not be obtained for r units while information on the auxiliary variable x is obtained for all the sample units. But the difference is that the population mean

X of the auxiliary variable x is not known. In this situation the classical ratio and

product estimators of the population mean Y are respectively defined by

R 3

t = y(nr) }

x x {

) r n

(  (3.49)

P 3

t = y(nr) }

x x

(9)

To the first degree of approximation the biases and variances of the estimators R

3

t and t3P are respectively given by

B (t3R) = ( _X

S2 x _{) {}

) p 2 + nq (

1

n 1

 } (R – β) (3.51)

B (t3P) = {₍_nq₊1₂_p₎ _n1}β_XS 2

x _(3.52)

Var (t3R) = {₍_nq₊1₂_p₎ _n1} [Sy2+ RS2x(R - 2β)] + (_n1  _N1) S2y (3.53)

Var (t3P) = {₍_nq1₊₂_p₎ _n1} [Sy2+ RS2x(R + 2β)] + (_n1  _N1) S2y (3.54)

We further define the modified ratio and product estimators for population mean

Y as

Re 3

t = y(nr) exp )

x x

x x (

) r n (

 

 

(3.55)

Pe 3

t = y(nr) exp )

x x

(

) r n (

 

 

(3.56)

and the generalized version of t3Reand t3Pe is given by

) 3 a ( 3

t = y(nr) exp[ }

) x x

(

) x -x ( a {

) r -n (

) r -n ( 3

 ] , (3.57)

where a3 is a constant to be chosen suitably. To the first degree of

approximation, the biases and variances of the estimatorst3Re, t3Pe and t3(a₃) are

respectively given by

B (t3Re) = {₍_nq₊1₂_p₎  _n1} ₈_X)

S

( 2x (3R - 4β) (3.58)

B (t3Pe) = {₍_nq₊₂_p₎

1

n 1

 _} ) X 8 S

( 2x _{(4β - R)} _(3.59)

B {t3(a3)} = {₍_nq₊1₂_p₎ _n1} ₈_X )

S a

( 3 2x {R(a3- 2) + 4β} (3.60)

Var (t3Re) = {₍_nq₊1₂_p₎  _n1}[S2y + (RS₄ ) 2

x _{(R - 4β)] +} ₎

N 1 n 1

(  2

y

S (3.61)

Var (t3Pe) = {₍_nq1₊₂_p₎  _n1}[S2y + ₄ )

RS

( 2x (R + 4β) ]+ )

N 1 n 1

(  2

y

S (3.62)

Var {t3(a₃)} = {₍_nq₊1₂_p₎ _n1}[S2y + ₄ )

RS a

( 3 x2 (a3R + 4β)] + (_n1 _N1) S2y (3.63)

the variance of t3(a₃) is minimized for

3

a = - 2_Rβ (3.64)

Thus the resulting minimum variance of t3(a3) is given by

min. Var {t3(a3)} = {₍_nq₊1₂_p₎ - _n

1

} 2

y

S (1__2)₊ ₎ N

1 n 1

(  2

y

(10)

The estimate of a3 based on the sample data at hand is given by

3

aˆ = aˆ₂= _

 



 

) r n (

Rˆ ˆ 2

(3.66)

Thus the resulting estimator based on estimated optimum value aˆ₃ is given by

) 3 aˆ ( 3

t = y(nr) exp[ ]

} x x { Rˆ

} x x { ˆ 2

) r n ( ) r n (

 



 



  

(3.67)

It can be shown to the first degree of approximation that

Var { ₎

3 aˆ ( 3

t } = min. Var {t3(a₃)} = {₍_nq₊1₂_p₎ - _n1 }S2y (12)+ (1_n _N1) S2y (3.68)

3.5 Efficiency Comparison

From (3.5){(3.35),(3.53)}, (3.6){(3.36),(3.54)}, (3.22){(3.43),(3.61)}, (3.23){(3.44), (3.62)} (3.24){(3.45), (3.63)} and (3.25) we have

(i) Var (tjRe) < Var {y(nr)}, (j = 1, 2, 3) if

R



> ₄1 (3.69)

(ii) Var (tjRe) < Var (tjR), (j = 1, 2, 3) if

R



< ₄3 (3.70)

(iii) Var (tjPe) < Var {y(nr)}, (j = 1, 2, 3) if

R



< -1₄ (3.71)

(iv) Var (tjPe) < Var (tjP), (j = 1, 2, 3) if

R



> -₄3 (3.72)

(v) Var {t (jaj)} < Var {y(nr)}, (j = 1, 2, 3) if

min. {0, )

R 4

(  } < a1 < max. {0, (_R4)} (3.73)

(vi) Var {t (ja₁)} < Var (tjR), (j = 1, 2, 3) if

min. {2, 2(1

-R 2

)} < a1 < max. {-2, 2(1 - _R

β 2

)} (3.74)

(vii) Var {t (jaj)} < Var (tjRe), (j = 1, 2, 3) if

min. {-1, (1 - 4_Rβ )} < a1< max. {-1, (1 - _R

β 4

(11)

(viii) Var {t (ja_j)} < Var (tjP), (j = 1, 2, 3) if

min. {2, -2(1 + 2_Rβ)} < a1 < max. {2, -2(1 + 2_Rβ )} (3.76)

(ix) Var {t (ja_j)} < Var (tjPe), (j = 1, 2, 3) if

min. {1, -(1 +

R 4

)} < a1 < max. {1, -(1 + 4_Rβ)} (3.77)

Combining (3.69) and (3.70) we find that the proposed estimator tjRe(j = 1, 2, 3)

is better than usual unbiased estimator y₍_n_r₎ and ratio estimator tjR(j = 1, 2, 3) if

4 1_<

R



< ₄3 (3.78)

Further combining (3.71) and (3.72) we obtain that the proposed estimator jPe

t (j = 1, 2, 3) is more efficient than usual unbiased estimator y(nr) and the

product estimator tjP (j = 1, 2, 3) if

- ₄3 <

R



< - ₄1 (3.79)

4. Estimation from the Sample of the Percent Relative Efficiency

In this section we have obtained the percent relative efficiencies (PRE) of the proposed estimators and their estimates from the sample in three different situations, see Sukhatme et al (1970, pp. 231-232).

4.1 Strategy -I:

Percent relative efficiencies (PREs) of t1R,t1Re and t1(aˆ₁) with respect to y(nr) are respectively given by

100 * )] k 2 1 ( A 1 [ ) y , t(

PRE 1

) r n ( R

1      (4.1)

100 * )] k 4 1 )( 4 / A ( 1 [ ) y , t(

PRE 1

) r n ( Re

1      (4.2)

100 * ) 1 ( ) y , t(

PRE 2

) r n ( ) 1 aˆ (

1    , (4.3)

where A (R S /S2)

y 2 x 2

 _and k(/R)_.

Further the percent relative efficiencies of t1(a₁) with respect to y(nr), t1R and t1Re are respectively given by

100 * )] k 4 a )( 4 / A ( a 1 [ ) y , t(

PRE 1

1 1

) r n ( ) 1 a (

1      (4.4)

100 * )] k 4 a )( 4 / A ( a 1 [

)] k 2 1 ( A 1 [ ) t , t( PRE

1 1

R 1 ) 1 a (

1 _ _

 

 (4.5)

100 * )] k 4 a )( 4 / A ( a 1 [

)] k 4 1 )( 4 / A ( 1 [ ) t , t( PRE

1 1

Re 1 ) 1 a (

1 _ _

 

(12)

It is observed from (4.1) to (4.6) that the percent relative efficiencies are the

functions of unknown population parameters such as R, 2

x

S , 2 y

S ,  and . In

practice, population parameters are not known exactly and hence these percent relative efficiencies formulae can not be used. To overcome this difficulty, it is advisable to estimate the PREs, replacing the population parameters involved in these PREs by their consistent estimates based on the sample. Thus the estimates of these PREs are respectively given by

100 * )] kˆ 2 1 ( Aˆ 1 [ ) y , t( PRE 1 ) r n ( R

1  

    (4.7) 100 * )] kˆ 4 1 )( 4 / Aˆ ( 1 [ ) y , t( PRE 1 ) r n ( Re

1  

    (4.8) 100 * ) ˆ 1 ( ) y , t( PRE 2 ) r n ( ) r n ( ) 1 aˆ (

1  

    _(4.9) 100 * )] kˆ 4 a )( 4 / Aˆ ( a 1 [ ) y , t( PRE 1 1 1 ) r n ( ) 1 a (

1  

    _(4.10) 100 * )] kˆ 4 a )( 4 / Aˆ ( a 1 [ )] kˆ 2 1 ( Aˆ 1 [ ) t , t( PRE 1 1 R 1 ) 1 a ( 1       (4.11) 100 * )] kˆ 4 a )( 4 / Aˆ ( a 1 [ )] kˆ 4 1 )( 4 / Aˆ ( 1 [ ) t , t( PRE 1 1 Re 1 ) 1 a ( 1       , (4.12) where 2 ) r n ( y 2 ) r n ( x 2 ) r n ( ) r n

( Rˆ s /s

Aˆ      , kˆˆ₍_n__r₎ /Rˆ₍_n__r₎, Rˆ(nr) y(nr)/x(nr), 2 ) r n ( x ) r n ( xy ) r n

( s /s

ˆ _ _ _ _

 _and ˆ₍_n_r₎ s_xy₍_n_r₎/{s_x₍_n_r₎s_y₍_n_r₎}.

4.2 Strategy-II:

Percent relative efficiencies of t2R,t2Re and t2(aˆ₂) with respect to y(nr) are respectively given by

100 * )] k 2 1 ( FA 1 [ ) y , t( PRE 1 ) r n ( R

2      (4.13)

100 * )] k 4 1 )( 4 / A ( F 1 [ ) y , t( PRE 1 ) r n ( Re

2      (4.14)

100 * ) F 1 ( ) y , t( PRE 2 ) r n ( ) 2 aˆ (

2     , (4.15)

where (nq2p) and F{(Nn)}/{n(N)}.

Further the percent relative efficiencies of t2(a2) with respect to y(nr), t2R and Re

2

t are respectively given by

100 * )] k 4 a ( a ) 4 / A ( F 1 [ ) y , t( PRE 1 2 2 ) r n ( ) 2 a (

2      (4.16)

100 * )] k 4 a ( a ) 4 / A ( F 1 [ )] k 2 1 ( FA 1 [ ) t , t( PRE 2 2 R 2 ) 2 a (

2  _   _ (4.17)

100 * )] k 4 a ( a ) 4 / A ( F 1 [ )] k 4 1 )( 4 / A ( F 1 [ ) t , t( PRE 2 2 Re 1 ) 2 a (

2 _ _

 

(13)

The estimates of PREs given at (4.13)-(4.18) based on sample are respectively given by 100 * )] kˆ 2 1 ( Aˆ Fˆ 1 [ ) y , t( PRE 1 ) r n ( R

2    

    _(4.19) 100 * )] kˆ 4 1 )( 4 / Aˆ ( Fˆ 1 [ ) y , t( PRE 1 ) r n ( Re

2    

    (4.20) 100 * ) ˆ Fˆ 1 ( ) y , t(

PRE *2

) r n ( ) r n ( ) 2 aˆ (

2  

    (4.21) 100 * )] kˆ 4 a ( a ) 4 / Aˆ ( Fˆ 1 [ ) y , t( PRE 1 2 2 ) r n ( ) 2 a (

2    

    (4.22) 100 * )] kˆ 4 a ( a ) 4 / Aˆ ( Fˆ 1 [ )] kˆ 2 1 ( Aˆ Fˆ 1 [ ) t , t( PRE 2 2 * R 2 ) 2 a ( 2 _         _(4.23) 100 * )] kˆ 4 a ( a ) 4 / Aˆ ( Fˆ 1 [ )] kˆ 4 1 )( 4 / Aˆ ( Fˆ 1 [ ) t , t( PRE 2 2 * * * Re 2 ) 2 a ( 2 _     

 , (4.24)

where Fˆ{ˆ(Nn)}/{n(Nˆ)}_, ˆ (nqˆ2pˆ)_, 2

) r n ( y 2 ) r n ( x 2 ) r n

( s /s

Rˆ

Aˆ  _ _

  _ _,      __ ) r n ( ) r n ( /Rˆ

ˆ

kˆ , 2

x ) r n ( yx ) r n

( s /s

ˆ

 

 

 _, Rˆ(nr) y(nr) /x and ˆ*(nr) syx(nr) /{sxsy(nr)}.

However one may also take the estimate of the population correlation coefficient

 as ˆ₍_n__r₎ s_xy₍_n__r₎/{s_x₍_n__r₎s_y₍_n__r₎} in place of  

ˆ₍_n _r₎, see Singh et al (2000).

4.3 Strategy-III:

Percent relative efficiencies of t3R,t3Re and t3(aˆ3) with respect to y(nr) are respectively given by

100 * )] k 2 1 ( A F 1 [ ) y , t( PRE 1 ) r n ( R

3       (4.25)

100 * )] k 4 1 )( 4 / A ( F 1 [ ) y , t( PRE 1 ) r n ( Re

3       (4.26)

100 * ) F 1 ( ) y , t( PRE 2 ) r n ( ) 3 aˆ (

3     , (4.27)

where F {N(n)}/{n(N)}_.

Further the percent relative efficiencies of ₎

3 a ( 3

t with respect to y(nr), t3R and

Re 3

t are respectively given by

100 * )] k 4 a ( a ) 4 / A ( F 1 [ ) y , t( PRE 1 3 3 ) r n ( ) 3 a (

3       (4.28)

100 * )] k 4 a ( a ) 4 / A ( F 1 [ )] k 2 1 ( A F 1 [ ) t , t( PRE 2 2 R 3 ) 3 a (

3  _   _

 (4.29) 100 * )] k 4 a ( a ) 4 / A ( F 1 [ )] k 4 1 )( 4 / A ( F 1 [ ) t , t( PRE 3 3 Re 1 ) 3 a (

3  _ _



(14)

The estimates of PREs given at (4.25)-(4.30) based on sample are respectively given by 100 * )] kˆ 2 1 ( Aˆ Fˆ 1 [ ) y , t( PRE 1 ) r n ( R

3     

    _(4.31) 100 * )] kˆ 4 1 )( 4 / Aˆ ( Fˆ 1 [ ) y , t( PRE 1 ) r n ( Re

3     

    (4.32) 100 * ) ˆ Fˆ 1 ( ) y , t(

PRE *2

) r n ( ) r n ( ) 3 aˆ (

3   

    (4.33) 100 * )] kˆ 4 a ( a ) 4 / Aˆ ( Fˆ 1 [ ) y , t( PRE 1 3 3 ) r n ( ) 3 a (

3     

    (4.34) 100 * )] kˆ 4 a ( a ) 4 / Aˆ ( Fˆ 1 [ )] kˆ 2 1 ( Aˆ Fˆ 1 [ ) t , t( PRE 3 3 * R 3 ) 3 a (

3 _ _

         _(4.35) 100 * )] kˆ 4 a ( a ) 4 / Aˆ ( Fˆ 1 [ )] kˆ 4 1 )( 4 / Aˆ ( Fˆ 1 [ ) t , t( PRE 2 3 * * * Re 3 ) 3 a (

3 _ _

     

 _, _(4.36)

where Fˆ {N(nˆ)}/{n(Nˆ)}_.

 _{Remark 4.1-} In similar fashion , the PREs of different product-type estimators {i.e. PRE(t_jP,y₍_n__r₎), PRE(t_jPe,y₍_n__r₎), PRE(t ₎,y₍_n _r₎)

j aˆ (

j  , PRE(tj(a_j),tjP),

) t , t (

PRE ₎ _jPe

j a (

j , (j = 1,2,3)} and their estimates {i.e. PRE(tjP,y(nr))

 , ) y , t (

PRE _jPe ₍_n__r₎ , PRE(t ₎,y₍_n _r₎)

j aˆ (

j 



, PRE(t ₎,t_jP)

j a ( j



and PRE(t ₎,t_jPe)

j a ( j



can be obtained without much difficulty.

5. Empirical Study

To illustrate the performance of different estimators of population mean Y we

consider the data as given in Singh (2003, p. 990).

The descriptions of the variables are given below:

i

y = Amount (in $ 000) of real state farm loans in different states during 1997.

i

x = Amount (in $000) of nonreal state farm loans in different states during 1997.

The estimates of different parameters are given below:

X= 878.16; N = 50, n = 20, r = 6, x= 942.8615, x(nr) = 1068.292,

) r n (

y  = 580.177,

2 x

s = 1307911.82, 2

) r n ( x

s _ = 1685838.731, 2

) r n ( y

s _ = 295169.6416,

) r n ( xy

(15)

) r n (

ˆ _

 = 0.84487, 



ˆ₍_n _r₎ = 0.9392, pˆ= 0.34993, qˆ = (1 - pˆ) = 0.65,

) r n ( ˆ 

 ₌ ₂

) r n ( x ) r n ( xy s s   = 0.35353,  

ˆ₍_n _r₎= ₂ x ) r -n ( xy s s

= 0.45568,Rˆ₍_n__r₎= 0.54309, 

r) n (

Rˆ = 0.61534, sx= 1143.639725,

) pˆ 2 qˆ n ( ˆ _ _  =13.7, ) r n ( ) r n ( Rˆ ˆ   

= 0.6510, _

    ) r n ( ) r n ( Rˆ ˆ

= 0.7405, s_x₍_n__r₎ = 1298.398525,

{1- 2 ) r n ( ˆ _

 }= 0.2862,

) r n ( ) r n ( Rˆ ˆ 4   

= 2.6040, ₂

) r n ( y 2 ) r n ( x 2 ) r n ( s s Rˆ   

= 1.684565697,

) r n ( y

s _ = 543.2951699, 1.6846

s s Rˆ Aˆ ₂ ) r n ( y 2 ) r n ( x 2 ) r n ( _     _, 5145 . 3 s s Rˆ Aˆ ₂ ) r n ( y 2 ) r n ( x 2 * ) r n ( _      _, 965 . 745 ) ˆ N ( n ) n N ( ˆ Fˆ     

 _, 571.725

) ˆ N ( n ) ˆ n ( N Fˆ      

 _, ₀_.₆₅₁

Rˆ ˆ kˆ ) r n ( ) r n ( _     , ) ( ) ( ˆ ˆ ˆ r n r n R k     _ 

We have computed the estimates of the PREs of the estimators:

(a) ₎ j aˆ ( j Re j jR,t and t

t , (j = 1,2,3) with respect to the usual unbiased estimator

) r n (

y  using the formulae {(4.7), (4.8), (4.9)}, {(4.13), (4.14), (4.15)} and

{(4.19), (4.20), (4.21)} and findings are demonstrated in Table 5.1, 5.6 and 5.11 respectively;

(b) t (ja_j) with respect to the usual unbiased estimator y(nr), t_jR and t_j_Re,

(j = 1,2,3) using the formulae {(4.10), (4.11), (4.12)}, {(4.16), (4.17), (4.18)} and {(4.22), (4.23), (4.24)} and results are shown in Tables {5.2, 5.3, 5.4 }, {5.7, 5.8, 5.9}, and {5.12, 5.13, 5.14} respectively.

we have also computed the estimate of range of scalar a_j in which the proposed

estimator t(ja_j) is better than y(nr), t_jR and t_j_Re, (j = 1,2,3) and finding are compiled in Tables 5.5, 5.10 and 5.14 respectively.

Strategy-I:

Table 5.1: The estimates of PREs ofy(n r),t1R,t1Re and ₎ 1 aˆ ( 1

t

Estimator y(nr) t1R t1Re t1(aˆ₁)

(16)

Table 5.2: The estimates of PRE of t1(a₁) with respect to y(nr) for different

values of a1

1

a -2.604 -2.50 -2.25 -2.00 -1.75



PRE { ₎ ₍_n _r₎ 1 a ( 1 ,y

t  100.00 112.30 150.48 203.56 269.83

1

a -1.50 a1(opt)

= -1.302 -1.25 -1.00 -0.75



PRE { ₎ (n r) 1 a ( 1 ,y

t  330.48 349.55 348.16 308.18 241.31

1

a _-0.50 _-0.25 _0.00



PRE {t₁₍_a₁₎,y(nr) 179.55 132.95 100.00

Table 5.3: The estimates of PRE of t1(a1) with respect to the ratio estimator R

1

t for different values of a1

1

a -2.00 -1.75 -1.50 a1(opt)= -1.302 -1.25



PRE{t1(a₁),t1R} 100.00 132.56 162.35 171.72 171.04

1

a -1.00 -0.75 -0.6020



PRE{t1(a₁),t1R} 151.39 118.55 100.00

Table 5.4: The estimates of PRE of t1(a₁) with respect to the modified ratio

type estimator t2Re for different value of a1

1

a -1.604 -1.50 a1(opt)=

-1.3020 -1.25 -1.00



PRE {t1(a1), t2Re} 100.00 107.25 113.43 112.98 100.00

Table 5.5: The estimates of range of a1 for t1(a1) to be more efficient than

different estimators of population mean

Estimator y(nr) t1R t1Re

Estimate of

range of a1

(-2.6040 ,

(17)

Strategy – II:

Table 5.6: The estimates of PRE of y(nr),t2R,t2Reand ) 2 aˆ ( 2

t with respect to

) r n (

y 

Estimator y(nr) t2R t2Re t2(aˆ₂)



PRE {. ,y(nr)} 100.00 184.11 187.22 208.69

Table 5.7: The estimates of PRE of

) 2 a ( 2

t with respect to y(nr) for different

values of a2

2

a _-2.962 _-2.50 _-2.25 _-2.00 _-1.75



PRE{ ) 2 a ( 2

t ,y(nr)} 100.00 137.79 161.39 184.11 201.46

2

a -1.50 aˆ2(opt)

= -1.4810 -1.25 -1.00 -0.75



PRE{ ) 2 a ( 2

t ,y(nr)} 208.65 208.69. 203.31 187.22 165.00

2

a -0.50 -0.25 0.00



PRE{ ) 2 a ( 2

t ,y(nr)} 141.30 119.19 100.00

Table 5.8: The estimates of PRE of ₎

2 a ( 2

t with respect to the ratio estimator t2R for different values of a2

2

a -2.00 -1.75 -1.50 aˆ2(opt)= -1.4810 -1.25



PRE { ) 2 a ( 2

t ,t2R} 100.00 109.43 113.34 113.36 110.44

2

a -1.00 -0.962



PRE { ) 2 a ( 2

t ,t2R} 101.70 100.00

Table 5.9: The estimates of PRE of ₎

2 a ( 2

t with respect to the modified ratio estimator t2Re

2

a -1.962 -1.50 aˆ2(opt)= -1.4810 -1.25 -1.00



PRE{ ) 2 a ( 2

(18)

Table 5.10: The estimates of range of a2 for the proposed estimator ₎ 2 a ( 2

t to be more efficient than different estimators of the population mean

The estimates

of range of a2

(-2.962 ,

0.00) (-2.000 ,-0.962) (-1.962 ,-1.000)

Strategy – III:

Table 5.11: The estimates of PRE of different estimators y(nr),t3R,t3Reand

) 3 aˆ ( 3

t with respect to the usual unbiased estimator y(nr)

Estimator y(nr) t3R t3Re t3(aˆ₃)



PRE { (.) , y(nr)} 100.00 153.88 155.54 166.44

Table 5.12: The estimates of PRE of ₎

3 a ( 3

t with respect to the usual unbiased estimator y(nr)for different values of a₃

3

a _-2.962 _-2.50 _-2.25 _-2.00 _-1.75



PRE{ ₎ 3 a ( 3

t , y(nr)} 100.00 126.62 141.16 153.89 162.87

3

a _-1.50 aˆ3(opt)

= -1.4810 -1.25 -1.00 -0.75



PRE{ ₎ 3 a ( 3

t , y(nr)} 166.42 166.44 163.80 155.54 143.25

3

a _-0.50 _-0.25 _0.00



PRE{ ₎ 3 a ( 3

t , y(nr)} 128.87 114.08 100.00

3 a ( 3

t with respect to the usual ratio estimator t3R for different value of a₃

3

a _-2.00 _-1.75 _-1.50 aˆ₃₍_opt₎_{=-1.4810 -1.25}



PRE{ ₎ 3 a ( 3

t , t3R} 100.00 105.84 108.15 108.16 106.45

3

a _-1.00 _-0.962



PRE{ ₎ 3 a ( 3

(19)

3 a ( 3

t with respect to modified ratio estimator t3Re for different values of a₃

3

a _-1.962 _-1.50 aˆ3(opt)=

-1.4810 -1.25 -1.00



PRE{ ₎ 3 a ( 3

t , t3Re} 100.00 107.00 107.01 105.31 100.00

Table 5.15: The estimates of range of a₃for the proposed estimator ₎

3 a ( 3

t to

be more efficient than the different estimators of the population mean Y

The estimates

of range of a₃ ( -2.962 ,_0.000) (-2.000 ,_-0.962) (-1.962 ,_-1.000)

It is observed from Tables (5.2, 5.4, 5.5) , (5.7, 5.9, 5.10) and (5.12, 5.14, 5.15)

that there is enough scope of selecting the scalar a_j, (j = 1,2,3); to obtain

estimators better than the usual unbiased estimator y(nr), usual ratio estimator

jR

t , (j = 1,2,3) and modified ratio type estimator t_j_Re, (j = 1,2,3). Larger estimated

gain in efficiency by using the proposed classes of estimators over y(nr), t_jR and

Re j

t are observed in the neighborhood of estimated optimum value

) opt ( j

aˆ , (j = 1,2,3) of the scalar a_j, (j = 1,2,3). Largest estimated gain in efficiency

is seen at estimated optimum value aˆ_j₍_opt₎ of a_j, (j = 1,2,3). The estimated gain in

efficiency by using the estimator ₎

j a ( j

t , (j =1,2,3). over the usual unbiased

estimator y(nr) (which does not utilize the auxiliary information) is larger as well

for larger range of the scalar a_j, (j=1,2,3) as compared to the usual ratio

estimator t_jR and t_j_Re, (j =1,2,3). Tables 5.5, 5.10 and 5.15 exhibit that when

1

a (-1.604 , -1.000), a₂ (-1.962 , -1.000) and a₃ (-1.962 , -1.000) the

proposed class of estimators ₎

1 a ( 1

t , ₎

2 a ( 2

t and ₎

3 a ( 3

t are always more efficient than

{y(nr), t₁_R,t₁_Re}, {y(nr), t₂_R,t₂_Re} and {y(nr), t₃_R,t₃_Re} respectively. It is further observed from Tables 5.1, 5.6 and 5.11 that the proposed modified exponential

type ratio estimator t_j_Re, (j=1,2,3) is more efficient than the usual unbiased

estimator y(nr) and the usual ratio estimator t_jR,(j =1,2,3) with substantial

estimated gain in efficiency. However the estimators ₎

1 aˆ ( 1

t , ₎

2 aˆ ( 2

t and ₎

3 aˆ ( 3

(20)

on estimated optimum values are more efficient than the estimators {y(nr),

R 1

t ,t₁_Re}, {y(nr), t₂_R,t₂_Re} and {y(nr), t₃_R,t₃_Re} respectively with considerable

estimated gain in efficiency. Thus the proposed estimators t_j_Re and

) j aˆ ( j

t , (j =1,2,3) are recommended for their use in practice. The estimated gain in

efficiency due to the estimators proposed in Strategy-I is the largest followed by the estimators in Strategy -II and Strategy -III.

Acknowledgements

Authors are thankful to the learned referee for his valuable suggestions and pointing out the mistakes on earlier draft of the paper.

References

1. Ahmed, M.S., Al-Titib, O., Al-Rawib, Z., Abu- Dayyeh, W. (2005):

Estimation of finite population variance in presence of random non-response using auxiliary variables. Information and Management science, 16(2), 73-82.

2. Chaudhuri, A and Stenger, H. (1992): Survey Sampling, theory and

methods, Marcel Dekkar, New-York.

3. Chaudhuri, A and Vos, J.W.E. (1998): Unified theory and strategies of

survey sampling, North-Holland.

4. Cochran, W.G. (1940): The estimation of the yields of cereal experiments

by sampling for the ratio gain to total produce. Jour. Agric. Soc., 30, 262-275.

5. Cochran, W.G. (1997): Sampling techniques. Third edition, John Willey

and Sons, New-York.

6. Heitjan, D.F. and Basu, S. (1966): Distinguishing ‘missing at random’ and

missing completely at Random’. The American Statistician, 50, 207-213.

7. Krisnaiah, P.R. and Rao, C.R. (1998): Hand book of statistics, Vol.6,

North-Holland.

8. Murthy, M.N. (1964): Product method of estimation. Sankhya, 26, 69-74.

9. Rubin, D.B. (1976): Inference and missing data. Biometrica, 63(3),

581-592.

10. Singh, S. (2003): Advanced sampling theory with applications, How

Michael ‘selected’ Any. Vol. I and II Kluwer Academic Publishers.

11. Singh H.P., Chandra, P., Joarder, A. H. and Singh, S. (2007): Family of

(21)

12. Singh, H.P. Chandra, P. and Singh, S. (2003): Variance estimation using multiauxiliary information for random non-response in survey sampling. Statistica. LXIII, 23-40.

13. Singh, H.P. and Tracy, D.S. (2001): Estimation of population mean in

presence of random non-response in sample surveys. Statistica, LXI, 231-248.

14. Singh, R. and Mangat, N.S. (1996): Elements of survey sampling. Kluwer

Academic Publisher.

15. Singh, S. and Joarder, A.H. (1998): Estimation of finite population

variance using random non-response in survey sampling, Metrika, 47, 241-249.

16. Singh, S., Joarder, A.H. and Tracy, D.S. (2000): Regression type

estimators for random non-response in survey sampling. Statistica, LX, (1), 39-43.

17. Sukhatme, P.V. and Sukhatme, B.V. (1970): Sampling theory of surveys

with applications. Iowa State University Press, Ames, Iowa.

18. Tracy, D.S, and Osahan, S.S. (1994): Random non-response on study

variable versus on study as well as auxiliary variables. Statistica, 54, 163-168.

19. Toutenburg, H. and Srivastava, V.K. (1998): Estimation of ratio of

population means in survey sampling when some observation are missing. Metrika, 48, 177-187.

20. Tuteja, R.K. and Bahl, S. (1991): Ratio and product type exponential