Original Research
Some generalized classes of double sampling regression type estimators using auxiliary information
Shashi Bhushan1*, R. Karan Singh2 and Arvind Pandey1
1Department of Statistics, Pachhunga University College, Mizoram University, Aizawl 796 001, India
2Department of Statistics, University of Lucknow, Lucknow 226 007, India
Received 17 December 2010 | Revised 17 January 2011 | Accepted 27 January 2011
ABSTRAC T
In the present study, a double sampling regression type estimator representing a class of estimators is proposed. The bias and mean square error (MSE) of the proposed estimator is obtained. A more generalized class of double sampling regression type estimator utilizing the auxiliary information available at first phase in the form of mean and variance of the auxiliary variable is also proposed.
The bias and MSE of the proposed class were obtained for this class. The concluding remarks show that the proposed classes of estimators are better than that of double sampling estimator based on earlier proposition. An empirical study is included for illustration.
Key words Key wordsKey words
Key words: Auxiliary information; bias; double sampling regression type estimator; mean square error.
Mathematics Subject Classification 2000:
Mathematics Subject Classification 2000: Mathematics Subject Classification 2000:
Mathematics Subject Classification 2000: Primary 62D05 Secondary 62F10
Corresponding author: S. Bhushan Phone.
E-mail: [email protected]
INTRO DUC TIO N
If the supplementary information on the auxiliary variable is not known then the dou- ble sampling ratio and regression strategies are very well known. Many biased double sampling ratio type, double sampling regres- sion type transformed estimators and the bi- ased double sampling estimators obtained through parametric linear combination of ra- tio or regression estimator and the usual unbi- ased estimators are available for estimating
the population mean.1- 7 The use of population mean and variance of auxiliary variable for increasing the efficiency of the sampling strat- egy has been discussed recently by Singh,8 Bhushan1 and Bhushan et al.9 among others.
Let y be the characteristic under study and x be the auxiliary variable. Thus for a finite population of size N, we denote by
Yi: the observation on the i th unit of the population for the characteristic y under study (i =1,2,...,N),
Xi: the observation on the i th unit of the population for the auxiliary characteristic x under study (i =1,2,...,N),
January-March, 2011 ISSN (print) 0975-6175 ISSN (online) 2229-6026
,
(where ρ is the population correlation coeffi- cient between x and y) and
: the (p,q) th product mo- ment about mean between x and y.
If the information about the mean and variance of the auxiliary variable is not known then we resort to double sampling. Let the auxiliary characteristic x be observed on a large preliminary simple random sample of size drawn without replacement from a population of size N in the first phase. Also the characteristic of interest y and the auxil- iary characteristic x be observed on the second phase sample of size n drawn from first phase sample by simple random sampling without replacement.
and
be the sample mean and sample variance of auxiliary characteristic x based on first phase sample of size . Also based on second phase sub-sample of size n, let
1
1 N
i i
Y Y
N =
=
∑
1
1 N
i i
X X
N =
=
∑
2 2
1
1 ( )
1
N
y i
i
S Y Y
N =
= −
−
∑
2 2
1
1 ( )
1
N
x i
i
S X X
N =
= −
−
∑
2 2
1
1 ( )
N
x i
i
X X
σ
N=
=
∑
−1
1 ( )( )
1
N
xy i i x y
i
S X X Y Y S S
N
ρ
=
= − − =
−
∑
2
xy y
x x
S S
S S
β
= =ρ
and
be the sample mean of auxiliary characteristic x and characteristic y under study, respec- tively,
and
be the sample variance of characteristic x and characteristic y, respectively,
be the sample covariance between x and y, and
be the sample regression coefficient of y on x.
A PRO P O SED DO UBLE SA MP LIN G ES TI- MA TO R
A proposed double sampling estimator of population mean is
(1)
where θ is the characterizing scalar to be de- termined suitably. Note that for θ=0 the pro- posed estimator reduces to the usual double sampling linear regression estimator
(2)
Let
with
1
1 ( )( )
1
n
xy i i
i
s x x y y
n =
= − −
−
∑
2 xy
x
b s
= s
2
xy xy
s =S +e sx2=Sx2+e3
2 2
x x e4
σ
=σ
+σ
'2x =σ
x2+e4' 11 ( ) ( )
N
p q
pq i i
i
X X Y Y
µ
N=
=
∑
− −n'
'
' '
1
1 n
i i
x x
n =
=
∑
' 2 ' ' ' 21
1 ( )
n
x i
i
x x
σ
n=
=
∑
−n'
1
1 n
i i
y y
n =
=
∑
1
1 n
i i
x x
n =
=
∑
2 2
1
1 ( )
1
n
x i
i
s x x
n =
= −
−
∑
2 21
1 ( )
1
n
y i
i
s y y
n =
= −
−
∑
Y
( ' )
ylrD
= +
y b x−
xy = +Y e0 x
= +
X e1 x'= +
X e1'(3)
Then by using (1), we have
(4)
Using the results given in Sukhatme and Sukhatme,4 Bhushan1 and Bhushan et al.9
(5)
where and .
Using (4) and (5) it can be seen that
(6) showing that is a biased estimator of popu- lation mean. Using (4) and neglecting terms
' '
0 1 1 2 3 4 4
( ) ( ) ( ) ( ) ( ) ( ) ( ) 0
E e =E e =E e =E e =E e =E e =E e =
0 4 21
( ) n
E e e =
γ µ
E e e( 0 4')= γ µ
n' 211 2 21
( ) n
E e e =
γ µ
E e e( 1 2' )= γ µ
n' 211 3 30
( ) n
E e e =
γ µ
E e e( 1 3' )=γ µ
n' 30' 2 ' 2
4 4 4 ' 40 20
( ) ( ) n( )
E e =E e e =
γ µ
−µ
2 2
( 0 ) n y
E e =
γ
S E e( 12)=γ
nSx2 ( 0 1) n xyE e e =
γ
S E e( 42)=γ µ
n( 40−µ
202)1 4 30
( ) n
E e e =
γ µ
E e e( 0 4)=γ µ
n 21'
0 1 '
( ) n xy
E e e =
γ
S E e( 1' 2)=E e e( 1 1')=γ
n'Sx2' 2 ' 2
4 4 4 ' 40 20
( ) ( ) n( )
E e =E e e =
γ µ
−µ
'
0 4 ' 21
( ) n
E e e =
γ µ
of ei’s having powers greater than two, we get the MSE given by
(7)
which is minimum for the optimum value of θ given by
(8)
and the minimum mean square error of is given by
(9)
A MO RE GENER ALIZ ED CL ASS O F DO U- BLE SA MP LIN G ES TI MA TO RS
A more generalized estimator of popula- tion mean is proposed to be
(10)
where ; g(w) is a function of w
such that g(w)=1 at w=1 satisfying the follow- ing conditions:
1. Whatever be the sample chosen, w as- sumes values in the bounded closed interval I of the real line containing the point unity.
2. Within the interval I the function g(w) is continuous and bounded.
3. The first, second and third partial de- rivatives of g(w) exist and are continu- ous and bounded in I.
By expanding g(w) about the point w=1 in the third order Taylor’s series, we have
' '
0 1 1 2 3 4 4
( ) ( ) ( ) ( ) ( ) ( ) ( ) 0
E e =E e =E e =E e =E e =E e =E e =
' ' ' 2 ' '
' 0 4 0 4 4 4 4 ' 1 2 1 2 1 3 1 3
0 2 4 4 2 2 1 1 2 2
x x x xy xy x x
e e e e e e e e e e e e e e e
Y Y e e e e e
Y Y
β
S S S S
= + + − + − − + + − + − − +
YθD = + +Y e e − +e − − + + e − +e − − +
' ' ' 2 ' '
' 0 4 0 4 4 4 4 ' 1 2 1 2 1 3 1 3
0 2 4 4 2 2 1 1 2 2
x x x xy xy x x
e e e e e e e e e e e e e e e
Y Y e Y e e e e
Y Y S S S S
θ
σ σ σ
= + + − + − − + + − + − − +
' ' ' '
1 4 1 4 1 4 ' 30
( ) ( ) ( ) n
E e e =E e e =E e e =
γ µ
' ( ') / '
n N n Nn
γ
= −( ) /
n N n Nn
γ
= −( D) ( D) ( )
Bias Yθ =E Yθ − =Y − − −
30
21 21
' 2 2
( ) ( ) ( n n)
x Sxy Sx
µ
θµ µ
γ γ β
σ
= − = − − −
( )
D ( n n')[(1 2) y2 242( 40 202) 2 21 2 30] n' y2x x x
Y Y Y
MSE Yθ
γ γ ρ
Sθ µ µ θ µ βθ µ γ
Sσ σ σ
= − − + − + − +
2 2
2 2 2 2
' 4 40 20 2 21 2 30 '
2 2
( n n)[(1 ) y ( ) ] n y
x x x
Y Y Y
MSE Y
γ γ ρ
Sθ µ µ θ µ βθ µ γ
Sσ σ σ
= − − + − + − +
30 21 20
2
40 20
( )
( )
opt Y
βµ µ µ θ
=µ
−−µ
( )
min ' 2 2 ' 2( )[(1 ) ]
D n n y n y
MSE Yθ =
γ
−γ
−ρ
S − +γ
S2
2 2 30 21 2
' 2 '
20 2
( )
( )[(1 ) ]
( 1)
n n y n y
MSE Y
γ γ ρ
Sβµ µ γ
Sµ β
= − − − − +
−
Y
( ) ( ' )
Ygd =yg w +b x −x
(11)
where , and may
depend on w; , and denote the first, second and third order derivatives of g(w) at the point w=1,1 and w*, respectively.
Using the notations given in (3), we have and
Putting these values in (11) and neglecting the terms of ei’s (i=0,1,2,3,4) and ’s (i=2,4) having powers greater than two, we get
(12)
Taking expectation, we get
(13)
showing that is a biased estimator of population mean. Also, the mean square error of the estimator is given by
(14)
(14) is minimized when
(15)
and the minimum mean square error of is same as that of expression given by (9).
CO NC L UDI N G RE MAR KS
The proposed generalized classes of esti- mators and are biased and their biases are given by (6) and (13) respectively. The minimum mean square errors of the proposed generalized classes are equal and are given by (9). Therefore, the proposed generalized classes of estimators and are preferred to usual linear regression estimator, ratio esti- mator, mean per unit estimator and product estimator in the sense of lesser mean square error. In the class of estimators, there ex- ists a subclass of optimum estimators satisfy- ing (15) such that every member of the sub- class attains the same minimum mean square error given by (9). For example, for
the estimator belongs to the sub-class of estimators attaining the minimum mean square error given by (9). Further, a double sampling ratio estimator based on Singh8 is
having MSE given by
where such that
2 2 2 2 2
' '
( sd) ( n n)[ y x 2 xy] n y
MSE y = γ −γ S +Rδ S − R Sδ +γ S
2 3
( 1) ( 1)
(1) ( 1) '(1) ''(1) '''( ) ( )
2! 3!
gd
w w
Y = y g + w− g + − g + − g w +b x −x
2 3
* '
( 1) ( 1)
(1) ( 1) '(1) ''(1) '''( ) ( )
2! 3!
w w
Y = y g + w− g + − g + − g w +b x −x
* 1 ( 1)
w
= + ψ
w−
0< <ψ
1ψ
'(1)g g''(1) g'''(w*)
'
ei
' ' 2
' 4 4 4 '
0 4 4 2 2 2 0 4 0 4 2
'(1) '(1)
( )
gd
x x x x
e e e Yg g
Y Y e e e e e e e
σ σ σ σ
= + + − − + + −
' '
0 4 4 2 2 2 0 4 0 4 2
'(1) '(1)
( )
x x x x
e e e Yg g
Y Y e e e e e e e
σ σ σ σ
= + + − − + + − 42 4' 2 4 4' ''(1)4 1' 1 2 2
( 2 )
2 x xy xy x x
Yg e e e e e e e e
e e e e e e
+ + −
σ
+ − + − − +' '
2 ' 2 ' ' 1 2 1 2 1 3 1 3
4 4 4 4 4 1 1 2 2
x xy xy x x
Yg e e e e e e e e
e e e e e e
S S S S
β
+ + − + − + − − +
' 2 21 4 40 20 2
'(1) ''(1)
( ) ( )[ ( ) ]
gd n n 2
x x x xy
g Yg
Bias Y
γ γ µ µ µ β
σ σ
= − + − + −
2 30 21
' 2 21 4 40 20 2
'(1) ''(1)
( ) ( )[ ( ) ]
x x Sx Sxy
µ µ
γ γ µ µ µ β
= − + − + −
{ }
22
2 2 2 2
' ' 4 40 20 2 21 2 30
( ) ( )[(1 ) '(1) ( ) ]
x x x
Y g
γ γ γ ρ µ µ µ µ
σ σ σ
= + − − + − + −
2 2 2 2
' ' 40 20 21 30
( gd) n y ( n n)[(1 ) y ( ) ]
MSE Y =
γ
S +γ
−γ
−ρ
S +µ
−µ
+µ
−µ
' ' 4 40 20 2 21 2 30
2 '(1) 2 '(1)
( ) ( )[(1 ) ( ) ]
x x x
Yg
β
Ygγ γ γ ρ µ µ µ µ
σ σ σ
= + − − + − + −
30 21 20
2
40 20
( )
'(1) ( )
g Y
βµ µ µ
µ µ
= −
−
30 21 20
2
40 20
( )
'(1) '(1)
( ) opt
g g
Y
βµ µ µ θ
= =µ
−−µ
=x
X δ X
= σ +
( ' )
( )
x sd
x
y y x x
σ σ
= +
+
showing that the proposed class of estimators are better than the double sampling Singh8 estimator. Also, the parameters involved in the may be estimated by the corre- sponding sample values in order to get a class of estimators depending upon estimated opti- mum value.
EMP IRIC AL ST UDY
The gain in precision of the proposed esti- mator(s) versus the usual double sampling linear regression estimator is studied for thirty two populations as provided in Bhushan1 and Bhushan et al.9 The table given below pro- vides the gain in precision when the proposed estimator(s) are used over the double sam- pling linear regression estimator.
2
min '
( sd) ( D) ( n n)[( y x) ] 0
MSE y −MSE Yθ = γ −γ ρS −R Sδ + ≥
2
30 21
2
20 2
( )
( ) ( ) ( )[( ) ] 0
( 1)
βµ µ µ β
− = − − + − ≥
−
'(1)opt g
Population
Number Gain(YθD/Ygd) Population
Number Gain(YθD/Ygd)
1 2.58583 17 300.001
2 9.08774 18 232.916
3 0.224016 19 10.8319
4 1.36995 20 11.1082
5 6.46651 21 0.0658345
6 25.9388 22 5.58773
7 1.04419 23 13.1132
8 0.035816 24 4.36055
9 6.54509 25 5.7562
10 65.492 26 2.78404
11 1.09776 27 0.0496923
12 2.58583 28 0.323048
13 0.0787519 29 0.586345
14 4.43361 30 11.9976
15 68.5157 31 5.47894
16 10.3939 32 2.16561
Table 1. Gain in precision of the proposed estimators over linear regression estimator.
Unpublished thesis submitted and accepted by Lucknow University, U.P., India.
6. Murthy MN (1967). Sampling Theory and Methods. Statis- tical Publishing Society, Calcutta.
7. Cochran WG (1977). Sampling Techniques 3rd edn. John Wiley and Sons, New York.
8. Singh GN (2003). On the improvement of product method of estimation in sample surveys, J Ind Soc Agr Stat, 56, 267-275.
9. Bhushan S, Mishra P & Singh RK (2010). A class of regres- sion type estimators using mean and variance of auxiliary variable (under communication).
REFERE NC E S
1. Bhushan S (2007). Some Improved Sampling Strategies in Finite Population. Ph.D. thesis, University of Lucknow, Lucknow, U.P., India.
2. Singh D & Chaudhary FS (2002). Theory and Analysis of Sampling Survey Design. New Age Pvt. Ltd., New Delhi.
3. Mukhopadhyaya P (1998). Theory and Methods of Survey Sampling. Prentice-Hall India, New Delhi.
4. Sukhatme PV & Sukhatme BV (1997). Sampling Theory of Surveys with Applications. Piyush Publications, Delhi.
5. Kumar R (1995). Some Contributions to Survey Sampling.