RESEARCH ARTICLE
J.Natn.Sci.Foundation Sri Lanka 2020 48 (1): 27 - 36 DOI: http://dx.doi.org/10.4038/jnsfsr.v48i1.9931
A new improved difference-cum-exponential ratio type estimator in systematic sampling using two auxiliary variables
J Shabbir1*, S Masood2 and S Gupta3
1 Department of Statistics, Quaid-i-Azam University, Islamabad, Pakistan.
2 Department of Mathematics and Statistics, PMAS-Arid Agriculture University Rawalpindi, Pakistan.
3 Department of Mathematics and Statistics, The University of North Carolina at Greensboro, Greensboro, USA.
Submitted: 22 May 2018; Revised: 23 January 2019; Accepted: 27 September 2019
* Corresponding author ([email protected]; https://orcid.org/0000-0002-0035-7072) Abstract: A difference-cum-exponential ratio type estimator
was proposed for estimating the finite population mean using two auxiliary variables under systematic sampling. Expressions for the biases and mean square errors (MSEs) were derived up to first order of approximation. It was observed that the proposed estimator is more efficient than the usual sample mean estimator, traditional ratio estimator, exponential-ratio estimator and many other recently proposed difference type estimators in terms of MSEs. Four datasets were used for efficiency comparisons.
Keywords: Bias, efficiency, MSE, systematic sampling.
INTRODUCTION
The use of auxiliary information in sample survey in an appropriate way may increase the precision of estimators by taking advantages of correlation between the study variable and the auxiliary variable. The ratio, product, exponential and regression estimators have frequently been used by many researchers in different forms either at the estimation stage or at designing stage or at both stages. In daily life information on single as well as two auxiliary variables are commonly used to enhance the precision of estimators. For example: (i) let y be the yield of a particular crop based on the area of a crop (x) and the amount of water utilisation (z); (ii) let y be the electricity consumed in the households (HHDs) based on income of the HHDs (x) and size of the HHDs (z) and (iii) let y be the average grades of the undergraduate students based
on the number of study hours used by students (x) and intelligence coefficient (IQ) level of the students (z). In all these examples, we are interested to use the auxiliary information in parallel to the study variable. Madow and Madow (1944), Cochran (1946) and Gautschi (1957) were among the earlier contributors in the area of systematic sampling. Other authors who have contributed in this area include: Swain (1964), Kushwaha and Singh (1989), Banarasi et al. (1993), Singh and Singh (1998), Singh et al. (2011), Singh and Jatwa (2012), Tailor et al.
(2013), Khan and Singh (2015), Khan (2016), Pal and Singh (2017), Kocyigit and Cingi (2017), Riaz et al.
(2017), Tailor and Mishra (2018), Qureshi et al. (2018) and Mishra et al. (2018). The main objective of this study was to construct a new estimator by taking advantage of two auxiliary variables to improve the efficiency of the estimator in systematic sampling.
METHODOLOGY
Consider a finite population U={U U1, ,... ,...,2 Ui N} of Nordered identifiable units. A sample of size n units is selected by first selecting an observation from the first k units and then selecting every kth unit thereafter. Thus, we can obtain k possible samples each of size n such that nk N= . Let yij and ( , )x zij ij
( 1,2,3,..., ;i= k j=1,2,3,..., )n be the observed values of the study variable ( )y and the auxiliary variables ( , )x z , respectively for the jth unit in the ith possible
sample. Let ( )
1 1
1 k n
sys
i j ij
y y
n = =
=
∑∑
, ( )1 1
1 k n
sys
i j ij
x x
n = =
=
∑∑
and( )
1 1
1 k n
sys ij
i j
z z
n = =
=
∑∑
be the systematic sample means corresponding to population means1 1
1 k n ij,
i j
Y y
N = =
=
∑∑
1 1
1 k n
i j ij
X x
N = =
=
∑∑
and1 1
1 k n
i j ij
Z z
N = =
=
∑∑
, respectively.Let ρy∗= + −1 ( 1) ,n ρyy ρx∗= +1 (n−1)ρxx and 1 ( 1) ,
z n zz
ρ∗= + − ρ where ( )( 2 ),
( )
ij ij
yy
ij
E y Y y Y E y Y
ρ = − ′−
−
2
( )( )
( )
ij ij
xx
ij
E x X x X E x X
ρ = − ′−
− and ( )( 2 )
( )
ij ij
zz
ij
E z Z z Z E z Z
ρ = − ′−
− are
the intra-class correlation coefficients between pairs of units within the same systematic sample for y, x and ( , )y z, respectively. Let ρyx, ρyz and ρxz be the usual population correlation coefficients between ( , ),y x
( , )y z and ( , )x z , respectively. Let Cy Sy
= Y , Cx Sx
=X and Cz Sz
= Z be the coefficients of variation of y, x and ( , )y z, respectively, where
2
1 1
1 ( ) ,
1
k n
y ij
i j
S y Y
N = =
= −
−
∑∑
21 1
1 ( )
1
k n
x ij
i j
S x X
N = =
= −
−
∑∑
and 2
1 1
1 ( )
1
k n
z ij
i j
S z Z
N = =
= −
−
∑∑
be the populationstandard deviations for y, x and ( , )y z, respectively.
We also define the following error terms.
Let 0 y(sys) 1
ζ = Y − , 1 x(sys) 1
ζ = X − , 2 z(sys) 1
ζ = Z − , such that E( ) 0ζi = for ( 0,1,2)i = . Also E( )ζ02 = Φρy∗Cy2,
2 2
( )1 x x
Eζ = Φρ∗C , E( )ζ22 = Φρz∗Cz2, E(ζ ζ0 1)= ΦCyx ρ ρy x∗ ∗, ( 0 2) yz y z
Eζ ζ = ΦC ρ ρ∗ ∗ and E(ζ ζ1 2)= ΦCxz ρ ρx z∗ ∗, where
yx yx y x
C =ρ C C , Cyz=ρyzC Cy z, Cxz=ρxz x zC C , and
1 N
nN Φ = − .
Now we discuss some of the existing estimators.
The usual sample mean estimator is given by
( )
ˆ sys
Y = y ...(1)
The variance of ˆY0 is given by 2 2
ˆ0
( ) y y
Var Y = ΦY ρ∗C ...(2) Single auxiliary variable
(i) The traditional ratio estimator is given by
( ) ( )
ˆ sys
R sys
Y y X x
=
...(3)
The bias and MSE of ˆ
YR to first order of approximation are respectively given by
{
2}
( )ˆR x x yx y x
B Y ≅ ΦY ρ∗C −C ρ ρ∗ ∗ ...(4) and
{ }
2 2 2
( )ˆR y y x x 2 yx y x
MSE Y ≅ ΦY ρ∗C +ρ∗C − C ρ ρ∗ ∗ ...(5) (ii) The usual exponential ratio type estimator is given by
( ) ( )
( )
ˆRE sys exp X xsyssys Y y
X x
−
= + ...(6)
The bias and MSE of YˆRE to first order of approximation are respectively given by
3 2 1
(ˆ )
8 2
RE x x yx y x
B Y ≅ ΦY ρ∗C − C ρ ρ∗ ∗
...(7)
and
2 2 1 2
(ˆ )
RE y y 4 x x yx y x
MSE Y ≅ ΦY ρ∗C + ρ∗C −C ρ ρ∗ ∗
...(8)
(iii) The usual difference estimator is given by
( )
( ) ( )
ˆ sys sys
YD =y +d X x− , ...(9) where d is the constant.
The minimum variance of YˆD at optimum value of d
i.e.
yx y * *y x opt
x x
Y C
d XC
ρ ρ ρ
ρ∗
= , is given by
Var Y( )ˆD min ≅ ΦY2ρy y∗C2
(
1−ρ2yx)
=MSE Y( )ˆD min...(10)
unbiased linear estimator which is given by
( ) ( )
( )
1 2 3
ˆ sys sys sys
KC
ax b ax b
Y y q q q
aX b aX b
α δ
+ +
= + + + + , ...(11) where ( , ) Rα δ ∈ and q i =i( 1,2,3) are constants and a and b are the functions of known population parameters, which may be the population mean, population coefficient of variation, population coefficient of skewness and population coefficient kurtosis of the auxiliary variable x.
The minimum MSE of ˆ
YKC at optimum value of
2 3
( )
AG= qα+qδ i.e.
*
( ) *
yx y y
G opt
x x
A C
gC
ρ ρ
= − ρ , where
g aX
=aX b
+ , is given by
( )
2 2 2
ˆ min
( KC) y y 1 yx
MSE Y ≅ ΦY ρ∗C −ρ , ...(12) which is equal to the variance of the linear regression estimator.
(v) Riaz et al. (2017) suggested the following class of estimators in systematic sampling:
YˆRDS
{
t y1 (sys) t X x2(
(sys)) }
exp X x((syssys))γ X x
−
= + − + ,
where t i =i( 1,2) are constants and γ is the scalar quantity.
For γ=1, the above estimator becomes:
( )
{
1 ( ) 2 ( )}
(( ))ˆRDS sys sys exp syssys
Y t y t X x X x
X x
−
= + − + ,
...(13)
The bias of ˆ
YRDS to first order of approximation is given by
2 2
1 1 3 1 1 2
(ˆ ) ( 1)
8 2 2
RDS x x yx y x x x
B Y ≅ t − Y t Y+ Φ ρ∗C − C ρ ρ∗ ∗+ t X CΦ ρ∗
2 2
1 1 3 1 1 2
(ˆ ) ( 1)
8 2 2
RDS x x yx y x x x
B Y ≅ t − Y t Y+ Φ ρ∗C − C ρ ρ∗ ∗+ t X CΦ ρ∗
...(14)
The MSE of ˆ
YRDS is given by
2 2 2 2 2 2 2
1 1 1 1 2 1 2
(ˆRDS) ( 1) 2
MSE Y ≅ t − Y +t Y A t X B t Y C t YXB+ − − + t t D
2 2 2 2 2 2 2
1 1 1 1 2 1 2
(ˆRDS) ( 1) 2
MSE Y ≅ t − Y +t Y A t X B t Y C t YXB+ − − + t t D,
where A= Φ
(
ρy∗Cy2+ρx∗Cx2−2Cyx ρ ρy x∗ ∗)
,x x2
B= Φρ∗C , 3 2
4 x x yx y x C= Φ ρ∗C −C ρ ρ∗ ∗,
(
x x2 yx y x)
D= Φ ρ∗C −C ρ ρ∗ ∗ .
The minimum MSE of ˆ
YRDS at optimum values of ( 1,2)
t i =i i.e. 1( ) ( 2 2)
2( )
opt
B C D
t AB D B
= − +
− + and
2( ) 2
( 2 )
2 ( )
opt
Y AB CD B D
t X AB D B
− + −
= − + is given by
(
2 2 2)
min 2 2
2 4 4 4
ˆ 1
( ) 1
RDS 4
AB BC BCD B BC BD B
MSE Y Y
AB D B
+ − + + − +
≅ − − +
(
2 2 2)
min 2 2
2 4 4 4
ˆ 1
( ) 1
RDS 4
AB BC BCD B BC BD B
MSE Y Y
AB D B
+ − + + − +
≅ − − +
...(15)
Two auxiliary variables
(i) The traditional ratio-ratio estimator using two auxiliary variables is given by
( )
( ) ( )
ˆ sys
RR sys sys
X Z
Y y
x z
=
...(16)
The bias and MSE of ˆ
YRR to first order of approximation are respectively given by
{
2 2}
( )ˆRR x x z z xz x z yx y x yz y z B Y ≅ ΦY ρ∗C +ρ∗C +C ρ ρ∗ ∗−C ρ ρ∗ ∗−C ρ ρ∗ ∗
...(17)
andMSE Y( )ˆRR ≅ ΦY2
{
ρy∗Cy2+ρx x∗C2+ρz∗Cz2+2Cxz ρ ρx z∗ ∗−2Cyx ρ ρy x∗ ∗−2Cyz ρ ρy z∗ ∗}
{ }
2 2 2 2
( )ˆRR y y x x z z 2 xz x z 2 yx y x 2 yz y z
MSE Y ≅ ΦY ρ∗C +ρ∗C +ρ∗C + C ρ ρ∗ ∗− C ρ ρ∗ ∗− C ρ ρ∗ ∗
...(18) (ii) Tailor et al. (2013) suggested the ratio-product estimator which is given by
( ) ( )
( )
ˆ sys sys
RP sys
X z Y y
x Z
=
...(19)
(iii) The bias and MSE of YˆRP to first order of approximation are given by
{
2}
( )ˆRP x x xz x z yx y x yz y z B Y ≅ ΦY ρ∗C −C ρ ρ∗ ∗−C ρ ρ∗ ∗+C ρ ρ∗ ∗
...(20)
andMSE Y( )ˆRP ≅ ΦY2
{
ρy∗C2y+ρx x∗C2+ρz∗Cz2−2Cxz ρ ρx z∗ ∗−2Cyx ρ ρy x∗ ∗+2Cyz ρ ρy z∗ ∗}
{ }
2 2 2 2
( )ˆRP y y x x z z 2 xz x z 2 yx y x 2 yz y z MSE Y ≅ ΦY ρ∗C +ρ∗C +ρ∗C − C ρ ρ∗ ∗− C ρ ρ∗ ∗+ C ρ ρ∗ ∗
...(21) (iv) The exponential ratio-ratio type estimator is given by
( ) ( )
( )
( ) ( )
ˆRRE sys exp X x syssys exp Z z syssys
Y y
X x Z z
− −
= + + ...(22)
The bias and MSE of YˆRRE to first order of approximation are given by
(
2 2)
3 1 1 1
(ˆ )
8 4 2 2
RRE x x z z xz x z yx y x yz y z
B Y ≅ ΦY ρ∗C +ρ∗C + C ρ ρ∗ ∗ − C ρ ρ∗ ∗− C ρ ρ∗ ∗
(
2 2)
3 1 1 1
(ˆ )
8 4 2 2
RRE x x z z xz x z yx y x yz y z
B Y ≅ ΦY ρ∗C +ρ∗C + C ρ ρ∗ ∗− C ρ ρ∗ ∗ − C ρ ρ∗ ∗
...(23)
and
2 2 1 2 1 2 1
(ˆ )
4 4 2
RRE y y x x z z xz x z yx y x yz y z
MSE Y ≅ ΦY ρ∗C + ρ∗C + ρ∗C + C ρ ρ∗ ∗−C ρ ρ∗ ∗−C ρ ρ∗ ∗
2 2 1 2 1 2 1
(ˆ )
4 4 2
RRE y y x x z z xz x z yx y x yz y z
MSE Y ≅ ΦY ρ∗C + ρ∗C + ρ∗C + C ρ ρ∗ ∗−C ρ ρ∗ ∗−C ρ ρ∗ ∗
...(24)
(v) Tailor and Mishra (2018) suggested the exponential ratio-product type estimator, which is given by
( ) ( )
( )
( ) ( )
ˆRPE sys exp X x syssys exp z syssys Z
Y y
X x z Z
− −
= + + ...(25)
The bias and MSE of YˆRPE to first order of approximation are given by
(ˆ ) 3 2 1 2 1 1 1
8 8 4 2 2
RPE x x z z xz x z yx y x yz y z
B Y ≅ ΦY ρ∗C − ρ∗C − C ρ ρ∗ ∗− C ρ ρ∗ ∗+ C ρ ρ∗ ∗
2 2
3 1 1 1 1
(ˆ )
8 8 4 2 2
RPE x x z z xz x z yx y x yz y z
B Y ≅ ΦY ρ∗C − ρ∗C − C ρ ρ∗ ∗ − C ρ ρ∗ ∗ + C ρ ρ∗ ∗
...(26)
and
2 2 1 2 1 2 1
(ˆ )
4 4 2
RPE y y x x z z xz x z yx y x yz y z
MSE Y ≅ ΦY ρ∗C + ρ∗C + ρ∗C − C ρ ρ∗ ∗−C ρ ρ∗ ∗+C ρ ρ∗ ∗
2 2 1 2 1 2 1
(ˆ )
4 4 2
RPE y y x x z z xz x z yx y x yz y z
MSE Y ≅ ΦY ρ∗C + ρ∗C + ρ∗C − C ρ ρ∗ ∗−C ρ ρ∗ ∗+C ρ ρ∗ ∗
...(27)
(vi) The traditional difference-difference type estimator is given by
( ) ( )
( ) ( ) ( )
1 2
ˆ sys sys sys
YDD=y +d X x− +d Z z−
, ...(28) where d1 and d2 are constants.
The minimum variance of YˆDD, at optimum values
of d1 and d2 i.e.
( )
( )
*
1( ) * 1 2
y y yx yz xz
opt
x x xz
d YC
XC
ρ ρ ρ ρ
ρ ρ
= −
− and
( )
( )
*
2( ) * 1 2
y y yz yx xz
opt
z z xz
d YC
ZC
ρ ρ ρ ρ
ρ ρ
= −
− , is given by
( )
2 2 2
min . min
ˆ ˆ
( DD) y y 1 y xz ( DD) Var Y ≅ ΦY ρ∗C −R =MSE Y ,
...(29)
where
2 2
2. 2
2 1
yx yz yx yz xz
y xz
xz
R ρ ρ ρ ρ ρ
ρ
+ −
= − is the multiple
correlation coefficient of y on x and z.
(vii) Khan and Singh (2015) and Khan (2016) suggested the following estimators respectively:
1 ( ) 2
( )
( )
( )
ˆ ( )
yz sys sys
KS sys
yx
Z b z Z Y y X
X b x X Z
δ δ
+ −
= + − , where δ =i( 1,2)i are constants and byx and byz are the sample regression coefficients and
( )
( ) ( )
( )
( ) (
( )
( ) ( )
ˆ exp exp
1 1
sys sys
sys
K sys sys
f X x h Z z
Y y
X g x Z η z
− −
= + − + − ,
where −∞ < < ∞f , −∞ < < ∞h , g >0 and η >0. Note that minimum MSEs of YˆKS and YˆK are exactly equal to the minimum variance of YˆDD to first order of approximation but the estimator YˆDD is preferable because of unbiasedness.
(viii) Qureshi et al. (2018) introduced the following estimator for population mean
( )
( )
1 (
( )
( ) ( )
2
ˆ exp
1
sys
QKH sys sys sys
X Z z
Y y
x Z z
ν
α ν
−
= + − , ...(30)
where v1 and v2 are constants and α is the scalar quantity, which takes values (0, 1, 1)− + .
The bias of YˆQKH reported by Qureshi et al. (2018) is given by
( ) { }
2 2
1 2 1
(ˆ ) 1 2( 1) ,
QKH x yx yx 2 z yx yx zx zx
B Y Y C H C a H H
a
ν ρ∗ α α ρ∗ ν ρ∗
≅ − + − + − −
( ) {
}
2 2
1 2 1
(ˆ ) 1 2( 1) ,
QKH x yx yx 2 z yx yx zx zx
B Y Y C H C a H H
a
ν ρ∗ α α ρ∗ ν ρ∗
≅ − + − + − −
where Hij=ρijC Ci/ j, 1 ( 1) 1 ( 1) i
ij
j
n n ρ ρ
ρ
∗= + −
+ − , a=α α/ ,*
* (1 2)
yx yz zx
yx
xz
H H H
α ρ
ρ
∗ −
= − .
Note: It is observed that the bias expression of estimator ˆQKH
Y by Qureshi et al. (2018) is not correct. Using equation (30), the correct bias expression is:
( ) 1 1
2 2
(ˆQKH c) yx y x yz y z xz x z
B Y Y νC ρ ρ α C ρ ρ αν C ρ ρ
ν ν
∗ ∗ ∗ ∗ ∗ ∗
≅ Φ − − +
( ) 1 1
2 2
(ˆQKH c ) yx y x yz y z xz x z
B Y Y νC ρ ρ α C ρ ρ αν C ρ ρ
ν ν
∗ ∗ ∗ ∗ ∗ ∗
≅ Φ − − +
2 2 2
1 1 2 2
2 2 2
1 ( 1)
2ν ν ρx xC α α 2α ρzCz
ν ν ν
∗ ∗
+ + + − +
2 2 2
1 1 2 2
2 2 2
1 ( 1)2ν ν ρx xC α α 2α ρzCz
ν ν ν
∗ ∗
+ + + − +
...(31)
The minimum MSE of ˆ
YQKH at optimum values of ν1 and ( / )α ν2 i.e.
( ) ( )
( )
*
1 * 1 2
y y yx yz xz
opt
x x xz
v C
C
ρ ρ ρ ρ
ρ ρ
= −
− and
( )
( )
*
2 * 2
( / )
1
y y yz yx xz
opt
z z xz
C C
ρ ρ ρ ρ
α ν ρ ρ
= −
− is given by
( )
2 2 2
min .
(ˆQKH) y y 1 y xz
MSE Y ≅ ΦY ρ∗C −R
...(32) (ix) Mishra et al. (2018) suggested the following class of estimators:
0 0 1 1
ˆ ˆ ˆ ˆ
MSS RP RPE
Y =δ Y +δY +δY , ...(33)
where 2
0 i 1
i
δ
=
∑
= are constants; Y iˆ ( 0, ,i = RP RPE) are defined earlier.Mishra et al. (2018) derived the bias and minimum MSE of YˆMSSup to first order of approximation as:
( ) (
2)
2 2
1 1
(ˆ )
2 4
MSS yz y z yx y x x x yz y z
B Y ≅ ΦYδ +δ C ρ ρ∗ ∗−C ρ ρ∗ ∗ +δ +δ C ρ∗−C ρ ρ∗ ∗
( ) (
2)
2 2
1 1
(ˆ )
2 4
MSS yz y z yx y x x x yz y z
B Y ≅ ΦYδ +δ C ρ ρ∗ ∗−C ρ ρ∗ ∗ +δ +δ C ρ∗−C ρ ρ∗ ∗ ...(34)
and
( )
22 2
min 2 2
(ˆ ) 3
2
yz y z yx y x
MSS y y
x x z z xz x z
C C
MSE Y Y C
C C C
ρ ρ ρ ρ
ρ ρ ρ ρ ρ
∗ ∗ ∗ ∗
∗
∗ ∗ ∗ ∗
−
≅ Φ +
+ −
...(35)
The optimum value of 1 22 δ δ
+
is:
( )
1 22 2 2 2
yz y z yx y x
opt x x z z xz x z
C C
C C C
ρ ρ ρ ρ
δ δ
ρ ρ ρ ρ
∗ ∗ ∗ ∗
∗ ∗ ∗ ∗
+ = −
+ −
...(36)
Expressions given in equations (34), (35) and (36) are not correct.
Solving equation (33), the correct expressions of bias and MSE of estimator YˆMSS to first order of approximation are:
( )
2 2
( ) 1 1
(ˆ )
2 4
MSS c yz y z yx y x xz x z
B Y ≅ ΦYδ +δ C ρ ρ∗ ∗−C ρ ρ∗ ∗ −δ +δ C ρ ρ∗ ∗
( )
2 2
( ) 1 1
(ˆ )
2 4
MSS c yz y z yx y x xz x z
B Y ≅ ΦY δ +δ C ρ ρ∗ ∗−C ρ ρ∗ ∗ −δ +δ C ρ ρ∗ ∗
2 2
2 2
1 3
8 Cx x 8 Cz z
δ δ
δ ρ∗ ρ∗
+ + −
...(37)
and
2 2
( )
2( ) min 2 2
(ˆ )
2
yz y z yx y x
MSS c y y
x x z z xz x z
C C
MSE Y Y C
C C C
ρ ρ ρ ρ
ρ ρ ρ ρ ρ
∗ ∗ ∗ ∗
∗
∗ ∗ ∗ ∗
−
≅ Φ − + −
...(38)
The correct optimum value of 1 2 2 δ δ
+
is:
2 2
1 1
2 opt c( ) 2 opt
δ δ
δ δ
+ = − +
...(39)
Proposed estimator
Motivated by Khan (2016) and Riaz et al. (2017), the following difference-cum-exponential ratio type estimator is proposed for the population mean under systematic sampling: