• No results found

SYSTEMATIC SAMPLING

In document 2. Survey Sampling Theory and Methods (Page 196-200)

Design- and Model-Based Variance Estimation

7.5 SYSTEMATIC SAMPLING

k∈s

wkXk=

N k=1

Xk

by minimizing the distance function



k∈s

ak(wk− ak)2/Qk, with Qk > 0

subject to the above CE. By the same approach one may derive tGR as a calibration estimator by modifying tR H C as well.

7.5 SYSTEMATIC SAMPLING

Next we consider variance estimation in systematic sampling where we have a special problem of unbiased variance esti-mation because a necessary and sufficient condition for the existence of a p-unbiased estimator for a quadratic form with at least one product term XiXj is that the corresponding pair of units (i, j ) has a positive inclusion probabilityπi j. But sys-tematic sampling is a cluster sampling where the population is divided into a number of disjoint clusters, one of which is se-lected with a given probability. Thus a pair of units belonging to different clusters has a zero probability of appearing together in a sample. Hence the problem of p-unbiased estimation of variance. Let us turn to it.

Let us consider the simplest case of linear systematic sampling with equal probabilities where in choosing a sample of size n from the population of N units it is supposed that Nn is an integer K . Then, the population is divided into K mutually exclusive clusters of n units each and one of them is selected at random, that is, with probabilityK1. If the ith cluster is selected

then one takes ¯yi, the mean of the n units of the ith cluster, i = 1, . . . , K as the unbiased estimator for the population mean Y . Then,

V ( ¯yi) = 1 K

K i=1

¯yi− ¯Y2= S2 n

1+ n− 1ρ

writing S2= nK1 1Knj=1 Yi j − ¯Y2, Yi j = the value of y for the j th member of ith cluster and

ρ = 1

K n (n− 1) S2

K 1

 

j = j

(Yi j − ¯Y ) (Yi j − ¯Y ).

For the reasons mentioned above one cannot have a p-unbiased estimator for V ( ¯yi) for the sampling scheme employed as above. However, there are several approaches to bypass this problem.

One procedure is to postulate a model characterizing the nature of the yi j values when they are arranged in K clus-ters as narrated above and then work out an estimator based on the sample, for example, v such that Em(v) equals EmV ( ¯yi), which therefore becomes a DM approach (cf. S ¨A RNDA L, 1981).

Second, the N elements are arranged in order, a num-ber r is found out so that nr is an integer m. Then, Kr = L, clusters are formed, and an SRSWOR of r clusters is chosen.

Each of these L clusters has m units and so a required sam-ple of size n = mr is thus realized. This is distinct from the original systematic sampling. To distinguish between the two they are respectively called single-start and multiple-start systematic sampling schemes. For the latter, one may suppose to have drawn r different systematic samples each of size m and the sample mean of each provides an unbiased estima-tor for the population mean. Denoting them by ¯y1, ¯y2,. . . , ¯yr

one may use ¯¯y = 1rr1¯yi as an unbiased estimator for ¯Y and

1 r (r−1)

r

1( ¯y− ¯¯y)2as an unbiased estimator for Vp( ¯¯y). Two vari-ations of this procedure are (a) to choose by SRSWOR method 2 or more clusters out of the K original clusters or (b) to divide the chosen cluster into a number of subsamples, and in either

case obtain several unbiased estimators for ¯Y and from them get an unbiased estimator of the variance of the pooled mean of these unbiased estimators.

A third approach is to first choose a systematic sample from the population and supplement it with an additional SRSWOR or another systematic sample from the remainder of the population. A variation of this is given by SINGH and SINGH(1977), who first make a random start out of all the N units arranged in a certain order, select a few successive units, and then follow up by choosing later units at a constant inter-val in a circular order until a required effective sample size is realized. They call it new systematic sampling, derive certain conditions on its applicability, show thatπi j > 0 for ev-ery i, j for this scheme and hence derive a Yates–Grundy-type variance estimator.

COCHRA N’s (1977) standard text gives several estimators following the first model-based approach. GA UTSCHI (1957), TORNQV IST (1963), and KOOP (1971) applied the second ap-proach. HEILBRON(1978) also gives model-based optimal es-timators of Var (systematic sample mean) as the conditional expectations of this variance given a systematic sample un-der various models postulated on the observations arranged in certain orders.

ZINGER (1980) and WU(1984) follow the third approach, taking a weighted combination of the unbiased estimators of Y based on the two samples and choosing the weights, keeping¯ in mind the twin requirements of resulting efficiency and non-negativity of the variance estimators. For a review one may refer to BELLHOUSE(1988) and IA CHA N (1982).

Finally, we present below a number of estimators for V ( ¯yi) based on the single-start simple linear systematic sample as given by WOLTER(1984).

We consider first the following notations: For the ith (i = 1,. . . , K) systematic sample supposed to have been chosen con-taining n units, let Yi j be the sample values, j = 1, . . . , n. Then,

¯yi = 1 n

n j=1

Yi j.

Let further

Then WOLTER (1984) proposed the following estimators for V ( ¯yi).

For a multiple-start systematic sample with r starts, let ¯yα denote the sample mean based on theαth replicate and

¯y= 1 r

r α=1

¯yα.

Then for V ( ¯y) the estimator is taken as v7= 1− f

r (r − 1)

r α=1

( ¯yα− ¯y)2.

This is also applicable if the ith systematic sample is split up into r random subsamples (cf. KOOP, 1971). Writing

ρˆK = 1 (n− 1)s2

n j=2

(Yi j − ¯yi) (Yi, j−1− ¯yi) another estimator for V ( ¯yi) is

v8= 1 (n− 1)s2

n j=2

Yi j − ¯yi) (Yi, j−1− ¯yi

.

WOLTER (1984) examined relative performances of these es-timators considering Bm(v) = Em[Ep(v)− V ( ¯y)] and Bm(v)/ EmV ( ¯yi) for v as vi, i = 1, . . . , 8 for several models usually postulated in the context of systematic sampling. He also ex-amined how good these are in providing confidence intervals for ¯Y . His recommendations favor v2, and v3, and, to some ex-tent, v8.

The general varying probability systematic sampling is known as circular systematic sampling (CSS) with probabil-ities proportional to sizes (PPS). From MURTHY (1967) we may describe it as follows. Suppose positive integers Xi(i = 1,. . . N ) with a total X are available as size measures and a sample of n units is required to be drawn fromU = (1, . . . , N ).

Then a member K is fixed as the integer nearest to X/n.

A random positive integer R is chosen between 1 and X . Then, let

ar = (R + r K) mod (X), r = 0, . . . , n − 1 and

C0= 0 , Ci=

i j=1

Xj, i= 1, . . . , N .

Then, a CSSPPS sample s is formed of the units i for which Ci−1 < ar ≤ Ci for r = 0, 1, . . . , n − 1

and the unit N if ar = 0.

In document 2. Survey Sampling Theory and Methods (Page 196-200)

Related documents