The Cost Model - Application to CPS Data - Adaptive Design to Adjust for Unit Nonresponse Using

3.4 Application to CPS Data

4.2.3 The Cost Model

The most cited survey cost model is of a linear form. For example, the cost model for a two-stage design with a total cost of C can be written as

C = C0+ C1n1 + C2n2 (4.5)

where C0denotes the fixed overhead cost, C1denotes the sampling or operation cost corresponding

to n1 primary sampling units, PSU, (e.g., housing units), and C2 denotes the variable cost for n2

secondary sampling units (e.g. subjects).

Overhead cost is the fixed cost of conducting a survey regardless of numbers of PSUs and sample subjects. Sampling/operation cost refers to costs such as the implementation of sample design (e.g. mapping and listing housing units, etc., field representative training, and the deployment of the field staff). The variable cost is the costs which increases with increases in sample size at each sampling stage. Example variable costs include hours, miles and other expenses related to locating, contacting and interviewing (when possible) sampled units.

As the formula states, the sampling/operation costs and variable costs are frequently considered separately. Sampling cost sometimes is included in the overhead cost C0, not affecting allocation.

Operation cost, depending on the survey design, may or may not increase with the increase in sample sizes. For example, operation cost increases discontinuously in a household face-to-face interview survey where a fixed ratio of field supervisor to field representative is needed. When the sample size increases so that an extra field representative needs to be hired; the hiring leads to the higher ratio of supervisor and staff, which in turn necessitates the hiring of a supervisor; in this situation the operation cost increases with the increase in the sample size. On the other hand, if the increase in the sample size fits into current field representatives’ workload, then the operation cost would not increase. Variable cost (C2) is the cost that by definition links to the direct cost of

Note that the setup of the cost model in (4.5) implies that all subjects, regardless of their characteristics and response propensities, have identical variable cost, C2. This assumption is

often not true. Groves (2004) pointed out that there is evidence that the efficiency of interviewers increases with the number of interviews completed. Therefore, the cost per sampled subject is a decreasing function of sample size within an interviewer. The model in (4.5) also did not take into account the discontinuous increase in operation cost, as the example described above.

In practice, survey costs are affected by the response propensity, in addition to the sample design, operational plan and the sample size. The reason is that cases with lower response propensity are likely to incur higher cost. Therefore, a cost model needs to take into sensible consideration the differential cost of sampling and operation between subjects with lower and higher response propensities, and presents a fair comparison on the variable cost for these subjects. We focus on survey cost before nonresponse follow-up. This is because the adaptive sampling designs im- prove respondent representativeness by oversampling the under-represented subjects, instead of conducting nonresponse follow-up. Conducting nonresponse follow-up increases the inferential complexity (Brick, 2013).

We propose a cost model that specifies the individual-level costs and reflects the differences between subjects with lower (likely nonrespondent) and higher (likely respondent) response propensities. For example, the sampling cost incurred by a likely nonrespondent may include the use of special frame, samples from areas with strong minority concentrations, sub-sampling within household, and screening (extra screener households). Operation cost incurred by likely nonrespondents may be allocating field representatives to a rural area or inner city, and special training for field representatives on communication skills. The variable cost for a likely nonrespondent is higher, perhaps due to the reasons such as harder to obtain a contact (requiring more visits) and higher probability of rejection (requiring a more experienced interviewer and costlier data collec- tion protocols).

adaptive sampling designs over-sample subjects with lower response propensity, incurring higher sampling, operation and variable costs. Suppose the overhead cost of conducting a survey is com- parable between fixed and adaptive sampling designs, we can leave it out from the cost model for our purpose. Let total cost of subject i, excluding overhead cost, be written as

Ci = Ai+ Bi (4.6)

where Ai denotes the sample/operation cost and Bi denotes the variable cost. The setting of

formula (4.6) suggests that, depending on their characteristics, each subject has their own sampling/operation and variable costs. Using this model, the total cost will be higher for a sample with higher number of likely nonrespondents than a sample with lower number of likely nonrespondents.

We assume that (a) subject i incurs the variable cost Bi, regardless of the sample design and (b)

subject i incurs sampling/operation cost Ai regardless of the sample design. That is, if subjects i

and j have the same response propensities, subject i is in fixed sampling design and subject j is in adaptive sampling design, they incur the same total cost, Ci = Cj, regardless of the sample design.

We can simplify the comparison by considering the cost model as the inverse of the response propensity. That is,

Ci ∝ 1/pi (4.7)

where pi is the response propensity for subject i. Formula (4.7) takes into account of the differ-

ential cost between likely respondents and nonrespondents. This cost model is subject-specific, stochastic, and simple.

In document Adaptive Design to Adjust for Unit Nonresponse Using an External Micro-level Benchmark (Page 119-122)