7 The Survey and its analysis
7.8 Model based approach to estimation
7.135. The model based approach is the main alternative to the de- sign based approach that was discussed in section 7.4.7 and appendix A. The model based approach which is discussed in appendix B is based on the idea that data obtained from the survey is the realization of a some unknown data gener- ating process. The latter is understood to be a set of rules that describe how the data is generated. The modeler’s task is to postulate candidate models that are viewed as approxima- tions to the data generating process. These models are then estimated and evaluated in terms of their Þt to the data.
7.136. The difference between the model based method and the sam-
pling based method relates to the information that is being used in the estimation process. In the sample based approach only information about the sample design and the probability of selection is used to construct the estimates. Thus any prior information possessed by the investigator is incorporated into
the structure of the questionnaire and the stratiÞcation of the sample. For example, if responses are thought to vary by size of business, industry and region then that information is in- corporated into the survey design by stratifying the sample on these features.
7.137. In the model based approach the prior information used relates to the knowledge of the data generating process. In the case of the models set out in Appendix B that knowledge is Þrstly, that means and variances of responses may differ by industry, region and business size and secondly, that deviations from the expected employment response are independently and identi- cally distributed. This information is used to obtain estimates of the relevant means and variances that satisfy one or more estimation principles. The estimated mean for each cell is then factored up to make statements about the population by mul- tiplying it by the number in the population that occur in the cell.
7.138. Because the model based approach breaks the link between the weights and the probability of selection it can be used to make statements about the population even when the probability that a particular unit record is selected into the data set is not known. This means that the model based approach remains valid in the presence of non response. See Lohr (1999) for a further discussion of the model based approach to the analysis of surveys.
7.139. Among economists the model based approach is a widely used method of making statements about the population using data on unit records collected from the population.
7.140. The model based approach produces estimates of the popu-
lation quantities in the following way. Let yj be the variable
of interest and xj be characteristics of the respondent. The
modeler speciÞes, estimates and tests models yj = f (xj, α)
that relates the variable of interest to the observed character- istics and the parameters α. Let f∗(x
j, α) the model that is
thought to provide the best Þt to the data and bα represent the estimated parameters of the model. Then, for each unit record in the data the model yield a predicted value ybj which
is obtained as follows, b
yj = f∗(xj,bα)
7.141. Weights wj that represent how many businesses in the popu-
sample are obtained from a source such as the Australian Bu- reau of Statistics Business Registrar. The weights here do not rely on their being interpreted as probabilities of selection and this means that non response does not cause a bias in the esti- mates of the population quantities. With these methods there is also no requirement that the data be drawn from a random sample although the latter feature can help in satisfying the assumptions of certain models.
7.142. The estimate of the population quantity of interest is then obtained as the weighted sum of the predicted valuesybj.That
is Y = N X j=1 wjbyj (7.3)
7.143. The variance of the predicted value can be obtained from the covariance matrix of the estimated parameters using the ap- proach described in appendix B. Let bτ2j represent the esti- mated variance of ybj then the variance of Y is obtained as
follows, V ar (Y ) = N X j=1 w2jbτ2j (7.4)
7.144. Here we have implicitly used the assumption that yj and yk
are independent in order to estimate the variance. This as- sumption is guaranteed by design if the data is collected by a random sample.
7.145. Two models that could be used in further work with this data are described in appendix B.
Bibliography
AAPOR (2000), Standard DeÞnitions: Final Dispositions of Case Codes and Outcome Rates for Surveys, AAPOR, Lenexa Kansas.
Akerlof, G. & Yellen, J. (1986), Efficiency wage models of the labor market, Cambridge University Press, New York. Bates, N. & Dixon, J. (2003), Webcati and aapor response
rates. Meomorandum US Bureau of the Census.
Card, D. & Kruger, A. B. (1995), Myth and Measurement: The New Economics of the Minimum Wage, Princeton University Press, Princeton, New Jersey.
Katz, L. (1986), ‘Efficiency wage theories: a partial evalua- tion’, NBER Macroeconomica annual 1, 235—276.
Lohr, S. L. (1999), Sampling Design and Analysis, 1st edn, Duxbury Press, Arizona.
Manning, A. (2002), Monopsony in Motion: Imperfect Com- petition in Labour Markets, Princeton University Press, Princeton, New Jersey.