Model based approach to estimation - 7 The Survey and its analysis

7 The Survey and its analysis

7.8 Model based approach to estimation

7.135. The model based approach is the main alternative to the design based approach that was discussed in section 7.4.7 and appendix A. The model based approach which is discussed in appendix B is based on the idea that data obtained from the survey is the realization of a some unknown data generating process. The latter is understood to be a set of rules that describe how the data is generated. The modeler’s task is to postulate candidate models that are viewed as approxima- tions to the data generating process. These models are then estimated and evaluated in terms of their Þt to the data.

7.136. The diﬀerence between the model based method and the sam-

pling based method relates to the information that is being used in the estimation process. In the sample based approach only information about the sample design and the probability of selection is used to construct the estimates. Thus any prior information possessed by the investigator is incorporated into

the structure of the questionnaire and the stratiÞcation of the sample. For example, if responses are thought to vary by size of business, industry and region then that information is incorporated into the survey design by stratifying the sample on these features.

7.137. In the model based approach the prior information used relates to the knowledge of the data generating process. In the case of the models set out in Appendix B that knowledge is Þrstly, that means and variances of responses may diﬀer by industry, region and business size and secondly, that deviations from the expected employment response are independently and identi- cally distributed. This information is used to obtain estimates of the relevant means and variances that satisfy one or more estimation principles. The estimated mean for each cell is then factored up to make statements about the population by mul- tiplying it by the number in the population that occur in the cell.

7.138. Because the model based approach breaks the link between the weights and the probability of selection it can be used to make statements about the population even when the probability that a particular unit record is selected into the data set is not known. This means that the model based approach remains valid in the presence of non response. See Lohr (1999) for a further discussion of the model based approach to the analysis of surveys.

7.139. Among economists the model based approach is a widely used method of making statements about the population using data on unit records collected from the population.

7.140. The model based approach produces estimates of the popu-

lation quantities in the following way. Let yj be the variable

of interest and xj be characteristics of the respondent. The

modeler speciÞes, estimates and tests models yj = f (xj, α)

that relates the variable of interest to the observed characteristics and the parameters α. Let f∗_(x

j, α) the model that is

thought to provide the best Þt to the data and _bα represent the estimated parameters of the model. Then, for each unit record in the data the model yield a predicted value y_bj which

is obtained as follows, b

yj = f∗(xj,bα)

7.141. Weights wj that represent how many businesses in the popu-

sample are obtained from a source such as the Australian Bu- reau of Statistics Business Registrar. The weights here do not rely on their being interpreted as probabilities of selection and this means that non response does not cause a bias in the estimates of the population quantities. With these methods there is also no requirement that the data be drawn from a random sample although the latter feature can help in satisfying the assumptions of certain models.

7.142. The estimate of the population quantity of interest is then obtained as the weighted sum of the predicted valuesybj.That

is Y = N X j=1 wjbyj (7.3)

7.143. The variance of the predicted value can be obtained from the covariance matrix of the estimated parameters using the approach described in appendix B. Let _bτ2_j represent the estimated variance of y_bj then the variance of Y is obtained as

follows, V ar (Y ) = N X j=1 w2_j_bτ2_j (7.4)

7.144. Here we have implicitly used the assumption that yj and yk

are independent in order to estimate the variance. This assumption is guaranteed by design if the data is collected by a random sample.

7.145. Two models that could be used in further work with this data are described in appendix B.

Bibliography

AAPOR (2000), Standard DeÞnitions: Final Dispositions of Case Codes and Outcome Rates for Surveys, AAPOR, Lenexa Kansas.

Akerlof, G. & Yellen, J. (1986), Eﬃciency wage models of the labor market, Cambridge University Press, New York. Bates, N. & Dixon, J. (2003), Webcati and aapor response

rates. Meomorandum US Bureau of the Census.

Card, D. & Kruger, A. B. (1995), Myth and Measurement: The New Economics of the Minimum Wage, Princeton University Press, Princeton, New Jersey.

Katz, L. (1986), ‘Eﬃciency wage theories: a partial evalua- tion’, NBER Macroeconomica annual 1, 235—276.

Lohr, S. L. (1999), Sampling Design and Analysis, 1st edn, Duxbury Press, Arizona.

Manning, A. (2002), Monopsony in Motion: Imperfect Com- petition in Labour Markets, Princeton University Press, Princeton, New Jersey.

Appendix A

Statistical properties of design based

In document Minimum wages in Australia: an analysis of the impact on small and medium sized businesses (Page 135-142)