Using list frames to build and use Master Sampling Frames
DECISION SEQUENCE
5.2.2. Using agricultural censuses to build Master Sampling Frames
The process of building an MSF based on an agricultural census applies when the census is conducted as a complete enumeration. The approach is very similar to the case of the population census featuring additional items on agriculture. When the census is taken on a sampling base, the list of areas in which the sample was taken is the only available frame that covers the whole country. In these cases, only the areas sampled for the census have auxiliary information that can be used for designing samples for agricultural surveys that will be based on a sub-sample of the census (sample-based).
However, more relevant auxiliary information can be obtained and used when two-stage sample designs are applied; data from agricultural censuses can significantly improve sample design. In particular, better account can be taken of rare items or geographically concentrated activities. Master samples can be developed which can be used to select subsamples for other surveys. Census data can also be used as benchmarks for forthcoming surveys.
Lists from an agricultural census provide excellent auxiliary information for the sampling design purposes: ratio and regression estimators can be used, because there is enough information on variables at the population level
to proportionate or make regressions between sampling observations and population values; stratification or PPS sampling is also facilitated. The same is true of directories of households involved in agriculture from population censuses containing an agricultural module. The problem with both types of census is that the data become obsolete, due to the long time between collection periods. In some cases, several years may pass before the census data become available, which makes them obsolete before they were even disseminated.
Section 5.4 provides some guidance on updating MSFs based on lists from population or agricultural censuses.
5.2.3. Using business registers of farms to build a Master Sampling Frame
As specified earlier, “a basic sample frame for agricultural statistics is a listing of the units from which the sample is to be selected at any stage of sampling”. Therefore, the quality of the frame will depend on how well it covers all population units; the goal of the statistician is to maximize coverage and, if possible, provide measurements of under-coverage.
FAO (2005) distinguishes two categories of agricultural holdings: (i) holdings in the household sector and (ii) holdings in the non-household sector.
A distinctive feature of agriculture in developing and developed countries is the respective importance of these two categories in the agriculture sector. In developed countries, the non-household sector tends to be the most important; in most developing countries, the contrary is true: the household sector is the most important sector for agriculture, with a limited number of non-household holdings. However, as economies develop, the non-household sector becomes increasingly important.
Population and agricultural censuses should provide information on both the household and non-household sectors, to include agricultural production on farms or holdings that are not associated with a household.
When the information available from censuses cannot provide accurate registers of the existence of farms in the non-household sector, other sources of information must be found to identify these units and complement the household-based holding, if complete coverage of the agricultural sector is to be achieved.
In some developed countries, particularly those of the Nordic region, farm registers play an important part in agricultural statistics. InBenedetti etal. (2010), Anders and Wallgreen provide a detailed description of how administrative registers can be used to create farm registers for multiple purposes in agricultural statistics, including (i) direct tabulation to provide estimates and (ii) contribute to building sampling frames for sample surveys. More generally, a review of the use of administrative data to improve official statistics in developed countries identifies the following four areas:
• direct tabulation of statistical registers • reduce data collection costs
• enable use of improved estimators
• input for frame construction and sampling design.
The literature provides detailed explanations of the advantages and weaknesses and limitations of using farm registers in agricultural statistics, but these focus mainly on developed countries.
In most developing countries, the non-household sector is composed of several types of units including large corporations, government-operated holdings, cooperatives, large plantations, large livestock units, etc. There is no standard method for approaching all these units and obtaining a perfect list. In practice, all relevant registers should
be considered when building a master list of all holdings in the non-household sector. This may include:
• administrative registers of corporations operating agricultural holdings (business registration/licensing registers) land registration/cadastral records
• lists of members of agricultural cooperatives,
• lists of members of farmers’ associations or special commodity boards (for coffee, cocoa, tea etc.)
• local knowledge and information from extension agents and local authorities about large specialty-type farms. Most of these individual lists are likely to be affected by frame imperfections. These imperfections are not limited to the case of business registers, but they could be exacerbated in this context. The major risks are:
a. Coverage errors
When analysing and integrating individual lists, care should be taken to ensure that all units of interest are included, and that only these units are included to minimize under-coverage and over-coverage. In practice, there tends to be a limited number of units in most developing countries, and they are usually visible and well known. Coverage may be an important issue when using frames based on farm registers. For example, farmers’ associations generally include farmers that produce particular crops, such as “rice producers association”, “banana producers’ association” or “association of dairy producers”. As group membership is voluntary, lists from such sources are usually not exhaustive, and other sources are required to complete the frame. Their use as (partial) sampling frames is preferable because, the associations tend to update their lists frequently; in addition, linking the actual farm to the farmer in the list is a straightforward operation. The main handicap of lists from farmers’ associations is their incompleteness, and the need to complement them with other sources. When combining lists from separate sources, caution must be taken, to avoid adding duplicates to the subsequent combined list.
In broad terms, the use of land records (cadastral registers) is favoured: these provide a complete coverage of land maps (usually in digital form within GISs) that facilitates identification of the piece of land on which the unit is located. Ancillary information usually only refers to the total area of the cadastral parcel.
b. Errors due to misclassification
Another risk concerns the accurate classification of the frame units, that is, whether the units are effectively members of the target population. This issue is related to the definition of the unit as adopted in agricultural censuses and surveys, which may differ from the definition adopted in various registers. Corporations and government institutions may have complex structures, in which different activities are undertaken by different parts of the organization with a varying degree of autonomy regarding management decisions. FAO (2005) recommends that the National Account concept of establishment should be used, where “an establishment is an economic unit engaged in one main production activity operation in a single location”. Land registry may be based on land ownership instead of the name of the holder effectively operating the holding. In addition, cadastral parcels are defined differently from agricultural land parcels, and linking the two may be difficult. Every effort should be made to ensure that the units in the registers correspond to the agricultural holding.
c. Duplication
The other risk is that of duplication, “when a population unit is represented by more than one frame unit ”14 Here, too, every effort should be made to identify and reduce duplication.
Therefore, the master list of holdings in registers should be prepared by crossing the information from various registers and by triangulation, to minimize these main risks and provide an acceptable complement to the household sector frame, to build a Master Frame with good coverage.
In the case of registers from farmers’ associations, the same individual may (and usually does) appear with different names in different lists. Experience shows that the matching of names from different lists is extremely difficult. Another disadvantage of lists from farmers’ associations is that they do not contain enough ancillary information to improve the sampling estimates.
d. Other issues
It is important to note that confidentiality is crucial in all statistical operations. Therefore, if tax records are used to obtain a list of farm operators, special care should taken to ensure the confidentiality of individual information. In many regions, local authorities maintain records of farm operators and the land operated in their respective areas. Their use as a source for building sampling frames requires a detailed and in-depth analysis to assess their quality. In particular, one must assess whether the source is up-to-date, complete and possesses the rules of identification, as well as other desirable properties.