7 SAMPLE DESIGN AND ANALYSIS FOR THEMATIC MAP ACCURACY
8.3. Database Design
The process of doing the reference database for the accuracy assessment of medium spatial resolution land cover products in Portugal involved a detailed review of the state of the art.
This phase guide us in the definition of a number of fundamental steps required to accommodate the necessary information for future applications of the database. As a result,
the database production process can be described as consisting of two main parts: 1) the survey sampling, namely the definition of the reference sampling observation and the selection of the most adequate sampling design for geographical random sample collection;
and 2) the so-called response design [3], which consists on the evaluation of the variable of interest at each sampling observation, i.e. reference land cover labels identification by visual analysis of high spatial resolution aerial imagery. Next we detail each of these components, providing a synopsis of the main aspects taken into account during database construction.
8.3.1. Reference sampling observation
The reference sample observation serves as the basic element of comparison between the land cover map classification and the reference, or “true” classification. Sample observations can be a point or an area and their various forms and relative advantages and disadvantages for land cover maps accuracy assessment are examined in detail by [3]. In the context of large-area land cover cartography assessment, derived from pixel-based classification of medium spatial resolution satellite imagery, the sample observation is usually the minimum mapping unit of the map (i.e. pixel), which dimension is identical to the spatial resolution of the source imagery [9, 10]. In accordance, each sampling observation in the database was labeled as the most distinguishable land cover class at a specific geographical spot representing a nominal area of 300-by-300 m at Earth’s surface. The selection of a reference sample observation with this specific square area was established to be in accordance with the original purpose of the database, i.e. to evaluate the accuracy of national land cover products derived from 300m spatial resolution MERIS Level 2 Full Resolution imagery.
8.3.2. Sampling design
The first step in this phase was to define the appropriate technique to geographically select the reference sample observations that will constitute the reference sample. The set of rules that must be used in this process is entitled sampling design [11]. In a random sampling design the inclusion probabilities are known for all elements in the sample and are non-zero for all elements in the population [4]. A statistically rigorous sampling random design (e.g.
simple random, stratified random, cluster, and systematic sampling) contributes to a scientifically defensible accuracy assessment of land cover products, and such designs must be used because of their objectivity [3]. If this is not the case, overall and per class accuracies estimates become entirely dependent on assumptions and, as such, will be very difficult to defend in a confrontational setting [4]. Basic sampling designs, such as simple random sampling, are frequently appropriate for land cover maps accuracy assessment [12].
Often, however, it is impractical to follow such sampling procedures. This is because there is
a need to ensure specified sample sizes per land cover class to achieve more precise accuracy estimates for rare strata, and sample does not ensure that all classes are adequately represented. Stratified random sampling must be applied when there is a need to ensure a minimum sample size in each stratum to derive precise accuracy estimates for all land cover classes presented in the map [11]. Moreover, stratifying by mapped land cover classes may be used to achieve more precise estimates than would be obtained from simple random or systematic sampling without stratification [4]. The presented complexity of this process is that if design includes an unequal probability sampling, e.g. stratified sampling with equal or optimal allocation, then the inclusion probabilities of observations determined by the design must be known and incorporated in the estimation. If the inclusion probabilities are known and its presence is recognized, then the consistency criterion of statistical rigor is readily satisfied in practice by incorporating them in the accuracy estimation [3, 11, 13].
To produce a reference database large enough to ensure precise per class accuracies estimation for all cover types presented in LANDEO nomenclature, it was necessary to perform a stratified random sampling selection of observations. The difficulty inherent to this approach was the a priori inexistence of a land cover map including the same classes to be used as strata. Thus, we decided to automatically produce a primary map based on MERIS Level 2 Full Resolution imagery data from August 2005 to serve as strata for sample observations allocation. This map was produced using a set of approximately thirty sample observations per class that served as input training data of a Maximum Likelihood (ML) classifier. Overall map accuracy was assessed using an independent sample of approximately the same size of the training set and estimated as about 63%.
The next step was to define the appropriate sample size to be collected per mapped land cover type. In general, the larger the sample size the greater the confidence one can have in assessments based on that sample. Several authors have suggested methods and guidelines to estimate the appropriate sample size. The goal is to provide accuracy estimations preferably with small variance estimates at some high level of confidence. Along years, main authors have used an equation based on the binomial approximation to the normal distribution to estimate the required sample size for the accuracy assessment of land cover maps (e.g. [14]).
Reference [13] states that this approach is statistically sound for estimating the sample size needed to estimate the overall accuracy of a classification or the accuracy of a single land cover category.
An absolute precision of 10% in per class accuracy estimation is sufficient to prosecute our requisites. This implies a collection of 100 sample observations per class to provide that
precision at 95% level confidence, even if per class accuracy is 0.5 [15]. Thus, we guaranteed that 100 sample observations were collected per mapped class. Note that for the accuracy assessment of upcoming maps with different cover types we can still use this data to estimate overall and per class accuracies. However, absolute precision of estimates may be larger than 10%, according to the spatial distribution of those cover types, from now on entitled domains. For a detailed description on how to estimate overall and per class accuracies based on domain estimation, please read [16].
8.3.3. Reference imagery data
The identification of the cover type most typical in each sample observation requires the visual analysis of high spatial resolution remote sensed imagery or even field check. The former is timely and economically expensive, so we decided to use high spatial resolution aerial imagery to identify the most adequate land cover type at each sample observation.
Reference land cover labels were derived by visual analysis of orthorectified aerial images acquired during the years of 2004, 2005 and 2006 and covering the whole Portuguese territory; these images have the following characteristics: four spectral bands in the blue, green, red, and infrared wavelengths; and 50 cm spatial resolution. The bounds of sample observations, randomly selected for each cover type, were identified using a 300-by-300 m fishnet covering the entire Portuguese territory.
8.3.4. Image interpretation
This phase may be considered as the most important and frail of the reference sample database production. At this stage, four image interpreters were escorted in the identification of the most reliable land cover type characterizing each sample observation. Reference [12]
suggests that an uncertainty in the confidence of single class labeling using the available reference data can significantly influence the apparent accuracy of a classification. Thus, to avoid confounding classification and location errors, primary and alternate reference land cover labels must be identified and characterized by visual interpretation of aerial images for each reference sample observation. This process was done in a blind format in that the interpreters did not have knowledge of the primary map classification for each observation.
After image interpretation training, performed to reduce individual subjectivity in reference label assignment, image interpreters collected primary and alternate reference land cover labels for sample observations and surrounding pixels. This process was executed repeatedly for each observation by all image interpreters to reduce uncertainty whenever the label identification was not clear. When this happened, image interpreters altogether examined critically the situation and jointly decided for the most adequate label. Additional auxiliary
information, such as intra-annual time series of Normalized Difference Vegetation Index for each sample observation, was used to complement the fragile visual analysis of single date aerial imagery. Nominally scored interpretation and location confidence ratings (ICR and LCR, respectively) were also specified for each observation, as suggested by [10]. These ratings allow the use of a scale of error in the accuracy assessments rather than simple and perhaps overly severe interpretations of ground data [10]. In Table 8.2 we present a description of the label and auxiliary information collected for each sample observation recorded in the reference database.