3.1 Reference data source
The reference data collection could only rely on already existing expertise distributed all over the world. The creation of an international expert network is the key element of the validation process. The experts have been selected according to the following criteria: undisputed expertise on landcover over relative large areas, familiarity with interpreting remote sensing imagery, commitment, complementarities to the other experts and belonging to well-known international network. 16 international experts from all over the world have been invited for 6 different 5- day workshops hosted by UCL (Louvain-la-Neuve, Belgium). The experts have truly committed themselves to build the GlobCover reference data set. Some of them could not join the working session but was familiar enough with the LCCS system and the validation process to complete their job form distance using the same tools.
the dominant landcover class within a 300mGlobCover pixel, using all classes except the mosaic of cultivated and managed/natural vegetation. In addition, they were asked to indicate the percentage of the pixel covered by the chosen landcover. This could then be repeated for a total of three landcover classes. A check box was provided so that volunteers could indicate if more than three landcover types were present in the pixel. Volunteers were then asked to indicate the overall human impact for the pixel and to further indicate their overall conﬁdence in their choice: unsure; less sure; quite sure; and sure. Finally they were asked to record the satellite image date that was visible at the bottom of the screen, in addition to indicating if the imagery was high-resolution. In contrast to the human impact campaign, the presence of abandoned land and related conﬁdence were not recorded (owing in part to the difﬁculty that participants had with this attribute). A total of 49 control pixels were independently assessed by three experts for quality assessment purposes (Data Citation 3).
The assessment of the accuracy of thematic maps such as those depicting landcover obtained via remote sensing has evolved considerably over the last four decades (e.g. Foody, 2002; Congalton and Green, 2009). It is now widely accepted that an accuracyassessment should be part of landcover mapping programmes. This is primarily because without an accuracyassessment each map produced is simply an untested hypothesis, one of many possible representations of the world which may or may not be fit for its intended purpose (Strahler et al., 2006). This is important as it is now very simple to produce thematic maps from remote sensing. Indeed there are, for example, numerous globallandcover maps available but they do differ markedly in their representation and it is sometimes difficult to know which is the most suitable one to use in an application or how to best use the set without information on accuracy (Giri et al., 2005; Jung et al., 2006; McCallum et al. 2006; Fritz and See, 2008; Critically, a map is not suited for scientific inference without a rigorous assessment of its quality, leaving the map as little more than a pretty picture (M c Roberts, 2011).
4. DATA ANALYSIS AND RESULTS 4.1 Traditional AccuracyAssessment
The GL30 accuracy with respect to the DUSAF has been already published (Bratic, et al. 2018a). In this former work, the approach used for computing the confusion matrix was slightly different than the one reported here. Namely, in Bratic et al. (2018a) the confusion matrix was computed using DUSAF raster map downsampled at 30 m. The exact confusion matrix derivation adopted in this work is described in section 3.2. It is equivalent to the confusion matrix that would be derived as if both reference and classified datasets had 5 m resolution. Nevertheless, the different approach did not incur significant changes in results (up to 1%). Table 4 includes the confusion matrix normalized by column (i.e. divided by the total number of pixels in each GL30 class). In Table 4, it can be observed that the agreement (diagonal values) of class 40 (Shrubland) is the lowest, and that the highest confusion (extra-diagonal values) is between class 40 and class 20 (Forest). This confusion is also evident from traditional accuracy indexes, e.g. the Producer’s accuracy (PA) and the User’s accuracy (UA) (Congalton, 2004) shown in Table 5. Due to the similar physical properties of these two classes, the error may be caused by the classification algorithm used for producing the map. In order to better investigate and describe errors features, spatial patterns of disagreements between class 40 and 20 is analysed. The next section focuses on this example to test the discuss the potential benefit of coupling the traditional accuracyassessmentaccuracy to exploratory and statistical spatial patterns analysis. Nevertheless, the same can be applied to the analysis of any intra-class and inter-class classification accuracy.
The images selected for the base map were only mod- erately well georeferenced. The SPOTMaps image, for ex- ample, had an approximated location error of 10–15 m ac- cording to the specification, and the other spatial data also revealed a substantial location error. Therefore, we georef- erenced them again using 14 ground control points (GCPs) distributed over the entire catchment. They were established along linear elements, such as roads, and defined by the Global Positioning System (GPS) coordinates averaged over several measurements. After georeferencing by the first- order polynomial (affine) transformation, the horizontal root mean squared error (RMSE) of the final base map image equalled 9.62 m.
Classiﬁcation is a fundamental process in remote sensing used to relate pixel values to landcover classes present on the surface. Over large areas landcover classi ﬁcation is challenging particularly due to the cost and difﬁculty of collecting representative training data that enable classiﬁers to be consistent and locally reliable. A novel meth- odology to classify large volume Landsat data using high quality training data derived from the 500 m MODIS landcover product is demonstrated and used to generate a 30 mlandcover classi ﬁcation for all of North America between 20°N and 50°N. Publically available 30 mglobal monthly Web-enabled Landsat Data (GWELD) products generated from every available Landsat 7 ETM+ and Landsat 5 TM image for a three year period, that are deﬁned aligned to the MODIS land products and are consistently pre-processed data (cloud-screened, saturation ﬂagged, atmospherically corrected and normalized to nadir BRDF adjusted reﬂectance), were classiﬁed. The MODIS 500 mlandcover product was ﬁltered judiciously, using only good quality pixels that did not change landcover class in 2009, 2010 or 2011, followed by automated selection of spatially corresponding 30 m GWELD temporal metric values, to deﬁne a large training data set sampled across North America. The training data were sampled so that the class proportions were the same as the North America MODIS landcover product class proportions and corresponded to 1% of the 500 m and b0.005% of the 30 m pixels. Thirty nine GWELD temporal metrics for every 30 m North America pixel location were classiﬁed using (a) a single random forest, and (b) a locally adap- tive method with a random forest classiﬁer derived and applied locally and the classiﬁcation results spatially mosaicked together. The landcover classi ﬁcation results appeared geographically plausible and at synoptic scale were similar to the MODIS landcover product. Detailed visual inspection revealed that the locally adaptive random forest classiﬁcations and associated classiﬁcation conﬁdences were generally more coherent than the single random forest classiﬁcation results. The level of agreement between the 30 m classiﬁcations and the MODIS landcover product derived training data was assessed by bootstrapping the random forest implementa- tion. The locally adaptive random forest classi ﬁcation had higher overall agreement (95.44%, 0.9443 kappa) than the single random forest classiﬁcation (93.13%, 0.9195 kappa). The paper concludes with a discussion of future research including the potential for automated globallandcover classiﬁcation.
Forest class of GlobeLand30 presents an intermediate overlap degree with Cropland from GLC-SHARE (0.61), a high value when compared with the obtained with Grassland and Shrubs covered areas from GLC-SHARE (0,45 with both landcover classes). Indeed this value at a first sight could be too high. However it must be presented that the overlap values were computed just with the LCC of the EAGLE matrix, and in terms of landcover, Cultivated land from GlobeLand30 could contain the LCCs "Trees" (e.g. olive trees or fruit trees), "Regular bushes" (e.g. fruit berrys) and "Regular graminoids" (e.g. rainfed crops), while for Grassland and Shrubs covered areas from GLC-Share just one of these LCCs are mandatory. This aspect is also evident when the overlap value for Grassland and Shrubland from GlobeLand30 is compared with Cropland from GLC-Share. If the proposed approach was extended to the Land Use Components of the EAGLE matrix, this would result in more different features between Cropland and the natural vegetation landcover classes of GLC-Share and consequently lower values of semantic overlap between these landcover classes.
LULC subject is one of the principal aspects of climate and environment studies. The physical condition of the surface is called landcover while human-altered part of these lands referred to as land use. Satellite data was widely utilized to study spatiotemporal comparison LULC study. Remote sensing spectral indices are a good method to classify LULC classes (Chen et al., 2006). Among them the Normalized Difference Soil Index (NDSI) to select bare soil pixels (Rogers and Kearney, 2004). Normalized Difference Built-up Index (NDBI) to automatically separate built-up areas (Zha et al., 2003). Normalized Difference Water Index (NDWI) to select water and vegetation liquid (Gao, 1995). The Normalized Difference Vegetation Index (NDVI), which is vegetation ratio based index. Enhanced Vegetation Index (EVI), which developed for high biomass regions. Leaf area index (LAI), which is used as a measurement of the total area of leaves per unit area. The Normalized Difference Snow Index to select snow cover lands. In this paper, we abbreviated it as (DSI) to be not confused with NDSI of soil index.
This paper investigates two methods for generalising landcover polygons that constitute a planar topographic map (i.e. based on a data structure without gaps or overlaps). The issue of generalizing landcover objects is one of the main remaining issues identified in an
internal proof of concept generalisation pilot by the Dutch Kadaster. In this ongoing research project a fully automated generalisation workflow is being set up for deriving small scale maps from TOP10NL data taking current requirements and new technologies into account ( Stoter et al, 2009; Smaalen en Stoter, 2008 ). In 2010 the generalisation of 1:50k map from TOP10NL was prototyped, and further worked out in a successful Proof of Concept (PoC) in 20011 (see Figure 1 and Stoter et al, 2011). These results are currently being further developed to other test areas and other scales (first at scale 1:100k).
Figure 5 shows a plot of overall classification accuracy and kappa coefficient against visibility; both decline as visibility drops. The classification accuracy degrades at a faster rate as visibility gets poorer. The haze becomes intolerable at visibilities less than about 11 km (i.e. 85% accuracy). For 8 km visibility (moderate haze), accuracy reduces by about 20%. About 70% drop in accuracy occurs between 8 and 0 km visibility. A much sharper decline can be observed for visibilities less than 4 km, with only 50% classification accuracy remaining at about 2 km visibility. It is clear that the kappa coefficient plot shows a consistent result with the classification accuracy plot.
A fifteen-second globallandcover dataset –– GLCNMO2008 (or GLCNMO version 2) was produced by the authors in the Global Mapping Project coordinated by the International Steering Committee for Global Mapping (ISCGM). The primary source data of this landcover mapping were 23-period, 16-day composite, 7-band, 500-m MODIS data of 2008. GLCNMO2008 has 20 landcover classes, within which 14 classes were mapped by supervised classification. Training data for supervised classification consisting of about 2,000 polygons were collected globally using Google Earth and regional existing maps with reference of this study’s original potential landcovermap created by existing six globallandcover products. The remaining six landcover classes were classified independently: Urban, Tree Open, Mangrove, Wetland, Snow/Ice, and Water. They were mapped by improved methods from GLCNMO version 1. The overall accuracy of GLCNMO2008 is 77.9% by 904 validation points and the overall accuracy with the weight of the mapped area coverage is 82.6%. The GLCNMO2008 product, landcover training data, and reference regional maps are available through the internet. Keywords: landcover, MODIS, decision tree method, Global Mapping Project
Abstract. The Global Change Assessment Model (GCAM) is a global integrated assessment model used to project future societal and environmental scenarios, based on economic modeling and on a detailed representation of food and en- ergy production systems. The terrestrial module in GCAM represents agricultural activities and ecosystems dynamics at the subregional scale, and must be downscaled to be used for impact assessments in gridded models (e.g., climate mod- els). In this study, we present the downscaling algorithm of the GCAM model, which generates gridded time series of globalland use and landcover (LULC) from any GCAM sce- nario. The downscaling is based on a number of user-defined rules and drivers, including transition priorities (e.g., crop ex- pansion preferentially into grasslands rather than forests) and spatial constraints (e.g., nutrient availability). The default pa- rameterization is evaluated using historical LULC change data, and a sensitivity experiment provides insights on the most critical parameters and how their influence changes regionally and in time. Finally, a reference scenario and a climate mitigation scenario are downscaled to illustrate the gridded land use outcomes of different policies on agricul- tural expansion and forest management. Several features of the downscaling can be modified by providing new input data or changing the parameterization, without any edits to the code. Those features include spatial resolution as well as the number and type of land classes being downscaled, thereby providing flexibility to adapt GCAM LULC scenarios to the requirements of a wide range of models and applications. The downscaling system is version controlled and freely avail- able.
Land Use and LandCover Maps (LULCM) are fundamental inputs to many areas of application, e.g. climate modelling and natural resource management among many others (Jones, 2008). These maps are usually generated through the classification of satellite imagery and their creation is time consuming. The update cycles of LULCM are also often long, e.g. CORINE LandCover (CLC) is updated irregularly, with three years between the two most current products. This is insufficient for environments that are subject to rapid environmental change. More recently, researchers have been investigating the use of OpenStreetMap (OSM) as a source of LULC information, both to create and validate different products (Estima and Painho, 2015; Jokar Arsanjani et al., 2015a; Martinho and Fonte, 2015) particularly as the data are updated on a daily basis. OSM is one of the most well studied collaborative mapping projects and undoubtedly the most well-known Volunteered Geographical Information (VGI) initiative (Jokar Arsanjani et al., 2015b). The overall aim is to collect vector data provided by volunteers that enables the creation of a map at a global scale. The flexibility of use, data availability, free access to the latest information on a daily basis, the large number of contributions and users of the data, and the existence of data not traditionally available in other types of more authoritative map databases makes OSM a valuable source of information for several applications, e.g. navigation (Codescu et al., 2011) and disaster response (Zook et al., 2010; Soden and Palen, 2014).
Investigation of accuracy of landcover data used in regional climate modeling
In this research, utilization of Landsat7 ETM+ im- ages in regional climate modeling was investigated and the accuracy of GlobalLandCover Characterization (GLCC) data set used in regional climate modeling was assessed for the Marmara Re- gion by comparing these data with Landsat ETM+ derived landcover data. Marmara Region was se- lected as study area because it faced with significant landcover changes as a result of rapid industriali- zation and population increase especially after 1980s. The region occupies the northwest corner of Turkey with a surface area of 67 000 km² and repre- sents approximately 8.6% of the Turkish national territory. It is the smallest but most densely popu- lated of the seven geographical regions of Turkey. This region includes eleven cities namely Istanbul, Bursa, Kocaeli, Edirne, Balikesir, Kirklareli, Tekir- dag, Canakkale, Bilecik, Sakarya and Yalova, where first three cities are industrial and commercial cen- ters of Turkey.
Land use/ Landcover information is the basic pre –requisite for land, water, vegetation resource, utilization, conservation and management. The information on Land use/ Landcover is available today in the form of thematic maps, published statistical figures in record. These information are inadequate, inconsistent and do not provide up-to-date information on the changing land use patterns, processes and their spatial distribution in space and time. Satellite remote sensing offers alternate, accurate and faster mode of data collection and updating the land use/landcover
Results show that globallandcover maps seem to be uncertain and inconsistent. To be specific, the CCI map tended to misclassify most inland water, whereas underestimated urban & built-up areas and overestimated croplands (Figure 5a). Although the GlobeLand30 could detect well inland water, it was likely to underestimate urban & other infrastructure lands and overestimate grassland (Figure 5b). These issues may be the result of using coarse spatial resolution satellite images, cloud cover, or reference data shortcoming. The MCD1Q1 0.5 km MODIS-based globallandcover climatology had a tendency to misinterpret most inland water as wetland regions and overestimate cropland (Figure 5c), perhaps because of the difference of landcover type definition or the use of very coarse spatial satellite imagery (500 m). For forest estimation, based on Google Earth view and the 2017 forest map in this research, we found that FNF maps probably misclassified some forest areas and could not accurately detect inland water, since a large number of reservoirs disappeared in the map (Figure 5d,e). These problems may be the result of using only SAR images (ALOS PALSAR or ALOS-2 PALSAR-2). Although the SAR images are not blocked by clouds or cloud shadows, it often suffers from speckle which can be reduced by using noise reduction filters, however still constraining classification accuracies [105–107]. In summary, globallandcover maps contain large uncertainties for environmental studies.
Available online 16 June 2016
The GlobalLandCover Facility (GLCF) global forest-cover and -change dataset is a multi-temporal depiction of long-term (multi-decadal), global forest dynamics at high (30-m) resolution. Based on per-pixel estimates of per- centage tree cover and their associated uncertainty, the dataset currently represents binary forest cover in nom- inal 1990, 2000, and 2005 epochs, as well as gains and losses over time. A comprehensive accuracyassessment of the GLCF dataset was performed using a global, design-based sample of 27,988 independent, visually interpreted reference points collected through a two-stage, stratiﬁed sampling design wherein experts visually identiﬁed for- est cover and change in each of the 3 epochs based on Landsat and high-resolution satellite images, vegetation index proﬁles, and ﬁeld photos. Consistent across epochs, the overall accuracy of the static forest-cover layers was 91%, and the overall accuracy of forest-cover change was N88% —among the highest accuracies reported for recent global forest- and land-cover data products. Both commission error (CE) and omission error (OE) were low for static forest cover in each epoch and for the stable classes between epochs (CE b 3%, OE b 22%), but errors were larger for forest loss (45% ≤ CE b 62%, 47% b OE b 55%) and gain (66% ≤ CE b 85%, 61% b OE b 84%). Accuracy was lower in sparse forests and savannahs, i.e., where tree cover was at or near the 30% threshold used to discriminate forest from non-forest cover. Discrimination of forest had a low rate of com- mission error and slight negative bias, especially in areas with low tree cover. After adjusting global area esti- mates to reference data, 39.28 ± 1.34 million km 2 and 38.81 ± 1.34 million km 2 of forest were respectively
determine the best landcovermap to use at a given location, and the second focussed only on correcting areas where all three landcover maps disagree. These hybrid products would generally represent the year 2005 as MODIS and Globcover apply to that year. Although the GLC2000 is for the year 2000, the majority of spatial disagreements between the products are not about landcover change but about incorrect classifications. Thus the merging of the products is really about trying to find the best landcover representation more generally for that time period. The two hybrid products were compared with the individual globallandcover products using performance metrics suggested by Pontius and Millones (2011) as well as overall accuracy as an additional relative measure. The first hybrid map outperformed the individual landcover maps based on the validation data set used while the second hybrid map was not as good as the individual MODIS landcover product on three out of four performance measures. We offered potential explanations for this including the use of the GLC2000 in the development of GlobCover and the continued improvement of MODIS over time. Other variations that might be tried are to: take the landcover at a given point only when MODIS agrees with one of the other landcover products (or all three agree); and use the two crowdsourced training datasets and GWR at all locations to create a single hybrid product.