4. Multi-temporal Analysis of Sentinel-1 Texture and Principal Component Analysis features
4.3 Methods
4.4.1 GLCM temporal classification
Texture features were generated with the 9*9 moving window size in the invariant direction (the average of all four spatial arrangements). The 10 texture features were combined into a single image. PCA was generated from this dataset. Random Forest, Neural Net and Support Vector Machine resulted in the overall accuracy and kappa coefficients displayed in Table 4.2.
Table 4.2: Averaged Overall Classification Accuracy for Random forest (RF), Neural net (NNET), Support Vector Machine (SVM) classifiers. TEX, VV, PCA are texture, single polarized and Principle Component derived overall classification accuracies
VV TEX PCA
RF NNET SVM RF NNET SVM RF NNET SVM
Kappa coefficients 0.49 0.59 0.58 0.68 0.68 0.68 0.78 0.78 0.78 Overall accuracy 0.62 0.70 0.70 0.77 0.77 0.77 0.84 0.84 0.84
66
Overall accuracies for the single polarized VV images were 0.62 for RF and 0.70 for NNET and SVM respectively. The GLCM featured resulted in accuracies of 0.77 for all the three classification algorithms. PCA features reported the highest accuracy of 0.84 for all the algorithms. The high overall accuracies of the PCA images indicate that reduction in dimensionality of texture data to its basic components containing the most variance increases the accuracy of classification. The GLCM are relatively large and therefore compressed with PCA. PCA texture features capture the variation of the land covers and are recommended for use when dealing with a large dataset.
Similar to the pattern in the overall accuracy results, the single polarized classifications exhibited low kappa coefficients. Single polarized kappa ranged from 0.62 to 0.70 for the three classifiers representing its low ability to identify features from single polarized SAR images. The Kappa coefficients for the GLCM texture features and PCA features were 0.77 and 0.84 respectively. The results indicate that PCA features have high potential in monitoring land cover in the Kilombero wetland yielding higher accuracies as compared to utilizing the single polarized sentinel VV images or their corresponding texture features.
Table 4.3: Average Sensitivity and positive predicted values for single polarized, texture features and Principal component classified images
VV GLCM PCA
sensitivity RF NNET SVM RF NNET SVM RF NNET SVM
bare 0.45 0.52 0.51 0.61 0.62 0.61 0.66 0.66 0.66 vegetated 0.50 0.64 0.68 0.66 0.67 0.67 0.84 0.84 0.84 built up 0.55 0.63 0.57 0.85 0.79 0.84 0.89 0.89 0.89 water 0.95 0.97 0.97 0.95 0.94 0.92 0.96 0.96 0.96 positive predicted value
RF NNET SVM RF NNET SVM RF NNET SVM
bare 0.47 0.54 0.56 0.64 0.64 0.63 0.83 0.83 0.83
vegetated 0.47 0.56 0.54 0.63 0.65 0.63 0.74 0.74 0.74 built up 0.59 0.79 0.82 0.87 0.89 0.88 0.94 0.94 0.94
water 0.96 0.97 0.97 0.97 0.98 0.98 0.95 0.95 0.95
The sensitivity gives the probability that a class in the classified map is actually in that class on the ground indicating the effectiveness of a classifier to identify a land cover class correctly. The sensitivity of the bare class was lowest for the single polarized images with the application of the Random Forest classifier (Table 4.3). The three classification algorithms reported the highest sensitivity for the bare class (0.66) in the classification of the PCA features. The ability to identify vegetation was lowest for the single polarized images with the Random Forest classifier, while the highest vegetation discrimination ability is reported for the PCA images irrespective
67
of the classifier (0.84). Texture features have a similar performance in vegetation discrimination as the single polarized SVM algorithm. Discrimination of built up areas is best for the PCA features regardless of the classifier while it is lowest for the single polarized images. The classifiers ability to discriminate built up areas from both the texture and PCA images was highly throughout the time series. Built up areas have high backscatter due to double-bounce scattering hence increasing their discrimination potential (Zhang et al. 2014). Water was easily distinguishable for all classifiers for the single polarized, texture and PCA images due to its distinctive low backscatter values.
The positive predicted value gives the probability that a land cover on the ground is classified correctly on the map. The bare prediction was lowest for the Random Forest single polarised images. A slight improvement was observed for the texture features with the best prediction ability recorded for the PCA features. Similarly, prediction ability for the vegetated land cover class was lowest for the single polarised images and highest for the PCA features. It was noted that the prediction for vegetated areas was slightly lower than for the bare area when dealing with the PCA features. All the classifiers correctly mapped built up areas at high precision, greater than 0.8, except for single polarized images Random Forest and NeuralNet. Classification of water was high with a precision of 0.9 to 1 due to its unique low backscatter characteristics. The complete tables of overall accuracies, kappa coefficients, sensitivity and positive predicted values are in the Appendix A2-A11.
The classified maps appear identical (Figure 4.4 a, b, c) nonetheless the uncertainty map (Figure 4.4d) reveals the areas with differences between the models. The areas along the Kilombero River have a high uncertainty rate represented by the high values of entropy. Mixed pixels could be a causal effect of the high entropy values as varied covers including submerged vegetation is located within this riparian zone.
68
Figure 4.4: Land cover classifications of selected dates based on Artificial Neural Network (ANN), Support Vector Machine (SVM) and Random Forest (RF). The corresponding Entropy image indicates difference in the assignment of land use classes using the three classifiers low vales (light areas) have common class assigned whereas the high vales (green areas) were different for the three algorithms.
SAR backscatter constitutes contribution from vegetation, soil water content and surface roughness. SAR is an ill-posed problem as different combinations influencing factors can give the same value. The method adopted for selection of training sites involved ratioing backscatter values from two dates. Hence in the selection of non- change areas, land cover changes could have occurred but with no significant magnitude thus assumed to have the same land cover. Inability to decompose the total backscatter into individual contributory elements could introduce errors in classification since the changes recorded could be due to, for example, a change in soil moisture and not necessarily a change in the land cover class. This is a shortcoming of the method adopted though it worked well in general. Additionally, the GLCM texture features are derived from the backscatter values and hence are also subject to influence by vegetation, soil water content and soil roughness (Kurvonen and Hallikainen 1999).