Implementation of the Proposed CNNs - Deep Learning for Multispectral ALS LC Classification

Chapter 3 Deep Learning for Multispectral ALS LC Classification

3.7 Implementation of the Proposed CNNs

The implementation procedures, shown in Figure, were used for all of the eighteen classification models. From the very beginning, input datasets, introduced in Section 3.3.4, were

imported into the proposed networks and separated into training, validating and testing data, respectively. To predict LC type for each pixel, every CNN was called twice for two core processes: a training process and a predict process. Detailed descriptions of these implementation procedures were presented in the following sections. The implementation procedures were run using a NVIDIA Tesla P100 16GB GPU computing processor.

Figure 3.8 Workflow of model implementation

3.7.1 Programming Language and Libraries

The CNNs were established and implemented using Python 3 based on the Tensorflow and Keras libraries. This step was achieved for each CNN by the script shown in Figure 3.9.

Figure 3.9 Imported libraries

3.7.2 Data Import

The first step of the data import was to simply specify which the input dataset and the labelled dataset were for the model with their formats. Since this study aimed to conduct the pixel-wise classification, input and labelled datasets were imported pixel by pixel with information of relative pixel position. This step was achieved by the script shown in Figure 3.10. The second step was to find valid pixels that were not empty. This was important because the study area was not an upright rectangle and consequently empty pixels existed in import datasets. This step eliminated the impacts from empty pixels and allowed the LC classification for study areas in any shape. This step was achieved by the script shown in Figure 3.11.

Figure 3.10 Importing data pixel by pixel

(b) 2D and 3D CNNs

Figure 3.11 Selection of valid pixels

3.7.3 Separation of Training, Validation, and Testing Data

The imported dataset was randomly separated into training data, validation data and testing data based on a rate in this step. This rate would be discussed and determined in Section 4.2.6. To ensure testing data were constant in each prediction, a random state was set when split the entire dataset into testing data and other data. Non-testing data were then divided into training data and validation data. This step was achieved by the script shown in Figure 3.12.

Figure 3.12 Separation of Training and Testing Data

3.7.4 Training Process

In the training process, training data and corresponding labelled data were used to determine parameters (i.e. weights and bias) and optimize the classifier. This process contained forward steps and backward steps (see Figure 3.13). Forward steps aimed to generate feature maps in each layer based on current weights and bias. Outputs of this forward step and the given ground truth labels were utilized to calculate the loss cost according to the loss function:

H(p, q) = − ∑ 𝑝(𝑥) log 𝑞(𝑥)𝑥 (3.5)

where H(p, q) is the cross entropy of p and q; p(x) is the actual possibility of an event x; q(x) is the predicted possibility of an event x.

Then, a backward step was applied to calculate the gradient of each parameter based on a learning rate which would be tested and determined in Section 4.2.7. These gradients were used to update all parameters in each layer. With these updated parameters, the system could proceed the next forward step. The circulation of a forward step and a backward step could be stopped when the loss cost of the model or the number of iterations reached a provided threshold. When an early stop was set, where the difference between the loss costs of two successive iteration was less than 0.0001, all models were stopped between the twentieth and the thirtieth epochs. Additionally, it improved the comparability of the results if the number of iterations was the same of all models. Therefore, thirty epochs were run for each model in this study. An entire training process would be stopped after the thirty iterations of forward and backward steps. After thirty epochs, the epoch with the lowest loss cost was found, and the current used values of parameters was considered as values of parameters derived from this training process.

The validation dataset was also utilized in this process to validate the currently used values of hyper-parameters and to help the determination of the optimal value of each hyper-parameter for the model. To do so, the entire training process was repeated many times with different values of hyper-parameters. After determining the optimal values of hyper-parameter, final values of parameters could be derived from the training process using these hyper-parameters. The training process was achieved by the script in Figure 3.14.

Figure 3.13 A forward step and a backward step of training process

3.7.5 Predict Process

The predict process was considered as a forward step in the training process, the aim of which was to predict LC types for testing data using parameters that were determined during the training process. Hyper-parameter values used in predict process should be the optimal values decided in the last step. The step was achieved by the script in Figure 3.15.

Figure 3.15 Predict process

3.7.6 Involved Hyper-parameters

As listed in Table 3.6, there were two key hyper-parameters involved in the implementation of each CNN. An initial value of each hyper-parameter was set for the control variate method.

Table 3.6 Hyper-parameters involved in the implementation of CNNs

Hyper-parameters Implemented steps 1D CNN 2D CNN 3D CNN Rate of training,

validation, and testing data

Separation of training, validation, and testing data

70%, 10%, 20% 70%, 10%, 20% 70%, 10%, 20% Learning rate Training process 0.001 0.001 0.001

3.8 Methods of Accuracy Assessment

In document Convolutional Neural Networks for Land-cover Classification Using Multispectral Airborne Laser Scanning Data (Page 58-63)