Random forest model - ML model building

4 ML model development

4.2 ML model building

4.2.4 Random forest model

The random forest models were built with the help of the ranger R package. This package makes the randomForest function available, which accepts various hyperparameters as well as the set of input features and the target variable as parameters. The following sections describe the process by which the five random forest models were developed, as per the Model Building phase depicted in Figure 20.

4.2.4.1 Hyperparameter grid creation

The hyperparameters that were optimised during the development of the random forest models were the number of trees to grow (B), the number of variables to sample per tree (K), the size of the samples drawn per tree (L) as well as the minimum size of the leaf nodes of the trees (n). The randomForest default values for these parameters were 500, a third of the input features, two-thirds of the observations in the training data set and 5, respectively. These defaults were used as the starting points for a grid search procedure that sought to optimise the model hyperparameters.

Table 28: Random forest hyperparameter grid

Hyper- parameter Option 1 Option 2 Option 3 Option 4 Option 5

B 500 550 600 650 700 K 1 6x No. Features 1 4x No. Features 1

3x No. Features N/A N/A L 12x No. Observations 2 3x No. Observations 3 4x No. Observations 4 5x No. Observations N/A n 3 5 7 9 N/A

The search space for the hyperparameter grid search algorithm is provided in Table 28. The number of trees to generate was allowed to range upward from the default value. This was done because ranger’s default value for this parameter prioritised execution time of the algorithm. By allowing this parameter to increase upward from the default, the random forest algorithm was provided with a larger search space to optimise this hyperparameter, at the cost of having longer model training times, which was an acceptable downside in this situation. In a similar fashion, the remainder of the hyperparameter grid was created to provide a hyperparameter search space that is wider than the default parameters.

4.2.4.2 Hyperparameter tuning

A combination of grid search and K-fold cross-validation was used to tune the random forest hyperparameters. The grid search was achieved by creating four sequentially nested for-loops, each of which looped through the array of one of the hyperparameters.

The K-fold cross-validation was achieved by creating a final for-loop within the grid-search for- loop structure. The procedure within this loop worked as follows: Similar to the procedure followed during the ANN hyperparameter tuning procedure, the training data set was shuffled and split into three equal-sized parts. Then, the hyperparameters selected by the outer for-loops were passed as parameters to the ranger random forest function call, along with two-thirds of the shuffled training data set. The function then used the hyperparameter configuration and the two-thirds training data to build a random forest. The resultant random forest was then validated against the remaining third of the data set. This process was executed three times per hyperparameter configuration, which made it a threefold cross-validation process. Finally, the average threefold cross-validation F1 score was calculated to measure the performance of every hyperparameter configuration. The best performing hyperparameter configurations were selected for each wear type. The final random forest configurations for each of the wheel wear measurements are provided in Table 29.

Table 29: Best performing random forest hyperparameter grid Measure B K L n FH 550 1 6x No. Features 3 4x No. Observations 9 TD 550 1 4x No. Features 2 3x No. Observations 9 HW 500 1 6x No. Features 3 4x No. Observations 5 FT 650 1 3x No. Features 3 4x No. Observations 5 FS 550 1₃x No. Features 1 3x No. Observations 7

4.2.4.3 Final model training and evaluation

The final phase of the random forest model development process entailed training the random forest models for each of the wheel wear measurements on the entire train data set, with the hyperparameters set to those in Table 29. As with the ANN and logistic regression models, each model’s performance was evaluated based on confusion matrix statistics, ROC curves an_{d the} AUC measure.

i) FH model evaluation

The confusion matrix for the FH wear prognostic random forest model is provided in Table 30. From Table 30, accuracy, sensitivity, specificity, and the F1 measure were calculated. These values are provided in Table 31. An ROC curve, which is illustrated in Figure 44, was produced for the ANN model performance on the test set. The associated AUC value was 0.897.

Table 30: Confusion matrix for random forest model of FH wear prognostics

Predicted Class = 1 Predicted Class = 0

Actual Class = 1 4’820 5’218

Table 31: Confusion matrix metrics for random forest model of FH wear prognostics Metric Value Sensitivity 0.480 Specificity 0.998 Accuracy 0.935 F1 0.964

Figure 44: ROC curve for random forest model of FH wear prognostics ii) TD model evaluation

Table 32: Confusion matrix for random forest model of TD wear prognostics

Predicted Class = 1 Predicted Class = 0

Actual Class = 1 41’398 0

Actual Class = 0 41’492 12

The confusion matrix for the TD wear prognostic random model is provided in Table 32. From Table 32, accuracy, sensitivity, specificity, and the F1 measure were calculated. These values are provided in Table 33. An ROC curve, which is illustrated in Figure 45, was produced for the random forest model performance on the test set. The associated AUC value was 0.913.

Table 33: Confusion matrix metrics for random forest model of TD wear prognostics Metric Value Sensitivity 1 Specificity 0.0003 Accuracy 0.5 F1 0.0006

Figure 45: ROC curve for random forest model of TD wear prognostics iii) HW model evaluation

The confusion matrix for the HW wear prognostic random forest model is provided in Table 34. From Table 34, accuracy, sensitivity, specificity, and the F1 measure were calculated. These values are provided in Table 35. An ROC curve, which is illustrated in Figure 46, was produced for the random forest model performance on the test set. The associated AUC value was 0.847.

Table 34: Confusion matrix for random forest model of HW wear prognostics

Predicted Class = 1 Predicted Class = 0

Actual Class = 1 1’397 5’703

Table 35: Confusion matrix metrics for random forest model of HW wear prognostics Metric Value Sensitivity 0.196 Specificity 0.993 Accuracy 0.925 F1 0.960

Figure 46: ROC curve for random forest model of HW wear prognostics iv) FS model evaluation

The confusion matrix for the FS wear prognostic random forest model is provided in Table 36. From Table 36, accuracy, sensitivity, specificity, and the F1 measure were calculated. These values are provided in Table 37. An ROC curve, which is illustrated in Figure 47, was produced for the ANN model performance on the test set. The associated AUC value was 0.989.

Table 36: Confusion matrix for random forest model of FS wear prognostics

Predicted Class = 1 Predicted Class = 0

Actual Class = 1 52’481 3’191

Table 37: Confusion matrix metrics for random forest model of FS wear prognostics Metric Value Sensitivity 0.943 Specificity 0.940 Accuracy 0.942 F1 0.914

Figure 47: ROC curve for random forest model of FS wear prognostics v) FT model evaluation

The confusion matrix for the FT wear prognostic random forest model is provided in Table 38. From Table 38, accuracy, sensitivity, specificity, and the F1 measure were calculated. These values are provided in Table 39. An ROC curve, which is illustrated in Figure 48, was produced for the ANN model performance on the test set. The associated AUC value was 0.842.

Table 38: Confusion matrix for random forest model of FT wear prognostics

Predicted Class = 1 Predicted Class = 0

Actual Class = 1 1’537 5’869

Table 39: Confusion matrix metrics for random forest model of FT wear prognostics Metric Value Sensitivity 0.208 Specificity 1 Accuracy 0.929 F1 0.962

Figure 48: ROC curve for random forest model of FT wear prognostics

4.2.4.4 Random forest performance summary

The random forest model performed very well when it came to providing FH wear prognostics. The model achieved a high accuracy rate of 93.5% and had an AUC of nearly 0.9, which is high. The model also achieved a moderately high sensitivity rate, indicating that the model was capable of separating the target variable classes. The ROC curve was indicative of a healthy ML model, that curved strongly toward the (0,1) point before bending away toward the (1,1) point. Random forest had immense difficulty modelling the TD wear. The ROC curve exhibited strange behaviour, with a slight increase in the true positive rate near the (1,1) end of the curve. This indicates that there were two regions of the model’s decision threshold where the true positive rate to false positive rate increases, which is unexpected. This further supports the notion that the extrapolated TD data exhibited anomalous behaviour.

Random forest performed well when it came to providing HW prognostics. The model achieved an accuracy rate of 92.5% and had an AUC of 0.847, which is quite high. The ROC curve was also indicative of a healthy and well behaved model, similar to the FH case.

Just like the ANN model, the random forest did extremely well when it came to providing FS wear prognostics. The ROC curve reached close to the (0,1) point and had an AUC of 0.989, which is extremely high. This indicates that the model was capable of separating the target variable classes. To this end, the model achieved an accuracy rate of 94.2%.

Finally, random forest performed well when it came to FT wear. The model attained an accuracy rate of 92.9% and had a high AUC of 0.842. The model had a healthy ROC curve which tended toward the (0,1) point before bending away toward the (1,1) point, indicating that the model was capable of separating the target variable classes.

4.3 Chapter summary

In this chapter the processes followed to build the three ML models for this project are described. The first section describes how the wheel wear measurement data set was ingested and processed in R. Then it describes how the data was formatted and how the outliers in the data set were dealt with. After that it describes how domain-specific knowledge was implemented to engineer features that were added to the data set. The chapter concludes with a description of how the logistic regression, ANN and random forest ML were developed and provides the performance results of the final ML models for each ML model and wheel wear type.

In document Implementation of machine learning techniques for railway wheel prognostics (Page 99-108)