Comparison of MARS Model and Multi-depth ANN Model Performance

Chapter 2 Data-Driven Synthesis of Compressional and Shear Travel Times for

2.6 Data-Driven Modeling

2.6.4 Comparison of MARS Model and Multi-depth ANN Model Performance

procedure to investigate the possibility of including depth dependencies to improve the prediction accuracy of the DTC and DTS logs. The 6 machine learning models built in the previous section use input data from a single depth to predict the corresponding DTC or DTS log. However, the petrophysical properties of sedimentary rocks are usually

related when the rocks are located in neighboring formations. We assume that the well logs from the neighboring formations may provide useful information in the prediction of the DTC and DTS logs.

The two models implemented in this section follow the same train and test procedure as discussed in section 2.5.3.1 and 2.5.3.2. The train and test dataset split method is changed to take the depth dependencies into consideration. Instead of randomly sampling 70% of data points as the train set, the 70% data points from the upper formations are used as training, and the rest 30% are used as testing. All data points from well #2 are used as a cross-well test set.

The two ANN models implemented in this section are trained to use the new train and test set. The first ANN model is the same as the ANN model implemented in section 2.6.2. The first ANN model is used as a baseline. In the following section, the first ANN model will be referred to as the single-depth ANN model. The second ANN model takes the depth dependency into consideration. The input of the second ANN model is the 8 input logs from three consecutive depths, and the output DTC or DTS log is from the center depth. In other words, the new ANN models use information from three depths to predict DTC or DTS log in a single depth. In the following section, the second ANN model will be referred to as the multi-depth ANN model. The structure of the model is shown in Figure 2-15.

Figure 2-15. Illustration of the multi-depth ANN model prediction process.

The two ANN models are trained with the train set from well #1. The train set is the upper 70% of the dataset. 5-fold cross-validation and grid search are used for hyperparameter tuning. The models’ stability is also investigated using the same bootstrap resampling method. The four metrics applied for the 6-shallow machine learning models are applied to the two ANN models. The results are presented in the following sections.

Figure 2-16 compares the prediction results of the multi-depth ANN model and the MARS model on both DTC and DTS data. The first four tracks compare the predicted and actual DTC log of the two models. The last four tracks compare the predicted and actual DTS logs.

Figure 2-16. Comparison of actual and predicted DTC and DTS logs of multi-depth ANN model and MARS model in Well #2.

Figure 2-17 and Figure 2-18 plots the corresponding relative error distribution of the single-depth and multi-depth ANN models. From the distribution, we can see that the relative error of both models is spread between 0~0.2. Compared to Figure 2-11, fewer data points are located in the range of 0~0.05, which means the models’ performance is worse than the MARS model.

Figure 2-17. Comparison of relative error distribution of DTC (Left) and DTS (Right) logs for the single depth ANN model deployed in well #2.

Figure 2-18. Comparison of relative error distribution of DTC (Left) and DTS (Right) logs for the multi-depth ANN model deployed in well #2.

The prediction performance of the single and multi-depth ANN models evaluated using the four metrics is shown in Table A-1. The data in Table A-1 indicates that both single depth and multi-depth ANN models perform worse than the MARS model. We can also refer to Figure 2-8 and Figure 2-9 to compare the metrics of single and multi-depth ANN models with other machine learning models. The bootstrap results indicate that the

single and multi-depth models are much less stable than the other machine learning models.

The confidence intervals of the two models are much larger compared to other models. The large confidence interval of the two ANN models may come from the following two parts. First, the models are only trained with the 70% data points from the upper formation, which means the models do not learn the relationship contained in the lower 30% data points. When exposed to a new data sample, the models may fail to predict the sonic logs. Second, the models are more complicated than the other 6 shallow learning models. With more parameters, the models may become more unstable due to overfitting. The other 6 models do not contain as many parameters, and thus the input- output relationships represented by the other 6 models are relatively simple. More importantly, the relationships between the other well logs and the sonic logs are simple. Many researchers have applied simple empirical models to describe the relationship. Applying complicated models for this simple relationship is not necessary and counterproductive.

By comparing the prediction results of the two models, we can see that when tested with the test set in well #1, the multi-depth ANN model performs better than the single-depth ANN model. Whereas the contrary is observed in well #2. This is due to the train test set split method used in this section. The training dataset is not randomly sampled, which means the train set, the test set, and the test set in well #2 each follow a different distribution. Even though the multi-depth model takes depth dependency into consideration, the multi-depth model still fails to generalize the relationship in the train

set. When exposed to a new dataset, both the single-depth model and the multi-depth model fail.

In document Machine learning for the subsurface characterization at core, well, and reservoir scales (Page 71-77)