The HTO dataset is expected to require different values than the ones used for suicide dataset for the parameters specified by the DFSP. In this section, the values for Count-Limit, IV-Thresh and DV-Thresh will be determined.
The first parameter to be determined is the Count-Limit. Analysing the data to estimate this parameter provides an understanding of the minimum and maximum acceptable number of features that the DFSP can use to produce good results. The aim is to find a low count such that speed is not jeopardised while producing acceptable accuracy. Determining Count-Limit is a step towards estimating the values of IV-Thresh and DV-Thresh.
Comparisons of accuracy and shifted accuracy for the 11 risk levels using feature counts 1 to 20 are shown in Figures 7.1 to 7.6. Different values for variable par, which was needed to tune the amount of stretching the prediction away from the mean in equation 5.1, were tested. Given the tough nature of the data, a low par was required to give more stretch to the results. In all the results of this chapter, the value for par is 0.25.
Figure 7.1 shows that the shifted accuracy in both risks 0 and 1 is almost constant among all feature counts. For risk level 0, the accuracy started at 0% with 1 feature then increased gradually as more features were added. On the other hand it started at 63% for risk level 1 and decreased as more features were added. Since the shifted accuracy was not affected by the accuracy’s increase in risk 0 and decrease in risk 1, this means that as more features are added, some of the records that were classified as risk 1 are now classified as risk 0. This means that the excess addition of features introduced information that shifted the results towards risk 0 which proves the need for a feature limit.
7.2 Determining parameters
(a) Risk 0 results
(b) Risk 1 results
Fig. 7.1 Accuracy and shifted accuracy for HTO risk levels 0 patients (Figure a) and 1 patients (Figure b) using feature counts ranging from 1 to 20. The results are shown for correlation Corr criterion.
(a) Risk 2 results
(b) Risk 3 results
Fig. 7.2 Accuracy and shifted accuracy for HTO risk levels 2 patients (Figure a) and 3 patients (Figure b) using feature counts ranging from 1 to 20. The results are shown for correlation Corr criterion.
7.2 Determining parameters
(a) Risk 4 results
(b) Risk 5 results
Fig. 7.3 Accuracy and shifted accuracy for HTO risk levels 4 patients (Figure a) and 5 patients (Figure b) using feature counts ranging from 1 to 20. The results are shown for correlation Corr criterion.
(a) Risk 6 results
(b) Risk 7 results
Fig. 7.4 Accuracy and shifted accuracy for HTO risk levels 6 patients (Figure a) and 7 patients (Figure b) using feature counts ranging from 1 to 20. The results are shown for correlation Corr criterion.
7.2 Determining parameters
(a) Risk 8 results
(b) Risk 9 results
Fig. 7.5 Accuracy and shifted accuracy for HTO risk levels 8 patients (Figure a) and 9 patients (Figure b) using feature counts ranging from 1 to 20. The results are shown for correlation Corr criterion.
Fig. 7.6 Accuracy and shifted accuracy for risk 10 using feature counts ranging from 1 to 20.
The results are shown for correlation Corr criterion.
Figure 7.2a shows that the shifted accuracy for risk 2 degrades quickly as more features are added. This is caused by the increase of misclassified risk 1 records as risk 0 when more features are added. For the accuracy, the results degrade slowly as more features are added for both risks 2 and 3.
Figures 7.3 to 7.5 show that starting with 3 features the accuracy and shifted accuracy is almost constant with a slight degradation when the feature count exceeds 17. Finally Figure 7.6 shows that starting feature count 3, the accuracy and shifted accuracy starts to increase gradually.
In order to choose the number of features to limit the algorithm, a few points need to be noted. When the count of features started to exceed 12, the accuracy starts to degrade for risks 1, 2, 3, 4, 6, 8 and 9. At the same feature count, shifted accuracy also starts to degrade for risks 1, 2, 3, 4, 5, 6, 7, 8 and 9. The only two risks that show big improvement with more than 12 features are risks 0 and 10 but with the downside of degrading most of the other risks.
7.2 Determining parameters
Since the DFSP aims to give good accuracy for all risk levels, 12 features is the upper bound for the choice.
Risks ranging between 1 and 10 produced good accuracy and shifted accuracy results with only 3 features but this was the case for risk 0. Risk 0 started to give better results using 7 features. This makes the more promising range for the feature count between 7 and 12.
Therefore, limit 8 was chosen such that it causes a lower percentage of misclassification by producing good accuracy results. Also the lower the feature count, the faster new models can be created.
After the feature count was decided, the next step was to estimate the value for DV-Thresh.
Table 7.1 shows the comparison between the accuracy and shifted accuracy among all risk levels for different values of DV-Thresh. The difference among the results of the accuracy and shifted accuracy using different thresholds accuracy is minimal. The 14 drop outs in thresholds 0.01 and 0.009 are records with the most correlated answered feature having relevance with the DV of less than 0.004. The information provided for these records was minimal and the output risks assigned were less than level 3. Therefore, it was not necessary to decrease the threshold any further. DV-Thresh was then given a value of 0.01 such that the lowest number of patient records are dropped, the mean sample size is large and the accuracy was better.
DV-Thresh 0.04 0.03 0.02 0.01 0.009
Mean Num of feats 7.935 7.951 7.973 7.986 7.989
Mean Sample Size 6027.7 5869.9 5743.7 5566 5561.5
Dropouts 37 37 26 14 14
ACC S.ACC ACC S.ACC ACC S.ACC ACC S.ACC ACC S.ACC
RiskCategories
0 31.61% 81.39% 31.62% 81.35% 31.65% 81.36% 31.72% 81.37% 31.72% 81.37%
1 44.63% 85.80% 44.58% 85.83% 44.58% 85.83% 44.53% 85.84% 44.53% 85.85%
2 28.02% 77.86% 28.00% 77.85% 28.05% 77.88% 28.04% 77.86% 28.03% 77.86%
3 22.73% 61.10% 22.66% 61.05% 22.63% 61.06% 22.69% 61.07% 22.69% 61.07%
4 20.16% 54.20% 20.21% 54.14% 20.23% 54.22% 20.25% 54.22% 20.25% 54.22%
5 17.51% 47.73% 17.48% 47.73% 17.43% 47.73% 17.43% 47.73% 17.43% 47.73%
6 15.63% 44.49% 15.63% 44.49% 15.63% 44.49% 15.63% 44.49% 15.63% 44.49%
7 16.06% 46.34% 16.06% 46.34% 16.06% 46.34% 16.06% 46.34% 16.06% 46.34%
8 16.29% 47.05% 16.29% 47.05% 16.29% 47.05% 16.29% 47.14% 16.29% 47.14%
9 17.56% 55.52% 17.56% 55.52% 17.56% 55.52% 17.56% 55.52% 17.56% 55.52%
10 24.24% 42.42% 24.24% 42.42% 24.24% 42.42% 24.24% 42.42% 24.24% 42.42%
Table 7.1 Results of testing the algorithm with DV-Thresh ranging from 0.04 to 0.09 while setting the feature limit to 8: the ACC column is the percentage of predictions exactly matching the clinical judgement and the S.ACC column is the percentage that either match the judgement or are one away from it.
After choosing the values for Count-Limit and DV-Thresh, the final step was to determine IV-Thresh. Table 7.2 shows the results of applying different values of IV-Thresh using the defined values for DV-Thresh and Count-Limit. The IV-Thresh controls the accepted level of redundancy in the selected subset of features. The table shows the accuracy and shifted accuracy for risks 0 and 1 improve with lower values of IV-Thresh which makes these risks the most sensitive to redundancy. It also shows that risks 3, 4, 5 and 9 produced better accuracy results and risks 2, 3, 5, 7 and 8 showing better shifted accuracy using IV-Thresh as 0.8. Risks 2, 6 and 9 produced better shifted accuracy and risk 8 produced better accuracy using IV-Thresh as 0.9. Finally the accuracy of risks 2, 6, 7 and 10 along with the shifted accuracy of 10 were best without IV-Thresh. Putting into consideration that most levels were affected by the use of IV-Thresh proving the existence of redundant features. The chosen
7.2 Determining parameters
IV-Thresh0.40.50.60.70.80.9 ACCS.ACCACCS.ACCACCS.ACCACCS.ACCACCS.ACCACCS.ACC
RiskCate gories
035.37%83.22%34.43%83.16%35.32%82.97%33.80%82.31%32.42%81.55%32.37%81.53% 145.20%87.17%45.77%87.21%45.58%87.43%45.46%86.63%44.96%86.10%44.86%86.03% 227.12%77.66%27.05%78.10%27.20%78.04%27.16%78.42%27.23%78.08%27.33%78.02% 322.17%60.29%22.31%60.27%22.62%60.46%22.57%60.77%22.93%61.12%22.93%61.08% 418.76%53.49%18.51%53.51%19.18%53.78%20.18%54.03%20.50%54.10%20.46%54.09% 515.83%45.92%15.86%45.90%15.40%45.82%16.70%46.61%17.26%47.98%17.15%47.93% 614.07%42.21%14.18%42.16%13.68%42.27%14.74%43.44%15.63%44.44%15.57%44.49% 716.86%44.31%16.31%44.06%16.68%43.57%15.82%44.37%15.57%46.40%15.75%46.28% 814.86%43.62%15.05%43.62%14.57%42.29%14.76%45.52%16.57%46.86%16.57%46.48% 915.01%54.11%15.01%53.82%14.73%52.69%17.85%52.41%17.28%54.96%17.56%55.24% 1026.26%43.43%27.27%43.43%26.26%43.43%27.27%43.43%26.26%44.44%26.26%44.44% Table7.2ResultsofaccuracyandshiftedaccuracyforIV-Threshrangingbetween0.4and0.9.Allresultsarebasedonsettingthe DV-Threshto0.01andCount-Limitto8features.
value for IV-Thresh was 0.8 to even out the need for low value of IV-Thresh from risks 0 and 1 with the need for high value from the remaining risks.
The results of applying the DFSP algorithm on the HTO dataset using the estimated parameters is achieved. The next step was to compare them to other methods. This is presented in the following section.