Comparison between the different stratification methodologies

Chapter 4 Results

4.6.5. Comparison between the different stratification methodologies

Perhaps surprisingly, the four risk allocation methodologies broadly agree, although they are founded on very different principles. However, at the level of detail there are important differences. In particular, it is generally the case in breast cancer that the population of operable patients comprises a very well surviving group and another, thankfully a much smaller group, with especially poor survival. Nevertheless, it is the accurate discrimination and grouping of patients in the mid-surviving groups that is of most interest, since these two groups of patients are those likely to benefit most from better targeting of therapy.

The prognostic indices obtained with Cox proportional hazards, PICox and with PLANN-

ARD modelling, PIPLANNARD, as well as the mode of the 6 variables found as the most

predictive ones, for the 10 imputed data sets were used for the different stratification methodologies, for both the training and validation data set.

clustering approach it were found 4 different risk groups. The clustering methodology based on learning metrics found 5 different risk groups. The two clustering methods did not display distinct observed survival measured by Kaplan-Meier actuarial estimates unlike the regression tree methodology and the log-rank bootstrap aggregation method, for both training and validation data sets. Therefore, it can be concluded that group separation is much better for regression tree and bootstrap log-rank methodology. These two methodologies have a very similar survival, for both prognostic indexes and for both training and validation data sets. However, looking at the log-rank pairwise values and KM curves, the bootstrap log-rank methodology has better separation between the risk groups. Although survival for both methods is similar, group membership is not the same, as it can be observed on Table 4.39 .

Regression tree Cox

1 2 3 4 Total 1 291 73 1 0 365 2 2 170 1 0 173 3 0 92 69 5 166 4 0 1 6 32 39 Bo otstra p log-rank Co x Total 293 336 77 37 743

Table 4.39 – Patients’ cross tabulation between two different stratification methodologies. These methodologies are Regression tree and bootstrap log-rank aggregation. The left tables represent for the prognostic index obtained with Cox proportional hazards and the right tables represent for the prognostic index obtained with PLANN-ARD. The top tables are for the training data

set and the bottom ones are for the validation data set.

For the training data set, with the exception of the 4th risk group, the bootstrap log-rank aggregation is generally more conservative in terms of patients’ risk group allocations than the regression tree method. However, this analysis is found more for the prognostic index obtained with Cox proportional hazard, as the one obtained with PLANN-ARD, the risk group

Regression tree PLANN

1 2 3 4 Total 1 284 10 0 1 325 2 9 166 5 1 181 3 0 39 93 17 149 4 0 2 6 80 88 Bo otstra p Log-rank PLAN N Total 293 247 104 99 743

Regression tree Cox

1 2 3 4 Total 1 1371 381 0 0 1752 2 20 900 18 1 939 3 7 628 356 64 1055 4 0 10 58 202 270 Bo otstra p log-rank Co x Total 1398 1919 432 267 4016

Regression tree PLANN 1 2 3 4 Total 1 1439 325 6 1 1771 2 68 744 62 19 893 3 3 146 554 192 895 4 0 1 10 446 457 Bo otstra p Log-rank PLAN N Total 1510 1216 632 658 4016

Cox proportional hazard is utilized as the prognostic index. As opposite, with the prognostic index obtained with PLANN-ARD, the regression tree is more conservative in terms of patient’s allocations than the bootstrap log-rank aggregation method.

Consequently, the bootstrap log-rank method showed clearly the better discrimination in survival between the most and least surviving group, and is more conservative than the use of regression trees, compared to which it draws a substantial number of patients from group 2 into group 3. This effect is more pronounced when the linear survival estimator is used, in part reflecting the observation that the non-linear estimator, PLANN-ARD, is itself slightly more conservative than Cox regression with respect to these two risk groups.

Regression tree Cox

1 2 3 4

Bootstra Log Rank Cox

Figure 4.31 – Survival curves obtained for the patients’ cross-tabulation.

They were obtained with the regression trees stratification methodology and bootstrap log-rank stratification methodology for the Cox Proportional Hazards and for the validation data set.

In order to verify the patients’ consistency allocated to the different stratification methodologies, regression tree and bootstrap log-rank, using prognostic risks, Cox proportional hazards and PLANN-ARD, the survival curves for the previously cross- tabulations are plotted on Figure 4.31 and Figure 4.32 , for the validation data set. Analysing the survival curves for both risk indexes, the obtained survival curves with the bootstrap log- rank stratification methodology are more consistent than the ones for regression trees methodology. This finding is more evident for Cox proportional hazards model.

Regression tree PLANN-ARD

1 2 3 4

Bootstrap Log Rank PL

ANN-ARD

Figure 4.32 – Survival curves obtained for the patients’ cross-tabulation.

They were obtained with the regression trees stratification methodology and bootstrap log-rank stratification methodology for the PLANN-ARD and for the validation data set.

Comparing the KM curves for the PICox and PIPLANNARD it can be confirmed that, for both

algorithms (Bootstrap log-rank aggregation and regression tree), survival is lower for the risk groups obtained by using PICox, for both training and validation data set. This conclusion is

because patients are allocated in higher risk groups, as it can be observed in Table 4.40 . This finding manifests itself more, using the regression tree stratification methodology.

Cox 1 2 3 4 Total 1 280 13 0 0 293 2 12 235 0 0 247 3 0 85 19 0 104 4 1 3 58 37 99 PLAN N- A R D Total 293 336 77 37 743 Cox 1 2 3 4 Total 1 1329 181 0 0 1510 2 64 1152 0 0 1216 3 0 560 72 0 632 4 5 26 360 267 658 PLAN N- A R D Total 1398 1919 432 267 4016

Table 4.40 – Risk groups’ cross tabulation between different models.

The left tables represents patients’ cross tabulation for the regression tree method and the right tables represents patient’s cross-tabulation using the Bootstrap log-rank aggregation, using the PI obtained with Cox and PLANN-ARD. The top tables are for the training data set and the bottom tables are for

the validation data set.

In order to verify the patients’ consistency allocated to the different prognostic risks, Cox proportional hazards and PLANN-ARD, using both regression tree and bootstrap log-rank stratification methodologies, the survival curves are plotted the Figure 4.33 and Figure 4.34. Analysing the survival curves it can be concluded that for each risk index, the survival curves obtained with the PLANN-ARD prognostic index are more consistent than the ones for Cox proportional hazards prognostic index. This finding is more evident for the regression tree stratification. Cox 1 2 3 4 Total 1 322 3 0 0 325 2 43 137 1 0 181 3 0 33 116 0 149 4 0 0 49 39 88 PLAN N- A R D Total 365 173 166 39 743 Cox 1 2 3 4 Total 1 1709 62 0 0 1771 2 43 813 37 0 893 3 0 64 824 7 895 4 0 0 194 263 457 PLAN N- A R D Total 1742 939 1055 270 4016

Regression tree Cox

1 2 3 4

Regression trees PLANN-ARD

Figure 4.33 – Survival curves obtained for the patients’ cross-tabulation.

They were obtained with the regression trees stratification methodology for both indexes, Cox Proportional Hazards and PLANN-ARD, for the validation data set.

Bootstrap log-rank Cox

1 2 3 4

Bootstrap log-rank PLANN-AR

Figure 4.34 – Survival curves obtained for the patients’ cross-tabulation.

They were obtained with the bootstrap log-rank stratification methodology for both indexes, Cox Proportional Hazards and PLANN-ARD, for the validation data set.

Regarding all the risk groups, the most similar figures between the training and validation data set were achieved for Log-rank bootstrapping aggregation using PLANN as the prognostic model. The next more similar values are for the regression decision tree stratification methodology using also the PLANN as the prognostic model. Herewith there is a greater survival similarity at 5 years between the training and validation data set using the PLANN as a prognostic model.

4.7 - OSRE and CART rules comparison

For each rule extraction methodology, CART and OSRE, and for each model used, Cox proportional hazards modelling and PLANN-ARD, different rules can be obtained. These rules must be applied to new patients in order to obtain the risk group they belong.

For a specific model, all the rules obtained with regression trees methodology are mutually exclusive as opposed to the rules obtained with OSRE. In OSRE a patient can be classified for different rules and these rules can be in the same risk group or not. However, OSRE determines a rule hierarchy, which means that for each patient each rule is tested in turn and as soon as a rule is met for that patient profile it is not necessary going through the hierarchy. Even so, a patient can met the requirements of different rules of different risk groups. Here, it was chosen a conservative approach that is the patient must belong to the higher risk group. When OSRE rules were applied to the development data set, 22 patients were not classified for PLANN modelling and 23 patients were not classified for Cox modelling. When OSRE rules were applied to the validation data set, 227 patients were not classified for PLANN modelling and 205 patients were not classified for Cox modelling. This means that these patients were considered as outliers.

The rules obtained with both methodologies, OSRE and CART were compared and it was analysed that generally the methodologies derive the same number of rules. Therefore it cannot be confirmed that one methodology is more parsimonious than another.

For both rules extraction methodology it was necessary more rules to specify the patients belonging using the PLANN-ARD prognostic model than the Cox modelling. There were more similar rules between both methodologies when it is used the PLANN-ARD model rather than the proportional hazards modelling: 6 rules versus 2 rules for the development data set and 5 rules versus 1 rule for the validation data set.

As it was performed with patients’ group risk membership, the rules’ consistency can be also analysed, both for stratification methodologies and for the different prognostic models in order to verify more precisely which stratification methodology can perform better, in terms of rules.

The rules obtained with OSRE and CART methodology where compared for both Cox modeling and PLANN-ARD modeling, through the KM curves’ analysis and the statistical

Using the prognostic index obtained with Cox modelling, the CART rules are more consistent than OSRE rules, because there are more KM curves statistically different for the same rule for OSRE methodology than for CART methodology. For PLANN-ARD modelling this consistency couldn’t be corroborated, as for development and validation there is not an evidence of more KM curves statistically different neither for OSRE nor for CART methodology.

The rules obtained with each stratification methodology were also compared in a different way, that is, the rules obtained using the Cox model and the rules obtained with PLANN- ARD model were compared for each OSRE and CART. Here it can be affirmed that the rules obtained with CART methodology are very similar between each other, 9 for development and 7 for validation data set. However, it was concluded for both, development and validation data set (5 and 10 years of follow up) that generally there is more consistency in rules obtained with PLANN-ARD than with Cox. The rules obtained with OSRE methodology are less similar, 7 for development and 5 for validation data set. Nevertheless the Cox rules are more consistent than the PLANN-ARD rules, for the development data set that and the contrary is verified for the validation data set, to both 5 and 10 years of follow up.

4.8 - Interval estimates of individual prognosis

Following the methodology previously explained on chapter 2 about the Individual prognostic predictions with confidence intervals using the PLANN-ARD Model, a survival distribution was obtained for the training data set, as it can be observed on Figure 4.35 . With this distribution a mean value as well as the 95% confidence intervals for each patient can be obtained.

The median survival estimates across all of the training data at the end of follow up is 0.8149 and the KM estimated survival is 0.8748. Box plots of personal survival estimates, split into the four PLANN-ARD prognostic groups obtained with the CART methodology are represented in Figure 4.36. The mean of the individual survival estimates in each group predicted by the PLANN-ARD model can be compared with the observed grouped mean survival estimated with the Kaplan-Meier method at 5 years of follow-up, shown for each risk group in Table 4.41. By the table inspection, model predictions are generally conservative, because these are generally lower for the different risk groups than the KM estimated values.

Figure 4.35 – Survival Distribution for an individual patient.

It was calculated from 1000 iterations of estimated survival for an individual patient for the training data set, where the mean survival and the 95% confidence intervals can be obtained.

Figure 4.36 – Box plots of individual survival estimates to 5 years. These are separated into PLANN-ARD CART risk groups.

Risk

group Mean predicted survival 95% Low individual survival estimate 95% High individual survival estimate estimate KM KM estimate 95 % Low KM estimate 95 % High

1 0,917 0,846 0,949 0,983 0,958 0,99

2 0,809 0,697 0,883 0,903 0,851 0,93

3 0,603 0,541 0,732 0,798 0,695 0,857

4 0,421 0.235 0.65 0,566 0.448 0.646

4.9 - Comparison between the existent prognostic groups and the proposed

ones

As previously mentioned, there are several clinical prognostic classification schemes proposed for breast cancer patients, some of which discriminate between the survival of different risk groups defined from the patient characteristics, such as the TNM staging system. The most widely used nowadays are the Nottingham prognostic index (NPI) and the consensus rules agreed by the St. Gallen group.

By cross-matching these prognostic classification schemes with the new prognostic indexes obtained with the Cox proportional hazards and PLANN-ARD followed by the regression tree stratification methodology it is possible to examine survival for patient sub- groups, using Kaplan Meier estimated survival curves, to uncover heterogeneity among the prognostic groups. This can be achieved to both training and validation data sets.

In document Prognostic modelling of breast cancer patients: a benchmark of predictive models with external validation (Page 153-163)