served incidence in the sample population. The method can also improve the discrimination and prediction rules. However, a biased dataset (collection based on unusual occurrences for one or more of the variables) could hinder the model in new populations because the observed mean for the variable coefficient could be unrealistic. Re-estimation towards the mean does not evaluate each variable independently therefore will not identify any underlying problems to accurately estimate risk caused by irrelevant variables included in the prediction model.
Re-estimation towards the mean has been applied to update prediction models. This method has been successful when the original model was poorly calibrated and also reported a poor discriminative ability [139]. Additionally, for a model with a poor discriminative ability and a large model updating dataset available, then extending the model further improved the model performance rather than re-estimation [145].
In summary, this method is most successful for a model with a strong discriminative ability but requires recalibration for accurate predictions in a new target population where the model updating dataset is a
sample population of the new target population. If the model has a poor discriminative ability, then model extension or the other model re-estimation method (Section 8.6.2 may be more appropriate methods.
8.7
Model Extension
The final set of single model updating methods are defined as model extension. These methods incorporate additional variables in the prediction model. A more robust model could be created because additional key variables that explain risk can be incorporated.
Although, there are potential risks with model extension including over-fitting by attempting to explain the difference between the predicted and observed risks rather than the underlying condition [144]. The model may then underperform if applied in different target populations. These concerns can be avoided by only considering variables with a significant association with the disease. Additionally, the original model should be recalibrated to avoid all additional variables being included because they improve the model goodness of fit due to an original model calibration deficiency.
To identify additional predictors to be included in a model the hazards or odds ratios can be evaluated. If a statistically significant ratio is observed then the variable should be considered. However, a variable associated with the disease does not guarantee an improvement in model performance as “Ware and Pepe showed simple examples in which enormous odds ratios were required to meaningfully increase the AUC” [139, 146, 147]. It is important to consider not including too many additional predictors which may not significantly improve the model performance and could result in over-fitting and the model may then be unsuccessful in new populations.
8.7.1 Re-estimation and Extension
To apply this method, the first stage is to recalibrate the original model. The model variables are re- estimated towards the recalibrated null model using the techniques previously presented in Section 8.6.2. The model is then extended to include all additional variables available in the model updating dataset or identified as being associated with the disease through previous testing.
As previously discussed, including all additional parameters may not improve the model performance, and could limit the model in new populations. This approach can cause over-fitting by considering too many parameters especially if the model updating dataset is small [144]. This could create a final model that while being successful when internally validated may perform poorly in new populations. A more selective model extension technique that identifies the appropriate variables, rather than including all available different information, could create a more robust model.
Previous studies have been critical of this method when applied to clinical prediction models. While this method has been shown to be beneficial if the original model had a poor overall performance [143, 145] the updated models have been criticised. This method should be applied with caution in small datasets, because the method places a comparatively large significance on the updating data, and hence would be prone to peculiarities of the updating dataset [141]. In a study updating a prostate cancer prediction model, while additional markers have the potential to improve discrimination these should be selected using forward selection rather than all being included as this created an inferior prediction model when externally validated [145].
8.7.2 Selective Re-estimation and Selective Extension with Recalibration
The first stage is to recalibrate and re-estimate the original model which has been presented in Section 8.6.2 before the model is extended. New variables can be included if they are shown to have an association with the condition in the model updating dataset. A second approach is to include the new variables using forward selection from the selectively re-estimated model if they offer a significant improvement to the model goodness of fit, measured by chi-squared. Incorporating the new parameters using forward selection
is preferred as previous studies have shown incorporating new variables based on their association with the condition may not lead to an improved calibration or discriminative ability [139, 146, 147].
This method can create a robust model with an improved calibration, discrimination, and prediction rules. Including only parameters that improve the model goodness of fit limits the risk of over-fitting caused by including all additional variables. The main concern with this method is that if the model updating dataset has an unusual participant recruitment and variables may be included in the final model that do not assist in estimating the likelihood of a disease developing. One such example would be if all diseased individuals are collected dependant on having a different prior existing condition; this is likely to show a significant association with the condition which may not be observed in real populations. Precautions can be taken by conducting a review of the dataset recruitment process and analysing the population demographic. The model should also be externally validated in comparison to the original model to assess if a more robust model has indeed been developed.
Models have commonly been extended and is likely to be an increasingly popular method as genetic markers are incorporated into existing models, as has been observed in lung cancer (Section 3). One study found when predictor effects are heterogeneous between the development and validation samples then extensive updating was required [143]. Re-estimating the existing variables and then selectively including additional variables was a successful method [143]. A separate study compared different strategies for model extension for a new variable. The study found that when the dataset used to extend the prediction model was small, simple re-estimation methods led to the largest increase in discriminative ability of the prediction model, but as the available sample population increased more extensive extension methods outperformed re-estimation techniques [145].
8.7.3 Re-estimation and Selective Extension without Recalibration
This method is similar to the model extension method described in Section 8.7.2, however the preliminary work differs. The original model is not recalibrated because the model is shrunk around the mean incidence rate observed in the model updating dataset, as presented in Section 8.6.3, then extended using forward selection to identify new variables that improve the model goodness of fit, as presented in Section 8.7.2.
Selective extension limits the risk of over-fitting [144] because only variables associated with the disease are included rather than including all available variables. However, the method could still be affected by an unusual recruitment of diseased and disease free individuals in the model updating dataset. Additionally, a poor original model calibration may see a large quantity of new variables included to rectify the original model calibration deficiency. This could result in an extended model that underperforms in new populations because the included variables explained the variance between the predicted and observed risk in the model updating dataset rather than the underlying condition. Therefore, it is important to assess how the model performs in new populations, and whether the new variables included in the extended model created a more robust model in comparison to the original model.
In summary, this method has been successful when the original model is poorly calibrated and reported a poor discriminative ability [143]. Indeed, with a large dataset available to perform the model updating, for a model with a poor original performance, model extension outperformed more simplistic model updating methods [145]. Additionally, using stepwise forward selection created a more robust model than including all available additional variables [145]. However, any peculiarities in the model updating dataset will be reflected in the new model, which then could under-perform in new populations [141].
8.8
Summary of Methods to Update a Single Prediction Model
There are a number of available methods to update a single prediction model which can be advantageous in different scenarios. Model recalibration only alters the model calibration. These methods would be preferable for a model with a strong discrimination and prediction rules performance but a poor calibration in a sample population, which is reflective of a new target population in which the model will be applied.
However, for a model with a poor overall performance, more extensive updating methods may be beneficial. This can include re-estimating or extending the model to incorporate new variables. The updated models will improve the original model calibration in an internal dataset but the calibration, discrimination, and prediction rules should be evaluated in external datasets to evaluate if a more robust model has been produced in comparison to the original model.