• No results found

HL Values (All Three Methods) Plotted

Against Data Set Size

Models, ds = SBP 2. ds = rts coded SBP 700 600 5 0 0 . FG uncoded SBP 400 ° FP uncoded SBP 300 ° ALG uncoded SBP X 200 ° FG coded SBP 100 ° FP coded SBP ° ALG coded SBP -100 0 1000 2000 3000 4000 5000 6000 7000 8000

Data Set Size

G raph 5

Graph 5 show s that for nearly all data set values reducing the covariate pattern for the SBP variable results in an apparent improvement in the goodness o f fit for the m odel. This effect occurs for all three m ethods o f calculating the H osm er L em eshow statistic. A ll three m ethods produce sim ilar values for the reduced (coded SBP) m odel. A ll three tests suggests that the reduced m odel (coded SBP m odel) has a much better fit than the uncoded SBP m odel as a consequence o f reducing the covariate groupings.

RESULTS: Study 4

HL Value (Algorithm Method) Plotted

Against Data Set Size

60 ' 50 ■ 0> =3 ; — I I ° ALG model 3 ° ALG model 2 ° AGL model 1 ° ALG control 1000 2000 3000 4000 5000 6000 7000 8000

Data Set Size

Graph 6

Graph 6 shows that only model 3 (smallest age covariate group) shows an appreciable difference from the control model. The difference is present over all data set values. Reducing the number of age groupings had little impact on the HL value except for the smallest grouping pattern.

HL Value (Fixed Percentile Method) Plotted

Against Data Set Size

30 • 0)3 I —I X ° FP m o d els ° FP model 2 10 . ° FP model 1 ° FP Control 1000 2000 3000 40 0 0 5000 6000 7000 8000

Data Set Size

Graph 7

Graph 7 shows that changing the age covariate pattern has a variable effect on the HL value compared to the control model when using the Fixed Percentile method.

HL Value (Fixed Group Method) Plotted

Against Data Set Size

60 ' 0) 5 0 ' 3 g —I X 30 . ° FG model 3 ° FG model 2 ° FG model 1 ° FG Control 1000 2000 3000 40 0 0 5000 600 0 7000 8000 0

Data Set Size

Graph 8

Graph 8 shows an increase in the HL value for model 1 and 2 compared to the control for the majority of the data set values. For model 3 the majority of the HL values are lower then the control.

RESULTS: Study 5

HL Values (Algorithm Method) Plotted Against Data Set Size

60 - 50 - _l I ° ALG model 3 30 - ° ALG model 2 20 - ° ALG model 1 '' ALG Control 1000 2000 3000 4000 5000 6000 7000 8000 0

Data Set Size

Graph 9

Graph 9 shows no clear effect after changing the age covariate pattern on the HL value for the HCISS + Age model using the Algorithm method.

HL Values (Fixed Percentiles Method) Plotted

Against Data Set Size

140 120 ■ 100 (/> Q) 3 g X ° FP model 3 ° FP model 2 ° FP model 1 ° FP Control model 0 1000 2000 3000 4000 5000 6000 7000 8000 D ata S e t Size Graph 10

Graph 10 shows a marginal increase in the HL values for the three models when compared to the control model (HCISS + Age). Changing the age covariate pattern therefore had little effect on the HCISS + Age model using the fixed percentile HL method.

HL Values (Fixed Group Method) Plotted

Against Data Set Size

100 1 80 ■ (Ü 3 ; — I X 40 ° FG model 3 ° FG model 2 ° FG model 1 ° FG Control model 0 1000 2000 3000 4000 5000 60 0 0 7000 8000 D ata S e t Size Graph 11

Graph 11 shows no clear effect on the HL values when compared to the control model (HCISS +Age). Changing the age covariate pattern therefore had little effect on the HCISS + Age model using the fixed group HL method.

RESULTS: Study 6

HL Value (Algorithm Method) Plotted Against Data Set Size

_i X 20 - ° ALG model 3 ° ALG model 2 ° ALG model 1

° ALG Control model 1000 2000 3000 4000 5000 6000 7000 8000

0

D a ta S e t S i z e

Graph 12

Graph 12 shows no real no appreciable effect on the HL values when compared to the control model (HCISS + RTS + Age) except for model 2, values 2000 and 3000. Changing the age covariate pattern therefore had little effect on the HCISS + RTS + Age model using the algorithm HL method.

HL Value (Fixed Percentile Method) Plotted Against Data S et Size

20 0) 3 I —I X ° FP model 3 ° FP model 2 ° FP model 1 ° FP Control model 1000 2000 3000 4000 5000 6000 7000 8000 0 D ata S e t S iz e Graph 13

Graph 13 shows that model 3 has a higher HL value over the full range of data set values when compared to the control model. The remaining two models (1 and 2) have values which fluctuate around the control model (HCISS + RTS + Age). Changing the age covariate pattern therefore had an unpredictable effect on the HCISS f RTS + Age model using the fixed percentile HL method.

HL V alue (Fixed G roup M ethod) Plotted A gainst D ata S e t Size

30 ■ 0) 3 5 _ i X D FG model 3 ° FG model 2 10 - ° FG model 1 ° FG Control 1000 2 0 0 0 3 000 4 0 0 0 5000 6 0 0 0 7000 8000 0

Data Set Size

Graph 14

Graph 13 shows little effect of changing the age covariate pattern on the HL value when compared to the control model. Changing the age covariate pattern therefore had little effect on the HCISS + RTS + Age model using the fixed group HL method.

Section 5: Discussion

The results from these studies shows the variable effect on the HL statistic by reducing the covariate pattern. Study 1 demonstrated that using the fixed percentile method reducing the HCISS model into 6 or 12 groups resulted in model over-fit. This effect was greatest for the model with 6 groups using clinical cut-off points and least for the model with 12 random groups. The effect was more erratic with the other two methods of calculating the HL statistic. Using increase in data set size could be a confounding factor because of the corresponding potential increase in HCISS groups. An analysis of this showed that for a data set size of 1000 cases 38 HCISS groups were represented. For a data set size of 7000 only an additional 3 HCISS groups were added.

The problem of model over-fit by reducing the number of covariate groups was most marked for the model with SBP as the sole predictor variable (study 3). The HL results for this model imply significant over-fit as the HL values are close to zero for all three methods of calculating the HL test. A similar effect was seen in the model where GCS was the sole predictor variable (study 2). Coding the GCS using the triage RTS values again resulted in over prediction of the model fit. The effect was most marked for the algorithm method. The model with age as the sole predictor variable demonstrated a variable response (study 4). Only the smallest covariate pattern resulted in over prediction of the model, the effect was seen for all three methods of calculating the HL statistic.

In study 5 the effect of reducing the age covariate pattern on the model with two predictor variables (HCISS + Age) had little effect in terms of over prediction. The fixed percentile method resulted in slight under prediction for all three models compared to the control. The results for the fixed percentile method were more consistent compared to the algorithm and fixed group methods. The algorithm and fixed group methods produced some under and some over prediction for the three models. The model with the smallest covariate pattern (model 3) resulted in the greatest degree of over prediction compared to the control model. In study 6 the model with three predictors also produced variable results. The algorithm and fixed group methods showed no real trend with some over prediction and some under prediction.

Hosmer et al (1988) found that small cell sizes (less than 5) can result in large values of the HL test. No previous studies have looked specifically at the effect of changing the covariate pattern on the HL statistic.

Conclusions

In summary the results of this study have shown that reducing the covariate pattern of a variable by recoding can have an appreciable impact on the HL value and may result in over prediction of the model goodness of fit. The effect is variable dependent. The effect was less with models with more than one predictor variable. All three methods of calculating the HL value were sensitive to changes in covariate pattern.

CHAPTER 7

A COMPARISON OF SIX GOODNESS OF FIT

Related documents