7.3 Model estimations with one latent variable
7.3.6 Model comparison
Now we want to find out which of the analyzed models can be considered to be the ”best” model – the basic question is which covariates should be incorporated into the predictor of the structural equation, and in which form. For model comparison we use the two different versions of the DIC defined in Equations (5.7) and (5.10). The difference between the two versions is thatDIC1 uses the estimated latent scoresz whereas theDIC2 lacks the latent
variables and uses the expected value of the structural equation η instead. The results of the DIC1 for all models with one latent variable are summarized in Table 7.16; the
obtained values for the DIC2 can be found in Table 7.17.
Let us start with the discussion of theDIC1. Model M2b with a pure parametric predictor
and interactions, and model M4b with a combination of parametric covariates and smooth functions of metric and spatial covariates have the lowest DIC1 values – hence those two
models might be preferred to the other models. However, the behaviour of the values for
D(θ,z),D(θ,z), andpD is rather unusual compared to generalized regression models. Us- ing the results listed in Table 7.16 and additionalDIC1 analyses based on some simulation
models of Section 6.3, the following observations can be made:
The effective number of parameters pD is about 5-20% below the number of obser- vations for models without indirect effects, i. e. for classic factor analysis models.
The values of the deviances D(θ,z) and D(θ,z) are always lowest for models with- out indirect effects. If indirect covariates (both parametric and nonparametric) are included in the model, the values of both deviances increase. This behaviour con- trasts sharply with standard regression models where the inclusion and addition of covariates always reduces the deviance values.
If indirect covariates are added to a model, the number of effective parameters pD decreases on a scale which does not depend on the number of parameters in the
Model Predictor of indirect effects Prior DIC D(θ,z) D(θ,z) pD
M1 η= 0 – 201793.11 174790.10 188291.61 13501.51 M2a η=Sex+Inc+Age – 201635.15 175543.42 188589.28 13045.87 M2b η=Sex+Inc+Age+Sex∗Inc+ – 201619.25 175552.07 188585.66 13033.59
η=Sex∗Age+Inc+Age
M3a η=f(Age) RW1 201810.75 174838.21 188324.48 13486.27 M3a η=f(Age) RW2 201782.62 174821.55 188302.09 13480.53 M3a η=f(Age) P3 201800.93 174830.61 188315.77 13485.16 M3b η=Sex+Inc+f(Age) P3 201649.44 175615.58 188632.51 13016.93 M4a η=fspatial(Reg) – 201824.23 174919.97 188372.10 13452.13 M4b η=Sex+Inc+f(Age) +fspatial(Reg) P3 201632.41 175650.60 188641.50 12990.90 M5a η=Sex∗f(Age) P3 201697.75 174883.72 188290.73 13407.02 M5b η=Inc∗f(Age) P3 201709.42 175560.65 188635.03 13074.39 M5c η=Inc+Sex∗Inc+Sex∗f(Age)+ P3 201633.15 175658.22 188645.68 12987.46
η=fspatial(Reg)
7.3 Model estimations with one latent variable 137
Model Predictor of indirect effects Prior DIC D(θ) D(θ) pD
M1 η= 0 – 248301.24 248222.39 248261.82 39.43 M2a η=Sex+Inc+Age – 240672.42 240341.57 240506.99 165.43 M2b η=Sex+Inc+Age+Sex∗Inc+ – 241429.44 240176.93 240803.19 626.25
η=Sex∗Age+Inc+Age
M3a η=f(Age) RW1 248030.16 247763.60 247896.88 133.28 M3a η=f(Age) RW2 247945.64 247734.09 247839.87 105.77 M3a η=f(Age) P3 247946.91 247785.82 247866.36 80.55 M3b η=Sex+Inc+f(Age) P3 240146.98 239891.44 240019.21 127.77 M4a η=fspatial(Reg) – 247328.65 246488.62 246908.64 420.02 M4b η=Sex+Inc+f(Age)+ P3 239860.19 239146.67 239503.43 356.76
+fspatial(Reg)
M5a η=Sex∗f(Age) P3 246441.99 246243.52 246342.76 99.24 M5b η=Inc∗f(Age) P3 240844.42 240561.51 240702.97 141.46 M5c η=Inc+Sex∗Inc+Sex∗f(Age)+ P3 239731.68 238997.05 239364.37 367.31
η=fspatial(Reg)
Table 7.17: Results ofDIC2 of estimated models based on PD1 dataset.
predictor of the structural equation. Again this response is different from standard regression models where the number of effective parameters increases when further covariates are added to an analysis. In our LVM however, the inclusion of covariates reduces the already high number of effective parameters of the LVM without indirect effects – the covariates seem to explain the fluctuations of the latent scores and therefore decrease the number of effective parameters.
TheDIC1 of a model including covariates can be lower – and hence indicate a better
fitting model – than in a model with less covariates because the reduction in number of effective parameters is higher than the increase in deviance.
The range of estimatedDIC1 values is rather narrow – the lowest value is 201,619.25
and the highest one is 201,824.23 which is not a big difference. The reason for this lies in the fact that the factor scores z are estimated in such a way for all models that they explain the actual response values in the best way possible, regardless of the used covariates in the predictor of the structural equation.
Looking at the values of the DIC2 in Table 7.17, the properties of the DIC2 are rather
different:
The values of the DIC2 are clearly higher – thus a much smaller likelihood prevails
– compared to the DIC1. This behaviour can be expected since only the expected
values of the latent scores η are considered in the DIC2 instead of the estimated
latent scores z.
The basic model M1 without any covariates has a much lower number of effective parameters for the DIC2 than for the DIC1 – it seems the number of latent scores z does not have an influence on pD. When covariates are added to the model, the deviance values of D(θ) and D(θ) decrease, the number of effective parameters pD
Model Predictor of indirect effects Prior Rank Rank of DIC1 of DIC2
M1 η = 0 – 9 12
M2a η =Sex+Inc+Age – 4 4
M2b η =Sex+Inc+Age+Sex∗Inc+ – 1 6
η =Sex∗Age+Inc+Age
M3a η =f(Age) RW1 11 11
M3a η =f(Age) RW2 8 9
M3a η =f(Age) P3 10 10
M3b η =Sex+Inc+f(Age) P3 5 3
M4a η =fspatial(Reg) – 12 8
M4b η =Sex+Inc+f(Age)+ P3 2 2
+fspatial(Reg)
M5a η =Sex∗f(Age) P3 6 7
M5b η =Inc∗f(Age) P3 7 5
M5c η =Inc+Sex∗Inc+Sex∗f(Age)+ P3 3 1
η =fspatial(Reg)
Table 7.18: Comparison of the order of the best fitting models recommended by DIC1 and
DIC2.
increases; this behaviour corresponds with the DIC properties in standard regression settings.
How the effective number of parameters is obtained is still a bit a mystery. As covariates are added to the model, pD typically increases by a certain amount – however, the value of the increase cannot be wholly explained by the added number of parameters. Again, the effective number of parameters seem to be related to the amount of latent variable variance which can be attributed to the indirect covariates.
The basic model without covariates almost always has the highestDIC2values – this
is obvious because there are no covariates present which might explain the fluctua- tions of the expected values of the latent scores.
The range of DIC2 values is higher than in the case of DIC1 with a spread from
239,731.68 to 248,301.24.
In order to compare the results of the two different DIC versions, the ranks of the ”best” fitting models from low to high DIC values are depicted in Table 7.18. We recognize that both DIC versions favour different models; e. g. DIC1 considers model M2b to be the best
model – DIC2, however, estimates model M2b to be the sixth best model. For the DIC1
some results seem to be unlikely, for example the DIC1 considers model M4a with a sole
spatial effect as the worst fitting model although the spatial effect explains the differences in latent scores rather convincingly (see Section 7.3.4). Although the end results of the