Step 4: Viewing Output
6.2 Application Types
6.2.2 LC (FM) regression models with random effects
A unique feature of Latent GOLD Choice is that it allows you to combine random effects with latent classes. More specifically, it is possible to specify LC Choice models in which the intercept and/or some of the regression coef- ficients vary within latent classes. Lenk and DeSarbo (2000) proposed using random effects in FMs of generalized linear models and B¨ockenholt (2001) proposed using random effects in LC models for ranking data.
It has been observed that the solution of a LC Choice analysis may be strongly affected by heterogeneity in the constants. In choice-based conjoint studies, for example, it is almost always the case that respondents differ with respect to their brand preferences, irrespective of the attributes of the
offered products. A LC Choice model captures this brand heterogeneity phenomenon via Classes with different constants. However, the analyst often likes to find relatively small number of latent classes that differ in more meaningful ways with respect to attribute effects on the choices. By including random alternative-specific constants (intercepts) in the LC Choice model, for example, ηm|x,zit,F1i =β con xm + P X p=1
βxpattzmitpatt +λconxm ·F1i,
it is much more likely that one will succeed in finding such meaningful Classes (segments). The random intercept term, which may have a different effect in each latent class, will filter out (most of) the “artificial” variation in the constants.
7
Multilevel LC Choice Model
7.1
Model Components and Estimation Issues
To explain the multilevel LC model implemented in Latent GOLD Choice, we need to introduce and some new terminology. Higher-level observations will be referred to as groups and lower-level observations as cases. The records of cases belonging to the same group are connected by the Group ID variable. It should be noted that higher-level observations can also be individuals, for example, in longitudinal applications. “Cases” would then be the multiple time points within individuals and replications the multiple choices of an individual at the various time points.
The index j is used to refer to a particular group and Ij to denote the number of cases in group j. Withyjit we denote the response at replication t of case i belonging to group j, with yji the full vector of responses of case i in group j, and with yj the responses of all cases in group j. Rather than expanding the notation with new symbols, group-level quantities will be referred to using a superscriptg: Group-level classes (GClasses), group-level continuous factors (GCFactors), and group-level covariates (GCovariates) are denoted by xg, Fgj, and zjg, and group-level parameters by γg, βg, and λg.
is P(yj|zj,z g j) = Kg X xg=1 Z Fgj f(Fgj)P(xg|zjg)P(yj|zj, xg,F g j) dF g j, (25) where P(yj|zj, xg,Fgj) = Ij Y i=1 P(yji|zji, xg,Fgj).
Assuming that the model of interest may also contain CFactors, for each case i, P(yji|zji, xg,Fgj) has a structure similar to the one described in equation (22); that is, P(yji|zji, xg,F g j) = K X x=1 Z Fji f(Fji)P(x|zji, xg,F g j)P(yji|x,zji,Fji, xg,F g j) dFji. where P(yji|x,zji,Fji, xg,F g j) = Ti Y t=1
P(yjih|x,zattjit,z pred
jit ,Fji, xg,F g j)
These four equations show that a multilevel LC Choice model is a model
• forP(yj|zj,zgj), which is the marginal density of all responses in group j given all exogenous variable information in group j,
• containing GClasses (xg) and/or (at most three mutually independent) GCFactors (Fgj),
• containing GCovariates zgj affecting the group classes xg,
• assuming that the Ij observations for the cases belonging to group j are independent of one another given the GClasses and GCFactors,
• allowing the GClasses and GCFactors to affect the case-level latent classes x and/or the responses yji.
GCFactors enter in exactly the same manner in the linear term of the conditional logit model as case-level CFactors. We refer to their coefficients as λcon,gmxd, λatt,gxpd , and λpred,gmxqd. GCFactors can also be used in the model for the Classes. We will denote a GCFactor effect on the latent classes as λ0xrd,g, 0≤1≤R, where the superscript 0 refers to the model for the latent classes.
GClasses enter in the conditional logit model for the choices as βxm,xcon,gg +
PP
p=1β att,g
xp,xg ·zmjitpatt +PQq=1βxmq,xpred,gg ·zjitqpred. Inclusion of GClasses in the model
for the Classes implies that the γ parameters become GClass dependent; that is ηx|zji,xg =γxg,x0 +
PR
r=1γxg,xr·zcovjir. Note that this is similar to a LC
Regression analysis, where xg now plays the role of x, and x the role of a nominal y variable.
The remaining linear predictor is the one appearing in the multinomial logistic regression model for the GClasses. It has the form ηxg|zg
i = γ g xg,0+ PRg r=1γ g
xg,r·zg,covjr . This linear predictor is similar to the one for the Classes (in
a standard LC model), showing that GCovariates may be allowed to affect GClasses in the same way that covariates may affect Classes.
Below we will describe the most relevant special cases of this very general latent variable model,20 most of which were described in Vermunt (2002b, 2003, 2004, and 2005) and Vermunt and Magidson (2005). We then provide some expressions for the exact forms of the various linear predictors in models with GClasses, GCFactors, and GCovariates.
Model restrictions One can use the parameter constraints “Class Inde- pendent”, “No Effect”, and “Merge Effects”, implying equalλ’s (β’s) among all Classes, zero λ’s (β’s) in selected Classes, and equal λ’s (β’s) in selected Classes.
ML (PM) estimation and technical settings Similar to what was dis- cussed in the context of CFactors, with GCFactors, the marginal density P(yj|zj) described in equation (25) is approximated using Gauss-Hermite quadrature. With three GCFactors and B quadrature nodes per dimension, the approximate density equals
P(yj|zj,zgj)≈ Kg X xg=1 B X b1=1 B X b2=1 B X b3=1 P(xg|zgj)P(yj|zj, xg, Fbg1, F g b2, F g b3)P g b1P g b2P g b3.
ML (PM) estimates are found by a combination of the upward-downward variant of the EM algorithm developed by Vermunt (2003, 2004) and Newton-
20In fact, the multilevel LC model implemented in Latent GOLD Choice is so general
that many possibilities remain unexplored as of this date. It is up to Latent GOLD Choice Advanced users to further explore its possibilities.
Raphson with analytic first-order derivatives.21
The only new technical setting in multilevel LC Choice models is the same as in models with CFactors; that is, the number of quadrature nodes to be used in the numerical integration concerning the GCFactors. As explained earlier in the context of models with CFactors, the default value is 10, the minimum 2, and the maximum 50.