Fixed parameters in MNL - Stated Preference methods

Chapter 4. Literature review: consumer choices in neoclassical and behavioural economics

4.3. Stated Preference methods

4.4.4. Fixed parameters in MNL

The second important restriction of MNL models is the average or fixed nature of the parameters. For MNL that consider only the attributes of the choice alternatives, the parameter estimates the average impact of each attribute on choice probabilities. However, this may mask important differences amongst respondents.

Three approaches to dealing with taste heterogeneity have been developed (Adamowicz, Louviere et al., 1998): a priori definition of segments, based on prior knowledge; latent class models; and the random parameters logit (RPL). These are discussed in turn.

The definition of segments has already been introduced with the MNL term that accounted for interactions between choice attributes and respondents’ characteristics. One way of creating segments is to collect information on respondents’ characteristics in the survey. This

information can then be used to divide respondents into different groups that the researcher, given prior literature, would expect to choose differently. For example, Burton, et al. (2001) used respondents’ self-reported frequency of organically-grown food purchases to create three segments. These segments had different willingness to pay for GM food, as expected.

Defining a priori segments is difficult. On the one hand, Stigler & Becker (1977) maintain ‘that tastes neither change capriciously nor differ importantly between people.’ They suggest that changes in price and income are the only important drivers of differences in consumption, which would tend to suggest that the only respondent characteristic of interest is income. In

empirical work, income has been shown to be an important factor in choosing food when food products are considered at a very disaggregate level (Jones, 1997). On the other hand, cultural worldviews (Langford, Georgiou, Bateman, Day, & Turner, 2000) and taste heterogeneity unrelated to demographics (Scarpa & Thiene, 2004) have also been important variables in choice analysis. In relation to food, research has shown clear links between personality traits and food purchases; however, these correlations explain only a portion of purchase behaviour (Bareham, 1995). Consumer segments with regard to genetically modified food are also problematic. As the review of consumer research showed, there is evidence of large

differences of opinion regarding GMF. Identifying members of different segments is another matter.

One technique for identifying segments is to do a cluster analysis of the data to identify similar respondents, then perform a separate MNL for each cluster (Adamowicz & Boxall, 2001; Richards, 2000). With this approach, the partworths generated from each MNL could be compared to determine similarities and differences.

Latent class models allow group membership to arise from the choice data themselves, rather than imposing membership exogenously (Scarpa & Thiene, 2004). In these models,

membership in one or another class is defined probabilistically, with choice probabilities conditioned on class membership (Adamowicz & Boxall, 2001; Swait, 1994). The

unconditional probability of choosing an alternative is thus the combined probability of class membership and choice. While actual membership in a class is probabilistic and results from the choice data, the number of classes in the analysis is exogenously determined. For

example, Scarpa & Thiene (2004) used statistical comparisons of different latent class models to determine the appropriate number of classes. The preferred model had a weaker statistical fit but had parameters that were more explanatory and interpretable.

The random parameters logit (RPL) is a flexible model specification that relaxes MNL assumptions regarding taste homogeneity and IIA (Bhat, 2003; Revelt & Train, 1998; Rigby & Burton, 2003), so it is becoming the preferred model for estimating discrete choice data (McFadden, 2001a; Walker et al., 2003). The model goes by different names, variously called random-coefficients logit, random parameters logit, error-components logit, mixed logit, mixed MNL, and logit kernel (McFadden & Train, 2000; Revelt & Train, 1998; Walker et al., 2003).

The RPL assumes that each parameter assumed to be random for the deterministic portion of utility is drawn from a distribution across the population of respondents (Rigby & Burton, 2003). This distribution can be described by a mean and variance, which are estimated for each choice attribute. The strength of this approach is that nearly any preference structure can in theory be estimated with the proper choice of distribution (Scarpa, Willis, & Acutt, nd), but it also raises the question what the choice should be (Rigby & Burton, 2004). Many

applications of RPL modelling assume that parameters are normally distributed (e.g., Bonnet & Simioni, 2001; Onyango et al., 2004; Rigby & Burton, 2003), but more-complex

estimations examine the impact of other distributional assumptions (e.g., Rigby & Burton, 2004). In theory, any distributional assumption, including discrete or discontinuous distributions, is possible (Bhat, 2003).

RPL is similar to MNL in that the observed choices are conditional on choice attributes and the personal characteristics of the respondents. The insight of RPL is that choice probability is conditional on the values that respondents attach to the choice attributes, the estimated β’s, and that these may take different values for different respondents (McFadden & Train, 2000; Revelt & Train, 1998; Train, 2003). Where it is possible to assume a constant value for these parameters, the unconditional probability can be modelled as a MNL. Where the parameters are random in the population, RPL allows these parameters to be defined by distributions, so

that the unconditional probability is given by the integral for the parameters’ entire distributions (Train, 2003):

( )

( ) ( )

Pr i =

∫

L_i β f β βd .

In this equation, Li(β) is the standard logit function, and f(β) is a density function on the

coefficient vector.

The density function, also called a mixing distribution (McFadden & Train, 2000), is described by the parameters θ, which are typically the mean and variance of β. The RPL estimates these θ, given the observed choices, attributes, and personal characteristics. The RPL cannot be estimated analytically because the integrals do not have a closed-form specification; it is therefore estimated by simulation (Revelt & Train, 1999; Train, 2003). Train (2003) provides an explanation of the simulation procedure. The researcher chooses values for θ, draws values of β at random given the described distribution, and calculates the value for the RPL equation. This is repeated many times, and the results are averaged to find the choice probabilities given the values of θ. The researcher then searches for the values of θ

that maximise the simulated log-likelihood.

Despite its flexibility, the use of RPL raises some issues. The first is the choice of distribution. The distribution of the parameters must be specified exogenously. Using a normal distribution or any other infinite distribution can lead to extreme values for some parameters, albeit with small probabilities (Rigby & Burton, 2003). Some distributions can take both positive and negative values (Rigby & Burton, 2003), which can lead to parameter values that do not conform to prior economic theory. Because of these concerns, distributions can be truncated or censored (Rigby & Burton, 2004), or can be finite (Bhat, 2003).

A second issue, paradoxically, is the fixing of parameters. In practice, distributions are estimated for only some parameters, while other parameters are fixed. Of course, a fixed parameter could be viewed as a special case of distributional choice – a point mass at an average value – but that sort of distribution deserves special mention. For example, Revelt & Train (1999) assume a fixed coefficient for price. They make this assumption to improve the stability of the estimation, to make the calculation of willingness to pay easier, and to avoid problematic assumptions about the distribution of the price coefficient. However, a fixed coefficient for the price attribute assumes a constant utility of money. This assumption may be problematic for GMF, as the WTP for GM of types of consumers may be related to both different responses to GM and different marginal utilities of money (Burton et al., 2001). As a result, if a RPL is necessary for taking into account respondent heterogeneity, then it may be important to estimate distributions for all attributes. On the other hand, the possibility of fixing some parameters raises the question of making the simplifying assumption that all parameters are fixed and that, consequently, a MNL is appropriate.

A third issue that parallels the other techniques for segmenting respondents is the conditioning of choices. If an RPL is conditioning the distribution of the parameters on some characteristic of the respondents, the question raised is how to condition the choices. In this respect, RPL is no different from MNL. Choice could be conditioned on membership in a cluster (Revelt & Train, 1999), on attitudes (Rigby & Burton, 2003), on demographics, or possibly on

something else. Using RPL does not resolve this deeper issue.

A further issue is the open-form specification of RPL. The solution to RPL models is found through simulation techniques, which are less accurate than closed-form GEV models that can be estimated via analysis (Bhat, 2003).

RPL is one more data analysis tool, but it requires the researcher to exercise judgement, which complicates the work of assessing each model. For example, choosing a normal distribution for a taste parameter regarding, say, improving apple flavour, and a fixed parameter for price will mean two things: first, that the estimate starts from the assumption that most people have fairly similar preferences for improving the flavour of apples (two-thirds within one standard deviation); and second, that the WTP of each individual is a function only of their preference for the improvement and is unrelated to income, wealth, or money preferences. To some extent, these assumptions can be tested, but this then raises questions of which assumptions to test and how to do it. Thus, while RPL does not suffer from the same restrictive assumptions as MNL, each individual estimation relies on its own set of potentially problematic

assumptions.

These two weaknesses of MNL models, the IIA property and the fixed parameters, have led to research on alternative specifications for RUM-based models. This research has developed a number of alternative approaches to modelling discrete choices, as discussed above.

In document Demand for genetically modified food : theory and empirical findings (Page 138-143)