MEASURING INTERDEPENDENCE
3.2 Measuring Interdependence
3.2.2 Utility Theory
The theory of utility states that consumers purchase certain (bundles of) products because they derive utility from them. However, a distinction can be made according to how the utility is modelled: deterministic or stochastic.
3.2.2.1 Deterministic utility theory
The classical theory of utility [238] states that a consumer with a limited budget allocates expenditure between different commodities so as to maximise the utility or satisfaction from consumption. Lancaster, however, criticised this theory and developed the microeconomic theory of the household [168, 169], which states that goods are purchased because they represent combinations of certain characteristics that are desired by consumers. Thus, goods themselves are not the immediate objects of preference or utility, such as in the classical theory of utility, but they have associated with them characteristics (such as calories, proteins, vitamins, …) that are directly relevant to the consumer.
Therefore, the consumer’s demand for goods is derived from their demand for characteristics.
In this context, the substitutability of one product for another increases as the (perceived) attributes of a set of products become increasingly similar [286]. Product complements, on the other hand, are products that are used in conjunction with one another to satisfy some particular need, i.e. together they provide a set of characteristics that are needed by the consumer to fulfil his utility.
3.2.2.2 Stochastic utility theory
Whereas the classical theory of utility views consumer choice as deterministic, recent developments in marketing research treat consumer choice decisions as stochastic. The latter are therefore referred to as stochastic (or random) utility models instead of deterministic models. Under deterministic choice models,
such as those from Lancaster [169] and Luce [179], the consumer is assumed always to assign the same utility to the same choice alternative. The stochastic choice model, however, assumes that the individual draws at random a member of a set of utility functions for each choice occasion. Consequently, utility levels for different alternatives are distributed around mean levels of utility, which depend on the alternatives’ attributes (which mostly consist of a constant plus marketing mix effects) [84]. Because of the flexibility and the strong theoretical foundations of the random utility framework, it has been used as the basic underlying framework for the study of brand choice by consumers, better known as brand choice models that have appeared since the early 1980’s onwards [e.g. 117, 132, 143, 183]. It would lead us too far to discuss every development in the field of brand choice models. Therefore, in the next paragraphs, only the most recent developments will be highlighted.
The classical brand choice model
This random utility theory assumes that the consumer is a rational decision-maker who aims to maximize the utility from purchasing a (bundle of) product(s). This means that from a set of alternative products, the consumer will pick the product that produces the highest utility for the consumer.
Usually, this utility is determined by the sensitivity of the household towards a number of product specific features, such as price, promotion and display. To illustrate this, consider an individual i facing a choice-set of J different (substitute) brands within a certain product category. Then, at shopping occasion t, the utility (U) that he derives from buying brand j can be expressed as:
(3.2)
i = 1, .. , H (number of households) j = 1, .. , J (number of alternative brands) t = 1, .. , T (shopping visit)
ijt ijt ijt
U V
= +ε
-54-The utility for household i from brand j at shopping visit t is thus composed of a deterministic component (Vi,j,t) and an error term εi,j,t, representing for example, the value of a sub utility of unobservable attributes and socioeconomic characteristics. Mostly, εi,j,t is assumed to be independently and identically distributed (IID) over alternatives and consumers. The deterministic component, however, consists again of two parts:
(3.3)
Firstly, there is the intrinsic preference (αi,j) of individual i towards brand j.
After all, it is believed that the consumer possesses an intrinsic preference towards a brand that can be represented by a constant term (i.e. brand specific intercepts). Secondly, in addition to the intrinsic preference, the deterministic component is influenced by the sensitivity of the consumer with regard to the different marketing-mix variables, such as the price, promotion and display of the brand. This sensitivity is reflected by the c x 1 vector βi which differs across consumers, but which is mostly indifferent with respect to time and the brand alternative. Xi,j,t is a 1 x c vector of explanatory variables that includes the price, promotion and display of brand j at purchase occasion t. In a hierarchical framework, these sensitivities will in turn depend on the socio-demographic and/or lifestyle characteristics of the household/consumer (e.g. see [183]).
It is important to note that the utilities (Uijt) can not be observed directly.
Therefore, they are also referred to as latent utilities, but they can be mathematically inferred from the choices made by the panellist. The link between the observed behaviour (Iit) and the latent utility for any product k can be represented as follows:
Iit = j where Uijt = maxk (Uikt) (3.4)
ijt
ijt ij i
V
=α
+X β
In other words, the brand j chosen by panellist i on choice occasion t is the one that represents the highest utility among all J brands in the category being studied. The purchase process is therefore characterized by a discrete choice (i.e. the consumer purchases a product or not) and the alternatives are indivisible. This is different from the classical economical view on utility maximization [168, 169, 284] where the consumer buys proportions of different products, i.e. alternatives are divisible. The latter are called continuous choice models but since they are of minor importance in the literature, they will not be treated further in this overview.
Variants of the classical brand choice model
The classical brand choice model has been extended in a number of ways, as discussed in subsequent paragraphs.
Single versus multiple category choice context
The objective of single category choice models is to study how consumers choose between different competing brands within a single product category [117, 261], whereas the objective of the multi category choice model is to study the purchase behaviour of households in several product categories simultaneously [14, 183, 232, 242]. In the latter framework, it is assumed that a consumer chooses a product from a particular category in the context of a larger choice task [63]. Popular models to study purchases of one product within a single category include the multinomial logit and probit models. The simultaneous purchase of multiple products is often studied by multivariate logit and probit choice models. In this context, the contribution by Manchanda et al.
[183] is worth to note since they developed a model for multicategory purchase incidence decisions within a random utility framework that allows for simultaneous, interdependent choice of items.
-56-Dealing with unobserved heterogeneity
A second difference between brand choice models relates to how the model deals with unobserved heterogeneity. Unobserved heterogeneity across households has been widely recognized as a critical research issue in choice modelling [19, 91, 291]. It refers to the fact that the households being studied are assumed to be heterogeneous in nature, which implies that households react differently to the same marketing-mix variables and often have different base preferences for products. However, it is not apriori known how many or how big these different customer segments are, i.e. they are hidden (unobserved) in the data. Consequently, unobserved heterogeneity can be dealt with in basically three ways that relate to the level of aggregation of the choice model.
In aggregate level choice models [117], only one response function is estimated for the entire sample of households. In order to allow for unobserved heterogeneity, the parameters of the aggregate response function are defined as stochastic variables following some distribution across the population of study. Subsequently, choice predictions of individuals outside the sample are made by using the aggregate level parameters. Therefore, this approach does not fully capture the individual customer differences in the sample.
In group level choice models [291], one response function is estimated per customer segment. These customer segments may be defined apriori (i.e. user defined) or post hoc (either by a traditional clustering approach or a simultaneous approach, i.e. latent class cluster models, where the segments and the response functions within each segment are calculated simultaneously).
Subsequently, choice predictions for individuals outside the sample are made conditional on their membership probabilities to one or more of the identified customer segments.
Finally, in individual choice models, one response function is estimated for each individual household in the sample and thus a set of response parameters is estimated for each individual household. As a result of the large number of
parameters that have to be estimated, this approach requires a large number of observations per individual in order not to jeopardize degrees of freedom and parameter stability. Therefore, in practice, individual choice models are not used very frequently. Yet, from the theoretical point of view, individual response models allow for maximal flexibility in modelling individual consumer choice behaviour. Furthermore, they enable predictions of choice behaviour on the level of the individual.
Besides the level of aggregation of the response function, a further distinction can be made with regard to the type of heterogeneity (see [91]) and its impact on the definition and estimation of the utility equation.
Response heterogeneity: means that individuals have different intrinsic preferences for some products or categories and therefore this type of heterogeneity is typically reflected in the intercept term (αi,j) of the utility function.
Structural heterogeneity: means that individuals may respond in a different way to the same attribute values because they assign a different (utility) value to different product attributes according to their individual needs. This type of heterogeneity is typically reflected in the βi parameters of the utility function.
The specification of both response heterogeneity and structural heterogeneity gives rise to different estimation procedures for the preference and response coefficients. One approach is called the fixed effects approach, wherein a set of parameters is estimated for each household separately [77, 230] and no particular probability distribution of heterogeneity must be specified. However, the fixed effects model involves estimating a large number of parameters and requires long purchase histories for each household. Therefore, in the case of insufficient observations per panellist14, the fixed effects model has proven to produce biased and inconsistent estimates not only for the fixed term, but also of the effects of marketing mix variables [141].
-58-A more tractable approach to estimating the model parameters is therefore to assume that these parameters vary across households according to some probability distribution. This is referred to in the literature as the random effects model [134, 141]. However, again different implementations of the random effects model have been proposed in the literature according to whether the parameters are assumed to follow a predefined probability distribution [78, 114] across the households or no specific parametric distribution is imposed but the distribution is estimated empirically using the underlying data [50], or the coefficients for each household are made dependent on a further set of variables, such as socio-demographic variables [14]. Research has shown that the non-parametric (distribution free) approach produces a better fit of the brand choice model. However, some people argue that the computational difficulties in estimating the model are high [50], although different papers lead to different outcomes [292].
Perceptual heterogeneity: means that individuals may differ in their perceptions, familiarity and/or recall of the underlying attributes utilized in their decision processes. This may be reflected via different values of the Xi,j,t
observations in the utility function.
Form heterogeneity: means that households may differ according to how their utility is constituted. For instance, the utility function was previously conceived as a linear function. However, it is possible that some customers use another form of utility function, e.g., non-linear. Moreover, some customers may value the product attributes in a compensatory way whereas others do not. This type of heterogeneity is typically reflected in the functional form of the utility function or of the choice model as a whole.
Distributional heterogeneity: individuals may possess higher or lower variance or shape parameters in the utility equation. This implies that the parameters of the error distribution of εi,j,t may be different for different households. Furthermore, distributional heterogeneity may be reflected in the type of the error distribution being used (logistic, normal, …).
Time heterogeneity: consumers may differ in their reaction to their past purchase experiences and behaviors. For instance, some customers tend to be very loyal towards a product whereas others may possess more volatile utility functions whose structure changes rapidly over time. This type of heterogeneity may reflect almost any aspect of the utility function. For instance, it may affect the constant
( α
i,j)
of the utility function (see formula 3.3) due to habit formation or variety seeking behaviour (see next paragraph), but it may also change the importance that they attach to the different elements of the marketing mix (i.e. the beta-coefficients in formula 3.3).Zero order versus higher order effects
Finally, most discrete choice models assume that the consumer wants to maximize the utility on each purchase occasion and therefore ignore that the utility of a brand on a particular purchase occasion may be affected by the choice(s) made by the consumer on previous shopping occasions, i.e. state dependence effects or purchase event feedback. State dependence refers to the idea that for some customers the probability to purchase a particular brand increases (decreases) when the same brand has been chosen on previous purchase occasions, i.e. positive (negative) state dependence. Positive state dependence (or habit formation) may result from a routine behaviour [140] of the customer to buy the same brand repeatedly over time. In contrast, negative state dependence may result from a variety seeking behaviour [187]
of the customer to try and purchase different brands over time.
State dependence effects can be dealt with in a number of ways. Mostly, a measure of brand loyalty is introduced, such as the most recent purchase [145]
or an exponentially weighted sum of all past purchases [117]. Dynamic discrete choice models follow another approach, which in general requires the consumer to solve a dynamic optimisation problem [79]. Furthermore, researchers have investigated whether state dependence effects differ across product categories and if there exists a relationship between marketing mix
-60-from unobserved heterogeneity [1]. Since this is not the focus of this dissertation, we will not elaborate on this, but the interested reader is referred to an excellent discussion of state dependence by Seetharaman, Ainslie and Chintagunta [242].