We seek optimal assortments for a retail category over a specified future planning horizon. For concreteness, we assume that the goal is to maximize revenue, since this was the primary concern in our three applications, although it is straightforward to adapt our approach to maximizing other functions, such as unit sales or dollar gross margin. We define the problem parameters in Table3.1.
The parametersKandLwould be specified by the retailer. For expositional simplic- ity,Kdoes not vary by store, but it would be easy to modify our process to enable
Kvarying by store, and in fact we do this in our computational work. Lwould be chosen to lie between 1 andmto tradeoff the greater revenue that comes with larger
Lagainst the administrative simplicity that comes with smaller L. As will be seen, our solution approach makes it easy to solve this problem for all possible values ofKandL, thus providing the retailer with rich sensitivity analysis to guide their choice of these parameters.
Table 3.1 Definition of Problem Parameters
Notation Description
N={1, 2, ...,n} Index set of all possible SKUs a retailer could carry in this
category
M= {1, 2, ...,m} Index set of all stores
K Maximum number of SKUs per assortment
L Maximum number of different assortments
Ds
i Number of customers at storesfor whom SKUiis their most
preferred product
pi Price of SKUi∈ N
We define an assortment to be a setS⊆ Nwith|S| ≤ Kand leta(s)denote the index of the assortment assigned to store s. A solution to the assortment optimization problem is completely defined by the portfolio of assortmentsSl, l =1, 2, ...,Land
a(s) ∈ {1, 2, . . . ,L}, for alls ∈ M.
Table 3.2 Description of Mathematical Notation
Notation Description
A Number of attributes
a Attribute type index
Na Number of levels of attributea,a=1, 2, . . . ,A
faus Fraction of customers at storeswho prefer leveluof the attributea,
u =1, 2, . . . ,Na,a=1, 2, . . . ,A
πsauv Probability that a customer at storeswho’s first choice on attributeaisuis
willing to substitute tov, defined for alla,u, andv.
Our demand model assumes that a consumer shopping this category in storeshas a most preferred SKU i ∈ N, but might be willing to substitute to other SKUs if
i ∈/ Sa(s). We view a SKU as a collection of attribute levels, use historical sales data to estimate the demand share of each attribute level, and finally estimate the demand share of any SKU as the product of the demand shares of its attribute levels. We introduce some notation to formalize this approach in Table3.2. By definition,
πsavv =1. Moreover,πsauvcan be 0 if attribute levelvis not a feasible substitute foru.
For example, size is an attribute of a tire and a 14 inch tire is not a feasible substitute for a customer with a 15 inch wheel.
The fraction of customers who most prefer SKUiwith attribute levelsi1,i2, . . . ,iA
is defined to be fis =Πa=A
a=1 faisa. If a customer’s most preferred SKUiwith attributes i1,i2, . . .iA is not in the assortment, they are willing to substitute to SKU j with
attributes j1,j2, . . .jA with probability πsij = Πaa==1Aπaisaja. If a customer with most
preferred SKU i finds i ∈ Sa(s) when they shop the store, then we assume they buy it. Otherwise, they buy the best substitute for iin S, defined to be j(i,S) =
arg maxi∈SΠAa=1πsaiaja.
In using store sales data to estimate the parameters of our model,we first estimate
Ds, the total unit demand in store s, as total unit sales divided by the share of demand captured, as defined by demand shares and substitution probabilities, and then setDsi = fisDs.
The revenue earned by storesusing assortmentSa(s)can then be written as
Rs(Sa(s)) =
∑
i∈Sa(s) piDis+∑
i∈S/ a(s) Disπijs(i,S a(s))pj(i,Sa(s)) The first term in this expression is the revenue from customers whose most preferred SKU is contained in the assortment and the second term is the expected substitution revenue from customers whose most preferred SKU was not in the assortment.
The assortment optimization problem is to chooseSl,| Sl |≤ K,l =1, 2, . . . ,Land
a(s)for alls ∈ Mto maximize∑s∈MRs(Sa(s)).
Our demand model most closely resembles the locational choice model, but with three important differences: (1) in the locational choice model, the probability of purchasing the closest substitute from the assortment is either 0 or 1 (depending on the no-purchase utility), while we allow general substitution probabilities, (2) consumers are assumed to be distributed in a continuous space in the locational choice model, while we allow customer locations to be restricted to discrete locations in the attribute space, and (3) all attribute levels in the locational choice model have a numeric value which allows the calculation of distance between products and identification of the nearest product to a given ideal point, whereas the nearest product to an ideal point in our model is identified via the substitution probabilities, which can be thought of as inducing a distance metric for attributes that can’t be located in a space. These three enhancements were needed to make the locational choice model operational in the applications we consider.
We allow substitution probabilities to vary by store because we found in our appli- cations that they did in fact vary by store. For example, we will see in Section3.4
that the willingness of a consumer to substitute to a higher priced product varies by store and is correlated with median income in the zip code in which the store is located.
Our demand model implies that a consumer’s preferences for the various attributes are independent, which may not be true. For example a college student shopping for twin size bed-sheets might have a different color preference than a suburban homemaker shopping for queen size sheets, so the color and size attributes for sheets would interact.
Our defense of this assumption is three-fold: (1) all prior publications we are aware of that use attributes in demand estimation make a similar assumption. Fader and Hardie(1996) is typical of the approach followed in the literature. They assume that the utility for a SKU is a linear function of its attributes and then use this utility in a Multinomial Logit model to determine SKU demand shares. They state that
“In both the marketing and economics literature, it is common to assume an additive utility
function” and note that this implies no interaction between attributes. (2) in our
applications, we check the accuracy of this approximation by comparing demand estimates with actual sales for the SKUs currently carried and find that forecasts based on this model are accurate compared to previously published research. (3) if there is significant interaction between attributes, we demonstrate ways to modify our demand model to take this into account. In the snack cakes application (Section
3.4.1), the attributes are flavor, package size (single serve or family size) and brand. Package size and brand interact since one brand is stronger in single serve and another in family size. We deal with this by combining brand and size into a new attribute brand-size. In the tire application (Section3.4.2), the attributes are size, brand (4 brands) and mileage warranty (low, medium, high). Brand and mileage warranty interact because a given brand does not offer all warranty levels, and so we combine brand and warranty level to create the attribute brand-warranty. In the tires example, there is also an interaction between size and brand-warranty. A tire with a given size attribute level fits a defined set of car models of a certain age and value. The six brand-warranty levels correspond to different price points and quality. There is a clear interaction between brand-warranty shares across price points and the age and value of the cars a size tire fits. We show how to deal with this by partitioning the sizes into a finite number of homogenous segments (which
are latent) and allowing the brand-warranty shares to be conditional on segment membership.