The Chosen Forecasting Methods - Forecasting the decline of superseded technologies : a compari

An important aspect of selecting a forecasting approach is the knowledge that managers seem sceptical of predictions from methods they do not understand, independent of how well a method might perform (for example, see Taylor & Thomas, 1982). The importance placed on understanding by managers underpins much of good forecasting practice (Armstrong, 2001e, p. 369). Marketing science diffusion models are the most common approach in diffusion forecasting. It is for this reason and their intrinsic ability to be understood in their basic forms that they were chosen as part of this investigation. The diffusion models selected are discussed next. This is followed by an explanation of why analogous series forecasting should be tested. The use of expert judgment is then described. Model parameter estimation is discussed, and then forecast error measures are discussed.

6.2.1. Criteria for the choice of diffusion models

Only foundational diffusion models (internal, external, and mixed influence) were considered (for examples see Mahajan & Peterson, 1985). The rationale for this is based on the lack of a general theory on technology decline, and the focus on models that are simple, less reliant on externally defined market determinants, and suitable for naïve data fitting methods. The choice of functional forms for this study was guided by two factors:

 the intention to set a yardstick of simple foundational models in decline studies, in a similar approach to that of Hardie et al. (1998) to support Studies Two and Three;

 the desire to understand the requirements of those models with respect to the often short data series (both in terms of completion of the decline process but also in terms

of the sparsity of data points within the estimation area) as observed (Christodoulos, Michalakelis, & Varoutas, 2010, 2011; Sultan et al., 1990)

The most commonly used models were screened against the following criteria:

 that they represent cumulative diffusion as a function of time;

 that they have no more than three (adjustable) parameters to ensure they are suitable for the short length data series available for this study, and were able to easily translate into a form with a market ceiling set to a normalised level of unity;

 that they were representative of models used extensively in diffusion research;

 that they were not a reformulation of another model already included in the study

6.2.2. Diffusion models selected for use in the three studies

Arguably the three most commonly used diffusion models are the Pearl logistic (Pearl & Reed, 1923), the Gompertz (Gompertz, 1825) and the Bass (Bass, 1969). They are parsimonious in parameter count and are potentially attractive for use in this study because they are simple to understand and are not tied too closely through their parameters to concepts related to growth, which might make them potentially incorrectly specified for this new application. However, despite being the most tested and used models their overall diffusion forecasting performance is reported as inconsistent (Meade & Islam, 2001; M. R. Young & Ord, 1989). Despite inconsistent evidence of performance, simple models are still appealing over complex models. In an investigation of early trial of various products by consumers, Hardie et al. (1998) found that out of eight relatively simple functional forms, simpler models when used to predict product trial outperformed more complex models with both 13 and 26 weeks of estimation data. One of the poorer performing models was the Bass model, despite it being so extensively used that it is considered an empirically generalised model (Mahajan et al., 1995a). The Bass model has however, a reputation for being unstable when applied to short data series (Heeler & Hustad, 1980; Mahajan et al., 1990).

Beyond the Bass, the Pearl logistic and the Gompertz models, a variety of other functional forms are also somewhat popular. However, many of them, as indicated in Chapter Five in the section on diffusion models, are effectively the same function but in a different form. Other models do not easily provide a solution giving universal and logical parameter values, or do not scribe an S-curve but assume exponential curve growth. Further, some are over-

the more popular secondary models notable examples include, in no particular order: Von Bertalanffy (Von Bertalanffy, 1957), Richards (Richards, 1959), the family of flexible logistic (FLOG) models (Bewley & Fiebig, 1988) and the log-logistic model (Tanner, 1978). The foundational and other extended models are extensively reviewed in Mahajan et al. (1995a) and Meade and Islam (2010b). The model formulations are presented in Table 2. Table 2

Table 2. Models Selected for Testing

Models Selected for Testing

Formulations of the models for this study has been selected because their saturation value L,

the market size, need not be known nor considered given share data normalised to a peak share of 100 percent. Setting L as fixed, is supported by the principle that a fixed saturation level model is difficult to beat by a model that adjusts for market growth over time (Meade & Islam, 2001).

Name (typically called) Function Comments

Pearl logistic (often called the simple logistic) - In form used by Meade and Islam (2001), see

Equation (9) 𝑌𝑡 =

𝐿 1 + 𝑎𝑒−𝑏𝑡

Where L and a are positive constants, with

L representing the upper limit asymptote, a

representing the number of times the Y(t=0)

value needs to grow to reach M, b

representing the rate and direction of the growth with a positive value indicating growth and a negative decline.

Gompertz - In a three-parameter cumulative function form (Sood

et al., 2012), See Equation (13). 𝑌𝑡 = 𝐿𝑒

−𝑎𝑒−𝑏𝑡

Where L is the upper asymptote, b and a are positive, with b setting the displacement along the x axis, and a setting the growth rate on the y axis.

Bass - In cumulative density form (Meade & Islam, 2010a),

See Equation (26). _𝑌_(𝑡)_{= (𝐿)} 1 − 𝑒

−(𝑝+𝑞)𝑡

1 +𝑞_{𝑝 𝑒}−(𝑝+𝑞)𝑡

Where L is the upper asymptote, q is the coefficient of innovation p is the coefficient of imitation. When q i s g r e a t e r t h a n

p t h e n the point of inflection occurs at 𝑡𝑖𝑛𝑓𝑙𝑒𝑐𝑡𝑖𝑜𝑛= 1 𝑝 − 𝑞𝑙𝑛 ( 𝑞 𝑝) 𝑁𝑖𝑛𝑓𝑙𝑒𝑐𝑡𝑖𝑜𝑛= 𝐿 (𝑞 − 𝑝) 2𝑞 When p i s g r e a t e r t h a n o r e q u a l t o

q t h e n the point of inflection occurs at a negative value of t. log-logistic - (Tanner, 1978), see Equation (28) 𝑌(𝑡) 𝐿 1 + 𝑎𝑒(−𝑏𝑙𝑛(𝑡))

An asymmetric function formed by inserting the natural log (ln) of time in place of time (t) in the simple logistic. A model representing a straight

line.

𝑌𝑡= 𝑎𝑋 + 𝑏𝑡 Where a = the rate of descent and b = the

There are many other potential approaches to forecasting beyond diffusion models. The two primary groups are the use of analogies, and judgmental methods. In the next section, the rational for deciding on an analogical method is described.

6.2.3. The rationale for the analogical approach chosen

In this current study, a simple method that overcomes challenge with choosing the best analogy was sought. This problem is described in depth in Chapter Five in the section on analogies in forecasting. With the focus on simple methods, an analogous series approach was chosen see (M. J. Wright & Stern, 2015). They had used this method on similar nonlinear series, to predict early trial sales of new products. Their method appeals for being both simple and tested. In the current study, a simple average of all series became an analogous series predictor.

6.2.4. The rationale for the judgmental approach chosen

For the third approach the investigation turns to expert judgment, an approach which has an inconsistent history in the literature but which when implemented as structured judgmental forecasting (Green & Armstrong, 2007; Rowe & Wright, 2001), shows some promise. In the proposed approach, experts would be canvassed via a panel and asked to judge the future data series outcomes directly. The literature on these three methods was covered in Chapter Five.

6.2.5. Model Parameter Estimation Methods

Having chosen a model and with empirical data available, it is possible to estimate the parameters’ values that provide the best fit of the model to that empirical data. This procedure is parameter estimation, and results in a model that estimates the data to which it is fitted. There are several approaches to parameter estimation proposed in the marketing literature (Mahajan, Mason, & Srinivasan, 1985; Mahajan & Sharma, 1986). The most common methods and the context in which they are used are presented in Table 3.

The methods used in marketing studies, tend to be dominated by model based estimation techniques for estimating the parameters, primarily by fitting to historical data, and have been generally limited to Ordinary Least Squares Estimation (OLS) (M. R. Young & Ord, 1989), Maximum Likelihood Estimation (MLE) (Schmittlein & Mahajan, 1982), and

Nonlinear Least Squares Estimation (NLLS) (Mahajan et al., 1985). Today these three methods are easy with the availability of computer programmes, which implement the complex routines to iterate the methods. The amount of data available and the functional form being assumed (and used), determines the procedure that should be used.

Table 3

Table 3. Common Methods of Parameter Estimation and Their Application

Common Methods of Parameter Estimation and Their Application

The following section provides some guidance on the selection of a model based iterative technique for estimating model parameters.

Early diffusion research used Ordinary Least Squares (OLS), also known as Linear Least Squares, as the estimation technique (Fisher & Pry, 1971; Griliches, 1957; Mansfield, 1963). OLS relies on transforming the data to a linear form via a log-linear transform and applying a linear regression model to estimates the parameters. OLS minimizes squared differences between the transformed data and the prediction from a linear function that is fitted to estimate that data. This log-linear transformation method is still quite common despite the ease with which modern software can fit non-linear functions to S-curve growth, and is favoured by statisticians because it frequently results in a close to normal distribution of errors, and thus supports the use of statistical tests relying on normality. OLS has the following practical advantages:

 Transforming helps compare different scaled time series;

 Any significant deviation from the transformed function is easier to see and to locate, by measuring the deviation of the target data from a linear logistic model;

 A simple linear regression can be applied to fit the transformed data;

 It tends to minimise the effects of stochastic variation in the original data.

Concept Method Used when/ requires

Judgment or observation based Informed guess Little data exists, or as a starting estimate in models

Expert judgment from direct observation of parameter values

Requires current observations and market expert availability

Combined judgment and model based estimation

Estimation by experts with judgmental fore-casts (bootstrapping with time-series model forecasts)

As a reference point to test alternative situations, to evaluate published parameters, or to modify published parameters in the light of observations. Model based estimation By formal estimation from historical data Parameter values from the literature are

already available or direct estimation from data by fitting iteratively is possible. By using an analogue series to

parameterise the model

Using data from a similar situation, the validity of the analogue selected is critical.

If we can assume that the fit errors are distributed normally, then the OLS estimator is also the maximum likelihood estimator. Most statisticians would view OLS as only useful for fitting linear regression models, because parameter estimates are unstable when few data points exist, a standard error of the OLS model parameters cannot be derived, and there is a time interval bias in estimating parameters for some models (Putsis, 1996).

Maximum Likelihood Estimation (MLE) is frequently used to estimate parameters and make inferences in statistics. Schmittlein and Mahajan (1982), who proposed MLE in Bass model studies, demonstrated how under a wide range of conditions MLE is consistent, asymptotically normal, and asymptotically efficient. MLE has the following characteristics: sufficiency (complete information about the parameter of interest contained in its MLE estimator); consistency (true parameter values that generated the data recovered asymptotically, i.e. for data of sufficiently large samples); efficiency (lowest-possible variance of parameter estimates achieved asymptotically); and parameterization invariance (same MLE solution obtained independent of the parameterisation used). Further, many of the inference methods in statistics are developed based on MLE. For example, MLE is a prerequisite for the chi-square test and many model selection criteria, such as the Akaike Information Criterion (AIC) (Akaike, 1973). However, MLE requires assumptions on distribution not supported in diffusion models, and a solution appears readily available for the Bass model only.

Srinivasan and Mason (1986a) observed that MLE seriously underestimated the standard errors of parameters and introduced Non-Linear Least Squares (NLLS) estimation to diffusion research. Putsis and Srinivasan (2000, p. 269) observe that the NLLS approach, when used in a noncumulative context, “…will do well in most settings and may be preferred to MLE”; subsequently NLLS has become the standard estimation technique in diffusion research. Both MLE and NLLS methods do not suffer from the time-interval bias problem that exists in OLS, and they can provide standard errors for the parameters.

Most of the research into estimators is directed at the Bass model and compares the OLS, MLE, and NLLS approaches. Putsis and Srinivasan (2000) warns that if nonlinear models have covariates, it is not clear which one is preferred, MLE or NLLS. Van den Bulte and Lilien (1997) demonstrated that although NLLS is favoured over OLS and MLE, it is not exempt from bias in parameter estimation (referring to how it understates “L” and “p” and

overstates “q” in Bass model parameter estimation). Dekimpe, Parker, and Sarvary (1998) observed that problems primarily occur when estimation is done without placing external constraints on the parameter ranges. When constraints are used, the problem was largely eliminated (Dekimpe et al., 1998). NLLS requires a ‘good’ initial estimate of parameter values; otherwise, it might converge on a local, rather than a global, optimum. NLLS is implemented in most simple fitting software and has many of the advantages of MLE. A multi start algorithm based on NLLS built into Microsoft Excel’s Solver add-in, became the parameter estimation tool used to minimise the sum of the squared errors in forecasts.

In document Forecasting the decline of superseded technologies : a comparison of alternative methods to forecast the decline phase of technologies : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Marketing at Ma (Page 114-120)