Problems in achieving consistently accurate forecasts

General Issues

5.2.2. Problems in achieving consistently accurate forecasts

There are many reasons why forecasters are unable to provide perfect accuracy. Martino (1993) describes how forecasts are subject to four key problems:

 Because there is a lack of inherent necessity, that is, historical outcomes are not completely determined by observable physical factors; some things just happen by chance, or are influenced by chance happenings. Consequently, each outcome is unique even though the observed situation was apparently identical. Forecasting in such situations is challenging;

 Because, each event is historically unique, then understanding the dissimilarities is important. An analogy is strengthened if there are several historical cases with parallel outcomes that can be compared, and found identical at their start points, with the situation of interest;

 Historically conditioned awareness means that when people are aware of what happened last time, and armed with that knowledge, they tend to act to achieve a different outcome “this time around”. This violates the forecasting principle that people will act in the same way every time;

 Any causal analogy assumes that not only are the observed parameters similar but that the cause and effects will play out in the same way.

These four problems remind us of two things. First, that one should control what is controllable, but also be aware that in forecasting some future conditions are uncontrollable and unpredictable.

In addition, forecasters must meet the needs of their audiences. For example, managers are cautious of forecasts produced by models they do not understand (Taylor & Thomas, 1982). Likewise, forecasting practitioners appear to rate sensibility (believability) above accuracy in forecasts (Huss, 1987), while academics rate accuracy higher than believability (Carbone & Armstrong, 1982). Yokum and Armstrong (1995)observed that managers consider flexibility, ease of use, and ease of implementation nearly as important as prediction accuracy in choosing a forecast approach. Mahajan, Muller, and Bass (1995b) observe that model selection involves a trade-off between fit and parsimony, this observation has support in model validity theory, through the concept of allowable degrees of freedom, which, when breached, leads to model overfitting (Babyak, 2004; Everitt & Skrondal, 2002). The

principle that, given two explanations and all other things being equal, the simple explanation is the best, has become generally accepted by scientists developing theories and models (Hawking, 2003, p. 371). There is also empirical support for the principle of parsimony in the form of a comprehensive review of experiments that compared the accuracy of forecasts from simple methods and models with that of forecasts from more complex ones. Forecasts from the simpler methods were as or more accurate, and often substantially so (Green & Armstrong, 2015). The approach taken in the research described in this thesis, then, has been to use simple explainable methods that can be assessed using simple explainable measures.

Recently, a principle of conservatism has been proposed, the so-called golden rule of forecasting (Armstrong, Green, & Graefe, 2015). This principle is broad based, where to be conservative means; “adhering to cumulative knowledge about the situation and about forecasting methods” (Armstrong et al., 2015, p. 1781). As a useful illustration, one might think forecasting a small change is conservative, however, if the indicators point to a large change, then assuming no change, which is a common approach, is not conservative (Batchelor & Dua, 1992). Comprehensive reviews of evidence on forecasting principles can be found in Armstrong (1985), and Armstrong (2001d). This next section introduces selected differences between the notions of diffusion (growth) curves and of discontinuance (decline) curves.

5.2.3. The suitability of diffusion models for discontinuance curves

As mentioned in Chapters Two and Three, when discontinuance rates in a market exceed adoption rates, the incumbent technology declines with an S-curve. Rogers’ theory implicitly specifies this as a binary process with only the one new and the one old technology involved. On the surface, decline (cumulative discontinuance of use) curves look to be the inverse of diffusion curves. Fisher and Pry demonstrated this with their logarithmic translation of the data and application of a linearised logistic function to both diffusing and substituted technologies. However, this observation only holds true at a theoretical level if the one to one binary substitution process described by Rogers (1962) holds true.

Thus, models built on an understanding of diffusion determinants might well be incorrectly specified, given potentially different determinants of decline, over diffusion. Such incorrect

sales and how these are echoed in market share, decline is affected by a range of dis-adoption and abandonment mechanisms, and therefore not directly by sales. Consequently, there is little to link the market diffusion defined parameters of the more refined marketing science diffusion models to the decline process. Fortunately, a naïve model fitted to data for declining technology will always reflect the impact of those determinants of decline even if the individual level discontinuance mechanisms driving decline are not understood, and the theoretical backing for the parameters used is not confirmed as suitable for a decline situation.

The lack of a strong assessment of the decline mechanisms from a modelling perspective precludes considering causal models for decline. Technology decline modelling research is relatively rare, (for exceptions see Newell et al., 2014; Norton & Bass, 1992). Importantly, when any attempts to model decline take place, they are predominantly undertaken from a growth or diffusion perspective, and forecast from growth onwards into a decline phase, an unnatural process as the modelling does not link with how a manager would commission a forecast. Typically, such studies include the substituted component in the diffusion stream, in order to show cumulative growth of the new technology, such that any decline prediction is a by-product of the single diffusing technology, and any parallel market size expansion (Fisher & Pry, 1971; Marchetti & Nakicenovic, 1979; Norton & Bass, 1987, 1992). To see if decline forecasting had been attempted, in March 2017, (and again in April 2018), a systematic search was undertaken of Google Scholar, Scopus and Web of Science (seeking sources 10 pages deep in each database (20 entries a nominal page)), this search found no studies directly investigating forecasting of technology decline. There is substantial literature in the area of human capability to undertake decline curve estimation (Best, Smith, & Stubbs, 2007; Ebersbach, Lehner, Resing, & Wilkening, 2008; Wagenaar & Timmers, 1979) and in the area of resource depletion and in particular oil reserves and oil extraction decline (Fetkovich, Fetkovich, & Fetkovich, 1996; Höök, 2009). There was also work on species decline (Duffy et al., 2009). There is perhaps an assumption that the work on decline forecasting has already been done, because substitution research implicitly includes the substituted technology. There are some studies that attempt the modelling of decline curves; their number is however, limited, as is their scope, which is restricted to demonstrating the potential to model (Marchetti & Nakicenovic, 1979; Norton & Bass, 1992). In the context of such limited knowledge, there is a good case for using the accumulated data for the decline

of a technology for forecasting rather than working with the growth data of the new technology.

5.2.4. Heteroscedastic forecast errors

The S-curve trajectory of typical diffusion means forecast residual errors can be heteroscedastic. That is, relatively small at the beginning, become large in the middle, and then reduce again at the end, as a simple artefact of the rate of the change in diffusion over time. Moreover, if measured as percentage error this heteroscedacity of errors is further expanded by the S-curve change in the magnitude of the actual values over time.

As a typical S-curve diffusion time-series grows over time, the S-curve pattern becomes more discernible, allowing better model calibration and thus better representing the underlying trend. In fact, the risks associated with not recalibrating the model, as new data become available are high in diffusion studies. This is because the rapid change in the rate of diffusion, observed in the region from take-off until growth slows to a ceiling, means small inaccuracies in the model specification will generate substantial forecast errors in this region of the curve. Hence, the potential gains from updating the calibration of the model each time period and getting an improved prediction are arguably higher than in any other situation where constant updating might be possible, because of the rapid rate of change data level in the middle portion of diffusion curves, where data points by definition are at their scarcest.

5.2.5. The length of diffusion data series

Logging of diffusion data is typically annually or quarterly and, very rarely, monthly. This means there are often few data points available in typical diffusion data series. Emmanouilides (2006) sought a wide range of diffusion time-series with more than seven observations. In the 926 series from 50 countries covering household appliances, home electronics, and telecommunication equipment, all introduced after the year 1950, Emmanouilides found 39 percent of his series only had his minimum of seven observations, 54 percent had between eight and 19 observations, and only eight percent had 20 or more observations. The diffusion literature rarely mentions this lack of length in diffusion series. Such scarcity of data points means that there will often be insufficient data for other than simple models, particularly when attempting to forecast rapidly diffusing technology or

when early phase only data is available. Often in such situations, choosing an S-curve model will be difficult to justify on the data alone and models will struggle to fit the data well. The literature describes several ways of dealing with insufficient data. Sometimes the problem is insufficient target variable data to support models, but elsewhere there might be similar data that can be used as a surrogate (analogous data). Sometimes the data is not adequate for any form of model even when analogous data is sought. In such cases, judgment might need to be applied to augment the data points, augment the models, or predict future data points directly. For example, researchers recognise that knowledge of the intentions of consumers and the opinions of experts can improve model choice and model specification when there is only limited information (Armstrong & Collopy, 1998; Lawrence, Goodwin, O'Connor, & Önkal, 2006). Moreover, even when the S-curve pattern is emerging (becoming recognisable), expert judgment might still be valuable, through some form of

bootstrapping (Armstrong, 2001c) and by providing input to forecasting models via some form of forecast adjustment. The potential benefits of adding judgmental input to forecasts might be high. Armstrong and Brodie (1999) and Armstrong et al. (1987), observed the benefits of judgment with limited data when they observed that a progression from judgmental methodologies to quantitative models improves forecasts as more data becomes available.

Debecker and Modis (1994) note that the quality of the early data and the portion of the growth curve covered by that data substantially affect prediction accuracy from diffusion models. They undertook over 35,000 fits to artificial data series and they were able to establish the uncertainties, confidence intervals, and systematic bias of the S-curve in forecasts for different densities of data. They advocated two simple rules: first, if there is access to a minimum of 20 percent of the full range of the likely diffusion curve, then it is possible to deliver an adequate forecast through fitting using Equation (35.

𝑌(𝑡) = 𝑀 1 + 𝑒−𝑎(𝑡−𝑡0)

(35)

Second, if diffusion has passed 50 percent, then a forecast of the ceiling value should be within 20 percent of the actual value, with a 95 percent confidence level. Additionally they demonstrated, consistent with previous researchers (e.g. Fisher & Pry, 1971), that the extremities of a curve, in their case below 5 percent and above 95 percent, could not be expected to fit well, supporting the Fisher and Pry recommendation to trim early and late

data, an action that limits the shortest series that can be used. If data series are short then testing models with a rolling origin forecast become problematic as described earlier in section 5.2.1.

5.2.6. Identifying the start of a decline pattern is a challenge

The completion of an S-curve trajectory is what defines successful diffusion. However, there are inherent problems with universally attributing this S-curve pattern to early data, because until the S-curve is well established, the use of an S-curve model is not fully supported. That is, early diffusion data give little indication of the later S-curve growth phase trajectory that regularly follows technology introduction. This lack of indication is because of two factors. The first is the low number of initial adopters and their limited ability to communicated benefits in adoption resulting in low levels of penetration. The second is the issue of noise in the diffusion data, in a large market, small market variations often swamp the low levels of data for the diffusing technology (Davies, 1979; Girifalco, 1991; Modis & Debecker, 1992), and this issue is covered in detail in Section 6.3. The same problem occurs with decline when only a small portion of the decline data has emerged, and there is a lack of an established trend. The forecaster is faced with three questions: first, has the decline process really started? This is critical to apply any further forecasting assumptions. Second, will this decline follow an S-curve? This question could be answered by watching the data progress until the series’ shape was evident. The third question: What does the first half of the decline curve say about the second half? These questions form part of tests for the analogy assumptions of S-curves, and are critical to the use of marketing science diffusion models. If the two halves are symmetric, then a symmetric S-curve model can be deployed. If not then a suitable asymmetric model is required. The problems outlined in the three questions manifest themselves in the fitting to early data of models, which is potentially not representative of the trend to come, but also in the choice of models with respect to their points of inflection. Diffusion in the early stage is slow; this slow growth region is extremely variable in length and can go on for weeks, months, or years (Dattee, 2007; Parker, 1994). Decline is similar to diffusion in this respect and during this time, the falls in the data are relatively small in magnitude and could be subject to random or seasonal fluctuations. Accordingly, judgment is required to identify the emergence of an S-curve, and a rule of some kind is needed to attribute the S-curve growth analogy to the data. Once that pattern

is accepted, the common observation of S-curve diffusion trajectories allows us to accept the use of models based on this shape.

In document Forecasting the decline of superseded technologies : a comparison of alternative methods to forecast the decline phase of technologies : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Marketing at Ma (Page 82-88)