Judgmental Forecasting Issues
5.5.7. Judgmental forecasting accuracy research is mostly econometric
In those studies that have looked at judgmental forecasting of time-series, complexities such as seasonality and large spikes in a trend have been associated with poorer judgmental forecasting accuracy (Lawrence et al., 1985; Sanders, 1992). It seems participants often see noise as part of a trend, resulting in overreaction behaviour, particularly to recent noise (Klayman, 1988). Furthermore, many of the studies are conducted with high levels of induced noise, see for example Sanders (1992) and also Best, Smith, Frey, and Stubbs (1998), see section 5.5.10. Many studies include data with complex seasonality combined with low underlying trends, particularly in econometric literature. That complexity and noise relative to trend level brings additional cognitive load and consequential potential for decreases in accuracy and consistency of performance of judgmental predictions. However, Lawrence et al. (1985) when extrapolating with judgment, found the forecasts appear to be as accurate as their extrapolative statistical model counterparts, although the studies that they reported were primarily of econometric data (Ashley, 1988; Ashton, 1984; L. D. Brown & Rozeff, 1978). In summary, much of the judgmental forecast literature including that in econometrics (see for example; Camerer, 1981), where typical contexts are linear low trend, are investigations with often high signal to noise ratios. This is in stark contrast to the highly trended, highly correlated non-linear nature of decline curves, making much of this literature of limited direct relevance, given the difference in the cognitive task.
The M1 competition (Makridakis et al., 1982) demonstrated that in econometric environments, that is, where economic data series with low trends and moderate variance are common, that simple methods work better. Deseasonalised single level exponential smoothing was superior to all other methods overall (Makridakis et al., 1982). Using the same M1 data, Lawrence et al. (1985), were able to conclude that judgmental forecasts were at least as accurate and sometimes more accurate than statistical methods. Importantly, the standard deviation of their judgmental estimates was less than that of the statistical model methods indicating the possibility that judgmental forecasts were more consistent. There is, further evidence supporting judgment in forecasting (Goodwin & Wright, 1993; Lawrence et al., 1985). In the M2 forecasting competitions data, Makridakis et al. (1993) demonstrated that judgmental forecasting could be at least as accurate as statistical forecasts, and that in some situations was the best method. Other studies using the same or similar data have
that the accuracy demonstrated by Makridakis et al. (1982) was seemingly related to the type of data being forecast (Carbone & Gorr, 1985; Sanders & Ritzman, 1992).
5.5.8.
Judgmental forecasts are generally underestimates
When participants are tasked to extrapolate data, accuracy is frequently low and underestimation is common (Bailey & Gupta, 1999; Keren, 1983; Timmers & Wagenaar, 1977). It seems that the underestimation can be found across many formulations of the forecasting task (Mullet & Cheminat, 1995). Generally, this underestimation does not appear to be driven by the method used to present the data, or the wording used in task description (Lawrence et al., 1985, 1986; Lawrence & Makridakis, 1989), nor is it improved by an awareness of humans’ propensity to underestimate (Andreassen & Kraus, 1990). The expertise of the judge also seems not to effect outcomes (Sanders & Ritzman, 1992; Wagenaar & Sagaria, 1975).
Some authors (Eggleton, 1982; O'Connor, Remus, & Griggs, 1993), believe there are some consistent biases in judgmental forecasting of trends: forecasters tend to dampen rising and falling trends, with falling trends suffering more than rising trends, although Lawrence and Makridakis (1989) found the dampening was even. Lawrence and Makridakis (1989) also observed that on falling forecasts, forecasters were less confident in their forecasts and widened the bounds of their estimates. This finding was confirmed by O'Connor, Remus, and Griggs (1997). Making forecasts from noisy series is associated with this phenomenon, (see this explained in; Andreassen & Kraus, 1990; Keren, 1983, 1984; Lawrence & Makridakis, 1989; O'Connor et al., 1993). In practice, this means their forecasts lie below upward trend lines but above downward ones. That is, forecasters tend to underestimate the steepness of trends in data series.
5.5.9.
Exponential trends are grossly underestimated by humans
Phenomena displaying exponential type growth or decline have important impacts on human life (Wagenaar & Timmers, 1979), beyond diffusion growth models, and analogous series models judgmental methods could be used to investigate these phenomena, however this has not been done often. In a general sense of judgmental forecasting, Lawrence et al. (2006) provide a useful review of judgmental forecasting and sources of the bias, and argue that even experts with experience with growth processes did not do significantly better than amateurs.
More specifically, the ability of humans to judgmentally forecast in exponential growth and decline situations has been investigated in the psychology literature (Best, 2008; Best et al., 2007; Timmers & Wagenaar, 1977; Wagenaar & Timmers, 1978a, 1978b, 1979) and in the financial literature (McKenzie & Liersch, 2011). In exponential growth prediction, judgmental extrapolation has been far from successful. Uniformly, humans substantially underestimate the rate of growth of exponential series when tasked to extrapolate from early data (Timmers & Wagenaar, 1977; Wagenaar & Sagaria, 1975), even providing more data points doesn’t help (Wagenaar & Timmers, 1978a, 1978b). Keren (1984) found that people when faced with historical exponential data, were not able to predict an exponential trajectory, although those with experience with exponential growth did better. All these authors reason that the underlying extrapolation model used by people does not allow for such fast moving changes. However in contrast, Bailey and Gupta (1999) while investigating learning curves, which have a declining exponential form, observed that human forecast accuracy was statistically superior to fitted curve models when few data points were available.
The shape of the trend is important, with exponential trends much more poorly forecast than linear trends (Wagenaar & Timmers, 1978a, 1978b) and Best (2008) noted that falling trends were forecast more accurately than growing trends. It has been observed that asymptotic exponential decline is marginally better predicted than exponential growth (Timmers & Wagenaar, 1977; Wagenaar & Sagaria, 1975). This is important because the last half of a decline curve follows a similar trajectory to a declining asymptotic exponential.
Best (2008) hypothesised, based on her findings, that underestimation bias found by previous researchers such as Wagenaar and Timmers (1978b) might depend on the expertise and experience of the forecaster. She found when experienced forecasters have information about the time series presented in a graph, they tended to substantially overestimate, while inexperienced forecasters tend to underestimate slightly. Lawrence and Makridakis (1989) observed that on falling forecasts, forecasters were less confident in their forecasts over growing curves and widened the bounds of their estimates. This finding was confirmed by O'Connor et al. (1997).