Each quarterly survey since February 1998 asks for forecasts at three future points in time, as in the example in Fig. 1: the fourth quarter (Q4) of the current year; the fourth quarter of the following year; and the corresponding quarter two years ahead. (In the early “inflation-only” surveys, only the first two questions appeared.) This structure eventually delivers nine successive forecasts of a given Q4 outcome, which form a sequence of “fixed- event” forecasts, with the date of the forecast preceding the date of the outcome by 8, 7,…, 1, 0 quarters. Given that the survey goes out early in the quarter, when no data on current- quarter inflation and GDP growth are available, we treat these as h-step-ahead forecasts with horizon h equal to 9, 8,…, 2, 1 quarters successively. In the more conventional time-series framework of constant-horizon forecasts, the third question delivers a quarterly series of nine-quarter-ahead forecasts, but the first two questions give only one observation per year at intermediate horizons, h = 4 and 8 in February, h = 3 and 7 in May, and so on. This focus on end-year targets is clearly more familiar to forecasters, since there are usually a few respondents who answer the first two questions but not the third question. Despite this, in May 2006 all three questions were switched to a constant-horizon format, focusing on the corresponding quarter one, two and three years ahead.
However, the recent literature on testing for rationality allowing for asymmetric loss also sug- gests that forecast uncertainty will a¤ect the relationship between the conditional mean and the pointforecasts. If the histograms accurately re‡ect the individuals’true (subjective) beliefs, then individuals will report pointforecasts that under or over-predict relative to their mean forecast if they have asymmetric loss functions, and the extent to which they do so will depend on their perceived forecast uncertainty. Our results are generally not supportive of the asymmetric loss explanation, for two reasons. Firstly, although we …nd a role for the forecast standard deviation in our panel regressions, we also …nd that last period’s actual value helps predict the di¤erence between the mean and point forecast. The signi…cance of the lagged actual value is inconsistent with rational behaviour by the forecaster for the class of loss functions we have allowed. Secondly, we …nd that pointforecasts are more accurate (assuming a squared-error loss function) than the histogram means. If the respondents have asymmetric loss functions, then the expected squared er- ror of a point forecast must exceed the expected squared error of the conditional mean. Our relative forecast accuracy …ndings count against the asymmetric loss explanation. The discrepancies be- tween the pointforecasts and histograms have been viewed as evidence ‘that point predictions may have a systematic, favourable bias’(Engelberg et al. (2007)), but this appears to be unwarranted. Rather, our …ndings indicate that …rst moments derived from survey respondents histograms have a tendency toward pessimism relative to the outcomes. The relative accuracy of the pointforecasts judged by squared-error loss would also tend to lend support to our maintained assumption that the pointforecasts are estimates of the mean.
rect measures of uncertainty that can serve as the gold standard for evaluating other ways of measuring or proxying forecast uncertainty. Chief amongst these are measures based on the dispersion of individuals’ pointforecasts (‘disagreement’ or ‘consensus’): see Zarnowitz and Lambros (1987), with more recent contributions including Giordani and Söderlind (2003), Boero, Smith and Wallis (2008), D’Amico and Orphanides (2008) and Rich and Tracy (2010). In this paper we analyze the properties of the histogram-based measures of uncertainty, and in particular the relationship between the histogram measure as an ex ante measure of forecast uncertainty, and ex post or realized uncertainty. Our interest in the term structure of forecast uncertainty (i.e., how forecast uncertainty changes with the horizon) is motivated in part by a recent paper by Patton and Timmermann (2011). Patton and Timmermann (2011) adopt a …xed-event framework to consider how realized forecast uncertainty varies over the forecast horizon. Their approach is silent about ex ante forecast uncertainty, and its relationship to ex post uncertainty. We use actual survey forecast errors to measure ex post uncertainty, as they do, but also use the SPF histograms to measure ex ante uncertainty.
In the recent literature it has been suggested that symmetric loss ought to be regarded as just a special case of more general loss functions, and that the presumption of symmetric loss might be misplaced for macro-forecasters. If forecasters have asymmetric loss functions, then we would expect their pointforecasts to be biased, and the bias should depend on the conditional forecast uncertainty. However, the small samples of forecasts we have by individual for a given horizon might explain why we often fail to reject the null of unbiasedness (irrespective of whether loss is really asymmetric). Our preferred test analyses the di¤erence between the point predictions and (estimates of) the conditional means and appears to have greater power, and is consistent with the asymmetry story in that the di¤erence is zero mean once we allow for forecast uncertainty. However, the outcomes of the forecast encompassing tests provide telling evidence against asymmetric loss: we tend to reject the null that the mean forecasts encompass the point predictions, under squared error loss, but not vice versa.
performance in panel data models. The panel considered in this paper features large cross-sectional dimension (N) but short time series (T). It is modeled by a dynamic linear model with common and heterogeneous coefficients and cross-sectional heteroskedasticity. Due to short T, traditional methods have difficulty in disentangling the heterogeneous parameters from the shocks, which contaminates the estimates of the heterogeneous parameters. To tackle this problem, the methods developed in this dissertation assume that there is an underlying distribution of the heterogeneous parameters and pool the information from the whole cross-section together via this distribution. Chapter 2, coauthored with Hyungsik Roger Moon and Frank Schorfheide, constructs pointforecasts using an empirical Bayes method that builds on Tweedie's formula to obtain the posterior mean of the heterogeneous coefficients under a correlated random effects distribution. We show that the risk of a predictor based on a non-parametric estimate of the Tweedie correction is asymptotically equivalent to the risk of a predictor that treats the correlated-random-effects distribution as known (ratio-optimality). Our empirical Bayes predictor performs well compared to various competitors in a Monte Carlo study. In an empirical application, we use the predictor to forecast revenues for a large panel of bank holding companies and compare forecasts that condition on actual and severely adverse macroeconomic conditions. In Chapter 3, I focus on density forecasts and use a full Bayes approach, where the distribution of the heterogeneous coefficients is modeled nonparametrically allowing for correlation between heterogeneous parameters and initial conditions as well as individual-specific regressors. I develop a simulation-based posterior sampling algorithm specifically addressing the nonparametric density estimation of unobserved heterogeneous parameters. I prove that both the estimated common parameters and the estimated distribution of the heterogeneous parameters achieve posterior consistency, and that the density forecasts asymptotically converge to the oracle forecast. Monte Carlo simulations and an application to young firm dynamics demonstrate improvements in density forecasts relative to alternative approaches.
Probability forecasts are unique insofar as they not only provide a prediction of the location/class of the observation but they also give a measure of the uncertainty in that prediction. Sharpness rewards models in terms of the location/class accuracy but gives no real indication of the correctness of probability estimates. Calibration, also known as reliability in meteorology or empirical validity in statistics , refers to the ability of a model to make good probabilistic predictions. A model is said to be well calibrated if for those events the model assigns a probability of P%, the long-run proportion that actually occur turns out to be P%. Intuitively, this is a desirable characteristic of any probabilistic forecast; in fact, it could be argued that probability forecasts that are not well calibrated are of no more use than pointforecasts because the probabilistic aspect of the prediction is incorrect.
It is unclear whether the pointforecasts should be interpreted as the means, modes or even medians of the probability distributions. Engelberg et al. (2006) calculate non-parametric bounds on these measures of central tendency for the histograms. We focus on the mean, as their results are similar across the three measures. Before calculating bounds, we brie‡y consider other approaches that have been adopted to calculate means from histograms. One approach in the literature is to assume that the probability mass is uniform within a bin (see e.g., Diebold et al. (1999b), who make this assumption in the context of calculating probability integral transforms). Another is to …t normal distributions to the histograms (see Giordani and Söderlind (2003, p. 1044)). If the distribution underlying the histogram is approximately ‘bell-shaped’then the uniformity assumption will tend to overstate the dispersion of the distribution because there will be more mass close to the mean. This problem will be accentuated when there is a large di¤erence in the probability mass attached to adjacent bins, where it might be thought desirable to attach higher probabilities to points near the boundary with the high probability bin. In the same spirit of …tting a parametric distribution to the histogram, Engelberg et al. (2006) argue in favour of the unimodal generalized beta distribution. Using the assumption of uniform mass within a bin, and approximating the histogram by a normal density, results in the correlations between the annual growth pointforecasts and the histogram means recorded in table 1. The crude correlations in columns 3 and 4 are all high, between 0.92 and 0.96, and the results indicate little di¤erence between the two methods of calculating histogram means that we consider. Nevertheless, it is
Figure 1 clearly demonstrates that a practical forecast based on annual data should incorporate spatial effects, or make use of more uncommon methods like neural networks. Turning to the other extreme, monthly data, time series models or methods based on large data sets can improve forecasting accuracy. The same holds for quarterly data. However, forecasting accuracy with spatial effects and monthly data has not been tested yet. Additionally, studies with monthly or quarterly data do not evaluate their models for longer horizons. Since most of the studies focus on pointforecasts, literature on different scenario analyses is scarce. Based on this state-of-the-art discussion, we now turn to the agenda for future research.
By using the model averaging approach, one can obtain a better accuracy than any method alone, when the forecasts combined use different methods that capture different information, different specifications or different assumptions. Because the underlying data generating process is often unknown, combined forecasts are more robust toward model mis-specification and are more likely to produce accurate pointforecasts. In the demographic literature, there has been little interest in combining forecasts from different models. Nonetheless, some notable exceptions include Smith and Shahidullah (1995); Ahlburg (1998, 2001) and Sanderson (1998), whose pioneering work, particularly in the context of census tract forecast, have done much to awaken others, including the present author. The contribution of this article is to apply the notion of model averaging to the problem of forecasting age-specific life expectancies.
Apart from pointforecasts, we investigated the ability of the models to provide interval forecasts. For all consid- ered models interval forecasts were determined analyti- cally; for details on calculation of conditional prediction error variance and interval forecasts we refer to . Afterwards, following , we evaluated the quality of the interval forecasts by comparing the nominal coverage of the models to the true coverage. Thus, for each of the models we calculated confidence intervals (CIs) and determined the actual percentage of exceedances of the 50%, 90% and 99% two sided day-ahead CIs of the mod- els by the actual market clearing price (MCP). If the model implied interval forecasts were accurate then the percent- age of exceedances should be approximately 50%, 10% and 1%, respectively. Note that for each “month”, 840 hourly values were determined and compared to the MCP.
I plot a figure like this for the forecasts derived from each data vintage. Unfortunately, it is not possible to show all these figures in this paper. However, screening over all the forecasts for the different historical data vintages reveals some notable observations. Structural models and the Bayesian VAR are suited to forecast during normal times. Given small or average exogenous shocks the models give a good view about how the economy will return back to steady state. In contrast, large recessions or booms and the respective turning points are impossible to forecast with these models. Figure 2 plots the forecast errors (outcome minus forecast) of all models on the horizontal axis and the correponding realized output growth rate on the vertical axis. A clear positive relation is visible. When output growth is highly negative the models are not able to forecast such a sharp downturn and thus the forecast error is negative. The models require large exogenous shocks to capture large deviations from the balanced growth path and the steady state inflation and interest rate. This is due to the weak internal propagation mechanism of the models. Therefore, for a given shock all models including the Bayesian VAR predict a quick return back to the steady state growth rate. While the pointforecasts cannot predict a recession, the possibility that a large deviation from steady state values occurs is captured by the density forecasts. Once the turning point of a recession has been reached, all models predict the economic recovery back to the balanced growth path well. 17
and Marrocu (2002a, 2002b) confirm this result in various applications with actual data, and show that when the nonlinear models are evaluated on interval and density forecasts, they can exhibit accuracy gains which remain concealed if the evaluation is based only on MSFE metric. Some gains of the SETAR models have also been found, even in terms of MSFEs, when the forecast accuracy is evaluated conditional upon a specific regime (Tiao and Tsay, 1994, Clements and Smith, 2001, and Boero and Marrocu, 2002a). An interesting result, common to these studies, suggests that SETAR models can produce pointforecasts that are superior to those obtained from a linear model, when the forecast observations belong to the regime with fewer observations.
This study compares the accuracy of USDA’s fiscal year export value forecasts for FY 2001-04 with forecasts based on trends in each commodity’s monthly exports. USDA’s forecasts are published quarterly in the Outlook for U.S. Agricultural Trade, The trend forecasts were produced with ARIMA models utilizing the monthly data available at the time each USDA forecast was published. The models were specified and estimated with the Tramo/Seats software developed by the Bank of Spain. This software was incorporated by Eurostat into a software package, Demetra, which was this study’s interface for Tramo/Seats.
tuned using a coefficient or scaling factor, with the size of the coefficient dependent on the age of the forecasts with respect to the initialisation time. More details can be found in Ebisuzaki and Kalnay, 1991. Perturbations are also applied to a number of surface parameters including sea surface temperature, the temperatures of the top two soil layers, surface moisture, vegetation fraction, leaf area index, soil thermal coefficient, roughness length over land, fluxes over the sea, albedo and snow depth. The perturbation strategy follows the method outlined in Bouttier et al., 2015.
(mean) and probability forecasts are studied from two candidate models: a univariate model, which expresses conflict numbers at time t as a function of conflict numbers at time t-1 and a vector autoregression model which expresses conflict numbers a time t as dependent on conflict numbers in t-1 and wheat prices in time t-1. Point forecasting results indicate the VAR model preforms better in terms of root mean squared forecast errors and forecast encompassing. Both models offer well-calibrated probability forecasts over our post fit period. Clearly, knowledge of wheat prices in month t is helpful in forecasting conflict numbers for month t+1; a result that coheres well with ‘fit” results found on earlier data via a machine learning algorithm.
Our earlier analysis suggests that firms with better CGQ have more informative disclosure policies. Inconsistent with our predictions, we find firms with higher CGQ are associated with greater bias (more “optimism”) in analyst forecasts and lower forecast accuracy. Nevertheless, these firms are associated with greater analyst following suggesting that even with greater analyst following, analysts find it difficult to predict earnings for such firms. The results for the disagreement model are inconclusive using Ordinary Least Squares estimation methods. However using Seemingly Unrelated Regression techniques (see robustness section for details) we find that the disagreement amongst analysts is lower for firms with better CGQ. For the forecast revision model, the CGQ measure is unrelated to the volatility of forecast revisions, possibly due to the relatively small sample size for this particular model.
The 1200 UTC cycle on the 15 th of August 2017 produced the first set of forecasts at ECMWF used oper- ationally by Met Éireann. This followed the porting work carried out in 2016 and 2017 and an upgrade to the Irish RMDCN connection to ECMWF. The operational suite is monitored and running successfully under the “Framework for Member State time-critical applications - option 2” (ECMWF, 2015) using the ecFlow scheduler.
Seasonal forecasting methods in hydrology can be broadly divided into two categories: statistical methods, which use a statistical relationship between a predictor and a predictand (e.g. Jenicek et al., 2016, and references therein), and dynam- ical methods, which use seasonal meteorological forecasts as input to a hydrological model. More recently, mixed ap- proaches have been investigated to take advantage of initial land surface conditions, seasonal predictions of atmospheric variables and the predictability information contained in large-scale climate features (see Robertson et al., 2013; Yuan et al., 2015, and references therein). Ensemble Streamflow Prediction (ESP; Day, 1985) is a dynamical method that is widely used to forecast low flows and reservoir inflows at long lead times (Faber and Stedinger, 2001; Nicolle et al., 2014; Demirel et al., 2015). It consists in using historical weather data as input to a hydrological model whose states were initialized for the time of the forecast. The ESP method is also used along with the reverse ESP method to deter- mine the relative impacts of meteorological forcings and hy- drological initial conditions on the skill of streamflow pre- dictions (Wood and Lettenmaier, 2008; Shukla et al., 2013; Yossef et al., 2013). An alternative dynamical method con- sists in using seasonal forecasts from regional climate mod- els (RCMs) (Wood et al., 2005). This approach yields better results when seasonal predictability is enhanced by meteo- rological forcings. Climate model outputs may also be more suitable to capture the specific climate conditions at the time of the forecast, whereas ESP-based methods will be limited to the range of past observations and challenged by climate non-stationarity.
Despite the preference of hydrological forecasters for consis- tency one should not ignore the advantages of inconsistency. Inconsistency discourages the forecaster from relying on the latest forecast, and instead seeking out alternative informa- tion in an ensemble system in addition to the forecasted hy- drograph values, as well as considering previous forecasts or information from other models. Persson and Grazzini (2007) argue that a consistent forecast may lull forecasters into a false sense of confidence in the reliability of their model, which exacerbates difficulties in decision making when sud- den surprising forecasts arise. In the same way a gradually changing forecast may contribute to greater confidence than an abruptly changing one (Lashley et al., 2008) and thus the magnitude of inconsistency is of particular importance. In- consistency can thus be an asset if it alerts forecasters to pos- sible forecast problems and highlights alternative develop- ments (see full details in Persson and Grazzini, 2007).