Each quarterly survey since February 1998 asks for **forecasts** at three future points in time, as in the example in Fig. 1: the fourth quarter (Q4) of the current year; the fourth quarter of the following year; and the corresponding quarter two years ahead. (In the early “inflation-only” surveys, only the first two questions appeared.) This structure eventually delivers nine successive **forecasts** of a given Q4 outcome, which form a sequence of “fixed- event” **forecasts**, with the date of the forecast preceding the date of the outcome by 8, 7,…, 1, 0 quarters. Given that the survey goes out early in the quarter, when no data on current- quarter inflation and GDP growth are available, we treat these as h-step-ahead **forecasts** with horizon h equal to 9, 8,…, 2, 1 quarters successively. In the more conventional time-series framework of constant-horizon **forecasts**, the third question delivers a quarterly series of nine-quarter-ahead **forecasts**, but the first two questions give only one observation per year at intermediate horizons, h = 4 and 8 in February, h = 3 and 7 in May, and so on. This focus on end-year targets is clearly more familiar to forecasters, since there are usually a few respondents who answer the first two questions but not the third question. Despite this, in May 2006 all three questions were switched to a constant-horizon format, focusing on the corresponding quarter one, two and three years ahead.

28 Read more

However, the recent literature on testing for rationality allowing for asymmetric loss also sug- gests that forecast uncertainty will a¤ect the relationship between the conditional mean and the **point** **forecasts**. If the histograms accurately re‡ect the individuals’true (subjective) beliefs, then individuals will report **point** **forecasts** that under or over-predict relative to their mean forecast if they have asymmetric loss functions, and the extent to which they do so will depend on their perceived forecast uncertainty. Our results are generally not supportive of the asymmetric loss explanation, for two reasons. Firstly, although we …nd a role for the forecast standard deviation in our panel regressions, we also …nd that last period’s actual value helps predict the di¤erence between the mean and **point** forecast. The signi…cance of the lagged actual value is inconsistent with rational behaviour by the forecaster for the class of loss functions we have allowed. Secondly, we …nd that **point** **forecasts** are more accurate (assuming a squared-error loss function) than the histogram means. If the respondents have asymmetric loss functions, then the expected squared er- ror of a **point** forecast must exceed the expected squared error of the conditional mean. Our relative forecast accuracy …ndings count against the asymmetric loss explanation. The discrepancies be- tween the **point** **forecasts** and histograms have been viewed as evidence ‘that **point** predictions may have a systematic, favourable bias’(Engelberg et al. (2007)), but this appears to be unwarranted. Rather, our …ndings indicate that …rst moments derived from survey respondents histograms have a tendency toward pessimism relative to the outcomes. The relative accuracy of the **point** **forecasts** judged by squared-error loss would also tend to lend support to our maintained assumption that the **point** **forecasts** are estimates of the mean.

36 Read more

rect measures of uncertainty that can serve as the gold standard for evaluating other ways of measuring or proxying forecast uncertainty. Chief amongst these are measures based on the dispersion of individuals’ **point** **forecasts** (‘disagreement’ or ‘consensus’): see Zarnowitz and Lambros (1987), with more recent contributions including Giordani and Söderlind (2003), Boero, Smith and Wallis (2008), D’Amico and Orphanides (2008) and Rich and Tracy (2010). In this paper we analyze the properties of the histogram-based measures of uncertainty, and in particular the relationship between the histogram measure as an ex ante measure of forecast uncertainty, and ex post or realized uncertainty. Our interest in the term structure of forecast uncertainty (i.e., how forecast uncertainty changes with the horizon) is motivated in part by a recent paper by Patton and Timmermann (2011). Patton and Timmermann (2011) adopt a …xed-event framework to consider how realized forecast uncertainty varies over the forecast horizon. Their approach is silent about ex ante forecast uncertainty, and its relationship to ex post uncertainty. We use actual survey forecast errors to measure ex post uncertainty, as they do, but also use the SPF histograms to measure ex ante uncertainty.

31 Read more

In the recent literature it has been suggested that symmetric loss ought to be regarded as just a special case of more general loss functions, and that the presumption of symmetric loss might be misplaced for macro-forecasters. If forecasters have asymmetric loss functions, then we would expect their **point** **forecasts** to be biased, and the bias should depend on the conditional forecast uncertainty. However, the small samples of **forecasts** we have by individual for a given horizon might explain why we often fail to reject the null of unbiasedness (irrespective of whether loss is really asymmetric). Our preferred test analyses the di¤erence between the **point** predictions and (estimates of) the conditional means and appears to have greater power, and is consistent with the asymmetry story in that the di¤erence is zero mean once we allow for forecast uncertainty. However, the outcomes of the forecast encompassing tests provide telling evidence against asymmetric loss: we tend to reject the null that the mean **forecasts** encompass the **point** predictions, under squared error loss, but not vice versa.

25 Read more

performance in panel data models. The panel considered in this paper features large cross-sectional dimension (N) but short time series (T). It is modeled by a dynamic linear model with common and heterogeneous coefficients and cross-sectional heteroskedasticity. Due to short T, traditional methods have difficulty in disentangling the heterogeneous parameters from the shocks, which contaminates the estimates of the heterogeneous parameters. To tackle this problem, the methods developed in this dissertation assume that there is an underlying distribution of the heterogeneous parameters and pool the information from the whole cross-section together via this distribution. Chapter 2, coauthored with Hyungsik Roger Moon and Frank Schorfheide, constructs **point** **forecasts** using an empirical Bayes method that builds on Tweedie's formula to obtain the posterior mean of the heterogeneous coefficients under a correlated random effects distribution. We show that the risk of a predictor based on a non-parametric estimate of the Tweedie correction is asymptotically equivalent to the risk of a predictor that treats the correlated-random-effects distribution as known (ratio-optimality). Our empirical Bayes predictor performs well compared to various competitors in a Monte Carlo study. In an empirical application, we use the predictor to forecast revenues for a large panel of bank holding companies and compare **forecasts** that condition on actual and severely adverse macroeconomic conditions. In Chapter 3, I focus on density **forecasts** and use a full Bayes approach, where the distribution of the heterogeneous coefficients is modeled nonparametrically allowing for correlation between heterogeneous parameters and initial conditions as well as individual-specific regressors. I develop a simulation-based posterior sampling algorithm specifically addressing the nonparametric density estimation of unobserved heterogeneous parameters. I prove that both the estimated common parameters and the estimated distribution of the heterogeneous parameters achieve posterior consistency, and that the density **forecasts** asymptotically converge to the oracle forecast. Monte Carlo simulations and an application to young firm dynamics demonstrate improvements in density **forecasts** relative to alternative approaches.

242 Read more

Probability **forecasts** are unique insofar as they not only provide a prediction of the location/class of the observation but they also give a measure of the uncertainty in that prediction. Sharpness rewards models in terms of the location/class accuracy but gives no real indication of the correctness of probability estimates. Calibration, also known as reliability in meteorology or empirical validity in statistics [5], refers to the ability of a model to make good probabilistic predictions. A model is said to be well calibrated if for those events the model assigns a probability of P%, the long-run proportion that actually occur turns out to be P%. Intuitively, this is a desirable characteristic of any probabilistic forecast; in fact, it could be argued that probability **forecasts** that are not well calibrated are of no more use than **point** **forecasts** because the probabilistic aspect of the prediction is incorrect.

12 Read more

It is unclear whether the **point** **forecasts** should be interpreted as the means, modes or even medians of the probability distributions. Engelberg et al. (2006) calculate non-parametric bounds on these measures of central tendency for the histograms. We focus on the mean, as their results are similar across the three measures. Before calculating bounds, we brie‡y consider other approaches that have been adopted to calculate means from histograms. One approach in the literature is to assume that the probability mass is uniform within a bin (see e.g., Diebold et al. (1999b), who make this assumption in the context of calculating probability integral transforms). Another is to …t normal distributions to the histograms (see Giordani and Söderlind (2003, p. 1044)). If the distribution underlying the histogram is approximately ‘bell-shaped’then the uniformity assumption will tend to overstate the dispersion of the distribution because there will be more mass close to the mean. This problem will be accentuated when there is a large di¤erence in the probability mass attached to adjacent bins, where it might be thought desirable to attach higher probabilities to points near the boundary with the high probability bin. In the same spirit of …tting a parametric distribution to the histogram, Engelberg et al. (2006) argue in favour of the unimodal generalized beta distribution. Using the assumption of uniform mass within a bin, and approximating the histogram by a normal density, results in the correlations between the annual growth **point** **forecasts** and the histogram means recorded in table 1. The crude correlations in columns 3 and 4 are all high, between 0.92 and 0.96, and the results indicate little di¤erence between the two methods of calculating histogram means that we consider. Nevertheless, it is

29 Read more

By using the model averaging approach, one can obtain a better accuracy than any method alone, when the **forecasts** combined use different methods that capture different information, different specifications or different assumptions. Because the underlying data generating process is often unknown, combined **forecasts** are more robust toward model mis-specification and are more likely to produce accurate **point** **forecasts**. In the demographic literature, there has been little interest in combining **forecasts** from different models. Nonetheless, some notable exceptions include Smith and Shahidullah (1995); Ahlburg (1998, 2001) and Sanderson (1998), whose pioneering work, particularly in the context of census tract forecast, have done much to awaken others, including the present author. The contribution of this article is to apply the notion of model averaging to the problem of forecasting age-specific life expectancies.

54 Read more

Apart from **point** **forecasts**, we investigated the ability of the models to provide interval **forecasts**. For all consid- ered models interval **forecasts** were determined analyti- cally; for details on calculation of conditional prediction error variance and interval **forecasts** we refer to [5][6]. Afterwards, following [2], we evaluated the quality of the interval **forecasts** by comparing the nominal coverage of the models to the true coverage. Thus, for each of the models we calculated confidence intervals (CIs) and determined the actual percentage of exceedances of the 50%, 90% and 99% two sided day-ahead CIs of the mod- els by the actual market clearing price (MCP). If the model implied interval **forecasts** were accurate then the percent- age of exceedances should be approximately 50%, 10% and 1%, respectively. Note that for each “month”, 840 hourly values were determined and compared to the MCP.

The forecasting sample covers the period 03/01/00-10/07/02; the models are specified and estimated over the first estimation period, 03/01/1990-30/12/1999, and the first set of 1 to 5 steps ahead forecast (h=1, 2,…5) computed. The models are then estimated recursively keeping the same specification but extending the sample with one observation each time. In this way 638 **point** **forecasts** are obtained for each forecast horizon. These **forecasts** can be considered genuine **forecasts** as in the specification stage we completely ignore the information embodied in the forecasting period. The computation of multi-step-ahead **forecasts** from nonlinear models involves the solution of complex analytical calculations and the use of numerical integration techniques, or alternatively, the use of simulation methods. In this study the **forecasts** are obtained by applying the Monte Carlo method with regime- specific error variances, so that each **point** forecast is obtained as the average over 500 replications (see Clements and Smith, 1997, 1999) 7 .

35 Read more

I plot a figure like this for the **forecasts** derived from each data vintage. Unfortunately, it is not possible to show all these figures in this paper. However, screening over all the **forecasts** for the different historical data vintages reveals some notable observations. Structural models and the Bayesian VAR are suited to forecast during normal times. Given small or average exogenous shocks the models give a good view about how the economy will return back to steady state. In contrast, large recessions or booms and the respective turning points are impossible to forecast with these models. Figure 2 plots the forecast errors (outcome minus forecast) of all models on the horizontal axis and the correponding realized output growth rate on the vertical axis. A clear positive relation is visible. When output growth is highly negative the models are not able to forecast such a sharp downturn and thus the forecast error is negative. The models require large exogenous shocks to capture large deviations from the balanced growth path and the steady state inflation and interest rate. This is due to the weak internal propagation mechanism of the models. Therefore, for a given shock all models including the Bayesian VAR predict a quick return back to the steady state growth rate. While the **point** **forecasts** cannot predict a recession, the possibility that a large deviation from steady state values occurs is captured by the density **forecasts**. Once the turning **point** of a recession has been reached, all models predict the economic recovery back to the balanced growth path well. 17

56 Read more

In March 2018 we held a workshop to exchange knowledge and enhance communication between users and developers of NWP at Met Éireann. Since then we have held monthly meetings which involve discussions on successful and unsuccessful HARMONIE-AROME **forecasts**, known issues in the model, model evaluation and verification, what is required to make a model operational, physical parametrizations, the IREPS ensemble and many other topics. Feedback from NWP users (i.e. forecasters at Met Éireann) has been very positive and overall the meetings have vastly improved the flow of information and feedback between the two groups.

10 Read more

We plan to upgrade from our current operational cycle 37h1.1 to 40h1.2 of HARMONIE-AROME. As part of this upgrade, we intend to increase the domain size and to implement upper-air data assimilation (3D-Var) using conventional observations with a 3-hour cycle. Extensive testing of 40h1.2 was carried out in 2017 and results were compared with past operational cycle 37h1.1 **forecasts**. Although cycle 40h1.2 shows a general improvement in verification scores, especially for 10 m wind speeds, there was a significant degradation in the 2 m temperature **forecasts** and this has delayed the implementation of our HARMONIE-AROME upgrade. We have observed a consistent cold bias of up to 1 ◦ C using the default HARMONIE-AROME settings; see Fig. 2. A number of changes to the physics parametrizations have been investigated; see Fig. 3. Switching off HARATU showed an improvement in the temperature bias, but this degraded the quality of 10 m wind **forecasts**. We have also experimented with lowering the default value of the heat capacity of vegetation in SURFEX (PCV in the code) and this has improved the temperature biases. However, the physical basis for any such changes needs to be investigated further. This work will continue in 2018.

Table 5 ranks the scores in ascending order, and negative scores are more common than positive scores. The scores range from 8 to -9. Soybeans, soybean meal, and cotton have the best scores, while a number of high-value products and rice have the worst. Interestingly, the accuracy of USDA’s estimates for Horticultural products in total is lower than for virtually all the **forecasts** of the components of the total. This isn’t the case for any of the other aggregates: Grains and feeds, Oilseeds and products, and Livestock and products.

(mean) and probability **forecasts** are studied from two candidate models: a univariate model, which expresses conflict numbers at time t as a function of conflict numbers at time t-1 and a vector autoregression model which expresses conflict numbers a time t as dependent on conflict numbers in t-1 and wheat prices in time t-1. **Point** forecasting results indicate the VAR model preforms better in terms of root mean squared forecast errors and forecast encompassing. Both models offer well-calibrated probability **forecasts** over our post fit period. Clearly, knowledge of wheat prices in month t is helpful in forecasting conflict numbers for month t+1; a result that coheres well with ‘fit” results found on earlier data via a machine learning algorithm.

29 Read more

Abstract. Meteorological centres make sustained efforts to provide seasonal **forecasts** that are increasingly skilful, which has the potential to benefit streamflow forecasting. Seasonal streamflow **forecasts** can help to take anticipatory measures for a range of applications, such as water supply or hy- dropower reservoir operation and drought risk management. This study assesses the skill of seasonal precipitation and streamflow **forecasts** in France to provide insights into the way bias correcting precipitation **forecasts** can improve the skill of streamflow **forecasts** at extended lead times. We ap- ply eight variants of bias correction approaches to the pre- cipitation **forecasts** prior to generating the streamflow fore- casts. The approaches are based on the linear scaling and the distribution mapping methods. A daily hydrological model is applied at the catchment scale to transform precipitation into streamflow. We then evaluate the skill of raw (without bias correction) and bias-corrected precipitation and stream- flow ensemble **forecasts** in 16 catchments in France. The skill of the ensemble **forecasts** is assessed in reliability, sharp- ness, accuracy and overall performance. A reference predic- tion system, based on historical observed precipitation and catchment initial conditions at the time of forecast (i.e. ESP method) is used as benchmark in the computation of the skill. The results show that, in most catchments, raw seasonal precipitation and streamflow **forecasts** are often more skil- ful than the conventional ESP method in terms of sharpness. However, they are not significantly better in terms of reli- ability. Forecast skill is generally improved when applying bias correction. Two bias correction methods show the best performance for the studied catchments, each method being

18 Read more

It is important for expert users to find robust ways to iden- tify inconsistency and express it numerically in order to aid their decision making, clarify system limitations or assess the performance of different forecast systems and aid their oper- ational decision making. Examples of evaluation measures include regression, root mean squared error and bias based approaches (Nordhaus, 1987; Clements, 1997; Clements and Taylor, 2001; Mills and Pepper, 1999; Bakhshi et al., 2005) and pseudo-maximum likelihood estimators (Clements and Taylor, 2001). In weather forecasting a latitude weighted root mean squared error (Zsoter et al., 2009), the Ruth-Glahn forecast convergence score (Ruth et al., 2009) and the Con- vergence Index (Ehret, 2010) have also been used. Pappen- berger et al. (2011b) have applied the latter to probabilis- tic hydro-meteorological **forecasts**. The number of different ways in which it is possible to quantify inconsistency intro- duces its own level of uncertainty to the evaluation, but it remains essential to quantify it in some (or many) numeri- cal ways and understand these relationships (similar to other skill scores see Cloke and Pappenberger, 2008).

10 Read more

It is impossible to find the perfect model for the prediction of economic indicators because of the uncertainty. Uncertainty plays an important role in many areas of economic behaviour that’s why uncertainty is also inherent to forecasting (Boero et al., 2008; Laurent & Kozluk, 2012). Bratu (2012) states some important strategies that can be used in practice in order to improve the accuracy of **forecasts**. One of these strategies is building combined **forecasts** in different variants: predictions based on linear combinations whose coefficients are determined using the previous **forecasts** and predictions based on correlation matrix, the use of regression models for the large databases of predicted and effective values. On the other hand, one can apply the historical errors method, which implies the same value of an accuracy indicator calculated for a previous period.

A number of studies have investigated the performance of ensemble forecasting systems, e.g. Alfieri et al. (2014) for the European Flood Awareness System, and Bennett et al. (2014), Olsson and Lindström (2008), Renner et al. (2009) and Roulin and Vannitsem (2005), for several catch- ments varying in size and other characteristics. These stud- ies demonstrated a deterioration of performance with in- creasing lead time. However, most studies focused either on flood **forecasts** (e.g. Alfieri et al., 2014; Bürger et al., 2009; Komma et al., 2007; Olsson and Lindström, 2008; Roulin and Vannitsem, 2005; Thielen et al., 2009; Zappa et al., 2011) or low-streamflow **forecasts** (Demirel et al., 2013a; Fundel et al., 2013). Studies on non-specific ensemble streamflow- forecasting systems (Bennett et al., 2014; Demargne et al., 2010; Renner et al., 2009; Verkade et al., 2013) did not eval- uate the performance for different streamflow categories (i.e. for low-streamflow and high-streamflow events). Moreover, previous studies did not assess the effects of runoff processes, such as snowmelt and extreme rainfall events, on the perfor- mance of ensemble **forecasts**. The only study we found that bears on this is the study by Roulin and Vannitsem (2005), who concluded that their high-streamflow-forecasting sys- tem is more skilful for the winter period than for the summer period.

19 Read more

Our earlier analysis suggests that firms with better CGQ have more informative disclosure policies. Inconsistent with our predictions, we find firms with higher CGQ are associated with greater bias (more “optimism”) in analyst **forecasts** and lower forecast accuracy. Nevertheless, these firms are associated with greater analyst following suggesting that even with greater analyst following, analysts find it difficult to predict earnings for such firms. The results for the disagreement model are inconclusive using Ordinary Least Squares estimation methods. However using Seemingly Unrelated Regression techniques (see robustness section for details) we find that the disagreement amongst analysts is lower for firms with better CGQ. For the forecast revision model, the CGQ measure is unrelated to the volatility of forecast revisions, possibly due to the relatively small sample size for this particular model.

29 Read more