EUROPEAN
ECONOMY
EUROPEAN COMMISSION
DIRECTORATE-GENERAL FOR ECONOMIC AND FINANCIAL AFFAIRS
ECONOMIC PAPERS
ISSN 1725-3187
http://europa.eu.int/comm/economy_finance
N° 219 December 2004
A sorted leading indicators dynamic (SLID) factor model for short-run euro-area GDP forecasting
by
Daniel Grenouilleau
Economic Papers
are written by the Staff of the Directorate-General for Economic
and Financial Affairs, or by experts working in association with them. The “Papers”
are intended to increase awareness of the technical work being done by the staff and
to seek comments and suggestions for further analyses. Views expressed represent
exclusively the positions of the author and do not necessarily correspond to those of
the European Commission. Comments and enquiries should be addressed to the:
European Commission
Directorate-General for Economic and Financial Affairs
Publications
BU1 - -1/180
B - 1049 Brussels, Belgium
ECFIN/REP
50402/04-EN
ISBN
92-894-8366-0
KC-AI-04-219-EN-C
©European Communities, 2004TABLE OF CONTENTS
ABSTRACT...ii
Non technical summary...iii
1. The model... 1
1.1. INTUITIVE APPROACH... 1
1.1.1. Leading indicators equations revisited ... 1
1.1.2. Advantages and shortcomings of business cycle factor models ... 2
1.1.3. Introduction to a sorted leading indicators dynamic factor model to forecast euro-area GDP... 4
1.2. FORMAL DESCRIPTION... 6
1.2.1. The dynamic factor model... 6
1.2.2. Forecast computation methods ... 8
1.3. FACTORS SELECTION... 10
1.4. COMMON FACTORS AND FORECAST INTERPRETATION... 11
1.4.1. Factors reading... 11
1.4.2. Forecasts interpretation... 11
2. Data selection and processing... 13
2.1. DESCRIPTION OF THE DATABASE... 13
2.2. SUMMARY OF THE DATA TREATMENT PROCEDURE... 15
2.3. THE ISSUES OF DATA LAGGING AND SORTING... 16
2.4. THE ISSUE OF THE DATA SET COMPOSITION... 16
2.5. THE ISSUE OF THE ADEQUACY OF GDP AS AN INDICATOR OF THE UNOBSERVED BUSINESS CYCLE... 17
3. Forecast performances... 18
3.1. THE REAL-TIME OUT-OF-SAMPLE DESIGN... 18
3.2. BASELINE AND BENCHMARK MODELS, TESTS TO BE CONDUCTED... 19
3.2.1. Calibrating the SLID factor model ... 19
3.2.2. Benchmark forecast models ... 19
3.2.3. Some statistics of forecast accuracy ... 20
3.3. OUT-OF-SAMPLE RESULTS... 21
3.3.1. Factors and forecasts interpretation... 21
3.3.2. Comparison between various computation methods... 23
3.3.3. Comparison between factors combinations ... 23
3.3.4. Comparison with benchmark models... 24
3.4. A REMARK WITH A VIEW TO PUBLISHING THE RESULTS... 26
4. A few remarks about model robustness ... 27
4.1. FACTOR SELECTION... 27
4.2. SIGNAL/NOISE RATIO... 28
4.3. TIME WINDOW WIDTH... 28
4.4. DATABASE COMPOSITION... 29
4.5. MAXIMUM LAG OF THE SERIES IN THE DATA SET... 30
5. Conclusion... 31
6. references ... 32
7. ANNEX ... 35
7.1. BAYES INFORMATION CRITERION (BIC) ... 35
7.2. OUT-OF-SAMPLE FORECAST ACCURACY STATISTICS... 39
7.2.1. SEM for Q and Q+1 forecasts, EM algorithm for Q+2 and Q+3 ... 39
7.2.2. SEM for Q, Q+1 and Q+2 forecasts, EM algorithm for Q+3 ... 42
A SORTED LEADING INDICATORS DYNAMIC (SLID) FACTOR MODEL FOR SHORT-RUN EURO-AREA GDP FORECASTING #
Daniel GRENOUILLEAU∗
ABSTRACT
This paper introduces a statistical model for short-term GDP forecasting based on approximate dynamic factors (Stock and Watson methodology), extracted from a very large number of leading indicators sorted according to their correlations at various lags to euro-area GDP (Sorted Leading Indicators Dynamic factor model).
A very large data set of 2000 series covering a variety of leading indicators for all countries of the euro area and the euro area itself was built for the model in order to meet consistency requirements. All series are introduced at different lags, thus several times, in the data set in order to extract dynamic factors (stacked version of the Stock and Watson model). Factors estimation is further enhanced by data filtering: indicators which contain little information about the business cycle at a given lag are dropped from the data set at this lag. The criterion for the information content of the series is their cross-correlation to GDP. Finally, forecasts are not based on a regression using fitted factors, but on a more parsimonious framework: the recursive projection of GDP on the first few factors by descending order of eigenvalues (augmented EM algorithm), since it is assumed that only the first eigenvectors corresponding to the larger shares of the data variance should account for common shocks and provide consistent estimations of factors.
There is a broad consensus that “true” (latent) factors can be consistently estimated with an approximate factor model provided that the cross-section dimension of the data is far greater than the number of observations. Our real-time out-of-sample experiment showed that it is indeed efficient for short-term forecasting to use a very large number of series and the same series introduced at various lags. The short-term forecast accuracy is nevertheless enhanced by data filtering, as it seems that an appropriate level of filtering removes subsets of series with very low correlation to the reference variable but high idiosyncratic cross-correlation that might bias factor estimation with an approximate factor model. We assume here that the greater the accuracy of factor-based forecasts is, the closer the estimated factors are to the “true” factors. The optimal threshold of filtering was estimated empirically and seems data set specific.
The augmented EM algorithm, based on the first three factors by descending order of eigenvalues, produced the best forecast performances. The number of three factors (for thousands of input series) was retained according to an adapted Bayes Information Criterion. According to the fifteen-quarter out-of-sample RMSE or Diebold-Mariano statistics, the Sorted Leading Indicators Dynamic (SLID) factor model with an augmented EM algorithm outperforms, where available, naïve or autoregressive (AR) models, CEPR Eurocoin, the DG ECFIN GDP indicator and OECD indicator equations for coincident, one-quarter and two-quarter ahead forecasts.
# Revised version 15/05/2006. A work-in-progress version of this paper (Grenouilleau [2003]) was presented at
Eurostat and DG ECFIN 4th Colloquium on Modern Tools for Business Cycle Analysis, Luxemburg 20-22/10/2003. A previous version was presented at the Warsaw CIRET conference, 15-17/09/2004.
∗Acknowledgements: the author thanks M. McCarthy, F. Keereman, G. Lejeune, C. Viguié and C. Gayer for useful
discussion. Data collecting benefited from the advice of country desk officers in Directorate B of DG ECFIN. Special thanks to D. Koszerek and M. Diron (for comments on previous work-in-progress), C. Viguié (for seasonal adjustment programs). Shortcomings and errors are the responsibility of the author alone.
D. Grenouilleau is an economist at the European Commission, Directorate general for Economic and Financial Affairs (at the time of drafting in the Unit Forecasts and economic situation). Please send comments to [email protected].
NON TECHNICAL SUMMARY
While a few leading indicators can be easily fitted to “explain” an endogenous variable such as GDP, the forecast performance of leading indicators equations deteriorates significantly outside the sample. Such equations are not very robust to the time sample that is used, first, because leading indicators can reflect well only specific shocks. If shocks of a different nature occur in the subsequent quarters, then the indicators have to be reselected. Secondly, the fit in OLS or VAR equations based on relatively small sample is unlikely to provide a robust tool for the selection of the relevant regressors, since leading indicators are generally noisy and are often strongly correlated with each other.
As the ability of leading indicators to provide relevant short-term information about low-predictability time series remains unchallenged, the difficulty is rooted in the decision rule for the selection of the relevant indicators and in the econometric tools used to produce forecasts. A factor model can provide a new response to those issues. In the Sorted Leading Indicators Dynamic (SLID) factor model, there can be almost no ex ante selection from among all potential leading indicators. It is possible to distil a signal about the business cycle in the near future from the common pattern to all leading economic series that is cleansed of noise and specific patterns in the series. Short-term forecasts about the business cycle can be directly derived from an efficiently extracted signal.
Let us assume that all coincident or leading indicators have two components: a common component corresponding to the general economic situation (business cycle) and an idiosyncratic component that is specific to each indicator. Following the methodology of Stock and Watson [1998], it is possible to extract from the data common factors that summarise the unobserved common component to all series. This approach allows us to distil from each series in a few factors the common signal pertaining to the business cycle and leave out idiosyncratic information or noise of each series. The business cycle common signal can be a good proxy for euro area GDP, if factor extraction is performed on a variety of series from countries of the euro area. In this framework, factor estimation using principal component analysis allows to obtain consistent estimates of the “true” (latent) common factors driving the business cycle, provided that the number of leading indicators is very large. In order to meet those requirements, about two thousand time series covering various data from the twelve countries of the euro area were collected. The selection criterion was judgmental: the series are assumed to contain information about the current and/or future economic situation and are, therefore, considered to be possible leading indicators for the euro-area economy.
In contrast to conventional forecast models based on regression on a few leading indicators, series are not selected according to their contribution to goodness of fit for a given time period. Instead, all relevant series can potentially be included in the model. Results should therefore be more robust over time even if economic shocks of a different nature occur. Series are introduced in the data set at several leads and lags, so that no particular assumption is made about the “true” lead of the series vis-à-vis the reference variable (GDP). The time span of the sample can be limited to 7 years, as this method is in principle asymptotically consistent with a large number series, whereas a large number of temporal observations is not needed.
The SLID factor model also exhibits a few additional specific features vis-à-vis standard factor models. Dynamics are introduced ex ante with the use of lagged series and not ex post. In the latter case, the extraction of factors is static and dynamics are obtained with the use of lagged factors in an appended VAR system. With the extraction of dynamic factors, the computation of GDP
forecasts can be based on a simple recursive coincident projection of GDP on the first few factors. GDP forecasts can be consistently enhanced at each recursion, with the extraction of factors based on the estimate previously obtained. The number of factors is set according to an adapted version of the Bayes Information Criterion (BIC). Factors are chosen from principal components by descending order of eigenvalues, as we assume that, if the factor estimation is consistent, factors should account for the largest share of common variance in the data and, hence, correspond to the highest eigenvalues. The last original feature is that the series are ex ante sorted according to their cross-correlations with euro-area GDP at various lags. A leading indicator with low correlation to GDP at a given lag is removed from the final subset. Conversely, some indicators can be introduced several times in the subset at different lags, if the cross-correlation at each corresponding lag is high enough. It is argued that factor estimation is enhanced where performed on the filtered subset, since an appropriate level of filtering removes series with very low correlation to the reference variable but high idiosyncratic cross-correlation that might bias factor estimation with an approximate factor model. The optimal threshold of filtering was estimated empirically and seems specific to each data set
An empirical simulation using about 2000 leading indicators and taking into account their real time availability and correlation, shows that GDP forecast accuracy is improved by removing from the data set series which cross-correlation to the reference variable is below a relatively low threshold. Anecdotal evidence that filtering allows a more consistent estimation of factors may be also found in the fact that the first factor's correlation with GDP is considerably improved with such mild filtering, whereas, with no filtering, the first factor is more correlated to a specific type of data than to the business cycle. The results also show that, in this framework, factor estimates obtained through principal component analysis can be used directly to perform short-term forecasts with a simple scheme based on recursive projection of GDP on the first three factors. The influence of factors on the forecasts and their economic interpretations are easier to explain qualitatively than with standard factor models. The SLID factor model outperforms others benchmark models: naïve or autoregressive (AR) forecasts models, CEPR Eurocoin, the DG ECFIN GDP indicator and OECD indicator equations for coincident, one-quarter ahead and two-quarter forecasts, where available.
GDP is one possible measure amongst others of business cycle conditions. This series is itself an imperfect estimate of business cycle conditions (due for example to seasonal adjustment or measurement errors). All the discrepancy between measured GDP and the factor model coincident forecast should not be viewed as problematic since the former might not necessarily be a better reflection of business cycle conditions than the latter.
1.
THE MODEL
In short-term economic analysis, a variety of time series about output or prices are used to characterise the business cycle. Many such series are available. A standard practice is to use some of these series that lead the business cycle, or are available in a more timely manner than GDP, to estimate coincident GDP or forecast GDP over a short horizon with few-variable systems estimated using single equation OLS or VAR.
The approximate factor model introduced in this paper can be seen as a mere generalisation of the leading indicators equation technique to a very large number of leading indicators in a non parametric framework. Given the very large number of potential leading indicators that can be selected judgementally, standard regression methods cannot be used. Instead, principal components analysis allows the extraction of the common patterns in all series, which are used later on to estimate coincident and future values of GDP.
In more technical terms, the patterns of about two thousands economic time series, including GDP, are assumed to reflect unobserved common factors (which underlie the business cycle). These unobserved dynamic factors can be consistently estimated using principal component analysis. Coincident, one-quarter and two-quarter ahead quarter-on-quarter GDP growth can be estimated with common factors through several methods of projection.
This part of the paper is organised as follows: first, an intuitive approach explains the differences between leading indicator equations and factor models and the underlying assumptions behind the use of a factor model for euro-area GDP. Secondly, the SLID factor model is described in more technical terms, together with the algorithm used to generate the forecasts.
1.1. INTUITIVE APPROACH
1.1.1. Leading indicators equations revisited
The technique of leading indicators equations1, which is grounded on the extraction of leading information contained by selected indicators vis-à-vis GDP, produces forecasts which are not considerably reliable, even in the short run2. The primary reason lies in the fact that, even though leading indicators do contain information about the near future, this information is mixed with a lot of noise. Typically, OLS equations or VAR systems based on leading indicators are unstable as they include not only information about previous specific shocks but also a great deal of noise. A second reason is that leading indicators in low-dimension systems necessarily provide shock-specific information.
The forecaster is faced with serious methodological and econometric problems related to model uncertainty. Stock and Watson [1998] recall that, “with a very large number of predictor variables, it is computationally infeasible to enumerate and to estimate all possible models up to a given
1 Sometimes called “bridge” equations.
2 Banerjee et al. [2003] report e.g. that different indicators need to be used in each period in order for indicators
order”. What is then the correct procedure for picking the best indicators available? Selecting indicators based on their statistical contributions to explaining variance in the dependent variable is a dead end, as an infinite number of indicators combinations produces an infinite range of fits (including a perfect fit). If they are selected according to out-of-sample forecast performances, there is a risk that the equation would be valid in the future only if shocks of the same nature occur, because leading indicators are shock-specific. If they are selected according to economic judgement, forecast performances will be very poor because of the noise in the data. Last but not least, such noisy and correlated data involve a high risk of spurious fit and low robustness as the number of indicators is generally borderline compared to the number of observations. A subtle trade-off can be found, but even the best-performing models3 based on this methodology are prone to produce forecast ranges with low confidence levels at most forecast horizons.
According to common knowledge, leading indicators are supposed to contain information that should be useful to forecast the near future but the issue is how to extract this information? Most difficulties can be (partially) solved with factor models insofar as they cleanse the input data of noise, provide a more robust econometric method with big data, and circumvent the dead end of model selection.
1.1.2. Advantages and shortcomings of business cycle factor models
Given a set of business cycle indicators, it is customary to distinguish between two components in each indicator. The first component is common to all series: it reflects the business cycle. The other component is idiosyncratic or specific to each particular series. Short-term forecasters are only interested in the information provided about the business cycle, not in the specific (or idiosyncratic) patterns. Common factors, which are “a small set of driving variables responsible for variation in macro time series”4, account for the first components of the series. They can account for the business cycle of the euro area, assuming commonality exists between the various cycles of the member countries of the euro area.
◊
A pervasive framework for business cycle modelling
Factor analysis in the field of econometrics has recently become a popular subject, according to the rapidly increasing number of articles in the literature over the last five years. It is not exactly new, though, neither in the field of econometric research, nor in the case of empirical forecasting models. Sargent and Sims [1977] or Geweke [1977] introduced a dynamic index model in which observed time series are supposed to be generated by a small number of indexes with distributed lags. Factors are estimated through maximum likelihood methods. Stock and Watson [1988, 1989] developed coincident and leading single indexes of economic indicators to summarise the state of the economy computed with a Kalman filter. Quah and Sargent [1993] extended the dynamic factor models' original framework to cases in which the number of cross-sectional time series is of the same magnitude as the time dimension with a quasi-maximum likelihood method involving the Estimation-Maximisation (EM) algorithm. Chamberlain and Rothschild [1983] generalised results of the arbitrage pricing theory to the case of approximate factor structure, in which idiosyncratic components of the series need not be uncorrelated, and the idiosyncratic covariance matrix need not be diagonal. In this framework, factors can be consistently estimated with a simple quasi-maximum likelihood method: principal component analysis. Also based on an approximate factor structure, Stock and Watson [1998, 2002a, 2002b] showed that a consistent estimation of a dynamic factor model can be obtained through standard principal component analysis, as both the number of series
3 Such as the OECD 26 indicator equations, cf. Sédillot & Pain (2003). 4 Stock & Watson [2002a], p.147.
and observations go to infinity5. The dynamic factors obtained can be combined to forecast macroeconomic time series such as GDP or inflation. The same authors also suggest implementing the EM algorithm to extract factors from an unbalanced panel (i.e. comprising series with missing observations for the relevant time sample). Angelini, Henry and Mestre [2001b] estimated such an approximate factor model to forecast inflation in the euro area.
◊
◊
Main advantages
Factor models offer a number of advantages compared to leading indicator equations. The use of a very large number of series is likely to reduce model selection uncertainty, i.e. the risk of picking the wrong series and omitting the relevant series. The signal extracted with factor analysis does not rely on a few specific series, but is rather the result of the general pattern, more precisely the common shocks, displayed by all series.
Many macroeconomic series contain leading information about the economic situation, or more specifically about GDP. The problem is that this signal is mixed with much noise. Factor analysis is a powerful tool for disentangling noise from signal, because the noise in the various series tends to be idiosyncratic, that is, broadly speaking, uncorrelated across series.
The estimation of approximate factor models through principal component analysis does not require long samples. This technique does not require the estimation of parameters, in the sense that the model is statistically non parametric. In contrast to standard low-dimension systems, consistency is theoretically achieved when the number of variables is (far) greater than the number of observations. Thus, estimation may be less subject to structural break problems.
Forecasts derived from factor indexes are more easily accountable (with loadings monitoring) than those derived from VAR systems. This feature is particularly important for publicly disseminated forecasts.
Last but not least, it is possible to use data at country level as well as at sectoral level, and thus to build up on the statistical edge of every country, whereas euro area data are for the time being relatively scarce and can be sometimes subject to aggregation problems.
Some shortcomings
Most factor models developed in the econometric literature have, however, been subject to two types of problems: first, a problem of estimating and identifying the common factors consistently even with large databases and, secondly, a problem of deriving robust forecasts from the factors. First, the theoretical consistency property of pseudo maximum likelihood methods6 does not seem to appear systematically in empirical implementations with real data7. In other words, there is no guarantee with real data (which might exhibit peculiar cross-correlation features) that the first few principal components are “true” (or latent) common factors. Bai [2003] showed that consistency is obtained for large N but fixed T when the idiosyncratic errors are serially uncorrelated and homoskedastic. There have been very few explorations of the conditions under which consistency is achieved with real data. Boivin & Ng [2003] argued that the composition of data can affect factor estimation and forecasting, for instance where idiosyncratic terms are cross-correlated in a particular way. Breitung and Kretschmer [2004] introduced a new information criterion to
5 By contrast to maximum likelihood factor analysis used for example in Doz and Lenglart [1999], standard principal
component analysis remains appropriate when the data matrix is not full rank.
6 Using principal component analysis. The issue is obviously different when common factors are extracted through
maximum likelihood methods performed on non singular data matrix.
distinguish latent (or “structural”) factors from static factors derived from principal component analysis.
Apart from common factor extraction, another problem is to link common factors to the dependent macroeconomic variable of interest. Which common factors should be selected to predict GDP? Which bridging system can best ensure robustness? Which information criteria8 should inform the choice of the number of factors? Some indeterminacy also remain in the choice of lags or more remote eigenvectors (by descending order of eigenvalues), since some factors associated to high eigenvalues might be poorly correlated to the variable of interest. Most of the literature has conservatively used OLS or VAR systems to forecast the variable of interest with common factors. However, these systems do not usually exploit the cross-correlation information purveyed by the data matrix at the stage of factor extraction. The artificial9 link between factors and dependent variable, which is created at the stage of forecast computation, might be detrimental to the model robustness. As a matter of a fact, factor models forecast performances are rarely systematically superior to those of standard stochastic models.
1.1.3. Introduction to a sorted leading indicators dynamic factor model to forecast euro-area GDP
The remainder of this paper introduces a dynamic factor model designed to produce coincident and up to two-quarter-ahead forecasts of euro-area GDP, which can be revised on a monthly basis.
◊
Underlying assumptions of the factor model
The model is based on a set of assumptions, which can be usefully summarised in order to identify potential shortcomings.
A1: The relevant information about the present/future pattern of GDP is contained in a variety of leading economic indicators at several lags.
A2: Series that do not contain sufficient information are dropped from the data set. These series are identified as exhibiting a too low signal-noise ratio with respect to the euro area business cycle (GDP for simplification). The cross-correlation of an indicator with GDP growth is a relevant information criterion for determining the indicators' signal-noise ratio at various lags10.
A3: Dynamic common factors extracted from a pool of series displaying a given lead of at least h quarters to GDP (considered as reflecting the economic situation) summarise available information about the economic situation at an horizon h. Factors are assumed to be cleansed of idiosyncratic shocks or noise in the original series.
A4: In the case of series pertaining to different economic sectors and various countries of the euro area, A3 produces common factors summarising the current and near-term economic situation of the euro area. It means that the economic situation of the euro area is determined by the common economic dynamics in the various sectors and countries of the euro area, after eliminating idiosyncratic country-wise or sector-wise disturbances.
8 Cf. Stock and Watson [1998], Bai & Ng [2002], Connor & Korajczyk [1993]. 9 In the sense that no economic meaning can inform econometric choices.
10 As series are incorporated in the model in difference or log difference, the cross-correlation test is performed on the
A5: Forecasts of GDP are solely based on the first common factors by descending order of eigenvalues and the database cross-sectional information, and not on OLS equations or VAR systems of various factors including GDP as the dependent variable.
To a large extent, some limitations of the model might a priori stem from these assumptions. A4 is a stronger assumption than it might first appear. What is common across a variety of economic time series about different sectors and countries of the euro area? Recent research11 suggests that national data do contain reliable information about the euro area business cycle. However, it might be more difficult at certain points in time to extract relevant information about the euro area from national data, for example when all countries/sectors do not experience a recession at the same time. Diverging national economic situations and/or economic policies might explain why factor extraction might give less reliable results. Conversely, in the twelve countries that moved in 1999 to a common monetary policy, budgetary policies are constrained by common norms and economic integration is likely to be fostered by EMU. It can be hypothesised that common factor extraction should provide better results since the launching of EMU, as economic integration is becoming deeper through the use of a common currency12. The business cycle amplitude might also differ considerably across sectors in a given country. This issue (connected to the problem of the data set composition) will be tackled more in depth later in this paper.
A2 should not be considered either as a neutral criterion. The selected series are not likely to be the same as those resulting from a different criterion for the assessment of signal-noise ratio at various lags. Potential improvements of the model might be explored with respect to this criterion. Out-of-sample experiments nevertheless suggest that forecasts seem to be robust to this rough criterion. With A5, the model departs from a common practice in short-term forecasting which is to use AR terms of the dependent variable13 and to select various factors (according to an information criterion based on the OLS regression of GDP) irrespective of the share of the data variance they explain. In our framework, all series are introduced at various lags in the data set and this feature allows the extraction of some of the relevant autoregressive information in the data. Only the first few factors by decreasing order of eigenvalues are selected. This is motivated by the fact that a remote factor by descending order of eigenvalues explains a smaller share in the data variance and is thus unlikely to be a common factor in the sense that it would account for shocks that are common to all the data. The latter assumption also encompasses the strong hypothesis that it is possible to produce accurate forecasts of GDP solely based on common factors that summarise the economic situation and not on GDP idiosyncratic estimation. In fact, GDP itself is an artificial aggregate that is correlated to business cycle common factors but also contains an orthogonal idiosyncratic component (e.g. noise or seasonal component). The larger the share of the idiosyncratic component variance compared to the variance of its common (business cycle) component, the less accurate the forecasts will be.
◊
Original features compared to standard factor models
Compared to standard approximate factor models, three features are introduced to try to resolve the indeterminacy of the link between factors and the dependent variable and to enhance factor consistency.
11 Altissimo et al. [2001a]
12 This could remain true even if economic integration goes together with increasing regional specialisation, as common
factors are in theory robust to idiosyncratic shocks.
13 Forni et al. [2002] also advocate the use of univariate or low dimensional vector time series methods to forecast the
First, series are introduced in the data set at various lags. The rationale is that leading indicators are expected to contain information about unobserved horizons in the present/future. The introduction of series at several lags is the feature that generally improves most forecast accuracy compared to the standard framework. In comparison with leading indicators equations, there is no ex ante assumption about the optimal lag of a leading indicator in the system (such a strong assumption may not be very robust). Where these leading indicators are lagged, the information becomes coincident and can be used to extract more efficiently the coincident information of other series. This possibility was already put forward by Stock and Watson [1998] in the “stacked version” of their model, but has not yet been implemented to our knowledge. Contrary to standard factor models in which dynamics are introduced at the level of lagged indexes or factors, the dynamic structure of the SLID factor model is encompassed in the original structure of the data set. In other words, the factors themselves are dynamic, not just the link between factors and the forecasted variable. For example, where series are lagged by up to h quarters in the data set, the last values of the factors based on the former series can be used to forecast GDP at a maximum horizon of h quarters with no need to use lagged factors in an appended equation or VAR.
Secondly, the sorting and selection of series according to a signal-noise ratio can improve forecast accuracy and provide more consistent estimates of the common factors. Series that do not reach at a certain lag a minimum threshold of the signal-noise ratio are dropped from the data set. The cross-correlation of the input series with GDP (reference variable) is used as an information criterion like a signal-noise ratio.
Finally, forecasts are derived from factor-based methods that do not require the use of appended OLS equations/VAR systems to bridge factor indexes and the forecasted variable of interest. One of the tested methods is based on the implementation of an augmented EM algorithm deriving the forecasted values of GDP as the estimates of missing values. The algorithm is performed recursively with re-computation of the signal-noise ratio and selection of series at each stage based on the new GDP forecast until convergence. This may also allow for a more robust exploitation of the dynamic cross-correlation structure of the data.
Compared to other standard factor models, a very large data base was built in order to test for consistency. Survey data (EC Business and consumer survey and other relevant national surveys) were introduced with a breakdown by countries and questions. The data set also covers country specific data: series that are neither available for each country, nor constructed on a harmonised basis.
1.2. FORMAL DESCRIPTION
1.2.1. The dynamic factor model
The general framework of the model is factor analysis, the objective of which is to extract common, or summary, information from a large number of series. Following Reichlin [2002], each variable in the model is “represented as a sum of a component which is common to all the variables in the economy and an orthogonal idiosyncratic” component (residual). If most variables display co-movements, a few factors will account for a large share of the data variance. Considering economic time series, the patterns of which usually reflect business cycle fluctuations, it can be expected that a few macroeconomic shocks reflected in the common factors will account for a substantial share of the data variance.
Factor analysis can be used with macroeconomic series in order to extract a “diffusion” index correlated to GDP (our variable of interest), industrial production or inflation. Insofar as this index contains leading information about the variable of interest (or is available on a more timely basis), it can be used to produce forecast.
The formal specification of the model draws heavily on the factor model framework developed in Stock and Watson [1998] and implemented for example in Angelini et al. [2001b]. We refer to these papers for clear and detailed presentations of the theoretical baseline model and focus here on differences.
Series are stacked at various leads14 or lags into the data matrix:
(
−1 0 +1 +2 +3 +4= X X X X X X
X
)
15, (1)where X is the (T,N) matrix of T rows (observations) and N columns (variables). Series assumed to contain leading information at a horizon of i quarters are lagged (shifted by i quarters towards the future) in the matrix X+i. Series can be introduced six times if they are assumed to contain relevant information16 at all leads and lags.
Let us consider a forecast for the horizon h (by convention h=0 for a coincident forecast). The data matrix X is trimmed in the time dimension in order to retain Xh a time window of T quarters from quarter Q+h-T+1 to quarter Q+h, the last observation of which corresponds to most remote quarter Q+h to be forecasted. All the series that display a missing observation (at a given lead or lag) over this time-span are removed from the set Xh (at the relevant lead or lag) leaving Nh variables (columns).
The data are assumed to be generated by a k-factor structure: h h k h k h F U X = .Λ + , (2)
where contains k common factors (columns) and is the corresponding loading matrix (N h k F h k Λ h columns, k rows).
The extraction of approximate factors is based on standard principal component analysis as in Stock and Watson [1998]. The latter paper shows that under some non parametric assumptions, a quasi maximum likelihood estimator for k dynamic factors is consistently obtained by choosing them as the first k principal components of X, corresponding to the eigenvectors for the k largest eigenvalues of the cross-moments matrix XX' (provided the series are standardised). An important condition for obtaining consistent estimates of common factors is that the number of series should be (far) greater than the number of observations, or:
in Stock and Watson [1998]: ln(Nh)/ln(T)→ρ >2 when T,Nh →∞, in Connor and Korajczyk [1988]: Nh →∞ when T fixed.
14 Series introduced with a lead in the data base are rarely used as their last values available (needed to compute
forecast) become attached to a previous observation and no values are available for the last observation and the following observation. In this case, a forecast can only be computed with these series for the previous observation.
15 Note that the splitting of the matrix suits the objective of forecasts performed up to the theoretical maximum horizon
of h=4 quarters. Further stacking might improve the model forecast performance at remote horizons for some variables of interest, for which long leading indicators are available.
Here, common factors summarise the shocks17 characterising the business cycle. The dynamics in this model are introduced at the level of the series’ leads and lags18, and not at the level of an equation with lagged terms or a VAR system connecting GDP to a certain number of factors19.
1.2.2. Forecast computation methods
In the remainder of the paper, “Q”, “Q+1”, “Q+2” forecasts respectively refer to “coincident”, one-quarter and two-one-quarter ahead forecasts. Coincident GDP forecasts are produced less than 3 months before the release of Eurostat GDP flash estimate20, Q+1 GDP forecasts between three and six months ahead of Eurostat GDP flash estimate release and Q+2 GDP forecasts between six and nine months ahead.
Principal component analysis allows the estimation of a few unobserved common factors as driving the euro-area business cycle and GDP. Beyond the first factors, the other principal components (eigenvectors) are not latent factors in the sense that they reflect only idiosyncratic shocks and do not provide relevant information with respect to the euro area business cycle. A common issue for all methods, which is the determination of the optimal number/selection of factors, will be treated in the next section.
If coincident and future values of the first factors can be estimated, it is possible to derive forecasts at the same horizon for the common component of euro-area GDP (neglecting the idiosyncratic component of GDP in the model). The general idea is that factors can be estimated without the euro-area GDP series in the data matrix. They nevertheless provide information about the business cycle that is shared with GDP. Coincident or forecasted values of GDP can thus be derived from coincident and leading values of the factors. More precisely, the time displacement of the variables leading GDP by at least i quarters allows one to derive i-quarter ahead forecasts of GDP in the same way that coincident variables21 allow us to produce a forecast of coincident GDP.
Three different methods are presented in the following developments. For the sake of simplification, the case of the GDP coincident forecast is described first. The problem of the estimation of one-quarter ahead and two-quarter ahead forecasts is an extension of the same problem. Generally speaking, it should be more efficient to use the estimate for the previous quarter insofar as it is more reliable than other alternatives (i.e. contains more information than the estimate for the same quarter, which could be jointly computed with the estimate for the next quarter22). A simple solution is therefore a recursive computation of forecasts for quarter q+i (i>0) based on forecast(s) obtained for quarter(s) q+i-j (j<i)23.
17 The assumption is that series leading GDP by h quarters (which are displaced in the database by h quarters into the
past) would normally reflect shocks to the business cycle h quarters ahead.
18 For a relatively similar treatment of dynamics at the level of factor extraction but in the domain of frequencies, see
Forni et al. [2002].
19 See for example: Stock and Watson [1998, 1999], Camba-Mendez et al. [1999], Angelini et al. [2001a].
20 Let us recall that Eurostat GDP flash estimate is released about 45 days after the end of the quarter estimated, hence,
the use of the term coincident for information available shortly before its release.
21 Available with greater timeliness than GDP.
22 In the present case, the data matrix includes about half the amount of variables for the one-quarter ahead forecast
compared to the coincident forecast, or for the two-quarter ahead compared to the one-quarter ahead. It is straightforward to conclude that a coincident forecast computed with half as many variables is likely to be significantly less reliable ceteris paribus.
23 For example, provided that the coincident value of GDP is considered to be correctly estimated, the one-quarter
◊
◊
◊
Projection of the GDP series on the first few factors estimated without the last observation of GDP (P method)
The most simple method is a projection of GDP on the first few factors derived from principal component analysis performed on the database without euro-area GDP. Coefficients of the linear combination of the factors are equal to the loadings of GDP in the principal component analysis performed excluding the last observation24. The performance of this type of projection should be close to OLS regression (under the assumption of factors' orthogonality). However, spurious results may be obtained with OLS due to the small size of the sample (28 observations). This method does not allow the exploitation of the cross-section information provided by the last observation in the database.
Factors are used coincidentally and selected in the empirical implementation by descending order of eigenvalues25. An alternative could be to project GDP on factors which are possibly lagged. This possibility was not explored as the emphasis was put on a simple and parsimonious framework.
EM algorithm: forecasts derived as missing values in the data set (EM26 method)
Broadly speaking, the idea is to calibrate GDP missing values (forecasts) with end of sample cross-section information based on the EM algorithm. The EM algorithm in econometrics is designed to solve attrition problems, e.g. to estimate missing values usually at the beginning of truncated time series. The first estimation step of this algorithm is identical to the P method. It consists of the estimation on one hand of the coincident or leading (corresponding to the future) values with factors derived from the data set excluding GDP and on the other hand of GDP loadings computed with the data set excluding the observation(s) for which GDP is missing. In a second maximisation step, the factor extraction through principal component analysis is performed on the full data matrix, and the GDP series augmented by the estimate obtained in the first step. This step is repeated until convergence is achieved.
Convergence was in general achieved rapidly in our out-of-sample exercise, albeit less so in the case of specific quarters (2001Q4) plausibly due to breaks in survey series. Even if the initial estimates (derived from the first step) were very poor, the first iteration generally corrects for most of the error and the next iterations converge to an optimum which is in principle independent from the initial value.
Augmented EM algorithm: Selection27, Estimation, Maximisation (SEM method)
In the framework of the EM method (as well as in the case of other methods indeed), a second stage can be usefully performed. Based on the forecast obtained in the first stage (with the EM
24 As no coincident value of GDP is available, the principal component analysis cannot be performed with the last
(coincident) observation.
25 The loading of GDP (which corresponds to the cross-correlation of GDP with the factor) is generally lower for the
factors following the first factor, as they are orthogonal to the first factor, which is itself highly correlated to GDP. Another optimisation strategy would be to select factors according to their respective GDP loading. However, some tests showed that it was more robust to select factors corresponding to a high eigenvalue than to a high GDP loading. Other more complex schemes combining decision rules for loadings and eigenvalues could also be envisaged, but this did not seem fruitful insofar as with small sample and out-of-sample spans the risk is high that results are not robust. Note that such schemes would not be consistent with the underlying assumption of the model, that the latent factors should be associated to the highest eigenvalues (see previous section).
26 EM for Estimation, Maximisation. See detailed description, for example, in Stock and Watson [1998]. 27 The selection step is based on the ex ante signal-noise screening of the series.
algorithm28), the correlation test can be computed again with the coincident forecast added to the series of GDP. The correlation computation takes into account this time one additional observation of each series and GDP (corresponding to the forecasted value of GDP). A new selection of series thus takes place and the EM algorithm can be performed again with the new series’ selection. All these steps can be iterated until convergence in this second stage. These iterations enhanced forecast accuracy for Q and Q+1 forecast. Convergence was generally achieved in less than 10 iterations of the second stage with three factors (it was slower with a higher number of factors).
Recursive estimation for Q+1 and Q+2 forecasts ◊
For Q+1 and Q+2 forecasts, the best option seemed to be the recursive use of forecasts obtained previously for the coincident and, if need be, one-quarter ahead horizons (i.e. in the case of Q+2 forecasts)29, both considered as observed values. Forecast errors on previous estimates can be expected to spill over into forecasts for more remote quarters. In practice, it was observed that the algorithm converged to a recursive q+i forecast value that is not considerably correlated to the q+i-j forecasts (with j>0) used.
For forecasts at remote horizons, the SEM method deteriorated the accuracy of forecasts obtained with EM in the first stage, where the accuracy of the latter forecasts was not sufficient to yield a more efficient selection of series. The explanation is straightforward: if the forecast is accurate, the estimation of the series recent correlation with GDP is improved where the forecast is used, otherwise a bias is introduced in the estimation of the correlations and series’ selection. In our out-of-sample experiment, such bias seemed to appear with Q+2 forecasts, that is with a RMSE higher than 0.20 percentage points for the forecast obtained at the end of the first stage (EM algorithm).
1.3. FACTORS SELECTION
One of the major issues with factors models is which and how many factors to select. Following Connor & Korajczyk [1993] and Stock and Watson [1998], the following standard information criterion (adapted BIC) was used in order to assess the efficient number of factors:
∑∑
= = Λ −Λ + = N i T t t t T T K F GDP NT Min IC 1 1 2)) ln( ) ) ( 1 ( ln( ) ,where K is the number of factors and Λ corresponds to the linear combination of some estimated factors that minimises the least-square criterion function.
Without filtering30, the adapted BIC suggests that the first three factors by descending order of eigenvalues provide the most efficient common factors' combination (based on the average over the out-of-sample time span, which is the lower figure in each column). The out-of-sample RMSE for Q to Q+3 forecasts is indeed lower with three factors.
However, the result is not the same where series are filtered. For example, with a correlation threshold of 0.2, five factors for Q and Q+1 forecasts are suggested by the criterion. This
28 As well as with any other method.
29 Other options could be envisaged. For example, one could use previous forecasts as starting values, but re-estimate
with the EM algorithm all forecasts i.e. estimate GDP loadings and factors with the most leading series for all forecasted quarters. In the second stage, one could perform the selection step with previous forecasts and run the EM algorithm again on all forecasted quarters.
prescription is at odds with out-of-sample results, which are considerably worse with 5 factors based than with 3 on the SEM algorithm.
Due to the specificity of this factor model, the use of a standard criterion for factor selection does not seem reliable where series are filtered. Several reasons might explain this. The criterion was not calibrated with filtered series and the penalty term is likely to be inappropriately scaled. Moreover, the criterion is based on OLS, i.e. a simple projection of GDP on the factors in order to assess the information content of the factors. As such, the criterion can be valid in the framework of this model with the projection method, but not with the EM or SEM methods that extract information more efficiently from the sample. Indeed, it can be checked that the prescriptions of the BIC are generally in line with out-of-sample results with the projection method.
The number or factors derived by the BIC was also compared to the monitoring of the eigenvalues' decay31, which suggested between 2 and 4 factors according to the quarters considered. But the Bai and Ng [2002] criterion did not provide consistent results with our very large database. Further research about an alternative information criterion for this type of model would obviously be useful.
1.4. COMMON FACTORS AND FORECAST INTERPRETATION
1.4.1. Factors reading
Factor extraction is sometimes considered to be a “black-box” tool. However, the interpretation of factors is simple but is relatively heavy only because of the size of the database. Forecasts can be more easily explained than with VAR systems for example.
Let us reconsider the basic factor equation (2): h
h h
h f U
X = ⋅Λ +
Each series is expressed as a linear combination of the factors and the loadings are the “coordinates” of the series in the factors base. There is a direct link between each input series on one hand and, the variables loadings and the factors on the other hand. Unfortunately, it is not possible to retrieve factors from basic projection operations on the variables. The information supplied by the common factors has to be retrieved through an indirect examination of the correlations of the series and the factors (correlations correspond to the loadings).
1.4.2. Forecasts interpretation
Interpreting and explaining a forecast is not more difficult. The last observation of GDP (the forecast) can be obtained as the sum of the weighted factors last observations, each weight corresponding to a loading. As the algorithm is usually performed with the first three factors in our out-of-sample test, it is sufficient to concentrate on the first three loadings. Thus, any forecast can be directly read as a combination of the first three factors, and the forecasts interpretation problem is only rooted in the explanation of the factors.
Compared to standard OLS regressions, it is not possible to make a quantitative assessment of the original series’ impact on the forecasts due to the factors interpretation step. It is nevertheless possible to connect (through the factors) the main groups of input series to the forecasts and make a
qualitative assessment of their impact on the forecasts, because series of the same group (type) tend to be highly cross-correlated with each other.
For each of the first few factors, variables can be sorted according to their correlations (loadings) with the factors32. The signs and magnitudes of the last observations of the best correlated variables have the highest explanatory power for the forecast. Given that series are correlated by blocks in the data set (each block generally corresponding to each type of data), it is sufficient to examine a few series of each block to form a qualitative assessment of its influence on the forecasts. In practice, one can examine the last observations of the series in each block that are best correlated with the first few factors in order to draw a explanatory table such as the following one (fictitious example):
Data categories / Forecast horizons Q Q+1 Q+2
Financial sector Ü Ü Ü
Legend: Impact on growth forecast
Short term employment, vacancies Ü Ü Ü
× very positive,
Interest rates, credit Ü Ü Ü
Construction Ü Ü Ü Ü positive, Industry ÖÜ Ö Þ Ö neutral, Producer prices Ö Ü Þ Þ negative, Consumer confidence ÖÜ Ö Ø very negative. Services Ö Ö Unemployment Ö ÖÞ Ö
Export prices, effective exchange rate Ö ÖÞ Ö
Consumer prices Þ Þ Þ
32 Attention must be paid to the sign of the correlation between the series and the factor, and between the factor and
GDP. For example, if both signs are negative, a large positive last observation for the series should translate into a negative reading for the factor and a large positive reading of the forecast if the variable has a strong correlation with the factor.
2.
DATA SELECTION AND PROCESSING
In this part of the paper, the large database that was constructed to perform the GDP forecasts is presented in more details, together with a precise description of the data processing. Some issues regarding the data treatment performed in the framework of the SLID factor model (that departs from standard factor models) are also raised.
2.1. DESCRIPTION OF THE DATABASE33
France, Spain, Italy and Germany are the countries most represented in the database (the latter being slightly underrepresented compared to its share in euro-area GDP due to data availability constraints). The number of series for each of these four countries is approximately equal to the number of euro area series. At this stage of the work, data from non euro area countries have not been included (apart from very few exceptions). Business cycles in non EMU European countries (especially UK, SE, and DK) and also countries with which the euro area has significant trade flows (e.g. US) are not disconnected from those of euro area countries. Therefore, non-euro-area countries' series might contain relevant leading information for this type of model34. However, they are also likely to contain more noise, which might prove difficult to cleanse because of potentially less stable leads and lags to euro-area GDP. For this reason, an extensive incorporation of such data would require additional tests.
Survey data account for about half of the database, followed by output data (industry, construction) and international trade data. Sales data (retail and wholesale) represent a smaller share in the database (for availability reasons). Most of the (survey) series are available from 1990 onwards, except for services. However, as explained in the last section, the most useful series (because of their timeliness and correlation with GDP) are often shorter. Reuters PMI series were also introduced even if services series were not found to be helpful in the out-of-sample exercise due to their short length.
The core component of the database is the monthly and quarterly Business and consumer survey of the European Commission [2003]. Available balances were broken down to the level of each question of the five surveys (industry, retail trade, construction, service, and consumer) and to the twelve countries, as well as the aggregates for the euro area. The euro area aggregates are not redundant insofar as balances are not available before 1995 for some countries (for example those joining the EU in 1995). The euro area balances incorporate all the available information in long series, thanks to a changing weighting scheme over time. On the one hand, balances capture some information pertaining to these truncated series (before 1995). On the other hand, such data are relatively noisy in part because of the aggregation process, but factor analysis is likely to partially correct for this unwelcome feature. The implementation of factor analysis showed that the broken-down balances contain more information (especially leading) than the aggregates. In particular,
33 For the full list of about 2000 series, please directly refer to the author.
34 The euro area business cycle is not derived from an aggregation of euro area countries but from common patterns
displayed by national series. If there are also commonalities with extra euro area countries, data from the latter countries could be used in accordance with the model design.
factor analysis allowed the disentangling of shocks of different sorts within a given balance35. Some selected country-wise survey results were added when they provide additional information, for example, IFO industrial orders survey, INSEE service survey36 or the Bank of France credit survey. The second largest component is trade data in value including exports, imports and trade balance data broken down by countries and broad economic categories. The use of volumes would be preferable but trade deflators are available only with a considerable lag (and are affected by considerable noise), which makes the data obsolete for use in short-term forecasts. Empirical tests have suggested that these data tend to have a significant impact on factor estimates for remote forecast horizons. The reason lies in the high correlation of some exports data with euro-area GDP. Imports are also included: as exports and imports deflators do not respond in the same way to nominal exchange rate fluctuations, the common component of both groups of series is likely to be closer to the common component of volumes. Only extra euro-area trade data were used in the out-of-sample exercise. Intra euro area trade data could also be tested37.
Industry and retail output data were broken down by sector and type of goods, and selected on a case by case basis when relevant. Some degree of judgement has to be exercised both on economic and statistical grounds. First, series about a sector, which is economically not relevant for a given country economy should not be too numerous, insofar as they increase significantly the data variance. Secondly, if a variety of series about a given country display different profiles and correlations to GDP, selecting all of them is likely to put considerable weight on this data compared to other countries. Data about housing starts were also selected when available. To illustrate again the advantage of disaggregating, car sales are notoriously poorly correlated with the overall economic activity because of large stochastic fluctuations, whereas retail (other than car) sales, which are also quite volatile, are nevertheless more reliable indicators for household consumption. Nominal data included cover industrial and retail prices, monetary aggregates, stock markets indices, nominal exchange rates38. Interest rates are poorly correlated with economic activity over the nineties in Europe39. Generally speaking, although financial data are abundant, they were used parsimoniously because cross-correlation to real activity is unstable.
Detailed data about the labour market, where available, together with business start-ups, bankruptcies, and staffing companies' turnover40 were included. It is noteworthy that the details in labour market statistics are not homogeneous across countries. For example, Spain, France or Belgium publish detailed and timely available data about labour markets.
35 A further breakdown to sectors might enable forecasters to uncover even more information, for example for the
extraction of an index from survey data only.
36 But some balances in services surveys conducted in France, and to a lesser extent Reuters' services survey, exhibit
outstanding cross-correlation with euro-area GDP.
37 The main advantage of intra euro area exports is that such data in value terms are less affected by exchange rate
fluctuations. But, intra trade data tend to dominate the other groups of data where they are used in principle component analysis because of their higher correlation with GDP.
38 Nominal effective exchange rates (NEER) display a better correlation with GDP than real effective exchange rates
(REER) that is the reason why the latter were not taken into account. A statistical explanation could result from the fact that estimates of REER are rarely accurate, especially when they are available in a timely manner, since international trade deflators are difficult to obtain at a high frequency for all countries.
39 This is most likely the result of significant monetary policy shocks throughout the nineties, and also because of the
channels of transmission of monetary policy in continental Europe.
40 Such very timely available data in the case of France and Belgium exhibit outstanding correlation with euro-area
GDP. Temporary work statistics in France include timely available but shorter survey series about staffing companies' order books.
Financial data were enriched with implied trading volatility on CAC40, DAX20, Eurostoxx50 and AEX. The ratio of put contracts/call contracts (open interest volumes) on these indexes can also be used in level terms (these series are stationary). The lead of this type of data can extend to two to three quarters and their cross-correlation with GDP is much higher than that of stock indexes. Yield curve spreads were also introduced.
Finally, the use of several series which start in 1999 was not envisaged at the beginning of our work. However, some experiments suggest that the model can be also used with shorter time series of six years span, which means that in the course of 2005 many more series relating to the euro area can be usefully added to the database. Corporate bond spreads series are among such series.
2.2. SUMMARY OF THE DATA TREATMENT PROCEDURE
(a) A very large set of data covering the twelve countries of the euro area and euro area aggregates was built. The criterion of selection is judgmental at this step: the selected series are assumed to contain information about the current or future economic situation and therefore to be possible leading indicators for the euro area economy. The emphasis is on good quality data in terms of accuracy and information content about the business cycle. A selection is made when many series are available for the same indicator. All the data are not be used at the final stage of factor extraction, only a subset of data, which contains the most relevant information at a given point in time is used. This subset is thus bound to change over time.
(b) All series undergo a homogeneous transformation: they are seasonally adjusted with Tramo Seats (except where only adjusted data41 are available), converted to quarterly frequency, transformed into first difference42 (simple difference if data can be negative, otherwise difference in logarithm) in order to be made stationary. The real time use of the model required a special treatment of monthly data. When only one or two monthly values are available in a given quarter, a substantial loss of information would ensue if the series could not be used. Quarterly averages were thus replaced by rolling three-month moving averages. In other words, it is as if the quarterly average series is lagged by one or two months in time. Since factor analysis is performed on the whole series displaced in time, only the relevant (common) coincident information is extracted. (c) Series are stacked at several leads and lags43.
(d) Indicators with low cross-correlation with GDP at a given lag44 are dropped at this lag.
(e) Indicators for which no data are available for the quarter needed to compute forecasts are also dropped45 (for example, a coincident forecast for 2003Q2 is computed with 2003Q2 values of series introduced coincidently, 2003Q1 values of series introduced with one quarter lag and 2002Q4 values of series introduced with two quarters lag, etc.).
41 This refers in particular to the data from DataStream. In others words, whenever non-adjusted series are available,
these series are preferred and are seasonally adjusted using Tramo-Seats.
42 Except for a very few series, obviously stationary, that are introduced in level terms (namely ratios of put contracts
open interest on call contracts open interest).
43 The fact that series are stacked at several leads and lags in the data set allows the extraction of the common
coincident component and the putting aside of the idiosyncratic lagged components.
44 This threshold was determined empirically in the out-of-sample exercise.
45 Series introduced with a lead will be dropped except during the 15 days preceding the release of Eurostat Flash
(f) The forecast computation is derived from an optimisation scheme based on the first few principal components of the selected series.
2.3. THE ISSUES OF DATA LAGGING AND SORTING
The introduction of series with leads or lags in the data set and their sorting is a possibility that was not, to our knowledge, empirically tested in the framework of the Stock and Watson [1998] methodology. However, data lagging seems to be used in Forni, Hallin, Lippi and Reichlin [2001, 2002] dynamic factor model46 in the frequency domain.
Static factors (with no lag in the input variables) cannot as such capture dynamic cross-section covariance information in the data set, as the main factors would generally correspond to the main categories of data47 (output, prices, interest rates, etc.) This does not allow an optimal extraction of leading information purveyed by the data. Factors extracted from coincident series have to be introduced using lags in an appended OLS equation to retrieve part of this information, with the risk of adding spurious correlation or noise. The introduction of the input series at several lags allows a more robust extraction of this leading information.
This issue of sorting/selecting is more difficult to justify theoretically. The standard model is assumed to produce consistent factors under the main condition that the number of series is far greater than the number of observations. Recent research focuses on conditions in which consistency is not achieved48. Real data features (especially cross-section correlation in idiosyncratic components) can cause trouble. Intuitively, the risk is that one of the first few factors might capture part of this idiosyncratic variance of a particular type of data. The sorting and selection of data based on a signal-noise ratio provides a potentially useful, but certainly not sufficient, tool for avoiding this unwelcome phenomenon. The selection of data based on cross-correlations with GDP might naturally be detrimental to forecast accuracy if this criterion is too rough. Empirical implementations suggested that our information criterion based on a correlation level seemed efficient, but it could certainly be enhanced.
2.4. THE ISSUE OF THE DATA SET COMPOSITION
Forecasting with factors models should normally not be subject to model uncertainty in the sense that all potential relevant variables can be incorporated to the model. However, specification error might occur as with any model: factor extraction from a basket of irrelevant indicators for the variable of interest will produce biased forecasts. The problem might be amplified by the use of approximate factor extraction through principal component analysis, which has different properties from those of the maximum likelihood method49. As such, there is no guarantee that the eigenvectors obtained effectively correspond to the (latent) common factors sought, even when N is very large, if a subset of series in the database exhibit peculiar cross-correlation features.
46 The Eurocoin indicator published by the CEPR is derived from this methodology. 47 See Angelini et al. [2001b].
48 Boivin & Ng [2003].