7. Guidelines on key reporting practices
7.2 Presentation of series breaks
252. A time series is a set of regular time-ordered observations of a quantitative characteristic of an individual or collective phenomenon taken at successive periods / points of time. Normally, though not always, time series are a set of observations, at constant intervals (months, quarters, years, etc.). The continuity of a time series not only implies that the observations are continuous over time but also that the same definitions, classifications, processes, etc., have been applied in the collection and compilation of each observation.
253. The application of inconsistent definitions and classifications, etc., for each observation over time, in theory, constitutes a measurement break in a time series (referred to as a “series break” hereafter). There are two different reasons for a time series break:
• measurement changes such as changes to classifications, estimation methodology, survey scope, etc.;
• real world changes induced by real events such as government policy (e.g. the introduction of a new tax), war and natural disaster, etc.
It is necessary to distinguish these two types of time series break. In this Handbook the term “series break” is used in the context of a measurement change induced time series break.
254. However, in reality not all changes to concepts, etc., constitute a series break that has a significant impact on the use of the time series. Statistical agencies responsible for the collection of data frequently apply changes to questionnaires, registers, concepts, to their monthly, quarterly and annual collections, many of which have no appreciable impact on the continuity of the series. Changes to annual collections, in particular, are a fact of life.
255. Statistical agencies, analysts and government agencies use time series data for economic and social research, and business cycle analysis to interpret current economic events. Statistical agencies require long time series to carry out seasonal adjustment and calendar effect correction. Time series are also fed into models to produce projections and forecasts about future economic and social conditions. For these reasons, users in national agencies and international organisations attach very high importance to time series continuity. In fact, such continuity within a series is often considered of greater importance than comparability between countries.
256. However, the uses of time series statistics outlined above are frequently hampered by series breaks or shortness of the series length. The main causes of series breaks are similar to some of the reasons for revising data described in Section 7.1.1 above, such as:
• changes to the base year which may co-incide with updating of the weighting system which in turn may involve changes in the sample of respondents and the sample of products; and
• the implementation of changes in concepts, definitions and classifications, methodology, sampling, estimation.
257. To a large extent, these factors derive from within the statistical agency responsible for the initial compilation of the data and are usually intentional (US Bureau of Labor Statistics 1996). However, some changes stem from external influences that may be outside the control of the statistical agency, in particular, where the data are derived from administrative sources. These include changes in
laws or administrative procedures, changes in the organisational structure of business through mergers, etc.
7.2.2 Approaches to minimising the impact of time series breaks
258. National statistical agencies normally attempt to minimise the frequency of series breaks, and when they occur, use a number of approaches to reconstruct series based on the new concepts, classification, etc.
• The most commonly used approach involves the compilation of the series using both the old and new methods, classifications, etc., for a specified period around the time of implementation. However, the high cost of compiling dual series severely restricts their availability and length. The availability of dual information enables an objective measure of the impact of change to be assessed and perhaps a concordance between the new and old series at the time of the series break. The concordance “coefficient” so calculated may be used to splice or link the series break. Caution is required in the application of such coefficients to the historic time series as it is only really applicable over the time dual series were compiled. It may not reflect the economic or social reality of the entire historical series (BEA 1993). The difficulty is determining when or how far back the conversion coefficient ceases being accurate.
• Alternatively, agencies may refer back to highly disaggregated data (or even unit record information) and recompile the series based on the new methodology, etc. In practice however this approach is also very labour-intensive and may only be possible for key highly aggregated series (OECD 2000). Finally, historical estimates may be made on the basis of a related indicator that exhibits the same or similar changes over time as the series where the series break occurred. 7.2.3 Recommended practices for the presentation and reporting of information about series breaks
259. Recommended practice with regards to time series breaks entails:
• The compiling agency taking all possible steps to avoid and minimise changes to questionnaires, definitions and classifications used to collect and compile data. Methodologies should be developed to reduce the frequency of revisions.
However, there comes a time when the time series may be disrupted even when outdated classifications, concepts and questionnaires are maintained. In such instances a complete break in series may be preferred to series that continue to be collected on the basis of outmoded classifications and concepts that do not approximate reality. There is clearly a tradeoff between costs imposed by breaking a time series on one hand and the benefits from improving the relevance of the time series on the other (BEA 1993).
• Where significant breaks in a time series are unavoidable, users should be given warning well in advance of the implementation of the series break outlining the timing of implementation and a detailed explanation of the reason(s) for the change. “In advance” is taken to mean not just the time of implemention but sufficient time to enable users to implement modifications to their systems, programmes or databases and to seek further clarification if necessary. A common practice adopted by many statistical agencies is to issue a detailed discussion paper many months in advance of the change.
DATA AND METADATA REPORTING AND PRESENTATION HANDBOOK - ISBN 92-64-03032-8 - © OECD 2007 106
• Actual breaks in the series should be clearly identified in both the statistical table and any accompanying graphs. A variety of methods are commonly used by national agencies and international organisations to highlight in tables that a series break has actually ocurred. These include the insertion of a line in the table at the break point, inclusion of a footnote or tabular presentation as an entirely new series. Whichever method is adopted, the main point is that the break is completely clear to users. Consideration will also need to be given to the identification of series breaks (together with appropriate explanatory information) in data disseminated electronically such as via on-line databases, etc.
The following information drawn directly from Eurostat guidelines should also be provided (Eurostat 2003c, p. 16):
o the reference period of the survey where the break occurred;
o whether or not the difference reported is one-off with limited implications for the time series and / or if the reported change led to harmonisation with any standards;
o a precise outline of the difference in concepts and methods of measurement before and after the series break;
o a description of the cause(s) of the difference, e.g. changes in classification, in statistical methodology, statistical population, methods of data transformation, concepts, administrative procedures with regard data from administrative sources;
o an assessment of the magnitude of the effect of the change, where possible, with a quantitative measure.
Links and references to more detailed information should also be provided.
• Points in line graphs should not be joined across discontinuities in data. The reason for the series break should be explained in a footnote accompanying the graph with appropriate links or references to more detailed explanations of the causes of the breaks.
• When methodological changes are introduced, an attempt should be made to revise the historical series as far back as data and available resources permit. Ideally, such backcasting should extend back 2-3 years to reflect the new methodology, etc.
260. An example of recommended practice of the systematic presentation of metadata on changes over time is provided by Statistics Canada’s Integrated Metadatabase (IMDB) referred to in para. 174 above. That NSI’s corporate metadata repository incorporates the time dimension in the metadata model used which allows users to systematically view changes to variable definitions, classifications, methodology, etc., both in summary form and with links to very detailed information regarding each change. An example of a summary of changes over time to the Canadian Labour Force Survey is provided below in Figure 11.
Figure 11: Example of provision of metadata on changes over time: Statistics Canada’s Integrated Metadatabase (IMDB)
Source: Statistics Canada, Integrated Metadatabase (IMDB), available at
http://www.statcan.ca/cgi-
bin/imdb/p2SV.pl?Function=getMainChange&SurvId=3701&SurvVer=0&InstaId=13986&SDDS=3701&lang=en&d b=IMDB&dbg=f&adm=8&dis=2
DATA AND METADATA REPORTING AND PRESENTATION HANDBOOK - ISBN 92-64-03032-8 - © OECD 2007 108