The Competitiveness Research Network (CompNet)

3.3 Cross-country and matched datasets in Europe – overview

3.3.2 Examples of cross-country (and) matched datasets in Europe

3.3.2.2 The Competitiveness Research Network (CompNet)

CompNet is a network set up by the European Central Bank (ECB) in March 2012 that includes all national central banks within the EU. International organisations also participate. In addition, international scholars specialising in competitiveness issues support the Network.

CompNet is meant to improve the existing frameworks and indicators of competitiveness in all dimensions (macro, micro and cross-border). Additionally, the Network is trying to establish a better connection between identiﬁed competitiveness drivers and resulting outcomes (trade, aggregate productivity, employment, growth

and welfare) also by building a bridge between micro and macro analysis, in order to support the design of adequate policies.

On the micro level, the research conducted within the Network has conﬁrmed the importance of ﬁrm-level factors (such as size, ownership and technological capacity) in understanding the drivers of aggregate performance. It has also developed a centralised project to compute cross-country homogenous indicators of labour and total factor productivity, and analyse the role of resource reallocation in increasing aggregate productivity.

CompNet is organised in three work streams related to: 1. Aggregate measures of competitiveness;

2. Firm-level studies;

3. Global value chains (GVCs).

One of the main policy questions addressed by CompNet is how aggregate productivity can be enhanced. As discussed earlier, a thorough analysis of competitiveness in different countries is best done by using firm-level data because firms are very heterogeneous. Therefore, information on firm-level drivers of competitiveness is being lost when working with country- or sector-level aggregates. However, because of confidentiality restrictions, the necessary firm-level datasets are not readily available in different countries. Nevertheless, in many European countries the micro-level data can be accessed from within the respective countries. Exploiting this fact, CompNet has opted to employ the Distributed Micro-data Approach (DMD) (see section 3.1.6) in order to compute different indicators of competitiveness at the micro level.

As such, CompNet has created an active network of country teams that independently run a common algorithm to compute a large number of competitiveness indicators. The CompNet ﬁrm-level indicator database is superior to others available because of: (i) coverage (58 2-digit, NACE Rev. 2, manufacturing and non-manufacturing sectors in 13 EU countries); (ii) time horizon (2002-2010), since it includes the recent boom- bust cycle and (iii) cross-country comparability. The ﬁrst round of the so-called Do-File exercise has been completed and the second round is underway. Research output of the network can be accessed via:

3.3.2.3 Combined ﬁrm data in Germany (KombiFiD)

The German KombiFiD project was a feasibility study to assess the potential, the obstacles and the benefits of matching official micro-level data from different institutions in Germany, also with regard to a future replication of such an effort on a larger scale or in different contexts. A unique business micro-dataset (also called KombiFiD) was created. This effort with the resulting unique new business micro-

dataset was expected to provide “enhanced information background for

entrepreneurial decision-making” and to reduce the “respondent burden for businesses in official surveys and notification procedures”(see http://fdz.iab.de/en/ FDZ_Projects/kombifid.aspx). By matching data on firms from different sources, it was also expected to gain additional information, eg for scientific research or for policymakers, by combining information formerly only available separately. The project started in January 2008 and finished at the end of 2010, with the dataset for researchers released in early 2011 (see Biewenet al, 2012, for an overview).

The micro-data involved includes both survey and process-generated data. In particular, several Federal Statistical Office datasets were used such as theBusiness Register, theCost Structure Survey, different tax statistics and theStructure of Earnings Survey. From the Federal Employment Office, theEstablishment History Panel (BHP) has been added to the study and the Deutsche Bundesbank provided their firm-level database on ‘Foreign Direct Investment Stock Statistics and Financial Statements’. For a complete list of datasets and for more detailed information on these datasets see http://fdz.iab.de/en/FDZ_Projects/kombifid.aspx.

A major challenge of the KombiFiD project was that German legislation (ie the Federal Data Protection Act) in principle does not allow the linking of the micro-level data of businesses or individuals without the explicit written consent of the affected firms or individuals. Thus, although the technical process of matching the data (ie linking the information contained in the different datasets by using common identifiers) has been quite straightforward, the requirement to obtain consent of the firms involved generated a high level of complexity. As it was not possible to include all businesses in Germany, a sample of 54,960 firms was selected. For a detailed description of the selection of the sample see Gruhlet al(2012, p. 7f).

These firms were asked for their consent to matching the available information in the respective databases. From that sample, nearly 31,000 firms responded, and 16,571 responses were positive, corresponding to an acceptance rate of 30.7 percent (see Vogel and Wagner, 2012, p. 3). The information from the different datasets on these

ﬁrms was then matched using the available common identiﬁers, and is used as the KombiFiD dataset.

Technically, the linking of the information from the different datasets was realised via common identifiers jointly available across the different sources and via record linkage techniques. The basic dataset for linking data from the Statistical Offices and the Federal Employment Office is the Business Register, which has been constructed since the 1990s in Germany (and in other European countries due to EU legislation68_{). The} Business Register contains several firm identifiers: a unique Business Register ID, the establishment numbers of all corresponding establishments and tax numbers (see Gruhlet al, 2012, pp. 10-15, for a detailed assessment of this matching process).

Matching data from the Deutsche Bundesbank was less straightforward. As no common identiﬁers are available between the datasets described above and the data to be used from the Bundesbank, record linkage techniques based on the ﬁrms’ names and addresses were used (see Koch and Neugebauer, 2014, for a more thorough description).

The resulting KombiFiD dataset contains all the information from its constituent datasets for the ﬁrms which agreed to the matching of their data. A detailed description and lists of variables are available in Gruhl et al(2012, pp. 21-85). The data is accessible to external researchers in a weakly anonymised version69_.

In general, a broad range of issues can be examined using the KombiFiD data. Up to now, however, the dataset has been only sparsely used in economic and statistical analyses. Exceptions are the papers by Wagner (2012 and 2012a) and Vogel and Wagner (2012), whereas only Wagner (2012a) goes beyond methodological aspects. This relatively scant utilisation of the potentially very rich KombiFiD data can ﬁrst be attributed to the fact that the data has been made available to the public only quite recently. With regard to the analysis of competitiveness, the dataset contains a comprehensive set of variables from the diﬀerent sources allowing evidence to be generated on, inter alia, growth, productivity, trade or employment.

It may, however, also be attributed to the fact that the data has some major drawbacks: ﬁrst and foremost, it has to be pointed out that the use of the KombiFiD data was

68. Council Regulation No. 2186/93.

69. This type of anonymisation means that some variables, eg regional and sectoral identiﬁers, are only available in an aggregate form.

restricted until 31 December 2014 which made the serious utilisation of the data very difficult. To our knowledge, the data has to be erased completely from the servers of the data providers after that date, thus making research projects or even working papers nearly impossible as results cannot be verified after that date. Another serious drawback of the data itself is that no information is available about the firms from the original sample that refused consent for their data to be matched for the project. This results in no information on a potential selection bias, making thorough analyses hard to realise.

Wagner (2012) and Wagner and Vogel (2012) performed tests on the quality of the KombiFiD sample for the manufacturing and the service industries on the basis of data from the Statistical Oﬃces. They come to the conclusion that the quality of the KombiFiD sample can be regarded as high only for the former West Germany, whereas for the former East Germany an assessment of quality is not possible because of the small sample size.

Ultimately, the KombiFiD project was a huge and ambitious eﬀort with very meaningful objectives, ie creating a ‘new’ dataset building on existing information and thus sparing ﬁrms from participating in further surveys. The expectation was also to evaluate the future potential of similar projects.

The expectations have only partially been met, and the main drawbacks can be traced back to existing legal regulations preventing deeper cooperation or even exchange of data between data providers. Although a relatively large sample was used for the survey, even taking into account the need to obtain consent from the selected ﬁrms, there was a relatively high response rate and a high acceptance rate of more than 30 percent. Nevertheless strict regulations prevent reasonable use of the data: ﬁrst, the limited time window of opportunity for using the data is a problem, and, second, the unknown nature of the potential selection bias.

In summary, the KombiFiD project generated much new knowledge on the technical aspects of data matching, experience with regard to ﬁrm behaviour and practical knowledge about cooperation between diﬀerent data-providing institutions. Hopefully, future projects will be set up in order to proceed in this promising direction.

In document Mapping competitiveness with European data Bruegel Blueprint 23, 6 March 2015 (Page 108-112)