Dynamic Panel Data (DPD) Estimators - Table 6:5 SME definition by EU (Commission Recommendation

Table 6:5 SME definition by EU (Commission Recommendations, 2003)

6.3.1.4 Dynamic Panel Data (DPD) Estimators

An alternative is provided by an array of papers by Harris and Moffat. They argue that if one estimates 𝜔𝑖𝑡 without other variables that potentially may derive a biased estimation because of an omitted variable problem. They tend to include all variables that may cause changes in productivity and estimate the lagoritmic Cobb Dauglas productivity estimation with system GMM estimator.

173

Given that the primary goal of this study is to investigate the effect of SBRR, other variables, which were defined in the previous section (6.3.1.3) are also included. Briefly, SBRR refers to the Small Business Rates Relief and its lags for each firm i in period t. The variables in vector X were defined in Section 6.3.1.3. Shortly, 𝜌 captures the initial34_effects

of receiving any relief or the uplift in relief, irrespective of level. Other variables are firm age (a), the regions (r) and broad sectors (s), immediate foreign ownership (IO), ultimate foreign ownership (FO), high growth firm dummy (HGF). To at least partly control for the competition, the model includes Marshall Specialisation (PS), Jacob Diversity (PD) and Herfindahl-Hirschman Index (HHI).

Hence, a logarithmic variable, 𝑋𝑖𝑡, is introduced:

𝑋𝑖𝑡 = ∑4𝑗=0(𝑆𝐵𝑅𝑅𝑡−𝑗),𝜌𝑖𝑡, 𝑎𝑖𝑡, 𝑟𝑖𝑡, 𝑠𝑖𝑡, 𝑃𝑆𝑖𝑡, 𝑃𝐷𝑖𝑡, 𝐻𝐻𝐼𝑖𝑡, 𝑅&𝐷𝑖𝑡, 𝐻𝐺𝐹𝑖𝑡, 𝐹𝑂𝑖𝑡, 𝐼𝑂𝑖𝑡 Then, the model becomes:

𝑙𝑛𝐺𝑉𝐴𝑖𝑡 = 𝛽𝐿𝑙𝑛𝐿𝑖𝑡+ 𝛽𝑘𝑙𝑛𝐾𝑖𝑡+ 𝜕𝐾𝑙𝑛𝐾𝑖(𝑡−1)+ 𝜕𝑀𝑙𝑛𝑀𝑖(𝑡−1)+ 𝜕𝐿𝑙𝑛𝐿𝑖(𝑡−1)+ 𝜕𝑋𝑙𝑛𝑋𝑖𝑡+ 𝑧𝑖𝑡𝜇 + 𝜉𝑖𝑡+ 𝑒𝑖𝑡

To at least partly deal with unobserved heterogeneity, it is common to apply the within (demeaning) transformation, as in one-way fixed effects models, or to take first differences if the second dimension of the panel is a time series, as carried out by Harris and Moffat (2016) and Harris et al. (2015). Given the large observed time and relatively small number of firms, the lag feature was exploited within this analysis by converting the productivity function to a dynamic form. In other words, controlling for the lagged dependent variable and estimating the equation with GMM system estimator:

𝑙𝑛𝐺𝑉𝐴𝑖𝑡 = 𝛽𝐺𝑉𝐴𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−1)+ 𝛽𝐿𝑙𝑛𝐿𝑖𝑡+ 𝜕𝐿𝑙𝑛𝐿𝑖(𝑡−1)+ 𝛽𝑘𝑙𝑛𝐾𝑖𝑡+ 𝜕𝐾𝑙𝑛𝐾𝑖(𝑡−1)+ 𝜕𝑀𝑙𝑛𝑀𝑖(𝑡−1)+ 𝜕𝑋𝑙𝑛𝑋𝑖𝑡+ 𝑧𝑖𝑡𝜇 + 𝜉𝑖𝑡 + 𝑒𝑖𝑡

The unique feature of DPD models is their capability of first differencing to remove unobserved heterogeneity. Nickell (1981) shows that the demeaning process creates a correlation between the regressor and error because it subtracts the individual’s mean value of the outcome variable and each independent variable from the respective variable. This correlation makes coefficients of the lag dependent variable biased. In other words, the mean of the lagged dependent variable is likely to contain observations of 0 through the period before on y, and the mean error, which is being subtracted from each error term,

174

contains contemporaneous values of error. One of the solutions to this may be first differences of the original model.

6.3.1.4.1 Anderson–Hsiao (AH) estimator

However, there is still a correlation between the disturbance process (first-order moving average) and the differenced lagged dependent variable because the disturbance process contains lagged error term. At this point, the Anderson–Hsiao estimator could be used to remove individual fixed effects with instrumental variables estimator by constructing instruments for the lagged dependent variable with the second/third lags. Anderson and Hsiao’s (1982) approach is based on a different form of the original equation: 𝑙𝑛𝐺𝑉𝐴𝑖𝑡 = 𝛽𝐺𝑉𝐴𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−1)+ 𝛽𝑗(𝐽𝑖𝑡− 𝐽𝑖(𝑡−1)) + 𝛼𝑖+ 𝑒𝑖𝑡 , where 𝐽 is all independent variables in the model and 𝛼𝑖 is individual specific fixed effects. This model in AH’s framework becomes:

𝑙𝑛𝐺𝑉𝐴𝑖𝑡− 𝛽𝐺𝑉𝐴𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−1)= (𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−1)− 𝛽𝐺𝑉𝐴𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−2)) + 𝛽𝑗(𝐽𝑖𝑡− 𝐽𝑖(𝑡−1)) + +𝑒𝑖𝑡 − 𝑒𝑖(𝑡−1)

This cancels individual effects assumed to correlate with exogenous variable, but the difference of lagged endogenous variable is correlated with the error term (𝑒𝑖𝑡 + 𝑒𝑖(𝑡−1)).

AH suggest (𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−1)− 𝛽𝐺𝑉𝐴𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−2)) instrumenting with lagged difference (𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−2)− 𝛽𝐺𝑉𝐴𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−3)) or level instruments (𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−2)) because these differences should not be correlated with the differenced error term:𝐸(𝑦𝑖,(𝑡−2)𝑑𝑒𝑖𝑡) = 0 and 𝐸(𝑑𝑦𝑖,(𝑡−2)𝑑𝑒𝑖𝑡) = 0.

Later, Holtz-Eakin et al. and Arellano (1989) found level instruments (𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−2)) to be superior because they both had smaller variance and no points of singularities. Furthermore, when level instruments are used, one year less is lost because of lags.

6.3.1.4.2 Arellano–Bond (AB) or Difference Estimator

In empirical work looking at productivity after Wooldridge’s (2009) one-step estimation was introduced, Generalised Method of Moments firstly suggested by Holtz- Eakin et al. and populated by Arellano and Bond (1991) has become increasingly popular. The main idea behind the estimator is that the instrumental variables approach does not use all available information, so by including more information, more efficient estimates may be found. They separate 𝐽𝑖𝑡 into two parts: 𝐽2𝑖𝑡 and 𝐽3𝑖𝑡 , where 𝐽2𝑖𝑡 consists of strictly exogenous regressors and 𝐽3𝑖𝑡 are predetermined regressors (could include lags of 𝑙𝑛𝐺𝑉𝐴 as well) and endogenous regressors possibly correlated with the unobserved individual

175

effect. First differencing, as performed in AH estimator also removes individual effects and its associated omitted variable bias:

𝑙𝑛𝐺𝑉𝐴𝑖𝑡 = 𝛽𝑗(𝐽2𝑖𝑡− 𝐽2𝑖(𝑡−1)) + 𝛽𝑗(𝐽3𝑖𝑡− 𝐽3𝑖(𝑡−1)) + 𝑒𝑖𝑡 − 𝑒𝑖(𝑡−1)

In standard 2SLS, so as well as AH estimator, the first observation is lost by applying the twice-lagged level in the instrument matrix:

( . 𝑙𝑛𝐺𝑉𝐴𝑖1 ⋮ 𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−𝑛) ),

where n is all lagged periods.

If the twice lagged instrument is included, two observations are lost:

( . 𝑙𝑛𝐺𝑉𝐴𝑖1 𝑙𝑛𝐺𝑉𝐴𝑖2 ⋮ 𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−2) .. 𝑙𝑛𝐺𝑉𝐴𝑖1 ⋮ 𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−3) )

To reduce the loss of degrees of freedom, AB constructs a set of instruments from the second lag of 𝑙𝑛𝐺𝑉𝐴 , one instrument pertaining to each time period:

( 0 𝑙𝑛𝐺𝑉𝐴𝑖1 0 ⋮ 0 0 0 𝑙𝑛𝐺𝑉𝐴𝑖2 ⋮ 0 ⋯ ⋯ ⋯ ⋱ … 0 0 0 ⋮ 𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−2)₎

The columns of this instrument matrix are orthogonal to the transformed errors because the resulting moment conditions correspond to an expectation 𝐸(𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−2), 𝐹𝐷𝑒𝑟𝑟𝑜𝑟𝑠) = 0.

This solution implies that all available lags can be used as instruments, for endogenous variables twice lagged or higher and for predetermined variables that are not strictly exogenous once lagged variables are also valid because they are only correlated with errors dated t-2 or earlier. Therefore, the instrumental matrix becomes:

( 0 𝑙𝑛𝐺𝑉𝐴𝑖1 0 0 ⋮ 0 0 𝑙𝑛𝐺𝑉𝐴𝑖2 0 ⋮ 0 0 0 𝑙𝑛𝐺𝑉𝐴𝑖3 ⋮ 0 0 0 𝑙𝑛𝐺𝑉𝐴𝑖2 ⋮ 0 0 0 𝑙𝑛𝐺𝑉𝐴𝑖1 ⋮ … … … … ⋱ )

6.3.1.4.3 Arellano–Bover and Blundell and Bond (ABBB) or system estimator

Later, Arellano and Bover (1995) and Blundell and Bond (1998) have shown that the lagged levels might be slightly wrong instruments for first differenced variables, particularly if they follow a random walk, so they provide a modification which includes

176

lagged levels and lagged differences. The chief drawback of the System GMM estimator is the additional restrictions on the initial conditions of the process generating.

6.3.1.4.4 Instrumental variables in general

The motivation to use instrumental variables is to isolate “as good as random” variation in the treatment variable, so that selection and unobservable problems could be solved. Revisiting the final equation of the dynamic model:

𝑙𝑛𝐺𝑉𝐴𝑖𝑡 = 𝛽𝐺𝑟𝑜𝑠𝑠𝑙𝑛𝐺𝑉𝐴𝑖(𝑡−1)+ 𝛽𝐿𝑙𝑛𝐿𝑖𝑡+ 𝛽𝑘𝑙𝑛𝐾𝑖𝑡+ 𝜕𝐾𝑙𝑛𝐾𝑖(𝑡−1)+ 𝜕𝑀𝑙𝑛𝑀𝑖(𝑡−1)+ 𝜕𝐿𝑙𝑛𝐿𝑖(𝑡−1)+ 𝜕𝑋𝑙𝑛𝑋𝑖𝑡+ 𝑧𝑖𝑡𝜇 + 𝜉𝑖𝑡+ 𝑒𝑖𝑡

Given that the capital was measured, it is directly related to investment: 𝛽𝑘𝑙𝑛𝐾𝑖𝑡 = 𝛽1𝑙𝑛𝐼𝑖𝑡+ 𝑒𝑖𝑡𝐼 → 𝛽𝑘𝑙𝑛𝐾𝑖𝑡 = 𝛽1𝑙𝑛𝐼𝑖𝑡+ 𝛽2𝑙𝑛𝑋2𝑖𝑡 + 𝑒𝑖𝑡𝐼

The main concern is that 𝐸(𝑒_𝑖𝑡𝑒_𝑖𝑡𝐼⃓ 𝛽𝑘𝑙𝑛𝐼𝑖𝑡) ≠ 0, so the path diagram of the endogeneity problem is that 𝛽𝑘𝑙𝑛𝐾𝑖𝑡 affects 𝑙𝑛𝐺𝑉𝐴𝑖𝑡 but the error term (𝑒𝑖𝑡) influences both 𝛽𝑘𝑙𝑛𝐾𝑖𝑡 and 𝑙𝑛𝐺𝑟𝑜𝑠𝑠𝑖𝑡.

The typical instrumental variable approach is to find variables that belong to the second equation but not the first one. Therefore, the following assumptions should be kept: a valid, nontrivial, first stage coefficient (𝛽2) for observables that belong in the participation equation but not the outcome equation (𝑙𝑛𝑋2𝑖𝑡) and a valid exclusion restriction (𝐸[𝛽2𝑙𝑛𝑋2𝑖𝑡∗ 𝑒𝑖𝑡𝐼⃓ 𝑙𝑛𝐼𝑖𝑡] = 0)

6.3.1.4.5 Sensitivity Analysis -Different Transformations, Steps and Ways

One way, one step system GMM estimator will be compared to the two ways, two steps system GMM estimator and one way, one step difference estimator. As previously discussed at the beginning of this section, some of these are expected to reduce observations and possibly weaker relationships. For instance, the difference GMM approach transforms the data to remove the fixed effects deals to reduce the inherent endogeneity. More specifically, it uses the first difference transformation. This seems a better choice than the within transformation (discussed in Appendix 10.2.2.1), as that transformation is likely to make each observation in the transformed data endogenous for a firm. The one shortcoming of this transformation is that it increases gaps in unbalanced panels. If some values of a variable are not available, then both values around the value will be missing in the transformed data. This motivates an alternative transformation: the forward orthogonal deviations (FOD) transformation, proposed by Arellano and Bover (1995).

177 6.3.2 Survival Analysis

Up until now, Section 6.3 focused on productivity. It defined the central issues in estimation (Subsection 6.3.1.1) and introduced two ways to estimate factors affecting productivity (6.3.1.2, 6.3.1.3 and 6.3.1.4). The other part of Section 6.3 is devoted to the survival analysis. It was defined and fundamental mechanisms were explained in the Methodology Review Chapter (Section 5.2). The chapter concluded that Cox proportional hazards (CPH) model is the preferred approach to be applied to this study. An advancement in survival analysis made it possible to reduce the amount of limiting assumptions and correct for various estimation flows. It became one of the most frequently applied techniques for the survival analysis because of its semi-parametric nature and theoretical foundations. The first part of this section will focus on implementation and the tests that were employed to test the model.

The Methodology Review Chapter (Section 5.3.1.2) proposed to supplement more standard Cox Regression with survival trees (ST). It also introduced the key concepts of ST. Section 7.3.2 further supplements this by extending the description of the ST for left- truncated and right-censored data which were found to be preferable for this analysis owing to the data structure and their ability to accommodate the longitudinal data.

In document Policy evaluation with advanced analytics: non-domestic property tax reliefs (Page 175-180)