CHAPTER 1 ·····························································································································
2.4 A MONTE CARLO APPROACH ···········································································
2.4.3 Simulation Results ····························································································
Unbiasedness
For the unbiasedness of FE and PD, the mean absolute error (MEA) and relative mean absolute error (RMAE) in parameter estimation are shown in the Table 2.4 to Table 2.8. Generally, since FE and PD are unbiased and consistent estimators, their MAEs are small in most cases as expected—less than 0.01 given that the true value 𝛽0 equals to 1.
For the homoscedastic case, in the panel setting (Table 2.4), the largest errors (0.01893 for FE and 0.01961 for PD) occur when the data is almost balanced and the within-cluster correlations of both regressors and errors are zero. In the clustered setting (Table 2.5), the largest errors (0.03925 for FE and 0.02807 for PD) appear in the same situation except that the correlation of the error are very high. The relative mean absolute errors are calculated by the MSA ratios of PD to FE and summarized in Table 2.6. Most values are around 1, indicating similar results of FE and PD. For the heteroscedastic case, the results are similar. There seem to be no specific settings in which FE or PD outperforms the other. Considering the small values of MAEs, the differences are trivial.
68
Efficiency
To compare the estimation efficiency of FE and PD, the change rates from FE to PD are calculated for both default i.i.d. standard errors and CRVE as
𝑐ℎ𝑎𝑛𝑔𝑒 𝑟𝑎𝑡𝑒(%) = 𝑠𝑒𝑃𝐷 − 𝑠𝑒𝐹𝐸
𝑠𝑒𝐹𝐸 × 100.
Note that a negative change rate indicates an efficiency improvement of PD over FE; while a positive one means an efficiency loss. The results of the homo- and hetero-scedastic cases are reported in Table 2.9-2.10 and Table 2.11, respectively.
Homoscedastic Case
For i.i.d. standard errors, FE and PD are very similar in the clustered setting. In contrast, in the panel setting, the differences between FE and PD vary with 𝝆𝝁 and the unbalanced level 𝜏.
First, when there is no correlation among within-cluster errors (𝝆𝝁 = 0), PD always shows an efficiency improvement over FE by more than 5% when the unbalanced level is middle or low (𝜏 = 0.51 or 0.91), or around 1.2% when the data is severely unbalanced (𝜏 = 0.19). Second, when the correlation among within-cluster errors is 0.5, PD exhibits an efficiency loss from FE by around 15% when the unbalanced level is middle, or around 3%-5% when the data is either severely unbalanced or almost balanced. Third, when within-cluster errors are highly correlated (𝝆𝝁 = 0.9), the situation may differ: PD improves FE by 1%-2% when the data is severely or middle-level unbalanced; PD worsens FE by less than 1% given an almost balanced data. While the within-cluster correlations of the regressors 𝝆𝒙 make no differences for any comparison, the number of clusters 𝐺 does influence the magnitude of the differences to some extent. With 𝐺 increases, all differences tend to shrink by a small amount (0%-3%).
69
In the clustered setting, for small 𝐺 (less than 10), PD has smaller CRVE than FE. Given 𝐺 = 2, this improvement reaches the largest 94% when the data is severely unbalanced, around 76% when middle-level unbalanced, and 22% when almost balanced. This advantage shrinks very fast as 𝐺 increases. At 𝐺 = 6, only around 1.6%, 4.5% and 3% improvements are kept at three unbalanced levels. When 𝐺 is larger than 10, the loss starts and gets larger. As 𝐺 grows, the increase in loss slows down and CRVEs for FE and PD converge to their own limits. In this process, neither 𝝆𝝁
nor 𝝆𝒙 is significantly relevant. In the panel setting, the situation is very similar except that 𝝆𝝁 matters. When 𝝆𝝁 = 0.5, the advantage of PD over FE disappears faster given middle-level unbalanced or almost balanced data, starting an efficiency loss by 1%-2% at 𝐺 = 6. Generally, in these two cases (𝝆𝝁 = 0.5 with 𝜏 = 0.51 and 𝝆𝝁 = 0.5 with 𝜏 = 0.91), PD is less advantageous in the panel setting than in the clustered one in three aspects: smaller efficiency gain for small 𝐺, larger decrease rate in efficiency gain as 𝐺 increases, and larger efficiency loss for large 𝐺. When the data is severely unbalanced, 𝝆𝝁= 0.5 also makes the efficiency loss larger than the level in
the clustered setting.
Figure 2.1 to 2.2 show how the change rates vary with the increases in the number of clusters in different data settings given 𝝆𝒙 = 0.5. For CRVE in all cases, PD does exhibit a faster
convergence rate than FE, however, with the number of clusters increases, FE converges to a lower level than PD.
Heteroscedastic Case
Under heteroscedastic assumption, the comparison using CRVE seems to be the same with the homoscedastic case, except that a higher within-cluster correlation of regressors favors PD over FE. For the default i.i.d. standard errors, PD and FE lead to similar results except that when the within-cluster correlation of regressors is high and the number of clusters is less than 10, PD
70 generates smaller values by 1%-6%.
Rejection Rate
The rejection rate matters since it validates the usage of standard errors to make inference. The results of the rejection rates for both i.i.d. standard errors and CRVE in the panel and the clustered settings are shown in Table 2.11-2.14; and for the heteroscedastic case in Table 2.15-2.16.
With homoscedastic errors, for both FE and PD, the i.i.d. standard errors have rejection rates very close to the nominal level 0.05 (Table 2.11-2.12). This is because after mean- differencing, the within-cluster correlations in both regressors and errors are eliminated. The transformed data is basically i.i.d., in which the default standard errors are valid. However, under heteroscedasticity, the i.i.d. standard errors always over-reject. In contrary, the CRVE shows over- rejection issue in both homo- and hetero-scedastic cases, and the over-rejection becomes less severe as the number of clusters 𝐺 grows, similar to the findings of Cameron et al. (2008). In the heteroscedastic case, the faster convergence rate is found to be associated with a very high within- cluster correlation of regressors.