Chapter 4 Research Design, Methodology and Data Description
4.4 The Models
4.4.2 Fixed Effects Least-Squares Dummy Variable (LSDV) Model
The second model is the fixed effects (regression) model; the main advantage over the
POLS model is the ability of the model to overcome heterogeneity, the main disadvantage
in the POLS model. This model assumes that the differences across the units can be
expressed as constant terms (Greene, 2011), it can be expressed as follows:
131
Where:
εit = error term that captures the difference between time and companies
The model expects an intercept to vary between cross-sectional units; each unit has a
fixed and unique intercept with the differences in the intercept reflecting the unobserved
difference between the cross-sectional units.
Fixed effects model is also known as Least Squares Dummy Variable Model (LSDV)
(Gujarati and Porter, 2009). This model is “appropriate in situations where the individual specific-intercept may be correlated with one or more regressors. A disadvantage of LSDV is that it consumes a lot of degrees of freedom when the number of cross-sectional units, N, is very large, in which case we have to introduce N dummies (but suppress the common intercept term).” (Gujarati and Porter, 2009:613). The model can be expressed as follows:
Yit = αi+ β1 Xit + µit
This model is widely used due to the simplistic nature of use and understanding, the
‘within effect model’ is used in situations whereby many groups in the panel data exists (Gujarati and Porter, 2009). The within effect model is particularly useful where
dummies are used within the data. The variables are transformed using the group mean to
avoid the use of dummies, and therefore a large degree of freedom from error. Another
function of this model is the ‘between effects’ model, which uses group means of the variables; the analysis is performed on groups or subjects rather than individuals. This
model is made up of the POLS model with the addition of dummy variables, these are
used to represent each unit and a dummy variable is coded either 0 or 1 (Gujarati and
132
in this study, with 1 representing whether the answer to the question is yes. For example,
is the CEO female? If the CEO is female the dummy variable is coded 1, while if the
CEO is male the dummy variable is coded 0. One issue with the use of dummy variables
is the instance of perfect collinearity, the avoidance of the dummy variable trap is vital to
ensure the validity of the model. There are three ways to avoid this issue (Gujarati and
Porter, 2009):
1. POLS drop a dummy variable (LSDV1).
2. POLS including all dummies but drops the intercept, resulting in producing an
incorrect R squared (LSDV2).
3. POLS including all dummies and intercept, the model includes a restriction that
the sum of parameters of all the dummies is zero (LSDV3).
4. Clustering of errors (discussed in Section 4.4.4)
Using POLS that drop a dummy variable is the most frequently used as it produces the
correct statistics (Greene, 2011).
However, there are several disadvantages of the LSDV Model (Gujariti and Porter, 2009):
1. If there are too many dummy variables in the model the degrees of freedom will
increase, the number of observations will decrease and the statistical analysis will
be reduced. Models that contain a lot of dummy variables increase the risk of
multicollinearity, enabling difficulty in the precise estimation of the parameters.
Dummy variables are created in relation to this research surrounding CEO
personal characteristics, for example, gender. However, dummy variables will be
kept to the key ones to avoid the dummy variable trap.
2. LSDV may be unable to identify the impact of time-invariant variables in
133
For example, Graham et al. (2013) consider the impact of the gender of the CEO,
and find that this does not change over time for an individual subject. The LSDV
model may not be able to identify the impact of this time-invariant variable on
leverage.
3. The error term µ𝑖𝑡 requires careful consideration, there are several possibilities. The following options are available:
1. Assume the error variable is the same for all cross-sectional units, or assume
that the error variance is heteroscedastic.
2. Assume that there is no autocorrelation over time.
3. Assume that there is no such correlation between the error term of company 1
versus the error term of company 2.
In addition to using POLS in one previous study, Bevan and Danbolt (2002) use a second
method, called fixed effects (FEM). Bevan and Danbolt (2002) find significant
differences between these two methods in their study of UK company’s capital structure, which contradicts many of the theories in relation to the determinants of capital structure.
FEM is able to overcome the heterogeneity that is associated with the POLS method,
Bevan and Danbolt (2002) highlight the importance of controlling for the fixed effects in
their study. In another UK based study, Michaelas et al. (1999) use FEM with the
inclusion of dummy variables. The UK data covers ten industries; therefore, the industry
studies cannot be classified as a small sample of a larger population, indicating the REM
would be inappropriate in this case. The use of FEM is demonstrated in the study by
Michaelas et al. (1999), seven time and nine industry dummy variables are used. The
study finds that time and industry specific have an impact on the capital structure of small
companies in the UK. In this study, the method will follow previous studies in the capital
134