CHAPTER 3: RESEARCH METHODOLOGY
3.4. Quantitative Techniques of Firm Failure
3.4.1 Introduction to Methodological approaches
The firm failure literature is composed of two types of research. The first type analyse the characteristics of the firm failure process while the other identifies the determinants of firmsโ failure, often in the context of firm failure prediction. Both areas of research use quantitative techniques. Therefore, to some extent, similar quantitative methodologies have been applied to both areas of research. Due to the nature of the failure prediction literature, where small improvements in the prediction accuracy are important, that area gives a more rigorous review and application of the quantitative techniques that have been used in the wider firm failure literature.
From a literature point of view, a number of techniques have been used to develop prediction models. Kumar and Ravi (2007) classified these models into two broad areas of statistical and intelligence/computing techniques. This classification in practice excludes the market-based model approaches which are not applicable to SMEs given that a large proportion of these firms are not listed in any stock exchanges. Statistical techniques are applicable to SMEs and they appear both in the failure prediction area and in the quantitative firm failure process area. Therefore, these statistical techniques will be the main focus of this analysis. Multiple Discriminant Analysis (MDA) and Logistic Regression (logit) have been the most popular applications, with panel data techniques recently appearing increasingly in the literature. However, other techniques such as factor analysis have also been used in the firm failure determinants literature (see for example Gaskill et al., 1993; Modina and Pietrovito, 2014), but with relatively limited applications as a standalone technique in the firm failure prediction area. However,
100 | P a g e
these techniques are a useful tool to reduce an initially large number of potential independent variables to more manageable levels.
Multiple Discriminant Analysis (MDA) has been extensively used, starting from Altmanโs (1968) z-score model. Many other authors followed the very same example (see for example Deakin, 1972; Edmister, 1972; Blum, 1974; Taffler, 1982). The first Z-score has the following form (Altman, 1968):
๐ = ๐1๐1+ ๐2๐2+ โฏ + ๐๐๐๐
The z-score was based on the linear general formulae of the Multiple Discriminant Analysis (MDA) which was according to Lachenbruch, (1975) was:
๐ท๐ = ๐0+ ๐1๐๐1+ ๐2๐๐2+ โฏ + ๐๐๐๐๐
MDA as a modelling technique had a number of advantages that made it appealing to researchers. It can discriminate between distinct populations, allowing it to distinguish between two different population outcomes (failure and non-failure) by using linear combinations of variables (Taffler, 1983). Moreover, MDA is relatively easy to use, making it an appealing solution to academics and risk professionals. It can also work with relatively small samples. In addition, the failure prediction accuracy results presented in the literature are adequate. In fact, Balcaen and Ooghe (2006) argued that MDA was the most popular application in the failure prediction literature. However, MDA has also a number of disadvantages. MDAโs assumption on the multivariate normal distribution of variables is quite strong and frequently violated in the applied failure research (Deakin, 1976; Taffler and Tisshaw, 1977) increasing the risk of producing biased error estimates (Eisenbeis, 1977). However, Lachenbruch (1975) demonstrated that the MDA estimations can be practically robust regardless of the violation in the assumptions. A limitation of the MDA technique is that it discriminates between binary outcomes which are usually two extreme business cases such as failure and non-failure. MDA examples in the literature have been applied in a cross-sectional data specification only. However, within the firm failure literature logistic regression applications largely replaced the MDA due to enhanced performance when larger datasets became available.
Ohlson (1980) provided the first empirical evidence that used logistical regression (logit) in a pooled data sample to predict firm failure. Logit applications need larger data samples than, for example multiple discriminant analysis, something that was
101 | P a g e
becoming gradually possible due to advances in technology and data availability. The advantages of the logit method is that it gives a probability of an outcome (such as failure) without strong assumptions of prior probabilities of failure or strict assumptions on the distribution of predictors. Typically, logit models (in their basic form) assume a binary outcome in the dependent variable, although multinomial applications can also be applicable. In any case the dependent variable outcomes should be non-overlapping and discrete.
On the other hand, logit applications are more sensitive to multicollinearity. This characteristic can provide challenges when financial ratios are the main independent variables because they tend to have some degree of correlation with each other. In addition, the larger datasets that the logit applications require tend to generate some issues associated with the sample selection. These are related to the inclusion of disproportionate numbers of failed firms in the sample. This issue is referred to as โchoice-based sample biasโ or โoversamplingโ (Zmijewski, 1984). Over-sampling affects both logit and probit coefficients (Dietrich, 1984). However, evidence suggests that the above characteristics of logit applications do not reduce materially the performance of such models in the context of firm failure prediction (Balcaen and Ooghe, 2006; Zmijewski, 1984). The sampling bias however, is not relevant in studies that analyze the determinants of firmsโ failure but instead for those that deal with failure prediction. Therefore it can be argued that it is not a limitation in such studies. The overall capability of logit models has been proved through their continuous popularity as they are still used extensively in research studies. Most of the applications of logit techniques has been associated with cross sectional data structures and this continues to be the case to a large extend.