• No results found

Research Methodology and Data

3.6 Analytical Procedures

This section presents the analytical strategies that will be used in Chapter 4, 5 and 6. In the first place, this section introduces the nature of data employed in the study, i.e., panel data. Additionally, the pros and cons of using panel data are also discussed and explained. More importantly, this section provides a detailed explanation of the panel data. The unique characteristics and features of this estimation method are also highlighted. However, to deliver a clear picture of the estimation method it is important to explain the endogeneity problem since it is the main concern in this study as mentioned in the above section.

The statistical methods utilised in this study are mentioned in this section. Generally, these methods are classified into two main categories, parametric and non-parametric estimations, and the decision as to which method should be employed depends on the nature and characteristics of the data. According to Gujarati (2003) there are four assumptions that should be met before using parametric tests, namely: the assumptions of normality, linearity, homoscedasticity and independence of error terms. Generally, parametric tests are more proper and can generate estimates that are more accurate if all these assumption are met, and when all variables that are used in the analysis are measured on at least an interval scale (see for example, Judge et al., 1985).

Nevertheless, if one or more of these assumptions is violated or is inaccurate, parametric methods can be a misleading approach and using non-parametric tests may be more effective (Balian 1982; Greene 2008). These assumptions are explained as follows: (1) Normality, this conjecture requires

80

that the sample data must be normally distributed. Two common tests or checks are used to examine the normality of the variables of this study, namely: skewness and kurtosis. According to Haniffa & Hudaib (2006), statistically, data is considered to be normally distributed if the skewness value is ±1.96 and the kurtosis value is within ± 3. (2) Linearity, this conjecture entails that the model should have linear parameters. In other words, the relationship between the explanatory variables (X) and the dependent variable (Y) should be linear. In circumstances where this assumption is violated the using parameter methods will result in biased estimates (Ayyangar 2007). (3) Homoscedasticity, under this conjecture, the standard deviation or the variance of the dependent variable within the groups is needed to be equal or homogenous. If not, the problem of heteroscedasticity will occur, which leads to biased standard errors and inefficient estimates. (4) Independence of error terms, this conjecture comprises that the error terms must be independent from each other, and thus no serial correlation must exist. In other words, parameter models demand that the error terms are uncorrelated and thus the observations are uncorrelated. If not, there is an autocorrelation.

As this study’s sample includes firms of different sizes, therefore, its estimation results may be because of a heteroscedasticity problem. Hence, an important question is how to mitigate these issues. There are several methods suggested in the literature for treating the heteroscedasticity problem. One commonly used one is deflation of data by some measure of size (Maddala 1992). In this method both dependent and independent variables are deflated by size. The purpose of deflation is to control for the size or scale effect. All variables used in this study are scaled by market capitalisation in order to control for the scale effect and to mitigate the heteroscedasticity problem (Brav 2009; Carpenter et al., 1994). The various checks that were discussed above were made to examine the data of this study against the assumptions of the OLS regression model. However, the results of the tests illustrate that the data do meet the required criteria or conditions for the parametric tests, and show that using parametric methods is an acceptable approach with regard to estimating the models created in this study due to the nature and characteristics of the data. The results for skewness and kurtosis (as will be demonstrated in the fourth and fifth chapters) indicate that most of the variables are normally distributed.

However, other checks have been applied to confirm these findings. Although the Shapiro-Wilk test provides some evidence that the data are normally distributed (i.e. values are significantly less than 1), the Kolmogorv-Smirnov test and the Quantiles plot confirm that the assumption of normality is not met. With respect to the assumption of homoscedasticity, the widely used Breusch-

81

Pagan and White tests were employed to detect the problem of heteroscedasticity. The findings of both tests illustrate that the problem of heteroscedasticity exists. Finally, the Durbin-Watson test was used in this study since it is the most common technique that is employed to detect the problem of autocorrelation. The results of this test showed that the assumption of independence of the error terms was not met.

Along with other assumptions, the normality of error terms is demanded for the statistical tests to be valid (Ayyangar 2007). In particular, OLS estimators become inefficient if the normality of the model is violated (Greene 2008). Hence, the estimated standard errors will be biased and inconsistent (Baltagi 2001; Greene 2008). It is suggested that two alternative statistical solutions can be used to overcome the problem of non-normality: firstly, transforming the data to adjust to parametric procedures by normalising it artificially or, secondly, employing other estimation methods that are robust and deal with the non-normality of variables (Dinga 2011).

Statistically, it is suggested that data transformation helps in overcoming the problem of non- normality and outliers by artificially making the data normally distributed. Although this technique could affect the output of the analysis by changing the fundamental nature of the information that results in complicating any interpretation (Osborne 2005), it has been found that using this technique for improving the normality of data is a valuable statistical method. Therefore, consistent with previous studies in ownership-liquidity relationship (Brockman et al., 2009; Heflin & Shaw 2000), this study uses the natural logarithm of all the study’s variables. Moreover, in order to check the consistency of the results, it was decided to utilise some appropriate estimation methods for non-normally distributed variables (e.g. robust regressions; cluster and Huber-White sandwich).

The most common of these alternative robust estimators is the Huber-White sandwich estimation, which was developed by Huber (1967), Eicker (1967) and White (1980). This robust approach produces robust standard errors that can deal with some violations of identity of variances, and thus standard errors that are attained by this approach are consistent, even if the residuals are not homogenous. Arellano (1987) expanded the Huber-White work and projected a cluster-robust estimator to diminish the conjecture of independently distributed residuals and therefore control for autocorrelation together with dealing with the problem of heteroscedasticity (Hoechle 2007). Clustering robust estimation is a robust approach that allows for the violation of independent errors or residual assumptions. This approach creates consistent standard errors if the residuals are correlated within the groups (Greene 2007; Hoechle 2007). Furthermore, in panel analyses, where

82

cross-section individuals are followed over time, the cluster robust estimation is appropriate since it corrects for the heteroscedasticity problem in the cross section and other general forms of serial correlation over time (Vogelsang 2008). As a consequences, clustering robust estimation is used in the primary analysis of this study since it accounts for the problems of autocorrelation and heteroscedasticity.