Discriminant, content and construct validity

CHAPTER 2: A REVIEW OF RELEVANT LITERATURE 2.1 Introduction

3.1 Introduction

3.3.7 Discriminant function

3.3.7.2 Discriminant, content and construct validity

The effectiveness of DA and the resulting discriminant models require a test for discriminant validity, content and construct validity (Podsakoff and Oragan, 1986). In this case, the classification as well as the use of bank financial and nonfinancial variables provides distinctive dimensionality, which means that the issue of discriminant validity is well settled. Regarding the issues of content and construct validity, the characteristics of bank financial variables are drawn from relevant literature that adequately provides multidimensional perspectives (e.g., asset quality, capital adequacy, credit risk, liquidity and profitability categories). In addition, these financial variables provide adequate coverage of the important contents and therefore a good basis for content validity (Nunnally, 1994). Because many related studies have conducted empirical examinations of bank financial and nonfinancial variables in the literature of the banking industry, these variables provide an adequate evidence of construct validity (Dince and Fortson, 1972;Sinkey, 1975).

105 3.3.8 Logistic regression

LR is a well-established multivariate statistical technique used to predict binomial or multinomial outcomes. The initial model formulation of LR was designed for binary classification problems (Crama et al., 1988). LR is used to examine the relationship between binary or ordinal response probability and one or more independent variables. That is, LR is used when the dependent variable is a dichotomy (two categories) and the independents are of any type (Cox and Snell, 1989; Hosmer and Lemeshow, 2000).

LR is a progression of the ordinary multivariate linear regression. The main difference between logistic and linear regression is that the dependent variable is binary or dichotomous. Consequently, the chosen parametric models and the assumptions attributed to each technique are different. Once these differences are accounted for, the methods used in an LR analysis pursue the same general principles used in linear regression (Hosmer and Lemeshow, 2000). LR is different from other classification techniques in that it thoroughly analyses a major subset of variable combinations to explain the positive and negative nature of the observations (e.g., to describe high-FSR or low-FSR banks, solvent or insolvent banks; Hammer et al., 2012).

The coefficient generated by LR for each independent variable explains the contribution of that variable to variations in the dependent variable. However, the dependent variable can only be defined by two values: 0 or 1. The nature of the dependent variable is the main difference between linear and logistic regression. In linear regression, the outcome of regression predicts a numerical value of the dependent variable from relevant independent variables and coefficients. In logistic regression, the result predicts the probability ( ) that it is 1 rather than 0 (i.e., the event/person fits in one group rather than the other).

106

Consequently, the log transformation of the values is employed to normalise the distribution and thus create a link with the linear regression equation. This process also is known as logit of or logit ( ). Logit ( ) is the log (to base e) of the odds ratio or likelihood ratio that the dependent variable is 1 and can be defined as follows:

Logit ( ) = log[ / (1 - )] = ln[ / (1 - )] (11) where is the range from 0 to 1, the Logit ( ) scale ranges from negative infinity to positive infinity. LR uses binomial probability theory to develop a logit model that is derived from linear regression. The logit model is described in the following equation (Abdou et al., 2008):

(12)

where is the probability of the outcome of interest, is the constant of the equation and is the coefficient in the linear combination of independent variables, , for = 1 to . LR finds a best-fit equation using the maximum likelihood method instead of the least-squared deviations method used for linear regression (Freund et al., 2006). The maximum likelihood method maximises the probability of getting the observed results into the appropriate category given the fitted regression coefficients. Consequently, the following nonlinear function is used to express the relationship between independent variables and binary dependent variable (Canbas et al., 2005; Premachandra et al., 2009):

(13)

where is a cumulative probability function that takes values between 0 and 1; and

(14)

Thus the objective of LR is to predict banks’ FSR group memberships correctly for individual observations using the most prudent model. A model is developed based on the inclusion of

107

all independent variables that are valid in predicting the dependent variable, namely, banks’ FSR group memberships.

In the literature of finance, LR is a widely used technique among practitioners to predict corporate and bank failure (Boyacioglu et al., 2009; Brezigar-Masten and Masten, 2012; Canbas et al., 2005; Doğanay et al., 2006; Hua et al., 2007; Jones and Hensher, 2004; Kick and Koetter, 2007; Kolari et al., 2002; Lanine and Vennet, 2006; Li et al., 2010;Loannidis et al., 2010; Martin, 1977; Ohlson, 1980; Premachandra et al., 2009; Zhao et al., 2009); credit ratings (Chaveesuk et al., 1999; Ederington, 1985; Kim et al., 1993; Kim and Ahn, 2012; Maher and Sen, 1997; Oelerich and Poddig, 2006; Tsai and Chen, 2010) and credit scoring models (Abdou, 2009a; Abdou et al., 2008; Akkoc, 2012; Desai et al., 1996; Joanes, 1993; Laitinen, 1999; Lee et al., 2002, 2006; Lee and Chen, 2005; Ruo-wei and Chun-yang, 2007; West, 2000; Westgaard and Wijst, 2001; Wiginton, 1980). Finally, the LR model has been employed by Öğüt et al. (2012), Hammer et al. (2012), Poon et al. (1999), and Belloti et al. (2011a, 2011b) to predict BFSRs.

3.4 Conclusion

This chapter presents and justifies the research method used in this thesis to fulfill the research objectives. This study has followed positivism as research philosophy because it depends entirely on application of various statistical techniques to a large set of quantitative data to test certain designated hypotheses to achieve the research objectives.

This chapter starts by explaining the data collection process via Bank scope database. The researcher divided the dataset into three samples: entire dataset, subsample1 and subsample2.

This is followed by a description of the numerical rating of the dependent variable (i.e., bank FSR issued by CI) and categorises the FSRs into four quartiles (i.e., high FSR, near-high FSR, low FSR and near-low FSR). In addition, bank financial performance variables are

108

elucidated thoroughly (i.e., asset quality, capital adequacy, credit risk, liquidity and profitability categories) along with the designated proxies that belong to each category and the expected sign associated with each proxy for bank FSR. Finally, a list of control variables is introduced to control for bank financial performance variables (i.e., country effect, size effect, time effect and SR).

The ultimate goal of this thesis is to enhance the performance of banks in the Middle East region by identifying the main financial and nonfinancial variables associated with high- and near-high FSRs using publicly available data. Consequently, the ML technique is introduced to achieve this goal. This thesis is intended to provide the banking sector in the Middle East region with a vast range of different bank FSR group membership modeling techniques (i.e., CHAID, CART, MLP neural networks, DA and LR) and to evaluate the predictive capability of these models using various evaluation criteria (i.e., ACC, EMC and gains charts). The key challenge is to build bank FSR group membership models to increase classification and prediction accuracy and to reduce the misclassification costs. The following two chapters introduce and interpret the empirical results.

109

CHAPTER 4 : MULTINOMIAL LOGIT (ML) RESULTS

4.1 Introduction

In this chapter, the researcher reports the results that identify the main bank performance measures (i.e., financial and nonfinancial variables) associated with high- and near-high FSRs versus low- and near-low FSRs in the Middle East region. The results of ML technique are presented in the following order: (1) descriptive statistics and (2) results obtained from the various models (i.e., asset quality, capital adequacy, credit risk, liquidity, profitability) and (3) all financial category models (with and without dummies).

In document The impact of financial and non financial measures on banks’ financial strength ratings : The case of the Middle East (Page 119-124)