DATA AND METHODOLOGY
4.4 ESTIMATION MODELS
4.4.2 Multiple Correspondence Analysis (MCA) approach
For the purpose of testing the third objective, MCA was used to explore the quality of contribution that over-indebtedness (db.sta) has towards the South African Multidimensional Poverty Index for it to be regarded as an additional indicator to the index.
A Principal Component Analysis (PCA), first proposed by Pearson (1901) and later developed by Hotelling (1933) is a natural choice as a data exploration technique. It seeks to find fewer principal components (PCs) that retain most of the information in the original set of observed indicators. The technique also provides weights or contributions for each indicator based on the covariance matrix.
A major weakness of PCA is that it was developed for quantitative variables while MCA, a new version of the PCA is appropriate when variables are categorical or binary (Alkire et al., 2015).
Moreover, quantitative variables can always be transformed into categorical ones. Asselin and Anh (2008) deal specifically with the use of MCA in multidimensional poverty analysis.
Introduced by Benzecri (1973), MCA is a method of finding the interrelationship between variables and the strength of the association between them (Greenacre, 2007). A matrix, with elements of 0s and 1s, is created. Observations with missing values are omitted from the analysis. The MCA performs a correspondence analysis on Burt matrix in this study which is obtained by multiplying the disjuncture matrix by its transpose. The application of MCA on the Burt matrix performs better in comparison to the results obtained from the indicator matrix (Greenacre, 2007).
There are four major advantages of using MCA over PCA. (1) MCA makes fewer assumptions than PCA regarding the distribution of the indicator variables and quantifies each indicator in a non- linear way, thus being free from the linearity hypothesis. (2) MCA addresses marginalisation bias by giving more importance to indicators with a smaller observations, thus focusing on the highly impoverished (Asselin & Anh, 2008). (3) MCA contains a reciprocal bi-additive character in that it can be applied to row-profiles (observations) or column-profiles (categories), and (4): while PCA studies the linear relationship between the variables, MCA studies the more complex non-linear relationships between them (Asselin & Anh, 2008). In a comparative study of multidimensional poverty between PCA, MCA and the fuzzy approach, Njong and Ningaye (2008) concluded that MCA is more sensitive to deprivation and therefore to be preferred to PCA.
It should be noted that even with its strengths, MCA suffers some weaknesses. The results of MCA are data driven making it difficult for intertemporal and cross-country comparisons, for instance, weights obtained may vary from one period to the next (Alkire et al., 2015). The MCA also
artificially inflates the chi-squared distances between profiles and thereby underestimating the percentage of variance explained by the first dimension. Introducing scale adjustments to the MCA solution corrects this challenge (Gower, 2006). The choice of factorial axis (dimensions) could
imply a high information loss (Asselin & Anh, 2008). This is remedied by consistency requirements that should be satisfied for results obtained through MCA as discussed later in this section.
In the analysis, the indicator variables are encoded as either 0 or 1. The distance (d) between two households i and iβ in category D (for example) is calculated as:
ππ,π2β² = πΆ β
(π₯π,π β π₯πβ²,π)
πΌπ
π·
π=1 β¦β¦β¦... (1)
where C is a constant and Id is the number of households that falls in d.
The distance between the two categories n and nd is calculated by determining the number of households that fall within d and nd, and is given by
ππ,ππ2 = πΆβ² 1
πΌππΌππ β (π₯π,π
πΌ
π=1 β π₯π,ππ)2 β¦β¦β¦.β¦.. (2)
where Cβ is a constant, and Id and Ind are the number of households that fall within the category d
and nd respectively2. Following Ezzrari and Verme (2012), there are two ways of analysing poverty
on a multidimensional dataset: (i) vertical analysis, in which the individual households are compared against one another; and (ii) horizontal analysis, in which the dimensions (indicator variables) are compared against one another. In extending the work of Ezzrari and Verme (2012), who follow the horizontal approach only, this chapter uses both equation 1 and 2 to understand the nature of poverty in South Africa.
In addition, in understanding how over-indebtedness (db.sta) affects multidimensional poverty, the question of impact or weightage of various indicator variables that influence the Composite Poverty Indicator (CPI) is addressed. While some researchers use the subjective measure of providing equal weight to all indicator variables, this study uses the objective measure of using the weights from MCA to evaluate the importance of each variable on CPI. In this regard CPI is defined as a latent multidimensional combination of deprivation and non-deprivation of 12 indicator variables. It
2
Equation 1 and 2 are adopted from Exploratory Multivariate Analysis by Example Using R, Francois Husson, Sebastien Le and Jerome Pages, (2011), Boca Raton, Fla: CRC Press.
is latent because poverty here is measured through the observed proxies of d and nd. The functional form of CPI is
πΆππΌπ =πΎ1βπΎπ=1βπ½ππ½π=1πππππΌπππ½π β¦β¦β¦.β¦β¦. (3) With
ππ½ππ = ππ
βπβ¦β¦β¦. β¦β¦β¦(4)
3
where k = (1,2,3,....K(=12)) is the number of indicator variables, j = (1,Jk (=2)) is the number of modalities of each variable, I (0/1) is the indicator of the modality of each variable, W is the weight- age or the factor scores of the first dimension of MCA normalised by the eigen value Ξ» with s factor score. The first dimension accounts for most of the total variance and all subsequent dimensions have decreasing variances. Asselin (2009) discusses the consistency requirements (axioms) that a multidimensional poverty analysis done through MCA should satisfy as robustness checks. The monotonicity axiom suggests that if a householdβs poverty situation, for any given indicator, improves then its overall poverty value should decrease. This requirement has two elements: First Axis Ordering Consistency (FAOC-I) for indicator states that there must be an ordinal consistency between the ordering of categories and the ordering of weights across categories, either increasing or decreasing order, and Global First Axis Ordering Consistency (FAOC-G) β for all indicators states that the ordering of weights for all indicators should be consistent as evidence by either decreasing or increasing trend. A binary variable always meets the FAOC-I requirement. The CPI is a by-product of MCA and it deals with multidimensionality of poverty at the household level while aggregation at societal level can follow the counting procedures applied in the AF methodology.
4.5 SUMMARY
This chapter provided a detailed description of this studyβs research methodology. Through a comparison with literature, different methods of measuring over-indebtedness were discussed. These will be implemented in the next chapter to determine the extent of over-indebtedness in South Africa in pursuance of the first objective of the study. The distribution of over-indebtedness
across household characteristics will also be analysed using the National Credit Regulator measure.
The GAM will be applied in chapter six to estimate the threshold effects of household debt on multidimensional poverty. The model is flexible to capture very complex curves along data points and thus seen as the most appropriate for objective two of the study. With regard to the third objective, MCA will explore the quality of contribution that over-indebtedness status has on SAMPI. Other exploratory analysis techniques are inclined towards using continuous variables, and thus not compatible with dataset that is used in this study. A significant contribution of over- indebtedness in the poverty index will therefore justify its inclusion in the South African Multidimensional Poverty Index that will be constructed in chapter eight.