• No results found

RESEARCH DESIGN AND METHODOLOGY

4.4. DATA PREPARATION

4.4.4. Step 4: Data Analysis

The previous section explained how the sample plan was constructed and culminated in describing the sample. This section will describe the process of visual inspection and correlations performed to determine the statistical significance of the variables; finally, the analyses carried out to predict debt uptake will be discussed. These steps are indicated in figure 4.20.

Figure 4.20: Data preparation process: Step 4

The statistical techniques used in this study are classified as non-parametric, as these techniques are ideal when using data that is measured nominally. Non- parametric techniques are less strict about the assumptions regarding the underlying population distribution (Pallant, 2013:221). These tests do not have the requirement that the variables be normally distributed and are also referred to as being “distribution free” tests (Newton & Rudestam, 1999:181).

Statistical analysis is mainly concerned with descriptive statistics and inferential statistics. On the one hand, descriptive statistics has to do with the description of the sample information (section 4.4.3.6); on the other hand, inferential statistics is related to generalising the sample information to the population (Newton & Rudestam, 1999:54). Step 1: Evaluation suitability of AMPS dataset (section 4.4.1) • Determine whether AMPS survey incorporates required variables • Examine the reliability and validity of AMPS data collection process • Determine the integrity of the individual datasets Step 2: Prepare individual datasets for current study (section 4.4.2) • Extract relevant variables from individual datasets • Ensure comparability of datasets over period of analysis (coding / re-coding) • Ensure data integrity of extracted and coded/re-coded individual datasets Step 3: Construct sample plan based on combined dataset (section 4.4.3) • Combine the datasets • Visual inspection of combined dataset (frequency distributions) • Ensure data integrity of combined dataset • Compile sample • Ensure reliability and validity of realised sample (hierarchical cluster analysis) • Description of the sample Step 4: Inferential statistical analysis (section 4.4.4) • Visual inspection (cross tabulations) • Correlation to determine statistical significance (chi- square tests) • Determine drivers

of debt uptake (cox proportional - hazards regression models)

4.4.4.1. Visual inspection of data

Visual inspection of the sample was performed in order to determine if there appears to be a relationship between age and debt product uptake of each of the six debt products.

In order to carry out the visual inspection, cross tabulations were employed. Cross tabulations were carried out for debt product uptake by age for each of the debt products, namely: credit cards, home loans, overdraft, student loans, vehicle finance and other loans. The results of the cross tabulations are presented in section 5.3.

4.4.4.2. Chi-Square test for independence

The chi-square test for independence is employed when exploring the relationship between two variables that have two or more categories in each (Pallant, 2013:225). In this study, the chi-square test was performed to determine the association between age and debt product uptake and the following research hypothesis was formulated:

𝐻0: There is no statistically significant relationship between age and debt product

uptake.

𝐻𝐴: There is a statistically significant relationship between age and debt product

uptake.

The aim of the chi-square test for independence is to determine if there is a statistically significant relationship between each of the debt products and age. There will therefore be six sub-hypotheses relating to: credit cards, home loans, overdraft, student loans, vehicle finance and other loans. These will be analysed in section 5.3. When conducting the chi-square test for independence, the assumption is made that the lowest expected frequency in a cell is five. If there is a violation of the assumption that the lowest expected frequency should be 5 or more, then it is suggested that the Fisher’s Exact Probability test be conducted (Pallant, 2013:225). This test can be generated automatically in the output from SPSS when performing the chi-square test. The Fisher’s exact test is useful, when dealing with small sample sizes, in determining whether two dichotomous

variables are significantly correlated (Leedy & Ormrod, 2013:301). In order to be statistically significant the asymptotic significance must be below 0.05.

4.4.4.3. Cox proportional-hazards regression model

The study made use of the Cox proportional-hazards regression model to explore the relationship between the uptake of debt and a number of explanatory variables.

The Cox proportional-hazards regression model forms part of survival or event history techniques (Adams, 1996:271). Survival analysis quantitatively determines the effect that a group of variables has on the time to an event occurring (Ansell, Harrison & Archibald, 2007:395). As mentioned by Ansell et al. (2007:395), survival analysis has been widely used in areas such as medicine and industry, but it can also be useful in a study of this nature in order to determine which of the independent variables have predictive effect for the hazard taking place; that is, the dependent variable (which is the particular debt product uptake). The dependent and independent variables that are components of this model are listed in 4.4.1.1. Before going on to discuss how the model was applied to this study, background information on the model will be discussed, including various assumptions and advantages associated with using this type of model.

The Cox regression model was presented by Sir David Cox in 1972 in a paper entitled “Regression models and life tables”. In this paper, he introduced the proportional hazards model (Mills, 2011:86). This model is called a semi- parametric model as it is flexible and the researcher does not have an obligation to choose a specific probability distribution beforehand (Mills, 2011:90). The model does not make an assumption regarding the shape of the hazard function (Mills, 2011:12) and a distinct advantage of this model is that a parameter estimate (β) may still be produced notwithstanding the baseline hazard which is not specified (Mills, 2011:91). Another facet of this model, which makes it applicable to this study, is that it does not take censored subjects into account (Mills, 2011:88). The censored objects in this study are the “0” values; that is, the particular debt product that is not taken up by the respondent.

Age of the respondents acted as the identifier and the independent variables were analysed in order to determine whether they are predictors for the uptake of the dependent variables. Thus, survival analysis was used to determine the impact of the independent variables on the time to the occurrence of an event, which is the take-up of the debt product. The proportional hazards model makes the assumption that the time to an event taking place is described by the hazard function. The formula for the proportional hazards model is given as follows (Mills, 2011:87):

ℎ𝑖 (t) = ℎ0(t){exp(𝛽1 𝑥𝑖1+ ∙∙∙ +𝛽𝑘 𝑥𝑖𝑘)}

Where the hazard for an individual (i) at time (t) is the product of two factors, namely the baseline hazard and an exponential function which describes the effect that takes place as a result of the covariates (SPSS Inc., 2015). With 𝑥 representing the covariates, 0t is the unspecified baseline hazard function. The unspecified baseline hazard function may be interpreted as hazard function for which all the covariates have a zero value (Mills, 2011:87) and is thus independent of the covariates.

For this study, the elements of the equation can be explained as follows with regard to credit card uptake:

i = credit card t = age

x = the independent variables/covariates (life stage, number of financial assets, housing assets, currently living with parents, children /dependents up to 12 years of age, children/dependents 13 years plus, level of education, household income, personal income, marital status, family size, work status, self-employed, occupation, changed jobs in the past 12 months, got married in the past 12 months, moved in the past 12 months, spent money on education in the past 12 months)

ℎ0age = the baseline hazard which is the probability of taking up credit card debt

The baseline hazard shape over time is determined by the baseline hazard. Covariates have the function of determining the overall degree of the function. The baseline hazard is time dependent, whereas the covariate effect remains the same for all time points (SPSS Inc., 2015). The Cox proportional-hazards regression model has the following assumption: the hazard ratio for an individual is a fixed proportion of the hazard for all other individuals (Mills, 2011:88), thus the ratio of any two individuals at any time is the ratio of the covariate effects (SPSS Inc., 2015) and will remain constant over time (Mills, 2011:88). Thus, the hazards should be parallel to each other (Mills, 2011:88, Kembo, 2009:49).

Survival analysis may be useful for policy decision makers and stakeholders, as it will assist them in gaining insight into when households are most likely to take on a particular debt product. For the purposes of this study, when assessing the results of the Cox regressions analyses the statistical significance of the relationship and the beta coefficient are of importance. Statistical significance in a sample indicates the probability of the researcher finding a relationship and proving that the results are unlikely to be the result of chance factors (Neuman, 2006:371). The levels of significance were performed at the following levels: p<.5, p<.05 and p<.01. The beta coefficient (β) is representative of the amount of change in the dependent variable resulting from one change in the independent variable, along with all other independent variables being held constant (Newton & Rudestam, 1999:266). The beta coefficient may be positive or negative, with a positive value denoting an increase in the probability of the hazard occurring and a negative value denoting a decrease in the probability of the hazard occurring. The higher the beta coefficient, whether positive or negative, the greater the prediction effect of the independent variable on the dependent variable. Thus, the coefficient reveals that on the one hand, when the hazard function is greater than 0, then the hazard, which is the event which is experienced, increases; on the other hand, if the hazard is less than 0, then the hazard decreases (Mills, 2011:96). In line with other regression models, the impact of the outcome will be influenced by the variables that have been chosen to be included or excluded in the analysis (Ansell et al., 2007:400). In order to carry out the analysis, the Survival command in SPSS 23.0 was used to construct the Cox Regression, using the combined dataset, which was filtered as

described in 4.4.3.4 to obtain the sample for the analyses. The commands as shown below in Figure 4.21 for credit card were carried out for each of the other five debt products, namely: home loans, overdraft, student loans, vehicle finance and other loans. For a description of the covariates, refer to section 4.4.2.1.

Figure 4.21: Cox Regression

Source: Author’s own.

The results of the cox regressions carried out will be discussed in section 5.3 and section 5.4.

4.5. ETHICAL CONSIDERATIONS

The Bureau of Market Research has obtained permission from the South African Audience and Research Foundation (SAARF) to use the secondary data for their various research projects, and it has been extensively used for this purpose. SAARF complies with their ethical code and obtains consent from all the participants to the surveys prior to performing the at-home interviews. SAARF is a member of the South African Market Research Association (SAMRA). SAMRA is a professional association and has a code of conduct to which all its members subscribe.

Ethical clearance for the purposes of this study, utilising secondary data, was granted by the Ethics Review Committee of the School of Accounting Sciences.