Data Management and Statistical Analysis - Perceived discrimination, coping options and their r

All surveys were checked for data collection errors or inconsistencies. Prior to data entry, coding was developed which provided instructions for each of the questions included in the survey. All the raw data was entered into the IBM SPSS Statistics 21 package and checked for input errors. Following this, the data was recoded where necessary and subscale and total scores were calculated according to the instructions for each measure. Questionnaires and consent forms were stored securely and all electronic information was entered on to an encrypted drive operated by the University of Leeds.

Missing responses, either through participants choosing to not answer questions or collection errors, were coded accordingly and the results presented are adjusted for missing data. With the exception of Outten et al.’s (2009) coping options scale where 4.8% of the data was missing (equivalent to nine participants), all other measures had less than 2% missing data. That the amount of missing data was highest for the coping options scale was due to

some participants choosing not to answer the questions as they did not perceive any discrimination towards them and therefore did not consider the questions relevant. Tabachnick and Fidell (2007) state that missing data of less than 5% does not cause concern and any procedure to handle missing values will yield similar results. For the current research missing data was omitted from all analyses on a pairwise deletion basis for all analyses except for mediation analyses, where a listwise deletion basis is considered most appropriate (Hayes, 2013).

2.6.1 Internal reliability of survey measures.

Cronbach’s alphas were calculated for each measure. Score of 0.7 or greater were considered to be good. All scales had Cronbach alphas in excess of 0.7 except the individual problem-focused subscale of the coping measure (α = .53) and the individual mobility scale (α = .59). Removing items from the scales did not improve their reliability and therefore no amendments were made. Blanz et al. (1998) reported an acceptable Cronbach alpha for the individual mobility scale (α = .86) and therefore, the reason for the current alpha score is unknown. Outten et al. (2009) reported a similarly low, albeit higher Cronbach alpha for the individual problem-focused scale (α = .66). The correlation coefficients of scales are also reported where scales or subscales had only two items. Table 26 in Appendix 1 provides the Cronbach alphas for all scales and subscales and the correlation coefficients where relevant.

2.6.2 Normality.

Table 27 in Appendix 1 presents the numerical normality data for each measure. Normality was assessed using values for skew and kurtosis, the Kolomogorov-Smirnov statistic and the shape of the histograms and normal Q-Q plots. Two criteria were considered in determining whether parametric or non-parametric statistics would be the most appropriate. These were a rule-of-thumb of between plus and minus one for skew and kurtosis and that Z scores for skew and kurtosis (the score divided by the standard error) were less than 3.29, which is based on sample size (Fife-Schaw, n.d.). Scales and subscales were considered to be normally distributed if they met both criteria, marginally normally distributed if they met one rule and to have a non-normal distribution if they met neither rule. Based on the criteria for assessing normality 11 scales or subscales were considered normally distributed, six were in the marginal range and eight did not meet either criteria. As a result of the majority of the data meeting both criteria, the overall data was considered not to violate the assumptions of normality and therefore it was considered appropriate to use parametric statistics. To further ensure that the use of parametric statistics was appropriate, two additional steps were employed. Preliminary analyses were conducted to ensure that the data did not violate the assumptions of normality, linearity, multicollinearity and homoscedasticity for multiple regressions. Secondly, bias corrected bootstrap analyses, based

on 5,000 bootstrap draws were used to examine mediated models. This analysis does not rely on the assumption that the data is normally distributed (Hayes, 2013).

2.6.3 Analytic approach.

The IBM SPSS Statistics 21 package was used to complete all analyses. Pearson product-moment correlation coefficients were calculated for descriptive purposes. Hierarchical multiple regressions were used to assess for significant direct relationships between variables, while controlling for covariates (gender and age in the present research). Standardized regression coefficients were reported for multiple regressions, which are the expected differences in the dependent variable, in standard deviations, between two cases that differ by one standard deviation on the predictor variable (Hayes, 2013). In contrast unstandardized regression coefficients report the expected difference in the dependent variable, in terms of the variable’s original units, between two cases that differed by one unit on the predictor variable. As recommended by Hayes (2013) unstandardized coefficients were reported for all indirect effects as he considers that standardized results to be less meaningful when testing for mediation. The coefficient of determination (R2) was also reported, which is the percentage of variation in the dependent variable explained by the predictor variables (Hinton, 2004).

To examine for indirect effects (i.e. mediated effects), Hayes’ (2013) PROCESS macro was used. This macro is an add-on to SPSS which uses path analysis (a statistical method of testing for cause and effect relationships) to test for moderation and mediation. Only the mediation function was used for the current research, which examines whether variables are significantly associated with one another through their relationship with other variables.

For mediation models, multiple mediator variables can be specified where the predictor variable is modelled as influencing the dependent variable directly as well as indirectly through two or more mediators, which operate in parallel. The macro uses ordinary least square path analysis to generate unstandardized model coefficients and confidence intervals for the direct (the effect of the predictor variable on the outcome variable, e.g. for RQ2 the relationship between perceived discrimination and psychological outcomes), the total indirect (the mediated effect of the predictor variable on the outcome variable through all mediators, e.g. for RQ2 the relationship between perceived discrimination and psychological outcomes mediated by all homeless social identity components), as well as the specific indirect effects (the mediated effect of the predictor variable on the outcome variable through each mediator individually, e.g. for RQ2 the relationship between perceived discrimination and psychological outcomes mediated by each identity component separately). Bias corrected (BC) bootstrapping was used to assess whether indirect effects were significant. These are constructed “by taking a random sample with replacement of size n

from the sample, estimating each specific indirect effect...in the resulting data, and repeating this resampling and estimation many times.” (Hayes, 2013, p. 139). By estimating each specific indirect effect thousands of times, endpoints of the confidence interval can be calculated. If the confidence interval does not cross zero then the indirect effect is considered to be significantly different from zero. For the current research 5,000 bootstrap samples were used with a 95% confidence interval. Hayes (2013) notes that using BC bootstrapping is the preferred approach to determine significance as it is more powerful than the normal theory approach and, as mentioned above, does not rely on the assumption of normally distributed data.

While the research involves multiple analyses, the results of these analyses were only used to inform the variables to be included in the six models which were tested to answer the five research questions and determine the overall percentage of variance accounted for. The Process Macro used in the research can estimate the direct and indirect effects of multiple variables in each model simultaneously, Therefore, while the research includes multiple analyses, the main results are informed by six larger analyses. Therefore, the p value was not adjusted to account for multiple analyses.

3. Results

In document Perceived discrimination, coping options and their relationship to mental health and psychological distress in homeless adults (Page 48-52)