Quantitative data analysis

CHAPTER 4 METHODOLOGY & METHODS

4.6 Data collection tools

4.6.10 Data analysis

4.6.10.1 Quantitative data analysis

Most of the structured questions for the quantitative study had multiple choice responses. Cross tabulations were generated between various explanatory variables e.g. demographic variables, occupation and socio-economic characteristics and outcome variables. The main outcome variables for this study were self-reported health status (physical health), mental health (had a mental health problem in the last month abroad), perceived health risks at work, accidents at work and visits to the doctor (utilisation of health care services). Health care utilisation was captured by whether respondents had visited a doctor in the last twelve months or not. A code book consisting of the original coding

used in the questionnaire and any re-coding for the analysis of each of the explanatory variables and outcome variables is presented in Appendix 4 (see Tables 4.4 to Table 4.6). Chi-square tests with continuity correction were applied to investigate the association between variables in 2 by 2 tables, and the Pearson Chi-square test was applied to other forms of tables (e.g. 3 by 2 tables). A Chi- square test for trend was applied to investigate the association between ordinal and categorical variables or binary outcome variables (Field, 2005).

Multiple logistic regression using the enter method was used to investigate the associations between major independent predictors and the dichotomous outcome variables. The reason behind using the enter method was to control for variable selection (ibid). Outcome variables were dichotomised with the coding in the same direction of worsening health status e.g verg good/good/fair coded as 0 and poor/very poor as 1 for self-reported health status (see Table 4.5). Similarly, the explanatory variables were coded in the same direction to ensure consistency of coding with the outcome variables e.g. very good/good/fair as 0 and poor/very poor as 1 for self-rated work environment (see Table 4.4).

The researcher used multiple logistic regression because cross tabulations only provide a simple association between outcome variables and independent variables (predictor). Whereas regression analysis is an accepted statistical method for assessing the association between independent variables (risk factor) and outcome variables, statistically adjusting for potential confounding effects of other covariates (Lee, 1986). Furthermore, logistic regression is the most popular technique available for modelling dichotomous dependent variables (for example, Hosmer & Lemeshow, 2000; LaValley, 2008).

Ordinal regression analysis has not appropriate for this study as some of the outcome variables (e.g, self-reported health status and mental health problems) have small numbers and small numbers do not strengthen the analysis. Hence a decision was made to use simple logistic regression analysis by combining outcome variables into two groups (Manor et al., 2000; Petrie & Sabin, 2009).

For the analysis, independent variables were added in blocks of demographic, socio-economic, type of job (which correlates highly with country), country of work, health and lifestyle characteristics. The researcher used SPSS to select the parsimonious model using the enter method. A parsimonious (simpler) model (Field, 2005) was then selected including only those explanatory variables which were found to be statistically significant (p<0.05) in the preliminary analysis. This process, in theory, should lead to a model with stronger associations (i.e. we can be more certain that the findings are significant), though it is noted that this method does perhaps explain a slightly smaller part of the associations found. The Nagelkerke R-square test was used to measure the variance in the data explained by the models i.e. how well the model fits the data (Kinner & Gray, 2010). For the purpose of analysis, the outcome variable (self-reported health status) originally consisting of five categories was dichotomised (see code book in Appendix 4, Table 4.5), with those reporting poor or very poor health as “poor general health” recoded as 1 versus those who reported their health as “fair/good” as no cases reported their health as “very good” and recoded as 0.

The other outcome variable of mental health problems (i.e. reported feelings of nervousness, hopelessness, restlessness, depression, everything was an effort and worthlessness in the last month abroad) originally consisting of six categories (all of the time, most of the time, some of the time, a little of the time, none of the time and don’t know) was grouped into two categories. Category 1 consisted of responses “all of the time/most of the time/some of the time/a little of the time”. Similarly, category 2 consisted of “none of the time” (no cases reported “don’t know”). In the analysis, respondents who were situated in category 1 were recoded as 1 versus those who answered category 2 and were recoded as 0 (see code book Appendix 4, Table 4.6).

The outcome variable perceived health risks at work consisted of three categories. The “don’t know” group was combined with the “no” group. It is considered that “don’t know” had very few cases (only three responses) and therefore it was not appropriate to make a different group for this analysis. In the analysis, respondents who perceived to have health risks at work were recoded as

1. At the same time those who did not perceive to have health risks and those who reported “don’t know” were recoded as 0. The next outcome, variable accidents at work (have you experienced a work-related accident abroad?), consisted of two categories. In analysis, respondents who reported “yes” were recoded as 1 whilst those who reported “no” were recoded as 0 (see Appendix 4 Table 4.5).

Similarly, another outcome variable, health care utilisation or doctor visit (how many times in the last 12 months have visited a doctor in your host country?) was categorised into two groups. In analysis, respondents who did not visit doctor were recoded as 1 whilst those who visited doctor were recoded as 0. The explanatory variables health insurance and doctor registration, originally consisting of three categories, were collapsed into two groups. Respondents who reported “no/don’t know” were recoded as 1 and those who reported “yes” were recoded as 0. The reason behind combining the “don’t know” category with the “no”’ category was that it was assumed most likely that those who did not know whether they had health insurance (or registration with a doctor) did not have any.

In document Health status and health risks of male Nepalese migrants in the Middle East and Malaysia. (Page 86-89)

Quantitative data analysis

CHAPTER 4 METHODOLOGY &amp; METHODS

4.6 Data collection tools

4.6.10 Data analysis

4.6.10.1 Quantitative data analysis

CHAPTER 4 METHODOLOGY & METHODS