Stage 5 data analysis - Stages of the study

3.3 Stages of the study

3.3.5 Stage 5 data analysis

The data analysis associated with the validation of the questionnaire is described in chapter four. The analysis of the mixed methods data collected for the case comparison aspect of this study was guided by a seven stage process described by Onwuegbuzie and Teddlie (2003). This process was as follows a) data reduction, b) data display, c) data transformation, d) data correlation, e) data consolidation f) data comparison, g) data integration.

3.3.5.1 Quantitative data reduction and display

Quantitative data from the questionnaire was analysed using SPSS version 16. Data was displayed using graphs and tables. Critical P was set at 0.05.

Numerical data were described in terms of means, standard deviations (SD), standard error of the mean (SE) and 95% confidence intervals (CI). Numerical data was assessed for normal distribution. The size of sample for which non normality and unequal variances can be ignored varies but Bland (2000) suggested 50+ in each group. Ruxton (2006) suggested using the unequal variance t-test unless the sample sizes were identical. In this study where Levene’s test was significant indicating unequal variances an unequal variance t-test was used to analyse the mean differences between constructs and where the group size was < 50 a Mann-Whitney U test was performed to confirm the findings. Where Levene’s test was not significant indicating equal variances an equal variance t-test was used in the analysis.

Categorical data was described in terms of frequencies and analysed using the Pearson Chi square test and displayed using 2 x 2 and 2 x 3 tables. The chi square statistic, p value and odds ratios were calculated with the 95% CI. Relative risk was also calculated. Risk refers to the increased (or decreased) risk of a factor. A relative risk of one indicates that there is no difference between the groups, while a risk of 2 indicates the condition of interest is twice as likely to have the impairment. It was decided to present data in relation to ‘relative risk’ as well as well as odds ratios whilst

acknowledging that odds ratios tend to exaggerate the probability of an impairment if the condition being investigated is above 10% (Grimes and Schulz 2008). Where numbers were below 5 a Fisher’s Exact test was used for comparing proportions. Calculations for the Fisher’s Exact are one sided, but where the marginal totals are different it is important to get a two sided test (Bland 2000) To calculate a two-sided probability it is recommended to double the one sided probability

(Armitage and Berry 1994).

Multiple linear regression analysis was used to examine the relationship between a single outcome variable to two or more explanatory variables. A description and validity of the regression model was reported in relation to variance inflation factors and averages and tolerances. Variance inflation factors close to 1, variance inflation averages not substantially greater than one and tolerances well above 0.2 would indicate that co-linearity was not a problem and there was no biasing of the regression model. These were reported in the results (Chapter 5). The R2 represented the proportion (as a percentage) of the variability explained by the model (Field 2005).

Logistic regression was used to examine the relationship between a binary outcome and a number of explanatory variables. An analysis of each model was described, in which the Hosner and Lemeshow and Wald statistics were reported. The Hosner and Lemeshow statistic has a chi- square distribution and indicates how well the model fits with the explanatory variable (a non significant result indicates good prediction). The Wald statistic tests the significance of independent variables to the regression model. Where the Wald statistic was found to be non significant

variables would need to be dropped from the model. To limit type II errors (false negatives) in the model it has been suggested that the ratio of sample population to explanatory variables be set at a minimum ratio of 10 to 1 with a sample size of about 100 (Tabachnick and Fidell 1996; 2001; Petrie and Sabin 2005; Field 2005).

3.3.5.2 Qualitative data reduction and display

Qualitative data from the questionnaire were loaded into word documents and read through a number of times in order to familiarise the researcher with the data. It was anticipated that this familiarisation of the text would enable patterns to be recognised within the data. The qualitative data collected in relation to exploring engagement in physical activity was collected from both patients with JHS and healthy volunteers. It was analysed using content analysis in which themes and patterns were identified. The themes and subthemes related to the type of physical activity while the patterns incorporated the frequency and duration of physical activity (See appendix 20) this was a method previously described (Patton 2002 p 452).

The qualitative data relating to the question ‘Can you recall an event that triggered the onset of your aches and pains?’ was collected only from patients with JHS. The data was analysed after reading and re-reading the text and dividing the text into meaning units. These were categorised and coded into meanings relevant to the question and the views of patients with JHS. These then formed themes and subthemes. This method has previously been described (Patton 2002 p454) using the terms ‘indigenous concepts and practices’. The use of meaning units for the analysis of this data was thought to best fit this analysis as it was assumed that patients with JHS had an innate understanding of their aches and pains. The inductive coding, categories and themes can be viewed (See appendix 17).

The qualitative data for the next part of the study came from the open ended question at the end of the questionnaire which was; ‘Is there any other information you wish to add?’ The qualitative data analysis in this section was based on the code and coding methods described Miles and Huberman (1994 p55). This method was chosen to enable coding of a breadth of information. The codes were grouped into broader categories that most accurately reflected the context. A thematic frame work emerged in which there were themes and subthemes. Expert advice was sought from supervisors throughout the process. In addition advice was sought from a field expert not involved in the study in order to validate the process and findings. Further details of the analysis may be viewed (See appendix 18).

3.3.5.3 Data transformation

Transformation of the data may involve either quantitising or qualitising data. Quantitising is a process in which qualitative data is transformed in to quantitative data (Sandelowski 2000). In this study the numbers of patients with JHS who reported functional difficulties both as an adult and as a child who reported on events that triggered their pain was described and compared using descriptive statistics. Qualitising is a process by which quantitative data is transferred into

qualitative data (Sandelowski 2000). This involves extracting information using another dimension and can also be used to confirm interpretations. This latter type of transformation was not used in this study.

3.3.5.4 Data correlation and consolidation

Where additional information was considered useful in explaining a phenomenon qualitative data was quantitised and comparisons were made between the proportions. In particular this related to the themes generated by patients with JHS who reported events triggering the onset of their aches and pains. In this case the proportions of patients with JHS who reported functional difficulties both

as a child and as an adult were compared with those who reported no functional difficulties. This data was analysed in relation to a theme was described and analysed using the Fisher’s Exact test.

3.3.5.5 Comparison and integration

The researcher anticipated that the comparison and integration of some of the qualitative and quantitative data would increase understanding. In this instance qualitative and quantitative data were integrated. In particular this related to the themes generated by patients with JHS when they described aspects of their condition. Multiple regression analysis was carried out to examine the relationship between a single outcome variable generated from the quantitative data (employing the physical component summary score of the SF-12) and explanatory variables generated from qualitative data (See section 5.9.5 ) The rationale for this analysis and discussion are reported (See 6.5).

In document Exploring the multi-factorial manifestations of joint hypermobility syndrome and the impact on quality of life. (Page 66-69)