4.2. RESEARCH AIM AND OBJECTIVES 68
4.5.7 Data Analysis
The responses to the questionnaire-based survey were captured on a Microsoft Excel®
spreadsheet by the researcher and coded for analysis using statistical software (SPSS and STATA10).
Initial data analysis was performed using descriptive statistics with the aid of graphical representation where relevant. The main objective of descriptive statistics is to summarize and represent obtained observations of time variables (Terre Blanche & Durrheim, 1999). The study demographics were tallied and captured with the South African current demographics to determine generalizeability. Most of the data was expressed numerically as a percentage of the total population involved.
The data obtained from demographic observations was depicted using frequency distributions;
for example bar charts, line curves and pie charts, where necessary. A frequency distribution is a graphical representation of the pattern of values obtained in a particular observation (Antonius, 2003). Once the observations had been manipulated into diagrams, the measures of central tendencies (mode, median and mean) were easily defined and reported by the researcher.
To enable the measurement of bi-variate or multivariate relationships between independent variables (age, gender, and so forth) and the dependent variable willingness to pay (WTP), and to determine the significance of these relationships, additional statistical techniques such as box plots, cross-tabulation, chi-square tests, analysis of variance (ANOVA) and linear regression were employed under the guidance of a statistician (please refer to acknowledgments).
4.5.7.1. Box plots or Whisker diagrams
Box plots or whisker diagrams were used to further describe the data. A box plot is a standardized visual description of the most important aspects of an observed variable based on the five-number summaries: minimum (lower extreme), first quartile, median, third quartile and maximum (upper extreme) as shown in figure 4.1 (Keller & Warrack, 2003)
.
Box plot or Whisker diagram
Figure 4.1: Illustrating a box plot or a whisker diagram 4.5.7.2. Cross-tabulation and chi-square test
Cross-tabulation of chi-square tests were utilized to measure the existence or absence of relationships between variables and the significance thereof. A cross-tabulation is a joint frequency distribution based on two or more categorical variables. They are usually presented in a matrix, called a contingency table. The joint frequency distribution can be analyzed with the chi-square statistic to determine the significance in the relationship between the variables (Keller
& Warrack, 2003; Gujarati, 2003).
4.5.7.3. Analysis of Variance (ANOVA)
Analysis of Variance (ANOVA) is a collection of statistical models and their associated procedures, useful in comparing means of more than two groups of observations and it is sometimes referred to as the test of significant difference between means (Terre Blanche &
Upper extreme
Upper Quartile
Median
Lower Quartile
Lower extreme Inter-quartile
Range (IQR)
there are several independent variables and can help identify the way in which these independent variables interact, as well as the effect these interactions have on the dependent variable in a regression analysis (Terre Blanche & Durrheim, 1999).
4.5.7.4. Correlations
Correlation statistical techniques indicate the statistical relationship between two or more random variables or observed data values (Gujarati, 2003). These techniques are used to measure the strength of relationship between two variables. The relationship between two variables can be strong or weak and positive or negative (Terre Blanche & Durrheim, 1999). Mathematically, the correlation is expressed in the form of a correlation co-efficient, that ranges from -1 (negatively dependent), through to 0 (absolutely independent), and 1 (always occur together). A simple correlation co-efficient is a statistical measure of degree of linear association, between two variables (Gujarati, 2003). Correlation and regression are similar as they are both associated with assessment of relationships between variables. In this study the relationship between variables was assessed by employing the Pearson correlation coefficient and the significance of the relationships was also determined.
4.5.7.5. Regression Analysis
Regression analysis is a statistical tool for investigating and quantifying relationship between variables, that is, the dependent variable as a function of the independent or explanatory variables, through the aid of a mathematical equation (Studenmund, 2000).
In this study the dependent variable, the respondents‘ willingness to pay (WTP) for pharmacist-provided services, was expressed as a function of multivariate determinates such as age, gender and income, as explained in the following section. The explanatory variables were carefully chosen with reference to literature (Donald, 2000; Shu Chuen Li, 2003), and their individual relationship with respect to WTP was simultaneously investigated and defined with the assumption that other influential factors and independent variables were held constant. The table below indicates the independent variables which were regressed against the dependent variable WTP.
Table 4.5 Explaining and defining the study’s independent variable Independent variable Symbol and expected
sign
Explanation
Geographical location Geo-Loc (+) The participants‘ provincial location in South Africa, which included: Gauteng, North-west, Limpopo, Mpumalanga, Western Cape, Eastern Cape, Northern Cape, Free-state and Kwa-Zulu Natal. The objective was to ascertain if a participant‘s location had a particular effect on WTP.
Some provinces have a greater contribution to the country‘s GDP; hence more economic muscle thus possibly influencing participant‘s WTP.
Gender Gender (+) The objective was to ascertain if the gender of participants had an effect on WTP.
Age Age (+) The participants were categorized in to age
groups. It is expected that different age groups have different appreciation for pharmaceutical care; hence a possible disparity in their WTP across age categories.
Race Race (+) Consisted of the racial groups representing the South African nation, that is, Blacks, Whites, Asians and Coloureds. Owing to skewed distribution in economic ability amongst racial groups in SA, race was expected to have an effect on WTP.
Level of education Education (+) It was expected that the greater the education level of a participant the greater the appreciation of healthcare, the better the economic power and possibly the greater the willingness to pay.
Total household income
Income (+) High income earners were expected to be willing to pay more possibly due to ability to pay.
Employment status Employment (+) An employed participant should be economically able to pay more than an unemployed participant.
Chronic diseases Cronic (+) Participants with a chronic disease were expected to be willing to pay more.
Co-morbidities Co-morbid (+) The more chronic diseases, that is an increase in co-morbidities, possibly the greater the expected willingness to pay.
Medical aid status Medical aid (-) Participants with medical aid are expected to be reluctant to pay as they expect their medical aid to cover the proposed serviceon their behalf.
Level of Satisfaction Satisfaction (+) The greater the level of satisfaction the greater the likelihood of willingness to pay.
As discussed above, to determine the regressive effect of each individual variable on the dependent variable, other variables were held constant. However in reality these variables may occur together and hence interfere, consequently producing a combined effect, thus for purposes of inference a second regression including all the independent variables was done.