TABLE 2.6 TYPES OF VALIDITY
2.11 DATA ANALYSIS
The purpose of data analysis is to make sense of the data obtained so that accurate conclusions and recommendations can be drawn. In other words, to change data into information. Data analysis is not an end in itself. It helps to produce information that will help address the problem at hand.
73 The selection of a data analysis strategy should be based on the earlier steps of the marketing research process, namely, the characteristics of the data and properties of statistical techniques (Malhotra, 2006: 440). Below are a number of methods of data analysis used in the current study.
2.11.1 Descriptive statistics
According to Aaker et al (2007: 438) descriptive statistics help to summarise the information presented in frequency tables. Descriptive statistics consist of measures of central tendency (mean, median and mode), measures of dispersion (range, standard deviation and frequency distribution), and measures of position (quartiles, deciles and percentiles). One of the uses of descriptive research is to generalise and relate the findings gathered from the sample to other situations. Descriptive statistics were used for example (refer to section 6.2) to report on the general characteristics of the respondent organisations and their most senior managers.
According to Leedy and Ormrod (2013: 186) box-and-whisker plots are typically used in descriptive research to relate two quantitative variables. In the current study the researcher made use of box-and-whisker plots to allow a visual inspection of the relationship between two CRM variables (refer to section 6.8). Pallant (2013: 77) recommends the generation of a box-and- whisker plot before calculating correlations. The box-and-whisker plot will provide the researcher with an indication of whether the variables are related
74 in a linear or curvilinear fashion. For correlation analyses, only linear relationships are suitable (Pallant, 2013: 77).
Box-and-whisker plots are also useful when the researcher wishes to compare the distribution of scores on variables. The researcher uses these plots to explore the distribution of one continuous variable or alternatively, the researcher can ask for scores to be broken down for different groups (Pallant, 2013: 81). For example, in the current study, box-and-whisker plots were used to show the relationship between the gender composition of managers and perceived business performance.
2.11.2 Inferential statistics
Inferential statistics are techniques that use descriptive statistics from a sample to make inferences about population parameters. Inferential statistics includes techniques that help to explore relationships between variables and probe complex questions (Somekh & Lewin, 2011: 231). Brotherton (2010: 195) contends that inferential statistics provide the researcher with an objective basis for making claims, as opposed to making mere speculations.
Somekh and Lewin (2011: 232) report that a statistical procedure entails a three-stage process. First, a range of descriptive techniques precede inferential statistics which are used to ensure the data are reliable, valid and meet the criteria required for statistical analysis. Second, the appropriate statistical tests are computed. Third, the appropriate test of significance is
75 conducted to examine what the probability of achieving the test result is. It is from the test of statistical significance that enables the researcher to express confidence of the result achieved.
Inferential statistical analysis used in the current study included factor analysis and Chi-square tests.
2.11.3 Factor analysis
Factor analysis is a multivariate statistical procedure with many uses. Williams, Onsman and Brown (2010: 2) note three uses of factor analysis, namely to:
reduce a large number of variables into a smaller number of subsets or factors;
establish underlying dimensions between measured variables and latent constructs, allowing the formation and refinement of the theory; and
provide constructively valid evidence of self-reporting scales.
Williams et al (2010: 2) distinguish between two major classes of factor analysis, Exploratory Factor Analysis (EFA), and Confirmatory Factor Analysis (CFA). In EFA, the researcher has no expectations of the number of inherent factors, as the analysis is exploratory in nature. It allows the researcher to explore the main dimensions to generate a theory from a large set of items. On the other hand, the researcher uses CFA to test a proposed
76 theory or model. Thus, CFA may be seen as a form of structural equation modelling.
EFA was performed in the current study to reduce the data into a smaller number of factors, specifically to identify CRM dimensions. Principal components analysis was used as the extraction method and the oblique technique as the rotation strategy. The factorability of the dataset of interest was first assessed. Since Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy (Kaiser, 1970: 401-415) was 0.932 (substantially greater than the 0.70 threshold value) and Bartlett‟s test of sphericity (Bartlett, 1954: 296-298)
was significant (p< 0.001), the suitability of the dataset for factor analysis was justified. The results of the factor analysis are presented in Chapter 6.
The Cattell‟s scree test (Cattell, 1966: 245-276) was used to determine the number of factors. According to Malhotra (2006: 617), “(a) scree plot is a plot of eigen values against the number of factors”. The shape of the plot is used to determine the number of factors. The Cattell‟s scree plot which was
generated for this study can be found in Annexure 4.
One of the post-hoc tests, namely, Cramer‟s V was used to explore the
strength of relationships based on frequency distributions. Similar to correlation coefficients, Cramer‟s V statistics produce results which provide a
measure of strength between 0.0 and 1.0. The closer to 1.0, the stronger the association (Somekh & Lewin, 2011: 238).
77 2.11.4 Chi-square test
The Chi-square test is extensively used as a technique for determining whether a statistically significant relationship exists between two categorical (nominal or ordinal) variables (Parasuraman et al, 2007: 412). It can also be used to analyse associations of variables (e.g. cross-tabulation between management system and business type). The Chi-square test serves as a means of formally checking the relationship between such variables. It is important to note that when using the Chi-square (²) statistic in studies
where the sample (n) represents more than 250 and more than 30 variables are observed, significant p-values can be expected (Hair et al, 2000: 253). A p-value is seen as significant when it is smaller or equal to 0.05.
The results of the statistical tests typically included the statistical test used, the actual result, the degrees of freedom and the probability of achieving the result assuming the null hypothesis was true. When evaluating the significance of results, the researcher presents the findings in terms of significance levels. The significance levels that are generally used in the social sciences are α=.05, α=.01 and α=.001 and are based on the normal distribution curve (Somekh & Lewin, 2011: 232). In this study the α=.05 level
78 Degrees of freedom (df) is used for significance testing. According to Somekh and Lewin (2011: 232) degrees of freedom refer to the number of items in a set which can vary and the calculation of this differs according to the respective statistical technique.