4.5 Data analysis techniques
4.5.1. Quantitative data
The quantitative data was analysed using the Statistical Package for the Social Sciences SPSS version 14. A range of statistical procedures are adopted to explore the research questions posed and to test the hypotheses. Initially descriptive analysis was undertaken to explore the results prior to in-depth analysis undertaken to test the hypotheses posed. In identifying suitable analytical techniques statistical textbooks were consulted. These texts suggest and identify appropriate statistics for difference types of research questions and research hypotheses (see Leech et al., 2005, Nardi, 2006, Carver & Nash, 2005, Colman et al., 2006). Nardi (2006) in particular suggested the statistical decision tree15 which is very helpful in deciding on what statistical methods would be most appropriate for this study. Most of the data
14
See Appendix 1.1, 2.1. 15
collected adopted a 5 point likert-scale. Responses to several likert items are summed and averaged, they are treated as interval data measuring a latent variable16.
a) One Sample T-Test
This t-test is adopted to determine the significance of the difference between the mean of a sample of scores and some specified value. In this study, 3 (midpoint of the likert scale) was used as a test value. Three represents a neutral point, for example between agree and disagree, therefore if the mean value falls below the test value, this suggests that the respondent did not agree with that particular item, or question. This test is to be adopted in the process of investigating research questions one, two and three, and in testing hypotheses one and two. The t-test analysis offers insights into five questions:
1) Do respondents agree with the statements representing optimism, innovativeness, easy to use and usefulness in their readiness to accept data mining?
2) Do adopters agree to that technological, organisational, human resource, and external issues are important to the decision to employ data mining?
3) Do non-adopters agree that technological, organisational and human resources issues offered reasons for not utilising data mining?
4) Do respondents agree to the statements reflecting the impact data mining could have on the accounting information system and the decision making process?
5) Do respondents agree that the ability to utilise data mining is important in the process of assessing the performance of AIS?
16
Although there is argument that such treatment can be seen controversial (see, Jamieson, 2004). It also has become common practice to assume the likert-scale categories constitute interval-level measurement.
b) Independent Samples T-Test
This type of t-test17 has been used to test whether there is a significant difference between two groups of respondents. The test is used to compare the means of the two groups to assess whether there is a significant difference between the groups. For example, was there a difference between respondents who were mailed questionnaires as compared to those who received a hand delivered questionnaire? Are there gender differences between in terms of readiness toward data mining? In this study these tests are used to test the validity of survey instrument and to test hypothesis four.
c) One-way Analysis of Variance (ANOVA)
ANOVA18 is undertaken to assess whether there is a significant difference among several independent group means. In general, the test is similar to t-test but designed to determine the significance of the differences among three or more (rather than only two) group means. In this study, analysis of variance (ANOVA) mainly intended to offer responses to the questions:
1) Is the readiness toward data mining significantly different among different levels of education, different groups of adopters, different job functions, different levels knowledge about data mining, and working experience in AIS19?
2) Are there any different between respondents who have more knowledge about data mining and who has limited knowledge about it on their perception of the impact of data mining to AIS and Decision Making?
17
The test is used to test for a significant difference between the means of two independent or unrelated samples of scores. It can be used with groups of unequal size. (Colman et al., 2006) 18
It was done by partitioning the total variance in the dependent variable in effects due to different levels of the independent variable (Colman et. al., 2006)
19
As also suggested by Francis (2004) and Nardi (2006) when there are more than two categories for example as for this study as concern, the level of education, we should use analysis of variance. A technique that asks whether the differences within a category are larger or smaller than those between those four levels of education.
It will be adopted to test hypotheses five, six, seven, eight and nine examining the differences of means (readiness) among those categories of independent groups. The test will seek to assess whether there is any difference between the levels of independent variables on readiness toward data mining technology.
d) Association Analysis (Correlation and Cross Tabulation)
Measurements of association via correlation indicate the strength and the direction of the relationship between pair of variables. There are two types of measure: measures of linear correlation using interval variables and measures of rank correlation using ordinal variables (Bryman & Cramer, 1994).
A linear correlation analysis was adopted to explore the relationships between statements representing the ability to utilise data mining in the performance of Accounting Information Systems. As the variables are interval, the Pearson product moment correlation is adopted. This is the most well-known approach of expressing the effect sizes in terms of strength of association (Leech et al., 2005). ‘Using Pearson r, effect size are always less than 1.0, varying between -1.0 and +1.0 with 0 representing no effect and +1 or -1 the maximum effect’ (Leech et al., 2005, p.55). Pearson r normally used in measuring or testing associational type of questions or hypothesis which both variables under study are normal/scale in measurement. For the purpose of this study, the interpretation of the strength of a relationship (effect size) includes: .10 to .30 as small, .30 to .50 medium, .50 to .70 large and > .70 very large strength of relationship20.
While in the case of ordinal variables, second type of correlation analysis is appropriate to be used. Spearman’s rank order correlation coefficient or rho was adopted for investigating the correlation between ordinal variables (Colman et al., 2006). It will be adopted for measuring the correlation of data mining knowledge with the intention to adopt data mining and with data mining terminology used.
20
Leech et al., (2005) offer a discussion on interpreting size effect sizes which mostly referring to Cohen’s (1988) works.
Unlike the Pearson r (parametric tests), Spearman’s rank order correlation is non- parametric test method. This is because sometimes we cannot assume normality in the data and also the data sometimes do not lend themselves to computing a mean (in this case the variables are ordinal). Nonparametrics is advised to be used in this situation (Carver & Nash, 2005). In the cross tabulation procedure, Gamma was also used to measure the strength of association which also indicates the direction of association between two ordinal variables (Babbie et al., 2003).
For this study the correlation analysis is intended to offer responses to the following questions which relate to hypothesis three and ten respectively:
1) Is knowledge about data mining possessed by respondents correlated with the intention to utilise that technology?
2) Is there a correlation between an ability to utilise data mining with the performance of AIS?