• No results found

4.13 Data analysis

4.13.1 Univariate statistics

Univariate means the researcher statistically analyse only one variable at a time. The use of univariate statistics in this study was to calculate some statistics to measure the distribution, central tendency and dispersion of data included in this analysis.

       

4. Research Methodology 135

4.13.1.1 Measures of central tendency

Frequency distributions can be useful for examining the different values for a variable. Frequency distribution tables are easy to read and provide a great deal of basic information (Hair et al. 2010). The mean, median, and mode are measures of central tendency. These measures locate the centre of the distri- bution. The mean is the average value within the distribution and is the most commonly used measure of central tendency. The mean calculates an average across a number of observations. The other measurement is the median. The median is the middle value of the distribution when the distribution is ordered in either an ascending or a descending sequence. The third measurement of the central tendency is the mode. The mode is the value that appears in the distribution most often.

4.13.1.2 Measures of dispersion

Measures of central tendency often do not tell the whole story about a distribu- tion of responses (Hair et al. 2010). Measures of dispersion describe how close to the mean or other measure of central tendency the rest of the values in the distribution fall. Two measures of dispersion that describe the variability in a distribution of numbers are the range and the standard deviation (Hair et al. 2010). The range defines the spread of the data. It is the distance between the smallest and largest values of the variable, while the standard deviation describes the average distance of the distribution values from the mean. The difference between a particular response and the distribution mean is called a deviation.

4.13.1.3 Bivariate statistical tests

In many instances researchers test hypotheses that compare the characteristics of two groups or two variables by examining the relationships. Relationships between variables means the variation in one variable coincides with variation in another variable (Greener 2008).

Bivariate data analysis is analysis and hypothesis testing when the investigation concerns simultaneous investigation of two variables. This may be done using tests of differences or measures of association between two variables at a time (Cooper and Schindler 2010). According to Hair et al. (2010), there are three

       

types of bivariate hypothesis tests: Chi-square, the t -test and analysis of vari- ance. In this study, we used a cross-tabulation and Chi-square to examine the effect of each independent variable on SMME growth.

4.13.1.4 Cross-tabulation

It is set up as a frequency table including column percentages but showing both variables against the chosen categories. If one variable is suspected of being the independent variable, this is shown as a column variable not a row variable. Such tables are used to look for patterns of association in the data (Greener 2008).A cross tabulation is just a more advanced method of presenting frequency data. It presents the frequencies in a matrix (Mutezo 2005).

Cross-tabulation is useful for examining relationships and reporting the findings for two variables. The purpose of cross-tabulation is to determine if differences exist between subgroups of the total sample (Hair et al. 2010).

Cross tabulation is one of the simplest methods for describing sets of relation- ships. Across-tabulation is a frequency distribution of responses on two or more sets of variables. One purpose of cross-tabulations is to study relationships among variables. Researchers can use the Chi-square test to determine whether responses observed in a survey follow the expected pattern.

4.13.1.5 Chi-square analysis

Chi-square (X2) analysis enables researchers to test for statistical significance

between the frequency distributions of two or more nominally scaled variables in a cross-tabulation table to determine if there is any association between the variables (Hair et al. 2010). The chi-square test evaluates the relationship be- tween two variables. Instead of measuring numerical scores, each individual is simply classified into a category for each of the two variables (Gravetter and Forzano 2012).

Chi-square analysis compares the observed frequencies of the responses with the expected frequencies. The Chi-square statistic tests whether or not the observed data are distributed the way the researcher would expect them to be, given the assumption that the variables are not related. The expected cell count is a the- oretical value, while the observed cell count is the actual cell count based on the study. The Chi-square statistic answers questions about relationships between

       

4. Research Methodology 137

nominally scaled data that cannot be analysed with other types of statistical analysis, such as ANOVA or t -tests.

4.13.1.6 Analysis of variance (ANOVA)

Researchers use analysis of variance (ANOVA) to determine the statistical dif- ference between three or more means. ANOVA test is used to test for significant differences in situations where the variables are more than two (Leroy 2012). ANOVA is used to compare the means between-subjects research study using two or more separate samples to compare two or more separate treatment con- ditions or populations.

The test statistic produced by ANOVA is the (F) statistic, and a (p) value is as- sociated with the (F). If the (p) value is less than .05, researchers conclude that the ANOVA is statistically significant and therefore the three or more groups differ from each other (Vanderstoep and Johnston 2009).

The total variance in a set of responses to a question is made up of between- group and within-group variance. The between-group variance measures how much the sample means of the groups differ from one another. In contrast, the within-group variance measures how much the response within each group dif- fers from one another. The F distribution is the ratio of these two components of total variance (Hair et al. 2010). The larger the difference in the variance between groups, the larger the F ratio. Since the total variance in a data set is divisible into between- and within-group components, if there is more variance explained or accounted for by considering differences between groups than there is within groups, then the independent variable probably has a significant impact on the dependent variable (Hair et al. 2010). Larger F ratios imply significant differences between the groups. Thus, the larger the F ratio, the more likely it is that the null hypothesis will be rejected.