2. Literature Review
3.4 Data analysis
3.4.7 First round of coding: Quantitative analysis
As stated earlier, in order to answer Research Question 2, I performed both quantitative and qualitative procedures of analysis. The mixed data analysis process is indicated in Figure 3.4. Firstly, in order to perform quantitative analysis on the qualitative oral data, the qualitative codes underwent the process of ‘quantitizing’, since they were transformed into numerical data (Sandelowski, 2011). Once more, the levels of measurement of variables determined the choice of the statistical tests. Therefore, considering that this time I had to work with categorical variables, I performed tests which were appropriate for measuring variables at the nominal level (Connor-Linton, 2010). The statistical analysis was operated in Microsoft Excel, where I performed manual equations of the relevant statistical tests. In particular, I performed descriptive statistics, chi-square tests for goodness of fit, and chi-square tests for independence.
Mixed-data analysis
grammatical lexical pronunciation unsolicited use of L1 prompts reformulations repair needs-repair no uptake Error typesCF types Uptake types
Quantitizing Descriptive statistics Chi-square tests for goodness of fit Chi-square tests for independence Tested the null
hypotheses QUALITATIVE DATA Oral data Transcription Predetermined & emergent coding
QUAN
Open coding Developed themes Established relationships Examined patterns Description ConceptualisationInterpretation & connection of QUAN and QUAL
QUAL
Post-hoc pairwise binomial tests
Post-hoc pairwise comparisons
The first step in the quantitative analysis of the oral data involved the operation of descriptive statistics. Descriptive statistics were performed for all of the elements of CF episodes to present a general picture of the distribution and frequency of single variables, namely types of error, CF, and uptake, across the sample. Descriptive statistics served as a building block, since the outcome was a summary of the overall picture of the data sample (Salkind, 2010).
Next, I performed chi-square tests for goodness of fit to test the significance of the distribution of the sample. The assumptions for the chi-square tests were met by the current sample. In particular, for each of the chi-square for goodness of fit test, there was one categorical variable, independence of observations, and the expected frequency of each categorical variable was at least five in each group (Pallant, 2011). Thus, I tested the nature of the distributions, for distinct variables, as expressed in the following null
hypothesis: Ho = Oi = Ei, i.e. there was an equal number of values for each variable type
distributed across the dataset. The null hypothesis was tested as opposite to the alternative
hypothesis: Ha = Oi ≠ Ei, i.e. values of variable types were not equally distributed in the
dataset. With an alpha level (α) of .05, the results were tested for probability levels to assess the power of the test. Statistical significance denoted that the result did not simply occur in the particular sample by chance. Therefore, if p value < a, then the null hypothesis was rejected, in favour of the alternative hypothesis, and vice versa if p value
> a, then the null hypothesis was not rejected (Rumsey, 2010).
In addition, I performed post-hoc pairwise binomial tests after the chi-square tests to determine which of the categories were significantly different. I applied the Bonferroni correction to deal with Type I error. Therefore, the significance level for each post-hoc test was adjusted based on the number of tests that were performed for specific categories (Pallant, 2011). For example, if six tests were performed as part of a post-hoc test, then the adjusted significance level would be .008, rather than .05.
Furthermore, I explored the relations between the components of CF episodes, and specifically, the success of CF types in terms of uptake. In particular, chi-square tests for independence were performed for two-way contingency tables to test the relations between errors and CF, and CF and uptake (Connor-Linton, 2010). The assumptions for
the chi-square test for independence were met by the current sample. Specifically, there were two variables at the categorical level i.e. error types and CF types, or CF types and uptake types, there was independence of observations, and the sampling was cross sectional (Pallant, 2011). The null hypothesis: Ho = no association/dependency between
k classifications, supported the claim that there was no relationship between the variables.
This was tested in contrast to the alternative hypothesis: Ha = there is
association/dependency between k classifications, which supported the claim that there
was a relationship between the variables. Once again, with an alpha level (α) of .05, the probability value of the chi-square test revealed the degree of power of the statistical significance of the test (Rumsey, 2010).
Furthermore, I performed post-hoc pairwise comparisons after the overall chi-square tests to determine which of the categories were significantly different. I applied the Bonferroni correction to deal with Type I error. Therefore, as specified earlier, the significance level for each post-hoc test was adjusted based on the number of tests that were performed for specific categories (Pallant, 2011).
The quantitative findings of the oral classroom data which tested the distribution of the different elements of CF episodes, and the relations between them, were followed by a form of qualitative analysis. Adopting an explanatory sequential analysis design, I followed-up with qualitative analysis in order to interpret and to explain the quantitative outcomes (Creswell & Creswell, 2018).