4.3 Inferential statistics
4.3.3 Semantic categories: epistemic commitment
The difference between the Egyptian L2 and the British L1 corpora is explored in terms of the distribution of the epistemic commitment devices across semantic categories. In the following section, writer groups are differentiated according to their use of three different semantic categories: certainty, probability and possibility.
An overview of the raw data indicates that both corpora contain 2122 epistemic devices (EDs). When calculating this data per 1000 words, it was found that both writer groups used 21.07 EDs per 1000 words; about sixty nine percent of these devices were used by the British L1 writers while the rest was used by the Egyptian L2 (see Figure 21). EDs in the L2 and L1 corpora total
118
8.39 and 12.69 (per 1000 words), respectively. The L1 writers use roughly 10% more as many EDs as the L2 writers.
Figure 21: Percentages of semantic categories between L2 and L1 writers
The pie chart in Figure 22 reveals that the possibility devices are the most used markers as they constitute more than half of the total used devices (11.42 per 1000 words); the probability devices came second (33%) with 6.92 devices per 1000 words and finally, the certainty markers (13%) with 2.76 per 1000 words.
Figure 22: Percentages of total epistemic commitment devices
It is clear from the data in Figure 23 that in all semantic categories, the devices appeared more often in the L1 corpus than the L2 one. It can be noticed that there is a slight difference between the percentages of the certainty devices used between the L2 and the L1 (49.05% to 50.95%), respectively, but there is a substantial difference in terms of possibility devices (44.83% to
39.82 60.22
Percentage of Epistemic Devices
L2 L1 13% 33% 54% Both corpora Certainty Probability Possibilty
119
55.17%), and a remarkable difference when using the probability devices (27.81 % to 72.19%), respectively.
Figure 23: Percentages of levels of epistemic commitment between the two corpora
As Table 26 reveals, the L2 writers’ use 1.35, 1.92 and 5.11 devices per 1000 words in terms of certainty, probability and possibility, respectively, while the L1 writers’ use 1.41, 4.99 and 6.92 per 1000 words. The L2 writers used also a wider range of EDs (3 more certainty, 9 more probability and 3 more possibility) than the L2, and this was expected as the text analysis of boosters and hedges in the previous sections illustrated this.
Table 26: Semantic categories: Raw number and per 1000 words
Writer Group Certainty Probability Possibility
L2 writers 121 (1.35) 172 (1.92) 457 (5.11) L1 writers 152 (1.41) 540 (4.99) 680 (6.29)
Total 273 (2.76) 712 (6.92) 1137(11.40)
Though standardising the frequencies give more valid data for comparison between the two writer groups, the raw data are presented to demonstrate the most frequent certainty, probability and possibility devices to provide more numerical clarity.
Table 27 shows the top 10 items of the most frequent certainty devices in both the L2 and L1 corpora. Those items constitute around 96% of the L2 corpus and about 92% of in the L1 corpus of the total frequencies of EDs. The table reports that the fact that and clearly are the most
49.05 27.81 44.83 50.95 72.19 55.17 0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00
Certainty Probability Possibility
Percentages of levels of certainty
120
frequent items in both corpora with more than 50% of the total used devices. Even though the number of markers used differs, both groups show significant similarities in their use of those devices listed among the 10 most frequent. Eight devices, i.e. ‘the fact that, clearly, actually, always, indeed, prove, must and never’, are common to both writer groups although the frequency of occurrence for each such word is different between the groups. For instance, ‘prove’ was used 10 times in the L2 corpus, but 3 times in the L1 corpus.
Table 27: Raw No. of certainty devices between L2 and L1 writers
Rank Certainty
Devices L2 writers % to all certainty devices L1 writers % to all certainty devices 1 the fact that 43 35.54 32 21.05
2 clearly 32 26.45 30 19.74 3 actually 11 9.09 9 5.92 4 always 6 4.96 11 7.24 5 indeed 8 6.61 7 4.61 6 certainly 0 0.00 11 7.24 7 certain 0 0.00 11 7.24 8 prove 10 8.26 3 1.97 9 must 4 3.31 7 4.61 10 never 2 1.65 9 5.92 Total 121 152
The most remarkable aspect of the data is that the adverbs are reported to be the preferred devices when expressing certainty by both text writers. As can be seen from the table, the most frequent five devices after the fact that in the L1 data are adverbs ‘clearly, actually, always, indeed and certainly’ constituting more than 40% of the total certainty items. Similarly, the L2 figures show that the second, third, fourth and fifth most frequent devices ‘clearly, actually, always and indeed’ constitute more than 45% of the total certainty devices.
Turning now to the probability devices, Table 28 reveals that the top 10 items constitute about 99% of the L2 corpus and about 92% of the L1’s of the total frequencies of the probability devices. The modal verb would is reported to be the most frequent probability device in both corpora. Yet, the figures show that the epistemic verbs ‘indicate, seem, suggest and appear’ are preferred by both text writers as they constitute around 60% in the L2 texts and about 46% in the L1 one. Interestingly, there were, yet, differences in the proportions as epistemic would occurs three times and indicate is represented twice as frequently in the L1 data compared to the L2
121
corpus. Also, appear occurs very often in the L1 texts (12.04% of the probability devices) while it is represented very rarely in the L2 scripts (4.04%). Both ‘probably and generally’ occur 73 times in the L1 texts, but they do not occur in the L2 corpus.
Table 28: Raw No. of probability devices between L2 and L1 writers
Rank Probability
Devices L2 writers % to all probability devices L1 writers % to all probability devices 1 would 38 22.09 121 22.41 2 indicate 38 22.09 84 15.56 3 seem 20 11.63 51 9.44 4 suggest 23 13.37 42 7.78 5 appear 7 4.04 65 12.04 6 generally 0 0.00 41 7.79 7 likely 10 5.81 25 4.63 8 in general 22 12.79 8 1.48 9 probably 0 0.00 31 5.74 10 often 9 5.23 22 4.07 Total 172 540
Moving to the possibility devices, which constitute more than half of the total epistemic items (56%), it is clear from the data in Table 28 that both writer groups used ‘may’ more often than any other possibility device with 29.32% in the L2 and 47.21% in the L1; however, it is
represented more than twice as often in the L1 texts than the L2’s. However, the data reports that the L2 writers used ‘might’ three times more often than the L1 and used ‘can’ and ‘possible’ considerably more frequently than the British English L1. Another similarity between the two corpora is that both writer groups prefer using modal verbs when expressing possibility as the top four possibility words are ‘may, could, might and can’ in both corpora constituting more than 90% of all possibility devices in in both corpora (see Table 29).
Table 29: Raw No. of possibility devices between L2 and L1 writers
Rank Possibility
Devices L2 writers % of all possibility devices L1 writers % of all possibility devices 1 may 134 29.32 321 47.21 2 could 86 18.82 226 33.23 3 might 116 25.38 38 5.59 4 can 82 17.94 53 7.79 5 possible 27 5.91 4 0.59 6 sometimes 7 1.53 4 0.59 7 perhaps 5 1.09 22 3.24 8 possibility 0 0 5 0.74 9 possibly 0 0 4 0.59 10 tentatively 0 0 3 0.44 Total 457 608
122
Taken together, these results suggest that the L1 corpus contains considerably more certainty, probability and possibility devices than the L2 one including about 60% of the total devices while the L2 scripts contain the other 40%. The possibility markers are the most frequent devices in the two corpora with around 56% whereas the certainty devices were the least used (13%). There are significant similarities between the two writer groups as they mostly share the top ten list of each category of epistemic meaning. However, there is a remarkable difference in the frequencies of each device, as it is noticed that the L1 writers sometimes used some devices three times more frequently than the L2, e.g. ‘would’ in the probability devices, and other devices are used in the L1 corpora twice more often than in the L2, e.g. ‘may and could’ in the possibility category. Finally, when considering the part-of-speech used to realise epistemic modality by the two writer groups further similarities become apparent. Epistemic modal verbs were preferred by both writer groups when expressing possibility while epistemic lexical verbs were used mostly when expressing probability, but adverbs were their favourite devices to express certainty.