Chapter 5: Quantitative analysis of MNZE rhoticity
5.2 Modelling the MNZE data
5.2.7 Model 1
5.2.7.2 Model 1 results
Unsurprisingly, Model 1 identified the following phonological context as a highly significant factor influencing /r/ articulation. The default condition in the model was vowel and tokens of /r/ with a following consonant were estimated to have a much lower likelihood (log odds) of articulation than tokens with a following vowel. Figure 5.1 shows the log odds of /r/
articulation according to whether a vowel or a consonantal follows the /r/.
Figure 5.1: Log odds of /r/ articulation according to following context
The non-directional hypothesis in relation to the influence of the vowel phoneme preceding /r/ was that:
H1: there will be a difference in the likelihood of /r/ articulation for different preceding vowel contexts
Model 1 returned different coefficients for each of the 8 different vowel contexts.
However, only a preceding NURSE vowel was estimated to have a significant effect on the intercept. The default condition was the START vowel. For categorical variables in regression models it is considered best to avoid using a category level which has an extreme coefficient value as a default condition (cf. Starkweather 2010b). The START vowel was not expected to have an extreme value. In order to assess whether this was indeed the case I also obtained estimates for each vowel using 3 different default conditions. This approach also provided a reliable indication of the relative ordering of the different vowels in relation to their effects on /r/ articulation. Regardless of which of the three vowels were used as a default, the only vowel which was identified as exhibiting a significant difference is NURSE. Figure 5.2 shows
the log odds of /r/ articulation in the different preceding vowel contexts as predicted by Model 1.
Figure 5.2: Log odds of /r/ articulation according to preceding vowel context
In relation to word frequency I hypothesised that:
H2: the likelihood of /r/ articulation for different word forms will be influenced by the frequency of the word form
Word frequency was not identified as having a significant effect on the intercept in Model 1 and the model achieved a better fit with word frequency removed. It is probable that
reliable estimates cannot be obtained for word frequency due to the very low frequency values and small differences between the frequency values of individual word forms. Many word forms occurred only once or twice and the total sample of words is small compared to a corpus such as BNC. The influence of word frequency on the /r/ tokens when all
phonological contexts are included is therefore inconclusive. I explore word frequency further in subsequent models.
Based on my fieldwork observations I hypothesised that there would be age-related variation in /r/ articulation as follows:
H3i: teenagers will articulate /r/ more than adults in pre-consonantal contexts and:
H3ii: teenagers will articulate /r/s less than adults in pre-vocalic contexts
Model 1 appeared to confirm the hypotheses with respect to age. The model identified a significant effect for age. Teenagers were estimated to be less likely to articulate /r/ than adults overall in the default condition: /r/ followed by a vowel. For the pre-consonantal tokens, there is a reduced likelihood of articulation across speakers regardless of age, but the model identifies an interaction between age and following context. Teenagers have a greater likelihood of articulating pre-consonantal /r/ than adults. This is shown in figure 5.3.
Figure 5.3: Log-odds of /r/ articulation of pre-vocalic and pre-consonantal /r/ by age
The differences between adults‟ and teenagers‟ log odds of /r/ articulation are not large, but there are effects for both pre-vocalic and pre-consonantal tokens. The effects for the two
following contexts are in opposite directions: the younger age group is less likely to articulate pre-vocalic /r/ but more likely to articulate pre-consonantal /r/. Together these observations tentatively indicate possible changes in MNZE rhoticity. If the changes are recent, the differences between age groups may not yet be considerable.
In relation to region as an explanatory variable, my fieldwork observations had led me to the hypothesis that:
H4: Region N speakers would be more likely to articulate pre-consonantal /r/s than region C speakers
I did not have any prior expectations of regional differences in relation to pre-vocalic /r/.
Model 1 indicated a slight regional difference for the default pre-vocalic /r/ tokens, with speakers in region C having a greater likelihood of articulation than speakers in region N.
However, this effect is mild and may not be linguistically significant.
In relation to the hypothesised effect for the pre-consonantal tokens, Model 1 confirmed H4. There was an interaction such that, while pre-consonantal tokens are less likely to be articulated overall, region N speakers have an increased likelihood of articulation for those pre-consonantal tokens. The regional effects are displayed in figure 5.4.
The results from Model 1 indicate that the regional differences are worth investigating further. It is not clear that there is any considerable difference between the 2 regions in relation to the pre-vocalic tokens, while the regional effect for pre-consonantal /r/ is stronger.
While pre-consonantal tokens are rarely articulated in either region, the model predicts that speakers in region N are more likely to do so.
Figure 5.4: Log-odds of /r/ articulation of pre-vocalic and pre-consonantal /r/ by region
Most studies of sociolinguistic variation identify differences in variant use associated with gender. Since females often lead in the use of innovative variants which are below the level of awareness I hypothesised that:
H5: female speakers would be more likely to articulate pre-consonantal /r/s than males
In relation to the pre-vocalic tokens I did not have any prior hypothesis regarding the direction of a gender difference and the models were therefore an exploratory analysis of gender differences within this phonological context.
According to Model 1 there was no significant main effect associated with gender and the model was a better fit when gender was removed. This finding was unexpected and led me to explore gender differences further in subsequent models of pre-consonantal and pre-vocalic /r/, which are discussed below.
Maori phonology has fuller vowels than NZE and allows VV sequences. I therefore hypothesised that this could have consequences for speakers‟ articulation of linking /r/s.
Linking /r/s may be less likely to be articulated by speakers who are more integrated into
Maori culture. Non-Maori speakers might be more likely to pronounce an orthographic /r/ in situations of vowel hiatus.
In relation to pre-consonantal /r/, my fieldwork observations and others‟ recent research findings (e.g. Kennedy 2006), indicated that articulating pre-consonantal /r/s, especially in the context of a preceding NURSE vowel, could be related to ethnicity. I therefore
hypothesised that:
H6i: speakers who are more integrated into Maori culture would be less likely to articulate non-final pre-vocalic /r/s (i.e. sandhi /r/s) than speakers who are less integrated into Maori culture
H6ii: speakers who are more integrated into Maori culture would be more likely to articulate pre-consonantal /r/s than speakers who are less integrated into Maori culture
Model 1 appeared to confirm the hypotheses in H6i and H6ii. There is a predicted decrease in the likelihood of articulation of pre-vocalic /r/ for each 1-unit increase in speakers‟ MCI scores. However, for pre-consonantal tokens, the increase in MCI scores leads to an increase in the likelihood of articulation. The estimated log odds of articulation for scores of 0, 5 and 10 are shown in figure 5.5.
Figure 5.5: Log odds of /r/ articulation for 3 different MCI score values
The difference in articulation associated with MCI scores does not appear as great for pre-consonantal tokens as it does for pre-vocalic tokens. It may not be significant. It is worth investigating this further.
Model 1 treats MCI as a continuous variable and estimates coefficients for each 1-unit increase in the MCI scores. However, it is not clear that each 1-unit increase in MCI is equally meaningful. It therefore seemed worthwhile to also explore a model in which MCI is treated as a categorical variable. In this treatment, the MCI scores become 11 distinct and ordered category levels with 11 discrete scores obtained by the speakers. Table 5.6 shows the 11 category levels which are labelled according to their score values (i.e. level A0 = a score of 0, etc).
Table 5.6: MCI as a category variable with 11 levels
MCI category A0 B1 C2 D3 E4 F5 G6 H7 I9 J10 K12
Number of speakers 6 5 14 9 4 3 2 6 1 1 1
A model in which MCI is treated as a categorical variable returns a coefficient for each category which represents how much that category deviates from the default condition. As with the preceding vowel contexts, there was an issue concerning which level to treat as a default. The lowest possible score value is 0 (category A0, with 6 speakers). As described in chapter 3, a zero score indicates the lowest possible degree of integration into Maori culture.
While it may not be appropriate to employ an extreme value as a default condition for modelling many variables (cf. Starkweather 2010b) it seemed appropriate in this case. This category level provided a meaningful contrast condition because it represented speakers who asserted that they had no affiliation with Maori culture. The coefficients for each of the remaining 10 category levels could be compared in order to evaluate how the speaker groups associated with the other 10 scores differed from these apparently “non-integrated” speakers.
If the findings for the continuous MCI variable were correct, the expectation for the categorical MCI variable would be that:
8:
(i) for pre-vocalic /r/ the coefficients for the individual score levels would decrease consistently as each score level increased
(ii) for pre-consonantal /r/ the coefficients for each score level would increase consistently as each score level increased
Table 5.7 shows the MCI score coefficients for articulating /r/ in the default context of a following vowel.
Table 5.7: Score level estimates for pre-vocalic /r/
Score Level
Coefficient Std. Error z-score Pr(>|z|) Intercept (default = A0) 2.93786 0.37642 -1.805 0.071008
For the pre-vocalic tokens of /r/ all score level coefficients were negative in contrast to the default condition of A0. Overall, the estimates for the scores became increasingly negative as scores increased, though not in a directly linear manner. Score level J10 was an exception to this trend. Score levels I9, J10 and K12 each have only 1 speaker and the estimates for these 3 score levels are to be viewed cautiously. 5 score levels which show an increasingly negative effect as their values increase were identified as significant in the model. These are: E4, F5, G6, H7 and K12.
Model 1 with MCI treated as a category also identified interactions between the following context and MCI scores. Table 5.8 shows the estimates for the /r/ tokens in the
pre-consonantal context. All score levels except for E4 have a positive effect on the intercept.
The E4 speaker group has an unusually large negative value for the pre-consonantal tokens as well as a large standard error. This unusual result seems to be due to insufficient
pre-consonantal data in relation to the 4 speakers within this group. If there are insufficient tokens available then the model will not produce effective calculations. None of the 4 speakers within the E4 group articulate any pre-consonantal tokens of /r/ and the model is therefore unable to make predictions of rates of /r/ use in relation to an MCI score of 4. Positive effects on articulation are identified as statistically significant for all score levels except C2 and E4.
Table 5.8: Score level estimates for pre-consonantal /r/
Score Level
Coefficient Std. Error z-score Pr(>|z|) Intercept
Figure 5.6 shows the differences in the log-odds of articulation for the 11 score levels.
Figure 5.6: Differences in the log odds of /r/ articulation for 11 MCI score levels
It is difficult to ascertain whether MCI is a significant factor in relation to /r/ articulation based on these results. There is certainly an indication that greater Maori cultural integration may have a disfavouring influence on pre-vocalic /r/ articulation and a favouring effect on pre-consonantal /r/ articulation. However the estimates for the 3 highest scores are unreliable since they are based on only 1 speaker per score category. Nevertheless, the observation of tentative general trends, again in opposite directions for pre-vocalic versus pre-consonantal /r/
suggested that it was worthwhile to continue to probe the relevance of this sociocultural factor in subsequent models.
The contribution of the random effects for speaker and for word forms in Model 1 can be evaluated by considering the model‟s estimates of the variance attributed to each random effect. The total variance associated with the random effects is:
word variance: 0.60 + speaker variance: 0.22 = total random effect variance: 0.82 The proportion of variance attributable to each random effect is that effect‟s variance divided by the total random effect variance. Thus speaker variance is: 0.22 / 0.82 = 0.26. 26% of the random effect variance can be attributed to interspeaker variation. 73% of the random effect
variance is attributable to word form differences. Both speaker and word contribute significantly to the model. Anova comparisons confirm that a model with either 1 of the random effects included is a better fit to the data than a model which does not include any random effects. There is no significant difference between 2 models which differ only with respect to which 1 of the random effects is included.
The variance attributable to differences in word forms seems considerable. The model provides intercepts for each individual word form. I explore word form intercepts in relation to specific phonological contexts in the following two sections. Speaker intercepts indicate individual speaker differences in relation to the model‟s baseline intercept value. I explore the individual speaker differences in chapter 6.