Chapter 5: Quantitative analysis of MNZE rhoticity
5.2 Modelling the MNZE data
5.2.7 Model 1
5.2.7.4 Model PreV
A model fit to the pre-vocalic non-phrase final tokens (i.e. linking /r/) initially included the linguistic variables: preceding vowel and word frequency, the social variables: age, region,
MCI and gender, interactions between region and MCI, between region and gender, between MCI and gender and between region, MCI and gender. It also included the random effects:
speaker and word form.
Neither the preceding vowel nor the word frequencies were estimated to be significant effects in the model and both were dropped. Gender is not significant but is retained due to a significant interaction between region and gender. The best fitting model (Model PreV) was one which included the social factors: region, age, MCI and gender and an interaction
between region and gender. The random effects for speaker and for word form were retained.
Estimated coefficients for Model PreV are shown in table 5.9. I discuss each of the significant effects in turn.
Table 5.9: Estimates for best-fitting Model PreV
Estimate Std. Error z-score Pr(>|z|) Intercept / baseline 3.21631 0.37828 8.502 < 0.0001 Age Young -1.45595 0.31751 -4.586 <0.0001
Region N -0.86313 0.24016 -3.594 0.000326
MCI -0.19533 0.03413 -5.724 <0.0001
Gender M -0.35065 0.24670 -1.421 0.155208 Region N:Gender M 0.81875 0.38051 2.152 0.031418
Model PreV confirmed that the teenagers are less likely than the 6 female northern region adults to articulate linking tokens of /r/ as shown in figure 5.7.
Figure 5.7: Log odds of pre-vocalic /r/ articulation by age
The 6 female adult speakers in region N are significantly more likely to articulate pre-vocalic /r/ than the teenagers. However, more of the adult population would need to be sampled in order to evaluate how widespread this effect may be, especially since the adults are all female town N speakers. While the age differences identified here must be treated as only
suggestive, this is certainly an interesting finding which is worth pursuing in future research.
The effect of region on the likelihood of /r/ articulation is also slightly more robust when the pre-vocalic tokens are modelled separately. The regional difference in the articulation of the pre-vocalic tokens is shown in figure 5.8.
Figure 5.8: Log odds of pre-vocalic /r/ articulation by region
The regional difference in linking /r/ has particular relevance for this thesis with its focus on regional differences in rhoticity. While regional differences in pre-consonantal /r/ articulation had been hypothesised on the basis of fieldwork observations, regional linking /r/ differences were not. It is not clear that the difference between the 2 regions is of great significance but it could be indicative of a relatively recent change in the sandhi dimension of NZE rhoticity.
The model‟s estimates in relation to gender lend some tentative support to the possibility that there is a change underway.
Model 1 did not identify any gender effects for rhoticity. In Model PreV gender is identified as a significant predictor for pre-vocalic /r/. The model coefficients predict that males in region N will articulate more linking /r/s than the model‟s default value (based on region = C and gender = F). It is not clear what interactions, if any, actually hold between region and gender. Attempts to include more complex interactions in the models resulted in complications with overfit models. The model tentatively indicates that females may be slightly less likely to articulate linking /r/s than males (though perhaps only in town N). If this is the case then this finding would lend support to the idea that there is a change in progress involving a reduction in sandhi /r/ articulation, since females are often more closely
associated with innovative non-salient changes than males. I explore this hypothesis more directly in subsequent models. Figure 5.9 displays the potential gender difference predicted by Model PreV.
Figure 5.9: Log odds of pre-vocalic /r/ articulation by gender (the data points for region C have been jittered slightly to improve the visualisation of the data, but there is no actual difference in the values).
Model PreV confirms that MCI is a relevant factor for the variation in pre-vocalic /r/.
When treated as a continuous variable MCI has a significant disfavouring effect. Articulation is predicted to decrease considerably in line with increasing scores (see figure 5.10).
Figure 5.10: Estimated log odds of pre-vocalic /r/ articulation for 3 MCI scores
In order to explore whether the MCI effect holds across all score levels I compared coefficients for the 11 score levels in 5 conditions. In each condition a different score category was used as a default condition for the model‟s estimates. The 5 conditions are as shown in table 5.10.
Table 5.10: 5 MCI category conditions Default condition Category description
A0 6 speakers who scored 0
B1 5 speakers who each scored 1
C2 14 speakers who each scored 2
D3 9 speakers who each scored 3
E4 4 speakers who each scored 4
A comparison of the coefficients in the 5 different conditions confirms that higher MCI scores have a disfavouring influence on linking /r/ articulation. However, the trend is not a straightforward linear pattern. The score levels fall into 2 groups: a group of lower score
levels and a group of higher score levels. 1 score level was an exception to this division (see table 5.11).
Table 5.11: Division of MCI scores based on model estimates in 5 conditions
Groups Category levels
Lower scoring group: A0, B1, C2, D3
Higher scoring group: E4, F5, G6, H7, I9, K12
Exception: J10
The division of the MCI scores as in table 5.11 was motivated by the observations described in 9-13:
9. When the default condition is any 1 of the scores in the lower scoring group the coefficients for each score in the higher scoring group is negative. The same 4 scores (F5, G6, H7 and K12) are consistently identified by the model as significant.
10. When the default condition is any 1 of the scores in the lower scoring group the coefficients for that group change between positive and negative values depending on which of the 4 scores is the default. For example, D3 only has a negative coefficient when contrasted with A0, while B2 is consistently negative when contrasted with any of its group members.
11. When the default condition is E4 (the first member of the higher scoring group), the coefficients for all scores in the lower scoring group are positive.
12. As the lowest score in the higher scoring group, E4 seems to represent a demarcation point between the 2 groups. When E4 is the default, all score levels below E4 retain positive coefficients and all score levels higher than E4 retain negative coefficients.
Furthermore, G6, which is identified as having a significant negative effect when contrasted with any lower level, ceases to be significant when contrasted with E4, while F5, H7 and K12 still are.
13. The category level J10 is an exception. This score involves only 1 speaker and the coefficient for this speaker is consistently positive across all contrast conditions, including E4.
Figure 5.11 Shows the log odds for different MCI scores when MCI is treated as a categorical variable and the default condition is E410.
Figure 5.11: Log odds of pre-vocalic /r/ articulation for 11 MCI scores (scores I9, J10 and K12 have only 1 speaker each).
It is interesting that all of the statistically significant effects are for scores above E4, i.e.
scores of F5, G6, H7 and K12. The highest scoring individual also has the strongest disfavouring effect.
K12 has a particularly low coefficient. In fact both of the 2 speakers with the 2 highest scores show relatively extreme behaviour, K12 in the expected direction for their score level and J10 in the opposite direction. These 2 speakers can be considered outliers.
On the questionnaire, a score of 4 could be achieved without any genuine involvement in Maori culture (e.g. simply knowing some Maori greetings). 38 speakers scored 4 or lower on the questionnaire and only 14 speakers achieved a score of 5 or more. It seems then that the
10The PreV model calculations for E4 are not significantly divergent from other speakers. This confirms that the unusual result for E4 in table 5.7 relates specifically to the absence of articulated pre-consonantal tokens for these speakers.
MCI scores have captured a socio-cultural effect that is relevant to variation in sandhi /r/. The MCI scores are not a wholly reliable predictor of articulation however, since individual speaker variation is also apparent. It will be interesting to compare the MCI findings with findings for a model of MCI influence on the pre-consonantal tokens.
Model PreV does not identify any preceding vowel effect for the articulation of pre-vocalic /r/. However, it is worth considering whether any individual word effects are apparent. The mixed effects model provides individual intercepts for each of the different word forms in which the non-phrase final pre-vocalic /r/ tokens occurred. The intercepts represent the estimated adjustment to the baseline intercept value for each word form irrespective of the patterns identified in the model. In total, there are 163 different word forms in which linking /r/ tokens occurred. The great majority of /r/ tokens are preceded by a lettER vowel. It is probable that this uneven distribution is the primary reason why the model was unable to identify any predictive patterns for the preceding vowel context.