• No results found

CHAPTER 2: ACOUSTIC ANALYSES OF PITCH RANGE

2.4 New methods in the analysis of pitch range

2.4.1 Ladd (1985-2003)

Ladd and his colleagues elaborated in their investigations a well-agreed and integrated model for pitch analysis (Ladd et al., 1985; Ladd, 1996; Shriberg et al., 1996; Ladd and Schepman, 2003). The first thing to point out is that Ladd (1996) adopted two kinds of approach to pitch analysis: the initializing and the normalizing approach.

The initializing approach has to do with the somehow relational or syntagmatic feature of pitch, which can be interpreted only in relation to other parts of the utterance. For its inherent nature, pitch has to do with a movement or a change in the intonation contour. This kind of approach can be used to describe local modifications of pitch in relation to what is immediately preceding (i.e. the starting point of an utterance). Ladd stated that ‘the only thing such a model requires in order to derive actual F0 values is an initial state for each utterance. It does not even need to refer to characteristics of a speaker’s range; all it needs is a starting point’ (Ladd, 1996: 253). Thus, the initializing

approach focuses on the assumption that a low rising movement can be distinguished from a high-rise movement with respect to the starting point, the initial F0 value.

The normalizing approach is based on ‘speaker-specific reference points, such as upper and lower F0 values’ (Ladd, 1996: 256). Ladd elaborated a sort of phonology of pitch by giving a quantitative definition of pitch scaling. He did so in a model based on three targets: H (for high), M (for mid) and L (for low) pitch. The idea is that these labels (based to some extent on the Autosegmental-Metrical system) can describe pitch movements in any language (see also the recent detailed analysis of tonal autosegments and features of pitch in Hayes, 2009: 291-312). Indeed, by combining H, M, and L targets, it is possible to describe pitch movements without making any reference to F0 values. This is true only at an abstract level; at a practical level, a speaker’s overall speaking range is determinant in the identification of targets. For instance, it has been argued that

‘the actual F0 values corresponding to H tone and M tone will depend on whether they are spoken by a man or a woman, by a person with a monotonous voice or a person with a lively voice; that is, the acoustic realization of the pitch scale depends crucially on the speaker and the paralinguistic context’ (Ladd, 1996: 270).

In order to create a normalizing model of the phonetics of pitch, Ladd elaborated a model capable of capturing quantitative properties of any speaker-specific scale. In line with Lieberman and Pierrehumbert (1984), Ladd observed that the utterance-final low of the speaking range could be considered as a reference frequency (Fr). This is based on the premise that, within a F0 contour, while peak dramatically raises and falls creating some valleys, low values remain nearly constant (Lieberman and Pierrehumbert, 1984). In addition, Ladd claimed that ‘the bottom of the speaking range is a fairly constant feature of an individual’s voice’ (1996: 267). F0 of a specific point within the utterance is calculated by this formula:

(1) F0 = Fr ·T ·r

where Fr is the zero level or reference frequency, T is an invariant abstract pitch value and r is a range multiplier whose value is 1 for normal range (for more mathematical insight on the application of this formula, see Ladd, 1996: 267). Despite the adequacy of the model described above, Ladd himself admitted that this model ‘does not work, in the

regularities that have been observed in range-modifications and range-comparisons’ (Ladd, 1996: 269).

Ladd studied in detail also other phenomena related to pitch range variation within and across speakers, such as the ‘segmental anchoring of F0’ and the prediction of F0 targets when ‘speaking up’. Segmental anchoring of F0 has to do with tonal alignment and, in particular, with the temporal coordination between F0 and phonetic segments. Local F0 maximum and minimum are claimed to be aligned in predictable ways along the segmental string (Atterer and Ladd, 2004). Thus, F0 rises and falls are aligned within specific landmarks within the segmental strings called ‘anchor points’ (Arvaniti et al., 1998). Since the association between tonal and segmental elements have an effect on the alignment of pitch targets, this has implications for the phonological description of intonation (Ladd et al., 1999; Atterer and Ladd, 2004). Ladd and his colleagues showed that F0 is related to the segmental structure by means of the ‘segmental anchoring’ phenomenon. In addition, F0 variation is influenced also by other correlates, such as speakers’ emotional states and pragmatic intentions. The prediction of F0 when ‘speaking up’ has to do with the F0 movements that occur when speakers decide to deliberately raise their voices in noisy or high-emotional contexts (Shriberg et al., 1996). This linguistic phenomenon deals with the idea that it exists ‘a raising function by which to relate F0 targets in the raised mode to corresponding targets in the same sentences spoken in the normal mood’ (Shriberg et al., 1996: 1). The analysis of speech data from 15 Dutch native speakers (7 males and 8 females) showed that subjects produced higher pitch levels when speaking over a (simulated) noisy telephone channels, as compared to normal face-to-face communication conditions.

It is still an open question whether the variability of pitch range should be described as gradient or categorical. Ladd has argued that, contrary to the view of Bolinger (1989) and Pierrehumbert (1994) who consider pitch as gradient, ‘it is theoretically coherent to recognize the existence of factors that are categorical and linguistic’ (Ladd, 1996: 282) such as the variability of pitch range. Even though it is clear that F0 variation is continuous along a pitch contour, Ladd and Morton (1997) found some evidence of the fact that pitch movements were categorically interpreted by speakers in their study. These considerations on the categorical nature of pitch by Ladd and Morton were further validated also in a study by Kohler (2004). The categorical essence of pitch is based on the fact that a speaker who has no knowledge of a specific language is

however able to perceive F0 changes in the contour. This means, that pitch movements are essentially categorical in their nature.