An outline of a phonetic implementation model for sign language

The sign language analogue of Boersma’s production model is presented on the right side of Figure 2.5.

Figure 2.5

A sign language production model

In Figure 2.5 we see the same overall scheme as in the speech model in Figure 2.4; only the representations are present. The perceptual specification (the ‘underlying form’) can be articulated in different ways. These different articulations have automatic optical results, which can in turn be categorized to a perceptual output. The optical result is ‘automatic’ in the sense that there are no mental processes involved in the reflection of light waves from the articulators to the eye: this process is predictable on the basis of physical laws, even though this is a complex process and even though the optical input will differ for perceivers looking from different angles and under different light conditions (cf. footnote 36 below).

At first sight, a major difference between the two modalities should exist since we can directly perceive the articulators by vision but not by audition. This difference is enhanced by the fact that the sign language articulators are external to the body: they are less hidden from view than most of the spoken language articulators. If this is so, then why would there be a difference between the perceptual specification and the articulatory output?

In part this difference between vision and audition is only apparent. Indeed, the sign language articulators are more external to the body than the vocal articulators, and in principle they can be seen. However, the perception of the arms and hands is in no sense ‘direct’ (see Dretske 1990 for a discussion on this term in visual perception research). We can only see part of the arms and hands at a time, seeing the sides of the articulator that are closest to us. Depending on the state of the shoulder, elbow, and wrist, the wrist could be hidden from view, for example, and we unconsciously infer that it is still there, and possibly what position it is in.

More pertinent to sign language forms, I argue that there is indeed a difference between perceptual and articulatory representations in sign language (just as in

spoken language) because it is only a small part of the articulator that has perceptual relevance. This is also the position implicit in most sign language work, which has claimed that various properties of the hand are phonologically relevant, whereas the arm serves only as an aid in the realization of some of those properties of the hand.32 For example, take the SLN sign SAY, as it appears on one of the SLN dictionary CD- ROMS (NSDSK 1996), illustrated in Figure 2.6.33

Figure 2.6

The dictionary form of the sign SAY

A simple phonological analysis of this sign would say that an extended index finger whose palm side faces the body moves from the chin forwards. Further phonological decompositions can be added to analyze the extended index finger for example or the way the movement is represented, but none of these would include information about the state of the wrist, elbow or shoulder joint. These joints work together to articulate the location ‘chin’ for this particular ‘handshape’ (extended index finger), but a description of their position and movement is not included in the phonological description.

In this thesis, I will assume that the relevant level of articulation included in the model is the state of the upper extremity joints over time.34_{Whether or not this is} actually the level that the motor control system controls is not relevant at this point. It may be the case that at some level it is rather muscle groups or individual muscles

32_{The arm has been referred to in phonological models in two ways. Firstly, in all models of all sign} languages different locations on one arm (such as the upper arm, the elbow, the inside of the elbow, etc.) can serve as a place of articulation. In this case, the arm functions like the upper body and the head, which are typical places of articulation in sign language. Secondly, in a very limited set of signs the forearm can serve as part of the active articulator together with the hand. For example, Stokoe (1960) included a category ‘forearm prominent’ in his phonological analysis of ASL.

33_{The two video frames are reproduced with permission of the Dutch Foundation for the Deaf and Hard-} of-hearing Child (NSDSK) and BSL Computer Consultancy.

34_{The ‘upper extremity’ is the anatomical term for the arm and hand together; similarly, the leg and foot} together are referred to as the ‘lower extremity’. See Appendix A for a discussion of some of the anatomical and physiological terminology pertinent to sign language.

that need to be activated. The level of joints is relevant to the model because it is differences in joint states that have an optical effect that can be perceived.35

For the above realization of the sign SAY, the different representations in the model would be as in (2.1). The arrows in the articulatory description indicate movement from beginning to end position. The different joints and their reference positions are described in Appendix A.

(2.1) Representations of one realization of SAY • Perceptual specification

handshape: extended index finger orientation: palm side facing the location location: at the chin

movement: straight ahead • Articulatory output

shoulder: rotated inwards 45 degrees, extended 20 degrees elbow: flexed 135 degrees à flexed 100 degrees

forearm: supinated 90 degrees wrist: flexed 40 degrees à extended index MCP: flexed 15 degrees index PIP: extended

index DIP: extended

MCP, PIP & DIP of middle, ring & pinkie: flexed 90 degrees thumb CMC: flexed 20 degrees

thumb MCP: flexed 45 degrees thumb IP: extended

• Optical output —

• Perceptual output

handshape: extended index finger orientation: palm side facing the location location: at the chin

movement: straight ahead

Because of its extreme complexity, the optical output is left unspecified here; since it is the automatic result of the articulatory output, I equate it with all of the

35_{It could be argued that, especially in the face, muscle tension can also be perceived, but I claim that for} the manual component of signs muscle tension always has an effect on joint states or on movement speed as well, and that these are the primary visible correlates of muscle tension. This will become relevant in Chapter 3, where I argue that tenseness is phonologically specified for the whole sign, and not for handshape or movement individually.

properties of the articulation that can be perceived.36_{Thus, it includes information} on many properties that the perceptual output has abstracted away from, such as the orientation of the upper arm, the location of the elbow, the position of the thumb, etc. Also note that the categories present in the perceptual output cannot be directly mapped onto articulatory categories. For example, the perceived movement category in this sign is ‘forward movement’, and not ‘elbow extension’. Whether elbow extension actually leads to an optic effect that can perceived as forward movement depends on the state of the shoulder. If the shoulder is hyperextended, then the direction of movement of the hand will be downward rather than forward.

There is no principled reason why articulatory categories themselves cannot be perceptual categories. In fact, the visibility of the sign language articulators would lead us to expect that sign language makes frequent use of phonological distinctions such as ‘wrist bent 90 degrees’, in which the articulatory and perceptual categories would coincide. As was already noted above, in most phonological descriptions to date the focus has been on the hand. It is not surprising then that the features describing handshape have often directly referred to the states of various joints of the fingers. For example, the perceptual description of the handshape is ‘extended index finger’, which could be seen as a direct paraphrase of ‘all index finger joints extended’. I will argue in the coming chapters that even for handshape, the overlap between perception and articulation is only apparent. Data from variation in the production of SLN are adduced in later chapters which lead to the conclusion that perceptual features of handshape abstract away from the states of the different joints of the hand, and furthermore, that finger joints can be used to articulate features of orientation and location.

The source of phonetic variation in this model is the production grammar: on the basis of one fixed perceptual specification it can generate many different articulatory representations. In the above example of SAY, the illustration in Figure 2.6 constituted one single realization of the sign. Again, this is due to the nature of sign language articulation and perception: we can see the articulators on an illustration, so that we get quite some (but not all) information about perceptual and articulatory details of the sign at one point in time. With the exception of the lips, lower jaw and sometimes the tongue tip, the configuration of most spoken language articulators is invisible even in a photograph, let alone in an IPA transcription, and we cannot hear the state of the articulators either. The sign SAY can have an infinite number of different articulations, though, most of which are correctly recognized as SAY and some of which may not be correctly recognized. As an example, let us focus on one aspect of its perceptual specification, viz. the handshape.

The perceptual specification says ‘extended index finger’, while nothing is said about the other fingers. For sake of ease, I will ignore the position of the thumb here,

36_{Note that very different optical inputs will result from each articulation for different perceivers, since} the angle with which they view the signer will necessarily differ for all perceivers. I will assume for now that the low level visual processes that abstract away from these highly complex and variable optical inputs at this point are irrelevant to the model and common to all visual perception processes. Ultimately, it is the task of the perception grammar to achieve such normalization processes, which resemble speaker normalization that will also have to be accounted for by the perception grammar.

and focus on the position of the other three fingers. In the literature on handshape, one of the earliest generalizations was that only one set of fingers can be specified for shape (the selected fingers), the shape of the non-selected fingers being predictable (Mandel 1981). This predictable state of the unselected fingers was spelled out as ‘either all closed or all extended’ (for example, see Mandel 1981), and its value was argued to depend on perceptual clarity: if one finger is selected, both its selection and its shape (extended, curved, clawed, etc.) will be easiest to perceive if all the other fingers are invisible.37_{For articulation, however, it will be easiest to} minimize the amount of flexion for each finger, the easiest form being one in which the fingers are in rest position. The actual rest position of the fingers will depend on the state of the wrist, among other things, but I will assume for now that it in rest position all finger joints are bent at about 30 degrees. Full closure of the fingers (such that they are folded into the palm) would mean approximately 90 degree flexion at all joints of the middle, ring, and pinkie fingers. Some of these possible articulations are included in the scheme in (2.2). The following conventions of the Functional Phonology model are adopted here and in Chapter 5: perceptual specifications are given between vertical lines, articulatory outputs are given between square brackets, and perceptual outputs are given between slanted lines. (In Chapters 3 and 4, conventional square brackets are used for the phonological perceptual features.)

(2.2) Different articulations of one realization of SAY

| extended index |

↓

PRODUCTION GRAMMAR

↓

[ index extended, other [ index extended, [ index extended, joints flexed 90 deg ] other joints extended ] other joints relaxed]

⇓

[optical output]

↓

PERCEPTION GRAMMAR

↓

/ extended index / / four fingers extended / / extended index /

There are three possible articulations of the fingers in this case, two of which are categorized as a perceptual output that matches the perceptual specification. One of these two (the leftmost) matches the exact form in Figure 2.6. The three different

37_{In signs with a so-called ‘aperture’ specification, the selected finger(s) are always flexed to a certain} extent, in combination with an opposed thumb. In these cases, the non-selected fingers (typically the middle, ring and pinkie fingers) are often extended, but not always. For ASL, Mandel 1981 argues for a phonological distinction between extended (as in FINDASL) and non-extended (as in BIRDASL) non-selected

fingers in such ‘aperture’ signs. Van der Kooij (1998) argues that in SLN there is no such phonological distinction.

forms are illustrated in Figure 2.7, as seen from the perspective of an addressee facing the signer.

i. extended index ii. all four fingers extended iii. index extended, proximal phalanges of other fingers visible

Figure 2.7

Three articulations of | extended index |

The choice between these different possibilities is carried out by two sets of optimality-theoretic constraints: one set of constraints evaluates the articulatory effort involved in the articulatory outputs, while a set of faithfulness constraints evaluates the extent to which the perceptual output matches the perceptual specification. As in other OT analyses, the respective ranking of the constraints (in columns) in the two sets determines the winning output; the constraints are ranked from highest (left) to lowest (right). Constraint violations are indicated by asterisks, while shaded cells indicate that a constraint is not active in deciding the status of the candidate listed in that row. ‘Fatal’ constraint violations are marked with an exclamation mark, and a pointing finger in the leftmost column indicates the winning candidate.38_{In order to have the winning output match the form in Figure} 2.6, a constraint ranking is proposed in (2.4); the constraints in the tableau are defined in (2.3). In this and further tableaux, each row lists one specific pair of articulatory output and a matching perceptual output, because they are complementary aspects of phonetic form which the constraint set evaluates.

38_{For readers who are not familiar with Optimality Theory, introductions include Prince & Smolensky} (1993), Archangeli (1997), and Kager (1999).

(2.3) Constraint definitions

FAITH(selected fingers): the value in the perceptual output for selected fingers should be identical to that in the perceptual specification FAITH(unselected fingers): the value in the perceptual output for unselected

fingers should be identical to that in the perceptual specification *ART(unselected fingers: closed): do not make a gesture that leads to closed

unselected fingers

*ART(unselected fingers: extended): do not make a gesture that leads to fully extended unselected fingers

(2.4) Tentative constraint ranking for the dictionary articulation of SAY | extended index finger,

unselected fingers fully closed |

FAITH (selected fingers) FAITH (unselected fingers) *ART (unselected fingers: closed) *ART (unselected fingers: extended)

?

[index ext., other finger joints fl. 90 deg.], /ext. index/

= Fig. 2.7i

*

[index ext., other finger

joints extended], /four fingers extended/

= Fig. 2.7ii

*!

*

[index extended, other joints relaxed], /ext. index/

= Fig. 2.7iii

*!

In this set of ranked constraints, the faithfulness constraints comparing the realization of the unselected fingers outrank the articulatory constraints that aim to minimize the amount of movement. The winning candidate, then, is the top one, in which articulatory effort has been put into making a maximum distinction between the shape of the selected finger and that of the unselected fingers. Variation can be the result of a different constraint ranking. In careful articulations as in Figure 2.7, which is taken from an SLN dictionary, FAITH constraints will assume a relatively high rank, in order to minimize perceptual confusion. In less careful articulations, FAITH constraints will be ranked relatively low, leading to a winning candidate that minimizes articulatory effort while still leading to the correct perception of the selected finger (the index) and its shape (extended). This is illustrated in (2.5).

(2.5) Constraint ranking for a less careful articulation of SAY | extended index finger | FAITH

(selected fingers) *ART (unselected fingers: closed) *ART (unselected fingers: extended) FAITH (unselected fingers) [index ext., other

finger joints fl. 90 deg.], /ext. index/ = Fig. 2.7i

*!

[index ext., other

finger joints extended], /four fingers extended/ = Fig. 2.7ii

*!

*

?

[index extended, other finger joints relaxed], /ext. index/

= Fig. 2.7iii

*

The tentative nature of the above example will be clear. In order to carry out serious analyses in this model that compare different phonetic variants of the same phonological form, we need a great deal of knowledge that is currently missing, including the following.

(2.6) Requirements for a functional phonology analysis

• a set of perceptual features that can serve as perceptual specifications • a description of the production grammar: which sets of joint

configurations are used to articulate a given perceptual specification? • a description of the perception grammar: how are all perceivable events

in the signal categorized as a perceptual output?

• a set of articulatory constraints: which level of articulation do they refer to, and how fine-grained are they?

• a set of faithfulness constraints that evaluate the match between perceptual specification and perceptual output

This is obviously a tremendous task. In this thesis, my main aim is to establish what the set of perceptual features is (Chapter 3), and what different phonetic manifestations of these features may be, both in perceptual and articulatory terms (Chapter 4). A set of features and feature values will be proposed in the next chapter. This set corresponds to a part of what is traditionally conceived of as a phonological model: it is a list of features, but does not include their hierarchical organization as in feature geometries (see Clements 1985, McCarthy 1988, van der Hulst 2000). Some data from articulatory variation will be presented as early as in Chapter 3, in order to motivate some of the choices for features. In Chapter 4 I will describe several studies that were performed to obtain an overview of different articulatory realizations of different types of movement, viz. changes in handshape,

orientation, and location. In Chapter 5 I will show how these variable articulations can be accounted for using the phonetic implementation model outlined above. The goal will not be to provide a very detailed formalization of all the variation found, as this would involve too many arbitrary choices in establishing the different components of the model. Rather, the goal is to illustrate the nature of the variation: in line with the model proposed by Boersma (1998), I claim that phonetic variation of lexical items consists of articulatory variants of a single perceptual specification.

3.1 Introduction

This chapter proposes a set of phonological features for SLN. This set has partly been developed in work together with Harry van der Hulst and Els van der Kooij; I will loosely refer to this collaboration as the ‘Leiden model’ of sign language

In document Phonetic implementation of phonological categories in Sign Language of the Netherlands (Page 50-62)