• No results found

We are investigating the very complexinterpretation relationthat obtains between certain structured kinds of sounds and certain structured kinds of meanings; our even- tual goal is to define it in a generative fashion. At the very least, we must have some notion of identity that tells us whether two signs sound the same and/or mean the same. The key idea is that we actually have access to more information, namely, whether two utterances arepartially similarin form and/or meaning. To use Bloom- field’s original examples:

A needy stranger at the door saysI’m hungry. A child who has eaten and merely wants to put off going to bed saysI’m hungry. Linguistics considers only those vocal features which are alike in the two utterances: : :Similarly, Put the book awayandThe book is interestingare partly alike(the book). That the same utterance can carry different meanings at different times is a fact we shall not explore until we introducedisambiguationin Chapter 6 – the only bur- den we now place on the theory of meanings is that it be capable of (i) distinguishing meaningful from meaningless and (ii) determining whether the meanings of two ut- terances share some aspect. Our expectations of the observational theory of sound are similarly modest: we assume we are capable of (i0) distinguishing pauses from speech and (ii0) determining whether the sounds of two utterances share some aspect. We should emphasize at the outset that the theory developed on this basis does not rely on our ability to exercise these capabilities to the extreme. We have not formally defined what constitutes a pause or silence, though it is evident that observationally such phenomena correspond to very low acoustic energy when integrated over a period of noticeable duration, say 20 milliseconds. But it is not necessary to be able to decide whether a 19.2 millisecond stretch that contains exactly 1.001 times the physiological minimum of audible sound energy constitutes a pause or not. If this stretch is indeed a pause we can always produce another instance, one that will have a

3.1 Phonemes 25

significantly larger duration, say 2000 milliseconds, and containing only one-tenth of the previous energy. This will show quite unambiguously that we had two utterances in the first place. If it was not a pause, but rather a functional part of sound formation such as a stop closure, the new ‘utterances’ with the artificially interposed pause will be deemed ill-formed by native speakers of the language. Similarly, we need not worry a great deal whetherColorless green ideas sleep furiouslyis meaningful, or what it exactly means. The techniques described here are robust enough to perform well on the basis of ordinary data without requiring us to make ad hoc decisions in the edge cases. The reason for this robustness comes from the fact that when viewed as a probabilistic ensemble, the edge cases have very little weight (see Chapter 8 for further discussion).

The domain of the interpretation relation I is the set of forms F, and the codomain is the set ofmeaningsM, so we haveI F M. In addition, we have twooverlaprelations,OF FF andOM MM, that determine partial sim- ilarity of form and meaning respectively.OF is traditionally divided intosegmental andsuprasegmentaloverlaps. We will discuss mostly segmental overlap here and defer suprasegmentals such as tone and stress to Section 3.3 and Section 4.1, respec- tively. Since speech happens in time, we can define two forms˛andˇassegmen- tally overlappingif their temporal supports as intervals on the real line can be made to overlap, as in thethe bookexample above. In the segmental domain at least, we therefore have a better notion than mere overlap: we have a partial ordering defined by the usual notion of interval containment. In addition toOF, we will therefore use sub- and superset relations (denoted byF;F) as well as intersection, union, and complementation operations in the expected fashion, and we have

˛\F ˇ¤ ; )˛OFˇ (3.1) In the domain ofI, we find obviously complex forms such as a full epic poem and some that are atomic in the sense that

8xF ˛Wx62d om.I / (3.2) These are calledminimum forms. A form that can stand alone as an utterance is a free form; the rest (e.g. forms likeityoralas inelectricity, electrical), which cannot normally appear between pauses, are calledbound forms.

Typically, utterances are full phrases or sentences, but when circumstances are right, e.g. because a preceding question sets up the appropriate context, forms much smaller than sentences can stand alone as complete utterances. Bloomfield (1926) defines awordas a minimum free form. For example,electricalis a word because it is a free form (can appear e.g. as answer to the questionWhat kind of engine is in this car?) and it cannot be decomposed further into free forms (electricwould be free but alis bound). We will have reason to revise this definition in Chapter 4, but for now we can provisionally adopt it here because in defining phonemes it is sufficient to restrict ourselves to free forms.

For the rest of this section, we will only consider the set of wordsW F, and we are in the happy position of being able to ignore the meanings of words entirely.

We may know that forms such ascityandvelocityhave nothing in common as far as their meanings are concerned and that we cannot reasonably analyze the latter as containing the former, but we also know that the two rhyme, and as far as their forms are concernedvelocity = velo.city. Similarly,veloandkiloshare the formloso we can isolateve,ki, andloas more elementary forms.

In general, ifpOq, we have a nonemptyu such that p D aub; q D cud.

a; b; c; d; uwill be calledword fragmentsobtained from comparingpandq, and we saypis a subword ofq, denotedp q, ifaD b D. We denote byWQ the smallest set containingW and closed under the operation of taking fragments –WQ

contains all and only those fragments that can be obtained fromW in finitely many steps.

By successively comparing forms and fragments, we can rapidly extract a set of short fragmentsP that is sufficiently large for eachw2 QW to be a concatenation of elements ofP and sufficiently small that no two elements of it overlap. Aphonemic alphabetP is therefore defined by (i)WQ Pand (ii)8p; q2P WpOFq)pD

q. To forestall confusion, we emphasize here thatP consists of mental rather than physical units, as should be evident from the fact that the method of obtaining them relies on human oracles rather than on some physical definition of (partial) similarity. The issue of relating these mental units to physical observables will be taken up in Chapters 8 and 9.

We emphasize here that the procedure for findingP does not depend on the exis- tence of an alphabetic writing system. All it requires is an informant (oracle) who can render judgments about partial similarity, and in practice this person can just as well be illiterate. Although the number of unmapped languages is shrinking, to this day the procedure is routinely carried out whenever a new language is encountered. In some sense (to be made more precise in Section 7.3), informant judgments provide more information than is available to the language learner: the linguist’sdiscovery procedureis driven both by the positive (grammatical) and negative (ungrammatical) data, while it is generally assumed that infants learning the language only have pos- itive data at their disposal, an assumption made all the more plausible by the wealth of language acquisition research indicating that children ignore explicit corrections offered by adults.

For an arbitrary setW endowed with an arbitrary overlap relationOF, there is no guarantee that a phonemic alphabet exists; for example, ifW is the set of intervals

Œ0; 2 nwith overlap defined in the standard manner,P D fŒ2 .iC1/; 2 iji 0g

will enjoy (ii) but not (i). In actual word inventories W and their extensions WQ, we never see the phenomenon of an infinite descending chain of words or fragments

w1; w2; : : :such that eachwiC1is a proper part ofwi, nor can we find a large number (say > 28) words or fragments such that no two of them overlap. We call such statements of contingent facts about the real worldpostulatesto distinguish them from ordinary axioms, which are not generally viewed as subject to falsification. Postulate 3.1.1Foundation. Any sequence of words and word fragmentsw1; w2; : : : such that eachwiC1wi; wiC1¤wi, terminates after a finite number of steps.

3.1 Phonemes 27

Postulate 3.1.2Dependence. Any set of words or word fragmentsw1; w2; : : : wm contains two different but overlapping words or fragments for anym > 28.

From these two postulates both the existence and uniqueness of phonetic alpha- bets follow. Foundation guarantees that every w 2 W contains at least one atom under, and dependence guarantees that the setP of atoms is finite. Since different atoms cannot overlap, all that remains to be seen is that every word ofWQ is indeed expressible as a concatenation of atoms. Suppose indirectly thatqis a word or frag- ment that could not be expressed this way: eitherqitself is atomic or we can find a fragmentq1in it that is not expressible. Repeating the same procedure forq1, we obtainq2; : : : ; qn. Because of Postulate 3.2.1, the procedure terminates in an atomic

qn. But by the definition ofP,qnis a member of it, a contradiction that proves the indirect hypothesis false.

DiscussionNearly every communication system that we know of is built on a finite inventory of discrete symbols. There is no law of nature that would forbid a language to use measure predicates such astallthat take different vowel lengths in proportion to the tallness of the object described. In such a hypothetical language, we could say It was taaaaaaallto express the fact that something was seven times as tall as some standard of comparison, andIt was taaallto express that it was only three times as tall. The closest thing we find to this is in Arabic/Persian calligraphy, where joining elements are sometimes sized in accordance with the importance of a word, or in Web2.0-style tag clouds, where font size grows with frequency. Yet even though analog signals like these are always available, we find that in actual languages they are used only to convey a discrete set of possible values (see Chapter 9), and no communication system (including calligraphic text and tag clouds) makes their use obligatory.

Postulates 3.1.1 and 3.1.2 go some way toward explaining why discretization of continuous signals must take place. We can speculate that foundation is necessi- tated by limitations of perception (it is hard to see how a chain could descend below every perceptual threshold), and dependence is caused by limitations of memory (it is hard to see how an infinite number of totally disjoint atomic units could be kept in mind). No matter how valid these explanations turn out to be, the postulates have a clear value in helping us to distinguish linguistic systems from nonlinguis- tic ones. For example, the dance of bees, where the direction and size of figure-8 movements is directly related to the direction and distance from the hive to where food can be collected (von Frisch 1967), must be deemed nonlinguistic, while the genetic code, where information about the composition of proteins is conveyed by DNA/RNA strings, can at least provisionally be accepted as linguistic.

Following the tradition of Chomsky (1965), memory limitations are often grouped together with mispronunciations, lapses, hesitations, coughing, and other minor er- rors asperformancefactors, while more abstract and structural properties are treated as competencefactors. Although few doubt that some form of the competence vs. performance distinction is valuable, at least as a means of keeping the noise out of the data, there has been a great deal of debate about where the line between the two should be drawn. Given the orthodox view that limitations of memory and perception are matters of performance, it is surprising that such a deeply structural property as

the existence of phonetic alphabets can be derived from postulates rooted in these limitations.