Copying in Natural Languages, Context Freeness, and Queue Grammars

(1)COPYING IN N A T U R A L L A N G U A G E S , CONTEXT-FREENESS, A N D QUEUE G R A M M A R S A l e x i s Manaster-Ramer U n i v e r s i t y of Michigan 2236 Fuller Road #108 A n n Arbor, MI 48105. ABSTRACT. Pullum (1984b) likewise heaped scorn on my argument that English reshmuplicative constructions show non-CFness, but he accepted (1984a; 1984b) Culy's (1985) argument about noun reduplication in Bambara and Shieber's (1985) one about Swiss German cross-serial constructions of causative and perception verbs and their objects. Gazdar and Pullum (1985) also cite these two, as well as an argument by Carlson (1983) about verb phrase reduplication in Engenni. They also refer to my discovery of the X o r n o X ... construction in English I and mention that "Alexis ManasterRamer ... in unpublished lectures finds reduplication constructions that appear to have no length bound in Polish, Turkish, and a number of other languages". While they do not refer to my 1983 reshmuplication argument, which they presumably still reject, the Turkish construction they allude to was cited in my 1983 paper and is similar to the English reshmuplication in form as well as function (see below).. The documentation of (unbounded-len~h) copying and cross-serial constructions in a few languages in the recent literature is usually taken to mean that natural languages are slightly context-sensitive. However, this ignores those copying constructions which, while productive, cannot be easily shown to apply to infinite sublanguages. To allow such finite copying constructions to be taken into account in formal modeling, it is necessary to recognize that natural languages cannot be realistically represented by formal languages of the usual sort. Rather, they must be modeled as families of formal languages or as formal languages with indefinite vocabularies. Once this is done, we see copying as a truly pervasive and fundamental process in human language. Furthermore, the absence of mirror-image constructions in human languages means that it is not enough to extend Context-free Grammars in the direction of context-sensitivity. Instead, a class of grammars must be found which handles (context-sensitive) copying but not (context-free) mirror images. This suggests that human linguistic processes use queues rather than stacks, making imperative the development of a hierarchy of Queue Grammars as a counterweight to the Chomsky Grammars. A simple class of Context-free Queue Grammars is introduced and discussed.. In any case, the acceptance of even one case of nonCFness in one natural language by the only active advocates of the CF position would seem to suffice to remove the issue from the agenda. Any additional arguments, such as Kac (to appear), Kac, Manaster-Ramer, and Rounds (to appear), and Manaster-Ramer (to appear a; to appear b) may appear to be no more than flogging of dead horses. However, as I argued in Manaster-Ramer (1983) and as recent work (ManasterRamer, to appear a; Rounds, Manaster-Ramer, and Friedman, to appear) shows ever more clearly, this conception of the issue (viz., Is there one natural languages that is weakly noncontext-free?) makes very little difference and not much sense.. Introduction The claim that at least some human languages cannot be described by a Context-free Grammar no matter how large or complex has had an interesting career. In the late 1960's it might have seemed, given the arguments of Bar-Hillel and Shamir (1960) about r e s p e c t i v e l y coordinations in English, Postal (1964) about reduplication-cum-incorporation of object noun stems in Mohawk, and Chomsky (1963) about English comparative deletion, that this claim was firmly established.. First of all, if non-CFness is so hard to find, then it is presumably linguistically marginal. Second, weak generative arguments cannot be made to work for natural languages, because of their high degree of structural ambiguity and the great difficulty in excluding every conceivable interpretation on which an apparently ungrammatical string might turn o u t - o n reflection--to be in the language. Third, weak generative capacity is in any case not a very interesting property of a formal grammar, especially from a linguistic point of view, since linguistic models are judged by other criteria (e.g., natural languages might well be regular without this making CFGs any the more attractive as models for them). Fourth, results about the place of natural languages in the Chomsky Hierarchy seem to be should be considered in light of the fact that there is no reason to take the Chomsky Hierarchy as the appropriate formal space in which to look for them. Fifth, models of natural languages that are actually in use in theoretical, computational, and descriptive linguistics are - a n d always have been--only remotely related to the Chomsky Grammars, which means that results about the latter may be of little relevance to linguistic models.. Potentially serious--and at any rate embarrassing-problems with both the formal and the linguistic aspects of these arguments kept popping up, however (Daly, 1974; Levelt, 1974), and the partial fixes provided by Brandt Corstius (as reported in Levelt, 1974) for the r e s p e c t i v e l y arguments and by Langendoen (1977) for that as well as the Mohawk argument did not deter Pullum and Gazdar (1982) from claiming that "it seems reasonable to assume that the natural languages are a proper subset of the infinitecardinality CFL's, until such time as they are validly shown not to be". Two new arguments, Higginbotham's (1984) one involving s u c h that relativization and Postal and Langendoen's (1984) one about sluicing were dismissed on grounds of descriptive inadequacy by Pullum (1984a), who, however, suggested that the Langendoen and Postal (1984) argument about the doubling relativization construction may be correct (all these arguments deal with English).. 85.

(2) whole m a y be strictly s p e a k i n g regular. Note t h a t this a p p r o a c h holds t h a t it is impossible to d e t e r m i n e with a n y confidence t h a t a particular string qua string is u n g r a m m a t i c a l , b u t t h a t it m a y be possible to tell one construction from a n o t h e r , and t h a t the l a t t e r - - a n d not the f o r m e r - - i s the real b a s i s of all linguistic work, theoretical, computational, and descriptive.. A s I a r g u e d in 1983, we should go beyond piecemeal d e b u n k i n g of invalid a r g u m e n t s a g a i n s t CFGs and by t h e s a m e token it s e e m s to m e t h a t we m u s t go beyond piecemeal r e s t a t e m e n t s of s u c h a r g u m e n t s . R a t h e r , we should focus on g e n e r a l i s s u e s a n d ones t h a t h a v e implications for the modeling of h u m a n l a n g u a g e s . One s u c h issue is, it s e e m s to m e , the kind of context-sensitivity found in n a t u r a l l a n g u a g e s . It a p p e a r s t h a t the c o u n t e r e x a m p l e s to contextf r e e n e s s are all r a t h e r similar. Specifically, t h e y all s e e m to involve s o m e kind of cross-serial dependency, i.e., a dependency b e t w e e n the n t h e l e m e n t s of two or more substrings. T h i s - - u n l i k e the s t a t e m e n t t h a t n a t u r a l l a n g u a g e s a r e n o n c o n t e x t - f r e e - - m i g h t m e a n s o m e t h i n g if we k n e w w h a t kinds of models were appropriate for cross-serial dependencies. Given t h a t not e v e r y kind of context-sensitive construction is found in h u m a n l a n g u a g e s , it should be clear t h a t there is n o t h i n g to be gained by invoking the dubious slogan of context-sensitivity.. Finite C o p y i n g T h e c o u n t e r e x a m p l e s to c o n t e x t - f r e e n e s s in the literature h a v e all been claimed to crucially involve e x p r e s s i o n s of u n b o u n d e d length. This s e e m e d n e c e s s a r y in view of t h e fact t h a t a n upper bound on length would imply finiteness of the s u b s e t of s t r i n g s involved, which would as a r e s u l t be of no formal l a n g u a g e theoretic interest. However, it is often difficult to m a k e a case for u n b o u n d e d length, and the m a i n r e s u l t h a s been t h a t , even t h o u g h e v e r y linguist knows a b o u t reduplication, it s e e m e d nearly impossible to find an i n s t a n c e of reduplication t h a t could be u s e d to m a k e a formal a r g u m e n t a g a i n s t C F G s , even t h o u g h no one would ever u s e a CFG to describe reduplication.. A n o t h e r r e l e v a n t question is the centrality or peripherality of t h e s e constructions in n a t u r a l l a n g u a g e s . The r e l e v a n t literature m a k e s it a p p e a r t h a t t h e y are s o m e w h a t m a r g i n a l at best. This would explain the tortured history of t h e a t t e m p t s to show t h a t t h e y exist at all. However, this a p p e a r s to be wrong, at least w h e n we consider copying constructions. T h e r e q u i r e m e n t of full or n e a r identity of two or more s u b p a r t s of a sentence (or a discourse) is a v e r y widespread p h e n o m e n o n . In this paper, I will focus on the copying constructions precisely because t h e y a r e so c o m m o n in h u m a n l a n g u a g e s .. For, in addition to reduplications t h a t c a n apply to u n b o u n d e d l y long e x p r e s s i o n s , t h e r e is a m u c h better known class of reduplications exemplified by Indonesian pluralization of nouns. Here it is difficult to s h o w t h a t the reduplicated forms are infinite in n u m b e r , b e c a u s e compound n o u n s are not pluralized in the s a m e w a y , a n d ignoring compounding, it would s e e m t h a t the n u m b e r of fiouns is finite. However, this n u m b e r is v e r y large a n d m o r e o v e r it is probably not well defined. The class of n o u n s t e m s is open, a n d c a n be enriched by borrowing from foreign l a n g u a g e s a n d neologisms, and all of t h e s e s p o n t a n e o u s l y pluralize by reduplication.. In addition to such questions, which a p p e a r to focus on t h e linguistic side of things, t h e r e a r e also the more m a t h e m a t i c a l and conceptual problems involved in the whole e n t e r p r i s e of modeling h u m a n l a n g u a g e s in formal t e r m s . My own belief is t h a t both kinds of i s s u e s m u s t be solved in t a n d e m , since we c a n n o t know w h a t kind of formal models we w a n t until we know w h a t we are going to model, and we c a n n o t know w h a t h u m a n l a n g u a g e s are or a r e not like until we know hot, to r e p r e s e n t t h e m and w h a t to compare t h e m to. This p a p e r is intended as a contribution to this kind of work.. Rounds, M a n a s t e r - R a m e r , a n d F r i e d m a n (to appear) a r g u e t h a t facts like this m e a n t h a t a n a t u r a l l a n g u a g e should n o t be modeled as a formal l a n g u a g e b u t r a t h e r a s a family of l a n g u a g e s , each of which m a y be t a k e n a s a n a p p r o x i m a t i o n to a n ideal l a n g u a g e . In t h e case before us, we could a r g u e t h a t e a c h of t h e a p p r o x i m a t i o n s h a s only a finite n u m b e r of n o u n s , for e x a m p l e , b u t a different n u m b e r in different a p p r o x i m a t i o n s . This idea, related to the work of Yuri Gurevich on finite d y n a m i c models of c o m p u t a t i o n , allows u s to s t a t e t h e a r g u m e n t t h a t the existence of a n open class of reduplications is sufficient to show t h e i n a d e q u a c y of C F G s for t h a t f a m i l y of a p p r o x i m a t i o n s . T h e b a s i s of the a r g u m e n t is the observation t h a t while each of the a p p r o x i m a t e l a n g u a g e s could in principle h a v e a CFG, each such CFG would differ from the n e x t not only in the addition of a n e w lexical item b u t also in t h e addition of a n e w reduplication rule (for t h a t particular item).. Copying Dependencies The e x a m p l e s of copying (and other) c o n s t r u c t i o n s which h a v e figured in the g r e a t context-freeness debate h a v e all involved a t t e m p t s to show t h a t a whole (natural) l a n g u a g e is n o n c o n t e x t free. Now, while it is often e a s y to find a noncontext-free s u b s e t of s u c h a l a n g u a g e , it is not a l w a y s possible to isolate t h a t s u b s e t formally from t h e r e s t of t h e l a n g u a g e in such a w a y as to show t h a t t h e l a n g u a g e as a whole is noncontext-free. T h e r e is so m u c h a m b i g u i t y in n a t u r a l l a n g u a g e s t h a t it is strictly s p e a k i n g impossible to isolate any construction at the level of s t r i n g s , t h u s invalidating all a r g u m e n t s a g a i n s t C F G s or even Regular G r a m m a r s t h a t refer to w e a k generative capacity. However, t h e a r g u m e n t s can be reconstructed by m a k i n g u s e of the notion of classificatory capacity of formal g r a m m a r s , introduced in M a n a s t e r - R a m e r (to a p p e a r a) a n d M a n a s t e r R a m e r a n d Rounds (to appear). The classificatory capacity is the set of l a n g u a g e s g e n e r a t e d by the v a r i o u s s u b g r a m m a r s of a g r a m m a r , and if we a r e willing to a s s u m e t h a t linguists can tell which s e n t e n c e s in a l a n g u a g e exemplify the s a m e or different syntactic p a t t e r n s , t h e n we can u s u a l l y simply d e m o n s t r a t e that, e.g., no CFG can h a v e a s u b g r a m m a r g e n e r a t i n g all and only the s e n t e n c e s of s o m e particular construction if t h a t construction involves reduplication. This will s h o t ' the i n a d e q u a c y of CFGs, even if the s t r i n g s e t as a. To c a p t u r e w h a t is really going on, we require a g r a m m a r t h a t is the s a m e for each a p p r o x i m a t i o n modulo the lexicon. This g r a m m a r in a s e n s e g e n e r a t e s the infinite ideal, b u t actually each actual a p p r o x i m a t e g r a m m a r only h a s a finite lexicon and hence actually only g e n e r a t e s a finite n u m b e r of reduplications. In order to model t h e flexibility of the n a t u r a l l a n g u a g e vocabulary, we a s s u m e t h a t each m e m b e r of the f a m i l y h a s the s a m e g r a m m a r modulo the t e r m i n a l v o c a b u l a r y a n d the rules which i n s e r t t e r m i n a l s . A n o t h e r w a y of s t a t i n g this is t h a t the lexicon of I n d o n e s i a n is finite b u t of an indefinite size ( w h a t Gurevich calls " u n c o u n t a b l y finite"). A C F G would still h a v e to contain a s e p a r a t e rule for t h e plural of e v e r y n o u n and henc, would h a v e to be of an indefinite size. T h u s , with. 86.

(3) addition of a new noun, the g r a m m a r would h a v e to add a n e w rule. However, this would m e a n t h a t the g r a m m a r at a n y given time can only form the plurals of n o u n s t h a t h a v e already been learned. Since s p e a k e r s of the l a n g u a g e know in a d v a n c e how to pluralize u n f a m i l i a r nouns, this c a n n o t be true. R a t h e r the g r a m m a r at a n y given time m u s t be able to form plurals of n o u n s t h a t h a v e not y e t been learned. This in t u r n m e a n s t h a t a n indefinite n u m b e r of plurals can be formed by a g r a m m a r of a d e t e r m i n a t e finite size. Hence, in effect, the n u m b e r of rules for plural formation m u s t be smaller t h a n the n u m b e r of plural f o r m s t h a t can be generated, and this in t u r n m e a n s t h a t there is no CFG of Indonesian.. T h e s e are clause-level constructions, b u t we also find ones restricted to the p h r a s e level. (He) deliberates, deliberates, deliberates (all d a y long). (He worked slowly) t h e o r e m by theorem. (They form) a church within a church. (He debunks) t h e o r y after theory. Also r e l e v a n t are c a s e s w h e r e a copying dependency e x t e n d s across s e n t e n c e boundaries, as in discourses like:. This brings up a crucial issue, of which we are all p r e s u m a b l y a w a r e b u t which is u s u a l l y lost sight of in practice, n a m e l y , t h a t the w a y a m a t h e m a t i c a l model (in this case, formal l a n g u a g e theory) is applied to a physical or m e n t a l domain (in this case, n a t u r a l language) is a m a t t e r of utility and not itself subject to proof or disproof. F o r m a l l a n g u a g e theory deals with sets of s t r i n g s over well-defined finite vocabularies (also often called alphabets) such as the h a c k n e y e d {a, b}. It h a s been all too e a s y to fall into the trap of e q u a t i n g the formal l a n g u a g e theoretic notion of vocabulary (alphabet) with the linguistic notion of v o c a b u l a r y and likewise to confuse the formal l a n g u a g e theoretic notion of a string (word) over the v o c a b u l a r y (alphabet) with the linguistic notion of sentence.. A: She is fat. B: She is fat, m y foot. It is i n t e r e s t i n g t h a t s e v e r a l of t h e s e t y p e s are productive even t h o u g h they a p p e a r to be b a s e d on w h a t originally m u s t h a v e been m o r e restricted, idiomatic expressions. T h e p a t t e r n a X within a X, for example, is surely derived from the single e x a m p l e a state within a state, y e t h a s become quite productive. M a n y of t h e s e p a t t e r n s h a v e a n a l o g u e s in other languages. For example, the X after X construction a p p e a r s to involve quantification and this m a y be related to the fact that, for e x a m p l e , B a m b a r a u s e s reduplication to m e a n ' w h a t e v e r ' and S a n s k r i t to m e a n ' e v e r y ' (P~nini 8.1.4). English r e s h m u p l i c a t i o n h a s close a n a l o g u e s in m a n y l a n g u a g e s , including the whole D r a v i d i a n and Turkic l a n g u a g e families. Tamil kiduplication (e.g. pustakam kistakarn) and T u r k i s h meduplication (e.g., kitap mitap) are i n s t a n c e s of this, though the s e m a n t i c r a n g e is s o m e w h a t different. In both of these, the s e n s e is more like t h a t of English books and things, books and such, i.e., a combination of deprecation and e t c e t e r a n e s s r a t h e r t h a n the purely derisive function of English books shmoohs. The English X or no X ... p a t t e r n is v e r y similar to a Polish construction consisting of the form X (nominative) X (instrumental) ... in its r a n g e of applications. The repetition of a verb or verbal p h r a s e to deprecate excessive repetition or i n t e n s i t y of a n action s e e m s to be found in m a n y l a n g u a g e s as well.. However, the f u n d a m e n t a l fact about all known n a t u r a l l a n g u a g e s is the o p e n n e s s of a t least s o m e classes of words (e.g., n o u n s b u t p e r h a p s not prepositions or, in s o m e l a n g u a g e s , verbs), which can acquire n e w m e m b e r s t h r o u g h borrowing or t h r o u g h various processes of n e w formation, m a n y of t h e m a p p a r e n t l y not rule-governed, and which can also lose m e m b e r s , as words are forgotten. T h u s , the welldefined finite vocabularies of formal l a n g u a g e theory are not a v e r y good model of the vocabularies of n a t u r a l l a n g u a g e s . W h e t h e r we decide to introduce the notion of families of l a n g u a g e s or t h a t of u n c o u n t a b l y finite sets or w h e t h e r we r a t h e r choose to s a y t h a t the v o c a b u l a r y of a n a t u r a l l a n g u a g e is really infinite (being the set of all s t r i n g s over the s o u n d s or letters of the l a n g u a g e t h a t could conceivably be or become lexical i t e m s in it), we end up h a v i n g to conclude t h a t a n y l a n g u a g e which productively reduplicates some open word class to form s o m e g r a m m a t i c a l category c a n n o t h a v e a CFG.. I h a v e not tried here to s u r v e y the u s e s to which copying constructions a r e p u t in different l a n g u a g e s or even to d o c u m e n t fully their wide incidence, t h o u g h the e x a m p l e s cited should give s o m e indication of both. It does a p p e a r t h a t copying constructions are e x t r e m e l y c o m m o n and pervasive, and this in t u r n s u g g e s t s t h a t t h e y are central to m a n ' s linguistic faculties. W h e n we consider such additional facts as the frequency of copying in child l a n g u a g e , we m a y be t e m p t e d to take copying as one of the basic linguistic operations.. Copying in English It should now be noted t h a t reduplications (and reiterations generally) are e x t r e m e l y c o m m o n in n a t u r a l l a n g u a g e s . J u s t how c o m m o n follows from an inspection of the bewildering v a r i e t y of such constructions t h a t are found in English. All the e x a m p l e s cited here are productive t h o u g h t h e y m a y be of bounded length.. C o p i e s v s . m i r r o r images Linguistics shminguistics. The existence a n d the centrality of copying constructions poses interesting questions t h a t go beyond the i n a d e q u a c y of CFGs. For e x a m p l e , w h y should n a t u r a l l a n g u a g e s h a v e reduplications w h e n they lack m i r r o r - i m a g e constructions, which are context-free? This a s y m m e t r y (first noted in M a n a s t e r - R a m e r a n d Kac, 1985, and Rounds, M a n a s t e r R a m e r , and F r i e d m a n op. cit.) a r g u e s t h a t it is not e n o u g h to m a k e a small concession to context-sensitivity, as the s a y i n g goes. R a t h e r t h a n grudgingly c l a m b e r i n g up the C h o m s k y H i e r a r c h y towards Context-sensitive G r a m m a r s , we should consider going back down to R e g u l a r G r a m m a r s and striking. Linguistics or no linguistics, (I a m going home). A dog is a dog is a dog. Philosophize while the philosophizing is good! Moral is as moral does. Is s h e beautiful or is she beautiful?. 87.

(4) f u r t h e r problems with t h e s e g r a m m a r s . One w a y in which t h e y fail is t h a t t h e y a p p a r e n t l y c a n only g e n e r a t e two copies--or two cross-serially d e p e n d e n t s u b s t r i n g s - - w h e r e a s n a t u r a l l a n g u a g e s s e e m to allow m o r e (as in Grammar is grammar is grammar). This is similar to t h e limitation of Head G r a m m a r s a n d Tree Adjoining G r a m m a r s to g e n e r a t i n g no more t h a n four copies ( M a n a s t e r - R a m e r to a p p e a r a). However, a m o r e general class of Q u e u e G r a m m a r s a p p e a r s to be within r e a c h which will g e n e r a t e a n a r b i t r a r y n u m b e r of copies.. out in a different direction. The s i m p l e s t alternative proposal is a class of g r a m m a r s which intuitively h a v e the s a m e relation to q u e u e s t h a t C F G s h a v e to stacks. The idea, ~vhich I owe to Michael Kac, would be t h a t h u m a n linguistic processes m a k e little if a n y u s e of stacks and employ q u e u e s instead. Queue. Grammars. This s u g g e s t s t h a t C F G s a r e not j u s t i n a d e q u a t e as models of n a t u r a l l a n g u a g e s b u t inadequate in a particularly d a m a g i n g way. T h e y are not even the right point of departure, since t h e y not only u n d e r g e n e r a t e b u t also overgenerate. This leads to the idea of a h i e r a r c h y of g r a m m a r s whose relation to q u e u e s is like t h a t of the C h o m s k y G r a m m a r s to stacks. A q u e u e - b a s e d analogue to CFG is being developed, u n d e r t h e n a m e of C o n t e x t - f r e e Q u e u e G r a m m a r . The c u r r e n t version is allowed rules of the following form:. P e r h a p s m o r e serious is the fact t h a t C F Q G s a p p a r e n t l y c a n only g e n e r a t e copying constructions a t the cost of profligacy (as defined in Rounds, M a n a s t e r - R a m e r , and F r i e d m a n , to appear). The repair of this defect is less obvious, b u t it a p p e a r s t h a t the f u n d a m e n t a l idea of b a s i n g models of n a t u r a l l a n g u a g e s on q u e u e s r a t h e r t h a n stacks is not u n d e r m i n e d . R a t h e r , w h a t is a t i s s u e is t h e w a y in which information is entered into and retrieved from the queue. The CFQGs s u g g e s t a piecemeal process b u t the considerations cited here s e e m to a r g u e for a global one. A n u m b e r of f o r m a l i s m s with t h e s e properties are being explored.. A->a A - - > aB A -- >. aB...b. O n t h e other h a n d , it m a y be t h a t s o m e t h i n g m u c h like t h e simple C F Q G is a n a t u r a l w a y of c a p t u r i n g cross-serial dependencies in c a s e s other t h a n copying. To see e x a c t l y w h a t is involved, consider t h e difference b e t w e e n copying a n d other cross-serial dependencies. This difference h a s little to do with t h e form of t h e strings. R a t h e r , in t h e case of other cross-serial dependencies, t h e r e is a s y n t a c t i c a n d s e m a n t i c relation b e t w e e n t h e n t h e l e m e n t s of two or m o r e s t r u c t u r e s . For e x a m p l e , in ~ respectively construction involving a conjoined subject arid a conjoined predicate, e a c h conjunct of t h e former is s e m a n t i c a l l y combined with t h e corresponding conjunct of the latter. In t h e case of copying constructions, there is n o t h i n g analogous. The c o r r e s p o n d i n g p a r t s of the two copies do not b e a r a n y relations to e a c h other. T h u s it m a k e s s o m e s e n s e to build up the c o r r e s p o n d i n g p a r t s of cross-serial construction in a piecemeal fashion, b u t this a p p e a r s to be inapplicable in the c a s e of c o p y i n g constructions.. A - - > a...b A - - > ...B. W h a t e v e r a p p e a r s to the r i g h t of the three dots is p u t a t the end of the s t r i n g being rewritten. Otherwise, all definitions are a s in a corresponding restricted CFG. T h u s , the g r a m m a r S - > aS...a S - > bS...b S - - > a...a. S - - > b...b will g e n e r a t e t h e copying l a n g u a g e over {a,b} excluding the null s t r i n g a n d define derivations like the following: S ->. aSa ->. S ->. bSb - - > b a S b a - > b a a S b a a - - > b a a b S b a a b. In view of all t h e s e limitations, t h e C F Q G s m i g h t s e e m to be a n o n - s t a r t e r . However, their i m p o r t a n c e lies in t h e fact t h a t t h e y a r e the first step in reorienting our notions of t h e formal space for models of n a t u r a l l a n g u a g e . A n y real s u c c e s s in the theoretical models of h u m a n l a n g u a g e d e p e n d s on the d e v e l o p m e n t of appropriate m a t h e m a t i c a l concepts and on closing t h e g a p b e t w e e n formal l a n g u a g e a n d n a t u r a l l a n g u a g e theory. One of the first steps in this direction m u s t involve b r e a k i n g the spell of C F G s a n d t h e C h o m s k y Hierarchy. T h e C F Q G s s e e m to be c u t o u t for this task. Moreover, the idea t h a t queues r a t h e r t h a n s t a c k s a r e involved in h u m a n l a n g u a g e a p p e a r s to be correct, a n d this more general r e s u l t is i n d e p e n d e n t of t h e limitations of CFQGs. However, given m y stated goals for formal models, it is n e c e s s a r y to develop models s u c h a s C F Q G s before proceeding to m o r e complex ones precisely in order to develop a n appropriate notion of formal space within which we will h a v e to work.. abSab - - > a b a a b a. On the other h a n d , I conjecture t h a t the corresponding. xmi(x) l a n g u a g e c a n n o t be g e n e r a t e d by s u c h a g r a m m a r . E v e n a t this e a r l y s t a g e of inquiry into t h e s e f o r m a l i s m s , then, we h a v e s o m e tangible promise of being able to explain w h y n a t u r a l l a n g u a g e s should h a v e reduplications b u t not m i r r o r - i m a g e constructions. Various xh(x) constructions such as the respectively ones a n d the cross-serial verb constructions c a n be handled in t h e s a m e w a y as reduplications. While the idea of taking q u e u e s as opposed to stacks as the principal nonfinite-state resource available to h u m a n linguistic processes would explain the prevalence of copying a n d the absence of mirror i m a g e s , it does not explain the coexistence of center-embedded constructions with cross-serial ones or the relative scarcity of cross-serial constructions other t h a n copying ones.. The other m a i n point addressed in this paper, the need to model h u m a n l a n g u a g e s as families of f o r m a l l a n g u a g e s or as formal l a n g u a g e s with indefinite t e r m i n a l vocabularies, is intended in the s a m e spirit. The allure of identifying formal l a n g u a g e theoretic cor~cepts with linguistic ones in the s i m p l e s t possible w a y is h a r d to overcome, b u t it m u s t be if. For this r e a s o n , if for no other, t h e C F Q G s could not be an adequate model of n a t u r a l l a n g u a g e . In fact, there are. 88.

(5) we are to get any meaningful results about natural languages through the formal route. It will, again, be necessary to do more work on these concepts, but it is beginning to look as though we have found the right direction.. Postal, Paul M., and D. Terence Langendoen. 1984. English and the Class of Context-free Languages. CL, 10:177-181.. REFERENCES. Pullum, Geoffrey K., and Gerald Gazdar. 1982. Natural Languages and Context-free Languages. Linguistics and Philosophy, 4: 471-504.. Carlson, Greg N. 1983. Marking Constituents. L i n g u i s t i c C a t e g o r i e s (Frank Heny and Barry Richards, eds.), 1: Categories, 69-98. Dordrecht: Reidel.. Pullum, Geoffrey K. 1984a. On Two Recent Attempts to Show t h a t English is not a CFL. CL, 10: 182-186.. Chomsky, Noam. 1963. Formal Properties of Grammars. H a n d b o o k of M a t h e m a t i c a l P s y c h o l o g y (R. Duncan Luce at al., eds.), 2: 323-418. New York: Wiley.. Pullum, Geoffrey K. 1984b. Syntactic and Semantic Parsability. Proceedings of COLING84, 112-122. Stanford, CA: ACL.. Culy, Christopher. Vocabulary of Bambara. 345-351.. Rounds, William C., Alexis Manaster-Ramer, and Joyce Friedman. To appear. Finding Natural Languages a Home in Formal Language Theory. M a t h e m a t i c s of Language (Alexis Manaster-Ramer, ed.). Amsterdam: John Benjamins.. 1985.. The. Complexity. of. the. Linguistics and Philosophy, 8:. Daly, R. T. 1974. A p p l i c a t i o n s of t h e Mathematical T h e o r y of Linguistics. The Hague: Mouton.. Shieber, Stuart M. 1985. Evidence against the Contextfreeness of Natural Language. Linguistics and P h i l o s o p h y , 8: 333-343.. Gazdar, Gerald, and Geoffrey K. Pullum. 1985. Computationally Relevant Properties of Natural Languages and Their Grammars. New G e n e r a t i o n C o m p u t i n g , 3: 273306. Higginbotham, James. 1984. English is not a Contextfree Language. L i n g u i s t i c I n q u i r y , 15: 225-234. Kac, Michael B. To appear. Surface Transitivity and Context-freeness. Kac, Michael B., Alexis Manaster-Ramer, and William C. Rounds. To appear. Simultaneous-distributive Coordination and Context-freeness. Computational. Linguistics. Langendoen, D. Terence. 1977. On the Inadequacy of Type-3 and Type-2 Grammars for Human Languages. Studies in Descriptive and Historical Linguistics: F e s t s c h r i f t for Winfred P. L e h m a n n (Paul Hopper, ed.), 159-171. Amsterdam: Benjamins. Langendoen, D. Terence, and Paul M. Postal. 1984. Comments on Pullum's Criticisms. CL, 8: 187-188. Levelt, W. J. M. 1974. Formal Grammars in Linguistics and P s y c h o l i n g u i s t i c s . The Hague: Mouton. Manaster-Ramer, Alexis. 1983. The Soft Formal Underbelly of Theoretical Syntax. CLS, 19: 256-262. Manaster-Ramer, Alexis. To appear a. Dutch as a Formal Language. Linguistics and P h i l o s o p h y . Manaster-Ramer, Alexis. To appear b. Subject-verb Agreement in Respective Coordinations in English. Manaster-Ramer, Alexis, and Michael B. Kac. 1985. Formal Languages and Linguistic Universals. Paper read at the Milwaukee Symposium on Typology and Universals. Postal, Paul M. 1964. Limitations of Phrase Structure Grammars. The S t r u c t u r e of L a n g u a g e : R e a d i n g s in t h e P h i l o s o p h y of L a n g u a g e (Jerry A. Fodor and Jerrold J. Katz, eds.), 137-151. Englewood Cliffs, NJ: Prentice-Hall.. 89.

(6)

New documents

Based on our results and the absence of a suitable indicator for thyroid hormone replacement in SCH, HbA1c levels may represent a convenient indicator for initiating treatment in

We show for the first time that prior exposure to higher, relative to lower, intensities of blue-enriched light speeds response times to left, but not right, hemifield visual stimuli,

4.2 Extrinsic Evaluation: Inflection Generation Inflection generation, also called morphology generation, is the NLP task of generating an inflected word from its lemma paired with its

By explicit construction of the spectral curves, we have proved the existence of a charge seven monopole with icosahedral symmetry and a charge five monopole with octahedral symmetry..

Impact: In this study which encompassed multiple number of breast cancer patients encountered within 30 years, HER2 tumors had the worst survival rates Interestingly, Luminal A subgroup

in which sentence-level quality predictions are used to select the best translation between the.. raw output or the correction produced by a factored phrase-based APE

The registered provider, or a person nominated by the registered provider, shall carry out an unannounced visit to the designated centre at least once every six months or more

The change in system trust for the misleading translation M = −0.21,SD = 1.50 did not have a significant effect on user trust compared with the control; t55 = 1.36, p = 0.18, so we

1Chongqing Center Disease Control and Prevention, No 8 Changjiang 2 Road Yuzhong District, Chongqing, China; 2Chongqing Yuzhong District Center Disease Control and Prevention, No 254

This section sets out the actions that must be taken by the provider or person in charge to ensure compliance with the Health Act 2007 Care and Support of Residents in Designated