C O N T R O L L E D T R A N S F O R M A T I O N A L S E N T E N C E G E N E R A T I O N
M a d e l e i n e B a t e s
B o l t B e r a n e k and Newman, Inc.
R o b e r t Ingria
D e p a r t m e n t of L i n g u i s t i c s , M I T
I. I N T R O D U C T I O N
This paper d e s c r i b e s a s e n t e n c e g e n e r a t o r that w a s b u i l t p r i m a r i l y to focus o n s y n t a c t i c form and s y n t a c t i c r e l a t i o n s h i p s . Our m a i n goal was to p r o d u c e a t u t o r i a l s y s t e m for the E n g l i s h language; the intended u s e r s of the s y s t e m are p e o p l e w i t h l a n g u a g e d e l a y i n g h a n d i c a p s such as deafness, and p e o p l e l e a r n i n g E n g l i s h as a f o r e i g n language. For t h e s e p o p u l a t i o n s , e x t e n s i v e e x p o s u r e to s t a n d a r d E n g l i s h c o n s t r u c t i o n s (negatives, q u e s t i o n s , r e l a t l v i z a t i o n , etc.) and their i n t e r a c t i o n s is necessary. • T h e p u r p o s e of the g e n e r a t o r was to serve as a p o w e r f u l r e s o u r c e for t u t o r i a l p r o g r a m s that need e x a m p l e s of p a r t i c u l a r c o n s t r u c t i o n s and/or related s e n t e n c e s to e m b e d in e x e r c i s e s or e x a m p l e s for the student. T h e focus of the g e n e r a t o r is thus not so m u c h o n w h a t to e x p r e s s as on h o w to e x p r e s s it in a c c e p t a b l e English. This is q u i t e d i f f e r e n t from the focus of m o s t o t h e r l a n g u a g e g e n e r a t i o n systems. N o n e t h e l e s s , our s y s t e m c o u l d be i n t e r f a c e d to a more g o a l - d i r e c t e d s e m a n t i c component.
The m e c h a n i s m of t r a n s f o r m a t i o n a l g r a m m a r was c h o s e n b e c a u s e it o f f e r e d both a w a y to e x e r c i s e tight c o n t r o l over the s u r f a c e s y n t a c t i c form of a s e n t e n c e and a good m o d e l for the p r o d u c t i o n of g r o u p s of s e n t e n c e s that are s y n t a c t i c a l l y r e l a t e d (e.g. the a c t i v e and p a s s i v e forms of a t r a n s i t i v e sentence). By c o n t r o l l i n g (at a very high level) the rules that are a p p l i e d and by e x a m i n i n g the d e t a i l e d s y n t a c t i c r e l a t i o n s h i p s in the tree s t r u c t u r e s at each end of the d e r i v a t i o n , the t u t o r i a l p a r t of the s y s t e m a c c e s s e s a g r e a t deal of i n f o r m a t i o n a b o u t the s y n t a x of the s e n t e n c e s that are p r o d u c e d by the g e n e r a t o r ; this k n o w l e d g e is used to g i v e e x p l a n a t i o n s and h i n t s to the user in the c o n t e x t of the p a r t i c u l a r e x e r c i s e that the s t u d e n t is a t t e m p t i n g .
The t r a n s f o r m a t i o n a l g e n e r a t o r is c o m p o s e d of t h r e e m a g o r parts: a b a s e c o m p o n e n t that p r o d u c e s base trees, a t r a n s f o r m e r that a p p l i e s t r a n s f o r m a t i o n a l rules to t h e trees to d e r i v e a s u r f a c e tree, and a set of m e c h a n i s m s to c o n t r o l the o p e r a t i o n of the first two c o m p o n e n t s . We w i l l d i s c u s s e a c h of the c o m p o n e n t s of this s y s t e m separately.
2. T H E B A S E C O M P O N E N T
The base c o m p o n e n t is a set of f u n c t i o n s that i m p l i c i t l y e m b o d y c o n t e x t free rules for c r e a t i n g a tree s t r u c t u r e (phrase marker) in the X-bar framework (as d i s c u s s e d by C h o m s k y (1970), J a c k e n d o f f (1974), B r e s n a n (1975) and others.) In this system, the m a j o r s y n t a c t i c c a t e g o r i e s (N(oun), V(erb), A(djective) and P ( r e p o s l t i o n ) ) are treated as c o m p l e x symbols w h i c h are d e c o m p o s a b l e into the f e a t u r e s [~N] and [~V]. This y i e l d s the f o l l o w i n g c r o s s - c l a s s i f i c a t i o n of these categories:
This w o r k was s p o n s o r e d by BEH grant ~ G 0 0 7 9 0 4 5 1 4 .
V
÷I
F i g u r e i. F e a t u r e s in the X-bar S y s t e m
The feature "N" m a r k s a g i v e n c a t e g o r y as " n o u n l i k e " (and thus c o r r e s p o n d s to the t r a d i t i o n a l g r a m m a t i c a l n o t i o n of "substantive") w h i l e "V" m a r k s a c a t e g o r y as " v e r b l i k e . " N o u n s and A d j e c t i v e s are [÷N] b e c a u s e they share c e r t a i n p r o p e r t i e s (e.g. A d j e c t i v e s can be used in n o m i n a l contexts; in h i g h l y i n f l e c t e d l a n g u a g e s , A d j e c t i v e s and N o u n s t y p i c a l l y share the same i n f l e c t l o n a l p a r a d i g m s , etc.) A d j e c t i v e s and V e r b s are [+V] b e c a u s e they share (among o t h e r things) v a r i o u s m o r p h o l o g i c a l traits (e.g. c e r t a i n v e r b a l forms, such as p a r t i c i p l e s ,
have
a d j e c t i v a l p r o p e r t i e s ) . V e r b s and P r e p o s i t i o n s are I-N] b e c a u s e they d i s p l a y c o m m o n c o m p l e m e n t s e l e c t i o n a t t r i b u t e s (e.g. t h e y both r e g u l a r l y take N o m i n a l c o m p l e m e n t s that bear A c c u s a t i v e Case.) (For further d i s c u s s i o n of the issue off e a t u r e d e c o m p o s i t i o n , and for some a l t e r n a t i v e p r o p o s a l s , see J a c k e n d o f f (1978) and G e o r g e
(1980a, S e c t i o n 2; 1980b, S e c t i o n 2).)
In addition, e a c h s y n t a c t i c c a t e g o r y c o n t a i n s a s p e c i f i c a t i o n of its rank (given in terms of number of bars, h e n c e the term "X-bar" system). For instance, a N o u n (N) is of rank 0 and is m a r k e d w i t h no b a r s w h e r e a s the N o u n P h r a s e w h i c h it heads is of the same c a t e g o r y but d i f f e r e n t (higher) rank. I n t e r m e d i a t e s t r u c t u r e s are also p e r m i t t e d ; for instance, V * (read "V bar") is that p o r t i o n of the V e r b P h r a s e w h i c h c o n s i s t s of a V e r b and its c o m p l e m e n t s (e.g. d i r e c t and i n d i r e c t objects, c l a u s a l c o m p l e m e n t s , p r e p o s i t i o n a l phrases, etc.) w h i l e V ~ (read "V d o u b l e bar") includes V ~ as w e l l as A u x i l i a r y elements. For our p u r p o s e s , w e have a d o p t e d a u n i f o r m two-level s t r u c t u r e a c r o s s c a t e g o r i e s ~ that is, each c a t e g o r y X is taken to have X ~* as its h i g h e s t rank, so that N o u n P h r a s e (NP) in our s y s t e m is N ~ , V e r b P h r a s e is V ~', etc. M i n o r c a t e g o r i e s (such as D E T ( e r m i n e r ) , A U X ( i l f a r y ) , N E G ( a t i v e ) ,
etc.) stand o u t s i d e this system, as do
S(entence) and S ~ (a sort of super sentence,
w h i c h c o n t a i n s S and c l a u s e introducing e l e m e n t s (or " s u b o r d i n a t i n g
conjunctions")
suchas that). T h e s e c a t e g o r i e s are not
d e c o m p o s a b l e into the f e a t u r e s [÷N] and [+V], and, e x c e p t for S and S" , t h e y - d o not ~ a v e d i f f e r e n t ranks. (It should be noted that the a d o p t i o n of a u n i f o r m t w o - l e v e l h y p o t h e s i s and the p l a c l n g of S and S ~ o u t s i d e of the normal X-bar s y s t e m are not u n c o n t r o v e r s i a l - - s e e e.g. J a c k e n d o f f (1978) and G e o r g e (1980a, S e c t i o n 2; 1980b, S e c t i o n 2). However, these a s s u m p t i o n s are found in m a n y v a r i a n t s of the X-bar f r a m e w o r k and are a d e q u a t e for our purposes.)
[image:1.612.382.446.100.158.2]An e x a m p l e of the i n t e r n a l s t r u c t u r e of the P'" c o r r e s p o n d i n g to the p h r a s e "to the sad boys" is g i v e n below:
p'" [ - v - N ] P " [ -V - N ]
P [ - V - N ] to
N ~ [ ~ N - V PER. 3 + D E F W U . P L + H U M A N G E N D E R . M A L E ] D E T [ + D E F ]
the
A ~" [ + N + V ] A ~ [ + N + V ]
A [ + N + V ] sad
N ~ [ + N -V PER. 3 + D E F N U . P L + H U M A N G E N D E R . M A L E ] N [ +N - V PER. 3 + D E F N U . P L
+ H U M A N G E N D E R . M A L E ] b o y
F i g u r e 2. P a r t of A S a m p l e B a s e S t r u c t u r e
T h i s s y s t e m of c r o s s - c l a s s i f i c a t i o n by f e a t u r e s
and by rank p e r m i t s the c r e a t i o n of
t r a n s f o r m a t i o n s w h i c h c a n refer to a s p e c i f i c rank or f e a t u r e w i t h o u t r e f e r r i n g to a s p e c i f i c m a j o r c a t e g o r y . (See B r e s n a n (1975) for f u r t h e r d i s c u s s i o n of this point.) For e x a m p l e , the t r a n s f o r m a t i o n w h i c h f r o n t s W H - w o r d s to form W H - Q u e s t i o n s t r e a t s a n y X ~ c a t e g o r y as its t a r g e t and, hence, c a n be used to q u e s t i o n any of the m a j o r c a t e g o r i e s (e.g. A ' ~ - - " h o w big is it?"; N ' ' - - " w h a t did they do?" " w h i c h m e n left?"; P ~ ' - - " t o w h o m did you g i v e it?"). S i m i l a r l y , the t r a n s f o r m a t i o n w h i c h m a r k s A c c u s a t i v e C a s e on p r o n o u n s a p p l i e s o n l y to t h o s e N ~ ' s w h i c h f o l l o w a I-N] c a t e g o r y ; i.e. o n l y to those N ~ s w h i c h are the o b j e c t s of V e r b s or P r e p o s i t i o n s . T h i s a l l o w s us to c r e a t e e x t r e m e l y v e r s a t i l e t r a n s f o r m a t i o n s w h i c h a p p l y in a v a r i e t y o f c o n t e x t s , and frees us from the n e c e s s i t y of c r e a t i n g s e v e r a l t r a n s f o r m a t i o n s , e a c h of w h i c h e s s e n t i a l l y r e p l i c a t e s the S t r u c t u r a l D e s c r i p t i o n and S t r u c t u r a l C h a n g e of the o t h e r s , d i f f e r i n g o n l y in the c a t e g o r y of the a f f e c t e d term.
A set of c o n s t r a i n t s ( d i s c u s s e d f u r t h e r below) is the input to the b a s e c o m p o n e n t and d e t e r m i n e s the type of b a s e s t r u c t u r e w h i c h is
p r o d u c e d . A b a s e s t r u c t u r e has b o t h the u s u a l f e a t u r e s on the n o d e s ( c a t e g o r y f e a t u r e s such as [+N] and [-v], and s e l e c t i o n a l f e a t u r e s such as [+PROPER]) and some a d d i t i o n a l d i a c r i t i c f e a t u r e s (such as [-C], for c a s e marking) w h i c h are used to ,govern the a p p l i c a t i o n of c e r t a i n t r a n s f o r m a t i o n s .
L e x i c a l i n s e r t i o n is an i n t e g r a l p a r t of the c o n s t r u c t i o n of the tree b y the b a s e c o m p o n e n t . It is not e s s e n t i a l that w o r d s be c h o s e n for the s e n t e n c e at this time, b u t it is c o n v e n i e n t b e c a u s e a d d i t i o n a l f e a t u r e s in the s t r u c t u r e (such as [+HUMAN], [+MALE]) are n e e d e d to g u i d e some t r a n s f o r m a t i o n s (for instance, the i n s e r t i o n of the c o r r e c t form o f o r o n o u n s . )
In our c u r r e n t system, the c h o i c e of w o r d s to be inserted in the base s t r u c t u r e is c o n t r o l l e d by a d i c t i o n a r y and a s e m a n t i c n e t w o r M w h i c h e m b o d i e s a l i m i t e d n u m b e r of s e m a n t i c c l a s s r e l a t i o n s h i p s and c a s e r e s t r i c t i o n s to o r o h i b i t
the p r o d u c t i o n of u t t e r a n c e s like "The a n s w e r saw the a n g r y c o o k i e . " T h e n e t w o r k nodes are c h o s e n at r a n d o m for each s e n t e n c e that is g e n e r a t e d , but a m o r e p o w e r f u l s e m a n t i c c o m p o n e n t c o u l d be used to c o n v e y p a r t i c u l a r " m e s s a g e s , " p r o v i d e d o n l y that it c o u l d find l e x i c a l items to be i n s e r t e d in the s m a l l n u m b e r of p o s i t i o n s r e q u i r e d by the b a s e c o n s t r a i n t s .
3. T H E T R A N S F O R M A T I O N A L C O M P O N E N T
E a c h t r a n s f o r m a t i o n a l rule has a S t r u c t u r a l D e s c r i p t i o n , S t r u c t u r a l C h a n g e , and (optional) C o n d i t i o n ; h o w e v e r r u l e s are not m a r k e d as o p t i o n a l or o b l i g a t o r y , as they w e r e in t r a d i t i o n a l t r a n s f o r m a t i o n a l t h e o r y (e.g. C h o m s k y (1955)). O b l i g a t o r y t r a n s f o r m a t i o n s w h o s e s t r u c t u r a l d e s c r i p t i o n s w e r e m e t w o u l d a p p l y n e c e s s a r i l y ; o p t i o n a l t r a n s f o r m a t i o n s w o u l d a p p l y at random. M o r e o v e r , v a r i o u s v e r s i o n s o f t r a n s f o r m a t i o n a l g r a m m a r h a v e e m p l o y e d t r a n s f o r m a t i o n s as " f i l t e r s " o n p o s s i b l e d e r i v a t i o n s . In o l d e r w o r k (e.g. the s o - c a l l e d " S t a n d a r d T h e o r y " (ST) o f C h o m s k 7 (1965)) d e r i v a t i o n s in w h i c h a t r a n s f o r m a t i o n r e q u i r e d in a g i v e n s y n t a c t i c c o n f i g u r a t i o n f a i l e d to a p p l y w o u l d block, c a u s i n g the r e s u l t to be r u l e d o u t as u n g r a m m a t i c a l (op. clt., p. 138).
In m o r e r e c e n t t h e o r i e s (e.g. the " E x t e n d e d
S t a n d a r d T h e o r y " (EST) o f C h o m s k y (1977) and C h o m s k y and L a s n i k (1977)) a l l t r a n s f o r m a t i o n s are o p t i o n a l , f r e e l y o r d e r e d and m a y a p p l y at random. T h o s e d e r i v a t i o n s in w h i c h a t r a n s f o r m a t i o n m i s a p p l i e s are r u l e d o u t b y i n d e p e n d e n t c o n d i t i o n s o n the s t r u c t u r e s
p r o d u c e d by the o p e r a t i o n o f the
t r a n s f o r m a t i o n a l c o m p o n e n t ( C h o m s k y (1977, p. 76)). T h e s e f r a m e w o r k s a d o p t a " g e n e r a t e and test" a p p r o a c h , w h e r e i n the m i s a p p l i c a t i o n o f t r a n s f o r m a t i o n s d u r i n g the c o u r s e of a d e r i v a t i o n (e.g. the f a i l u r e o f a r e q u i r e d t r a n s f o r m a t i o n to a p p l y (ST, EST) or the a p p l i c a t i o n of a t r a n s f o r m a t i o n in a p r o h i b i t e d s y n t a c t i c c o n f i g u r a t i o n (EST)) w i l l r e s u l t in a r e j e c t i o n of this p o s s i b l e d e r i v a t i o n . The a p p l i c a t i o n of d i f f e r e n t o p t i o n a l t r a n s f o r m a t i o n s r e s u l t s in the p r o d u c t i o n of a v a r i e t y of s u r f a c e forms.
T h e r e are two r e a s o n s w h y we do not use this g e n e r a t e and test a p p r o a c h . T h e f i r s t is t h a t
it is c o m p u t a t i o n a l l y i n e f f i c i e n t to a l l o w the t r a n s f o r m a t i o n s to a p p l y at r a n d o m and to c h e c k the r e s u l t to m a k e sure that it is g r a m m a t i c a l . M o r e i m p o r t a n t l y , w e v i e w the t r a n s f o r m a t i o n s as t o o l s to be u s e d by a p r o c e s s o u t s i d e the s e n t e n c e g e n e r a t o r . itself. T h a t is, an e x t e r n a l p r o c e s s d e t e r m i n e s w h a t the s u r f a c e s y n t a c t i c form o f a g i v e n b a s e s t r u c t u r e s h o u l d be; the t r a n s f o r m a t i o n s are not i n d e p e n d e n t e n t i t i e s w h i c h make this d e c i s i o n on their own. For e x a m p l e , a focus m e c h a n i s m s h o u l d be able to s e l e c t or p r o h i b i t p a s s i v e s e n t e n c e s , a d i a l o g u e m e c h a n i s m s h o u l d be a b l e to c a u s e a g e n t - d e l e t i o n , and so on. In OUr a p p l i c a t i o n , t u t o r i a l p r o g r a m s s e l e c t the c h a r a c t e r i s t i c s of the s e n t e n c e s to be p r o d u c e d o n the b a s i s of the s y n t a c t i c r u l e or rules b e i n g e x e r c i s e d in the p a r t i c u l a r tutorial.
T h e S t r u c t u r a l C h a n g e of e a c h t r a n s f o r m a t i o n c o n s i s t s of one or m o r e f u n c t i o n s , a n a l o g o u s to the t r a n s f o r m a t i o n a l e l e m e n t a r i e s of t r a d i t i o n a l t r a n s f o r m a t i o n a l t h e o r y ( C h o m s k v (1955, pp. 402-407, S e c t i o n 93.1)). W e have
[image:2.612.91.275.80.226.2]not adopted the restriction on the Structural
Change of transformations proposed by more
recent work in generative grammar (e.g. Chomsky (1980, p. 4)) which prohibits "compounding of elementaries"; i.e. which limits the Structural
Change of a transformation to a single
operation. This would require breaking up many
transformations into several transformations,
each of which would have to apply in the
derivation of a particular syntactic
construction rather than having one
transformation that performs the required
operations. Inasmuch as we are interested in
utilizing the generative capacity of
transformational grammar to produce specific
constructions, this break up of more general,
overarching transformations into smaller, more specific operations is undeslrable.
The operations that are performed by the rules are a combination of classic transformational operations (substitution, adjunction, deletion,
insertion of non-lexical elements such as
"there" and "do") and operations that llnguists
sometimes relegate to the base or post-
transformational processes (insertion of
pronouns, morphing of inflected forms). By
making these operations rule-speclflc, many
related forms can be produced from the same
base tree and the control mechanisms outside
the generator itself can speclfv which forms
are to be produced. (Figure 3 shows some of
the transformations currently in the system.)
SUBJECT-AUX-INVERSION
SD: (S ~ (FEATS (TRANS.1)) COMP (FEATS (WH.+))
1 2
(S N *~ TNS (OPT NODE (FEATS (M.+)))))
3 4 5 6
SC: (DeleteNode 6)
(DeleteNode 5) (LChomsk7 2 6) (LChomsky 2 5)
Condition: [NOT (EQ (QUOTE +)
(FeatureValue (QUOTE WR) (RootFeats 4]
RELATIVE-PRONOUN-SPELL-OUT [REPEATABLE]
SD: (S* XX (N "~ N "~ (S" (COMP X (N ~"
1 2 3 4 5 6
(FEATS (WH . +)) WH))))) 7
SC: (DeleteSons 6)
(LSon 6 (if (EQ "+(GetFeat 6 ~ HUMAN)) then "who
else ~whlch))
Figure 3. Sample Transformations
Those transformations which affect the
syntactic form of sentences are apnlied
cyclically (see (Chomsky (1965, p. 1 4 3 ) for
more details). Thus transformations apply from
the "bottom up" durinq the course of a
the transformations are strictly
iand
extrinsically) ordered. In addition to the
cyclic syntactic transformations there exists a set of post-cyclic transformations, which apply after all the cyclic syntactic transformations
have applied. These post-cyclic
transformations, whose domain of operation
ranges over the entire syntactic tree, supply the correct morphological forms of all lexical
and grammatical items. This includes qlvlna
the correct plural forms of nouns, the
inflected forms of verbs, the proper forms of pronouns (e.g. "he," "she" and "they" in subject position and "him," "her," and "them"
in object position), etc. While it has been
relatively rare in recent transformational
analyses to utilize transformations to effect this type of morphological "spell-out," this mechanism was first proposed in the earliest
work in generative grammar (Chomsky (1955)).
Moreover, recent work by George (1980a; 1980b) and Ingria (in preparation) suggests that this
is indeed the correct way of handling such
morphological processes.
The transformations as a whole are divided up into "families" of related transformations.
For example, there is a family of
transformations which apply in the derivation
of questions (all beginning with the prefix
WH-); there is a family of morphlng
transformations (similarly beginning with the
flagged mnemonic prefix MORPH-). These
"families" of transformations provide detailed
control over the generation process. For
example, all transformations of the W~- family will apply to a single syntactic position that may be questioned (e.g. subject, direct object,
object of preposition, etc.), resulting in
questions of the form "Who died" and "To whom did she speak." This familial characterization of transformations is similar to the classical
transformational approach (Chomsky (1955,
p. 381, Section 90.1)) wherein families of
transformations were first postulated, because of the condition imposed within that framework
that each transformation must be a single-
valued mapping.
Our current sentence generator produces
declarative sentences, passive sentences (with
and without agent deletion), dative movement
sentences, yes-no questions and wh-queetlons
(including multlple-wh questions such as "Who
gave what to whom?'), there-insertlon
sentences, negated sentences (including both
contracted and emphatic forms), relative
clauses, finite and infinitival complements
(e.g., "The teacher wanted Kathy to hurry.'),
imperative sentences, various complex
auxiliaries (progressive, perfective, and
modals), predicate adjectives, and predicate
nominals. Although not all of these
constructions are handled in complete
generality, the generator produces a very large
and natural subset of English. It is important
to note that the interactions among all these
transformations have been taken into account,
so that any meaningful co~blnatlon of them will
produce a meaningful, grammatical sentence.
(Appendix A lists some of the sentences which
have been produced by the interaction of
various transformations.)
derivation, applying first in the most embedded In our application, there is a need to generate
clause and then working upwards until the ungrammatical utterances occasionally (for
[image:3.612.56.284.367.619.2]a b i l i t y to j u d g e • the g r a m m a t i c a l i t v of u t t e r a n c e s ) . To this end, we h a v e d e v e l o p e d an a d d i t i o n a l set of t r a n s f o r m a t i o n s that c a n he u s e d to g e n e r a t e u t t e r a n c e s w h i c h m i m i c the u n g r a m m a t i c a l forms f o u n d in the w r i t i n g of the
language d e l a y e d p o p u l a t i o n s for w h i c h this
s y s t e m is intended. For e x a m p l e , d e a f and h e a r i n g - i m p a i r e d c h i l d r e n o f t e n have d i f f i c u l t y w i t h n e g a t i v e s e n t e n c e s , a n d r e p l a c e the not of S t a n d a r d E n g l i s h n e g a t i o n w i t h no a n d / o r p l a c e the n e g a t i v e e l e m e n t in p o s i t i o n s in w h i c h it d o e s not o c c u r in S t a n d a r d E n g l i s h (e.g. "The m o u s e is no a big a n i m a l , " "The g i r l no has g o n e , " "Dogs not c a n b u i l d trees"). T h e fact that t h e s e u n g r a m m a t i c a l f o r m s m a y be m o d e l l e d w i t h t r a n s f o r m a t i o n s is h i g h l y s i g n i f i c a n t , and l e n d s s u p p o r t to the c l a i m ( C h a p m a n (1974), F r o m k i n (1973)) that u n g r a m m a t i c a l u t t e r a n c e s are r u l e - d r i v e n .
4. H I G H E R L E V E L S O F C O N T R O L
In o r d e r to m a n a g e the c r e a t i o n of the base
t r e e s and the a p p l i c a t i o n of the
t r a n s f o r m a t i o n a l rules, we h a v e d e v e l o p e d s e v e r a l l a y e r s of c o n t r o l m e c h a n i s m s . T h e f i r s t of these is a set of c o n s t r a i n t s that d i r e c t s the o p e r a t i o n of the b a s e c o m o o n e n t and i n d i c a t e s w h i c h t r a n s f o r m a t i o n s to try. A t r a n s f o r m a t i o n a l c o n s t r a i n t m e r e l y turns a p a r t i c u l a r t r a n s f o r m a t i o n o n or off. T h e fact that a t r a n s f o r m a t i o n is t u r n e d o n d o e s not g u a r a n t e e that it w i l l a p p l y ; it m e r e l y i n d i c a t e s that the S t r u c t u r a l D e s c r i p t i o n and C o n d i t i o n of that t r a n s f o r m a t i o n are to be tried. B a s e c o n s t r a i n t s c a n h a v e e i t h e r a t o m i c i n d i c a t o r s or a list o f c o n s t r a i n t s as their v a l u e s . For e x a m p l e , the d i r e c t o b j e c t
c o n s t r a i n t (DIROBJ (PER 3) (NU PL) ...)
s p e c i f i e s all the b a s e c o n s t r a i n t s n e c e s s a r y to p r o d u c e the N'" s u b t r e e for the d i r e c t o b j e c t p o s i t i o n in the base s t r u c t u r e .
T h e r e are a n u m b e r of d e p e n d e n c i e s w h i c h e x i s t a m o n g c o n s t r a i n t s . For e x a m p l e , if the t r a n s f o r m a t i o n a l c o n s t r a i n t for the p a s s i v e t r a n s f o r m a t i o n is t u r n e d on, t h e n the base c o m p o n e n t m u s t be i n s t r u c t e d to p r o d u c e a d i r e c t o b j e c t and to c h o o s e a m a i n v e r b that m a y be p a s s i v i z e d ; if the b a s e c o n s t r a i n t for a d i r e c t o b j e c t is t u r n e d off, t h e n the b a s e c o n s t r a i n t for an i n d i r e c t o b j e c t m u s t be t u r n e d o f f as well. A d a t a b a s e o f i m p l i c a t i o n s c o n t r o l s the a p p l i c a t i o n of c o n s t r a i n t s so that w h e n e v e r a c o n s t r a i n t is set (or t u r n e d off), the b a s e a n d / o r t r a n s f o r m a t i o n a l c o n s t r a i n t s t h a t its v a l u e i m p l i e s are a l s o set.
T h e n o t i o n o f a p a r t i c u l a r s y n t a c t i c c o n s t r u c t i o n t r a n s c e n d s the d i s t i n c t i o n b e t w e e n b a s e and t r a n s f o r m a t i o n a l c o n s t r a i n t s . T h e " n a t u r a l " s p e c i f i c a t i o n o f a s y n t a c t i c c o n s t r u c t i o n such as p a s s i v e or r e l a t i v e c l a u s e s h o u l d be made w i t h o u t r e q u i r i n q d e t a i l e d k n o w l e d g e of the c o n s t r a i n t s or "their i m p l i c a t i o n s . In a d d i t i o n , o n e m i g h t w a n t to request, say, a r e l a t i v e c l a u s e o n the subject, w i t h o u t s p e c i f y i n g w h e t h e r the t a r g e t of r e l a t i v i z a t i o n is to be the s u b j e c t or o b j e c t o f the e m b e d d e d c l a u s e .
We h a v e d e v e l o p e d a d a t a b a s e of s t r u c t u r e s c a l l e d s y n s p e c s (for " s y n t a c t i c s p e c i f i c a t i o n s " ) w h i c h embody, at a v e r y h i g h level, the n o t i o n of a s y n t a c t i c c o n s t r u c t i o n . T h e s e c o n s t r u c t i o n s c a n n o t be i d e n t i f i e d w i t h a s i n g l e c o n s t r a i n t or its i m p l i e d c o n s t r a i n t s .
( I m p l i c a t i o n s s p e c i f y n e c e s s a r y d e p e n d e n c i e s ;
s y n s p e c s s p e c i f y p o s s i b l e but not n e c e s s a r y c h o i c e s on the p a r t of the s y s t e m d e s i g n e r s a b o u t w h a t c o m b i n a t i o n s of c o n s t r a i n t s s h o u l d be invoked u n d e r a g e n e r a l name.) A s y n s p e c c a n c o n t a i n an e l e m e n t of c h o i c e . T h e c h o i c e c a n be m a d e by any u s e r - d e f i n e d f u n c t i o n , t h o u g h in our p r a c t i c e m o s t of the c h o i c e s are m a d e at random. O n e e x a m p l e of this is a s y n s p e c c a l l e d w h - q u e s t i o n w h i c h d e c i d e s w h i c h of the s y n s p e c s that a c t u a l l y set u p the c o n s t r a i n t s for a w h - q u e s t i o n ( q u e s t i o n - o n - s u b j e c t , q u e s t i o n - o n - o b j e c t , q u e s t i o n - o n - d a t i v e , etc.) s h o u l d be used. T h e s y n s p e c s a l s o p r o v i d e c o n v e n i e n t h o o k s o n w h i c h to hang o t h e r i n f o r m a t i o n a s s o c i a t e d w i t h a s y n t a c t i c c o n s t r u c t i o n : s e n t e n c e s e x e m p l i f y i n g the c o n s t r u c t i o n , a d e s c r i p t i o n of the c o n s t r u c t i o n for p u r p o s e s of d o c u m e n t a t i o n , etc. F i g u r e 4 s n o w s how s e v e r a l of the s y n s p e c s look w h e n p r i n t e d for the user.
w h - q u e s t i o n
C o m p u t e : (PickOne " ( q u e s t i o n - o n - s u b j e c t q u e s t i o n - o n - o b j e c t q u e s t i o n - o n - d a t i v e ) )
D e s c r i p t i o n : (This S y n S p e c w i l l c r e a t e any o n e of the q u e s t i o n s w i t h W H - w o r d s . )
s e c o n d - p e r s o n - i m p e r a t i v e
B a s e C o n s t r a i n t s : ( ( I M P E R A T I V E . 2) (TNS))
T r a n s C o n s t r a i n t s :
( ( R E Q U E S T - V O C A T I V E - D E L E T I O N . +} ( R E Q U E S T - E X C L A M A T I O N - I N S E R T I O N . +) ( R E Q U E S T - Y O U - D E L E T I O N . +))
E x a m p l e s : ('Open the door!")
F i g u r e 4. S a m p l e S y n S p e c s
S y n s p e c s are i n v o k e d t h r o u g h a s i m p l e m e c h a n i s m t h a t is a v a i l a b l e to the t u t o r i a l c o m p o n e n t of the system. E a c h tutorial s p e c i f i e s the range o f c o n s t r u c t i o n s r e l e v a n t to its t o p i c and c h o o s e s a m o n g them for e a c h s e n t e n c e that is to be g e n e r a t e d . To p r o d u c e r e l a t e d s e n t e n c e s ,
the g e n e r a t o r is restarted at the
t r a n s f o r m a t i o n a l c o m p o n e n t (using the p r e v i o u s b a s e tree) a f t e r the s y n s p e c s s p e c i f y i n g the r e l a t i o n s h i p have b e e n p r o c e s s e d . )
J u s t a s c o n s t r a i n t s h a v e i m p l i c a t i o n s , so do s y n s p e c s . T h e r e l a t i o n s h i p s that h o l d a m o n g s y n s p e c s i n c l u d e e x c l u s i o n (e.g. t r a n s i t i v e - s e n t e n c e e x c l u d e s p r e d i c a t e - n o m i n a l - s e n t e n c e ) , r e q u i r e m e n t (e.g. e x t r a p o s e d - r e l a t i v e r e q u i r e s
relative-clause-on-subject or r e l a t l v e - c l a u s e -
o n - o b j e c t ) , and p e r m i s s i o n (e.g. p r e d i c a t e - a d v e r b - s e n t e n c e a l l o w s t h e r e - i n s e r t i o n ) . A m e c h a n i s m s i m i l a r to the i m p l i c a t i o n s f o r
c o n s t r a i n t s r e f i n e s a set of c a n d i d a t e s y n s p e c s so that the user (or the tutorlals) c a n m a k e c h o i c e s w h i c h are c o n s i s t e n t . T h u s the user d o e s not h a v e to know, u n d e r s t a n d , or r e m e m b e r w h i c h c o m b i n a t i o n s of c h o i c e s are allowed.
[image:4.612.319.558.37.221.2]Once some constraints have been set (either directly or through synspecs), a command can be
given to generate a sentence. The generator
first assigns values to the constraints that the user did not specify7 the values chosen are guaranteed to be compatible with the previous choices, and the implications of these choices ensure that contradictory specifications cannot
be made. Once all constraints have been set, a
base tree is generated and saved before the
transformations are applied. Because the base
structure has been saved, the transformational
constraints can be reset and the generator
called to start at the transformational
component, producing a different surface
sentence from the same base tree. As many
sentences as are wanted can be produced in this way.
5. DEVELOPMENT TOOLS
As one side effect of the development of the generative system, we have built a debugging environment called the syntactic playground in
which a user can develop and test various
components of the generator. This environment
has become more important than the tutorials in testing syntactic hypotheses and exploring the
power of the language generator. In it,
dictionary entries, transformations,
implications and synspecs can be created,
edited, and saved using interactive routines
that ensure the correct format of those data
types. It is also possible here to give
commands to activate synspecs; this operation
uses exactly the same interface as programs
(e.g. tutorials) that use the generator.
Commands exist in the playground to set base
constraints to specific values and to turn
individual transformations on and off without
activating the implications of those
operations. This allows the system programmer
or linguist to have complete control over all aspects of the generation process.
Because the full power of the Interlisp system
is available to the playground user, the base
tree can be edited directly, as can any version
of the tree during the derivation process.
Transformations can also be "broken" like
functions, so that when a transformation is
about to be tried the generator goes into a "break" and conducts an interactive dialogue with the user who can control the matching of the Structural Description, examine the result of the match, allow (or not) the application of the Structural Change, edit the transformation
and try it again, and perform many of the
operations that are available in the general
playground. In addition to the
transformational break package there is a trace option which, if used, prints the constraints
selected by the system, the words, and the
transformations that are tried as they apply or
fail. The playground has proved to be a
powerful tool for exploring the interaction of
various rules and the efficacy of the whole
generation package.
6. CONCLUSION
This is the most syntactically powerful
generator that we know of. It produces sets of
related sentences maintaining detailed
knowledge of the choices that have been made
and the structure(s) that have been produced.
Because the notion of "syntactic construction" is embodied in an appropriately high level of
syntactic specification, the generator can be
externally controlled. It is fast, efficient,
and very easy to modify and maintain; it has
been implemented in both Interlisp on a
DECSystem-20 and UCSD Pascal on the Cromemco
and Apple computers. It forms the core of a
set of tutorial programs for English now being
used by deaf children in a classroom setting,
and thus is one of the first a p p l i c a t i o n s of
computational linguistics to be used in an
actual educational environment.
References
Bresnan, Joan (1975) "Transformations and
Categories in Syntax," in R. Butts and
J. Hintikka, eds. Proceedings of the Fifth
International Congress of Lo@ic-~- M e - - ~ o d ~
and Philosophy of Sc~-ence, University of
W-~tern Ontario, L o - n d o n , ~ i o .
Chapman, Robin S. (1974) The Interpretation of
Deviant Sentences ~ ~ : A
~ r m a t i o n a l Approach~- Janus Linguarum~
Series Minor, Volume 189, Mouton, The Hague.
Chomsky, Noam (1955) The Logical Structure of
Linguistic Theory, unpublished manuscript",
microfilmed, MIT Libraries, partially published by Plenum Press, New York, 1975.
Chomsky, Noam (1965) ~ of the Theory of
S~ntax, MIT Press, Cambrldge, Ma'ssa---6~usetts. - -
Chomsky, Noam (1970) "Remarks on
Nominalization", in R . A . Jacobs and P . S .
Rosenbaum, eds., Readings in
Transformational Grammar, G i n n - - a n d Co.,
Waltham, Mass.
Chomsky, Noam (1973) "Conditions on
Transformations", in S . A . Anderson and
P. Kiparsky, eds., A Festschrlft for Morris
Halle, Holt, Rinehart--and Winston, New~Yor-~.
Chomsky, Noam (1977) "On WR-Movement", in
P. Culicover, T. Wasow and A'~'AkmaJian, eds.
Formal S~ntax, Academic Press, Inc., New York.
Chomsky, Noam (1980) "On Binding," Linguistic Inquiry ll.
Chomsky, Noam and Howard Lasnik (1977) "Filters and Control", Linguistic Inquiry 8.
Fromkin, Victoria A. (1973) Speech Errors as
Linguistic Evidence, Janua L n ~ u ~ , ~-eri~
major, Volume 77, Mouton, The Hague.
George, Leland M. (1980a) Analogical
Generalization in Natural Langua_qe Syntax,
unpublished Doct6~'al D l s s e r ' ~ a t o n , ~ .
George, Leland M. (1980b) Analogical
Generalizations of Natural Language Syntax,
unpublished m a n u s 6 " F i p 6 " 7 - ~ .
Ingria, Robert (in preparation) Sentential
Complementation in Modern Greek, Doctoral
Dissertation, MIT.
Jackendoff, Ray S. (1974) "Introduction to the
X" Convention", distributed by Indiana
University Linguistics Club, Bloomington.
Jackendoff, Ray S. (1978) X" ~ Sntax: --A Study_ of Phrase Structure, Linguistic Inqulry Monograp-~
A ~ e n d i x A: S a m p l e S e n t e n c e s 6. S u p e r l a t i v e S e n t e n c e s
i. T r a n s i t i v e S e n t e n c e s
i. T h e b u l l i e s c h a s e d the girl. 2. W h a t did the b u l l i e s do to the
g i r l ?
3. T h e y c h a s e d her. 4. W h o c h a s e d the g i r l ? 5. T h e b u l l i e s c h a s e d her. 6. W h o did they c h a s e ? 7. W h o m did they c h a s e ? 8. T h e y c h a s e d the girl.
9. H o w m a n y b u l l i e s c h a s e d the g i r l ?
10. E i g h t b u l l i e s c h a s e d the girl. Ii. H o w m a n y b u l l i e s c h a s e d her? 12. E i g h t b u l l i e s c h a s e d her. 13. W h o g o t c h a s e d ?
14. T h e g i r l g o t c h a s e d .
15. S h e was c h a s e d by the bullies. 16. T h e girl was b e i n g c h a s e d by
the bullies.
2. I n t r a n s i t i v e S e n t e n c e s
i. W h a t did the g i r l do?
2. She cried.
3. W h o c r i e d ?
4. The girl cried.
3. I n d i r e c t D i s c o u r s e
i. Dan said that the g i r l is sad.
2. Dan said that she is sad.
3. W h o said that the g i r l is sad?
4. T r a n s i t i v e S e n t e n c e w i t h I n d i r e c t O b j e c t
i. T h e g e n e r o u s boy g a v e a d o l l to the girl.
2. T h e g e n e r o u s b o y g a v e the g i r l a doll.
3. The girl was g i v e n a doll. 4. A d o l l w a s g i v e n to the girl. 5. W h o g a v e the g i r l a d o l l ? 6. W h o g a v e w h a t to w h o m ?
7. W h a t did the g e n e r o u s b o y g i v e the g i r l ?
8. He g a v e her a doll.
9. W h a t did the g e n e r o u s b o y g i v e to the g i r l ?
i0. He g a v e a doll to her. ii. W h o g a v e a d o l l to the g i r l ? 12. W h o g a v e the g i r l a d o l l ? 13. W h i c h boy g a v e the g i r l a d o l l ? 14. T h e g e n e r o u s b o y g a v e her a
doll.
15. W h i c h boy g a v e a d o l l to the g i r l ?
16. T h e g e n e r o u s b o y g a v e it to he-.
17. H o w m a n y d o l l s did the g e n e r o u s b o y g i v e the g i r l ?
18. He g a v e her o n e doll.
5. C o m p a r a t i v e S e n t e n c e s
!. T h e s o l d i e r w a s better.
2. T h e g e n t l e m a n w i l l be m o r e u n h a p p y .
3. A l i c i a is h u n g r i e r than Jake. 4. T h e c h i l d r e n w e r e a n g r i e r than
Andy.
158
I. A p o l i c e m a n c a u g h t the n i c e s t b u t t e r f l i e s .
2. A s h e e p d o g was the s i c k e s t pet. 3. T h e fire c h i e f l o o k s m o s t
g e n e r o u s .
4. T h e s m a r t e s t m a n swore.
5. The o l d e s t b u l l d o g b r o k e the dolls.
7. S e n t e n c e s w i t h I n f i n i t i v e s
I. T h e t e a c h e r w a n t e d K a t h y to hurry.
2. T h e g e n t l e m a n p r o m i s e d the lady to c l o s e the door.
3. T h e g i r l s w e r e h a r d to r i d i c u l e .
8. R e l a t i v e C l a u s e s
I. W h o e v e r e m b r a c e d the kids w i l l e m b r a c e the ladies.
2. T h e g i r l w h o w a s i n t e l l i g e n t c h e a t e d the adults.
3. T h e w o m a n w h o g r e a s e d the t r i c y c l e m u m b l e d .
4. T h e t e a c h e r w h o lost the b u l l d o g s swears.
9. N e g a t i v e S e n t e n c e s
i. K i m w o n ' t help. 2. C l a i r e d i d n ' t help. 3. T h e c h i l d r e n w o n ' t shout. 4. Do not slap the ~ o o d l e s .
5. Do not cry.
i0. V a r i e t i e s of Q u a n t l f i e r s
i. No toy breaks.
2. S o m e e x c i t e d boys k i s s e d the w o m e n .
3. S o m e h u n g r y p e o p l e eat. 4. T w o m e n cried.
5. E v e r y new toy broke. 6. N o t e v e r y m a n slips.
7. T h e boy w o n ' t g i v e the d o g s any o r a n g e s .
8. T h e g i r l d o e s n ' t see any cats. 9. T h e o l d m e n d i d n ' t tell the
b o y s any thing.
i0. T h e g i r l d i d n ' t love a n y body.
ii. V a r i e t i e s of P r o n o u n s
i. B e t t e is the sad one. 2. G l o r i a is the h a p p y one. 3. K e v i n is the saddest. 4. K a t h y is the m o s t c h e e r f u l . 5. V a r d a l i k e d the s w e e t apple. 6. V a r d a l i k e d the s w e e t one.
12. T~u~RE S e n t e n c e s
i. T h e r e w e r e some toys in the dirt.
2. T h e r e w e r e no toys in the dirt. 3. T h e r e w e r e n ' t any toys in the