Principle Based Parsing Without Overgeneration

(1)

P R I N C I P L E - B A S E D

P A R S I N G W I T H O U T

O V E R G E N E R A T I O N

1 D e k a n g

L i n

D e p a r t m e n t of Computing Science, University of M a n i t o b a Winnipeg, Manitoba, Canada, l~3T 2N2

E-mail: [email protected]

A b s t r a c t

Overgeneration is the main source of computational complexity in previous principle-based parsers. This paper presents a message passing algorithm for principle-based parsing t h a t avoids the overgeneration problem. This algorithm has been implemented in C + + and successfully tested with example sentences from (van Riemsdijk and Williams, 1986).

1. I n t r o d u c t i o n

Unlike rule-based g r a m m a r s t h a t use a large number of rules to

describe

patterns in a language, Government-Binding (GB) T h e o r y (Chomsky, 1981; Haegeman, 1991; van Riemsdijk and Williams,

1986)

ezplains

these patterns in terms of more

foundmental and universal principles.

A key issue in building a principle-based parser is how to procedurally interpret the principles. Since GB principles are constraints over syntactic structures, one way to implement the principles is to

1. generate candidate structures of the sentence that satisfy X-bar theory and subcategoriza- tion frames of the words in the sentence. 2. filter out structures t h a t violates any one of

the principles.

3. the remaining structures are accepted as parse trees of the sentence.

This implementation of GB theory is very ineffi- cient, since there are a large number of structures being generated and then filtered out. T h e problem of producing too m a n y illicit structures is called

overgenera~ion

and has been recognized as the cul- prit of c o m p u t a t i o n a l difficulties in principle-based parsing (Berwick, 1991). Many methods have been proposed to alleviate the overgeneration problem by detecting illicit structures as early as possible, such as optimal ordering of principles (Fong, 1991), coroutining (Doff, 1991; Johnson, 1991).

] T h e a u t h o r wishes to t h a n k t h e a n o n y m o u s referees for t h e i r h e l p f u l c o m m e n t s a n d s u g g e s t i o n s . T h i s r e s e a r c h was s u p p o r t e d b y N a t u r a l Sciences a n d E n g i n e e r i n g R e s e a r c h C o u n c i l of C a n a d a g r a n t O G P 1 2 1 3 3 8 .

This paper presents a principle-based parser t h a t avoids the overgeneration problem by applying principles to descriptions of the structures, instead of the structures themselves. A structure for the input sentence is only constructed after its description has been found to satisfy all the principles. T h e structure can then be retrieved in time linear to its size and is guaranteed to be consistent with the principles.

Since the descriptions of structures are constant- sized attribute vectors, checking whether a struc- tural description satisfy a principle takes constant a m o u n t of time. This compares favorably to ap- proaches where constraint satisfaction involves tree traversal.

T h e next section presents a general framework for parsing by message passing. Section 3 shows how linguistic notions, such as dominance and government, can be translated into relationships between descriptions of structures. Section 4 describes interpretation of GB principles. Familiarity with GB theory is assumed in the presentation. Section 5 sketches an object-oriented implementation of the parser. Section 6 discusses complexity issues and related work.

2. P a r s i n g b y M e s s a g e P a s s i n g

T h e message passing algorithm presented here is an extension to a message passing algorithm for context-free g r a m m a r s (Lin and Goebel, 1993).

We encode the g r a m m a r , as well as the parser, in a network (Figure 1). T h e nodes in the net- works represent syntactic categories. T h e links in the network represent dominance and subsumption relationships between the categories:

• T h e r e is a dominance link from node A to B if B can be i m m e d i a t e l y d o m i n a t e d by A. T h e dominance links can be further classified according to the type of dominance relationship.

• There is a specialization link from A to B if A subsumes B.

(2)

with each other by passing messages in the reverse direction of the links in the network.

/x•!':"

... ~ ... / . . . \ .... t . . . "

/\°....

x

PSpec B / i VI~. " d ' ' ' - ~ , _~ \ %

1

i I

i k

" \ "

"..

.3.

• F S ~ N ~ ] AUX" Have%e

iv(

//--'

., : \ $ :

",,,i

\

Xi

ASpec .. A'bar % ~ D~et "

N

... ~ --" 0 barrier adjunct-dominance specialization link

~--~ . , l l . l l * . l l |

head dominance specifier~ominance complement-dominance

Figure 1: A Network Representation of G r a m m a r

T h e messages contains items. An i t e m is a triplet t h a t describes a structure:

< s u r f a c e - s t r i n g , a t t r i b u t e - v a l u e s , s o u r c e s > ,

where

s u r f a c e - s t r i n g is an integer interval [i, j] denoting the i'th to j ' t h word in the input sentence. a t t r i b u t e - v a l u e s specify syntactic features, such as

c a t , p l u , c a s e , o f the root node of the structure described by the item.

sources c o m p o n e n t is the set of items t h a t describe the i m m e d i a t e sub-structures. Therefore, by tracing the sources of an item, a complete structure can be retrieved.

T h e location of the item in the network deter- mines the syntactic category of the structure.

For example,

[NP

the ice-cream] in the sentence "the ice-cream was eaten" is represented by an item i4 at NP node (see Figure 2):

< [ 0 , 1 ] , ( ( c a t n ) - p l u ( n f o r t a n o r m ) -cm +theta), {ix, 23}>

A n item represents the root node of a structure and contains enough information such that the internal nodes of the structure are irrelevant.

T h e message passing process is initiated by send- ing initial items externally to lexical nodes (e.g., N, P, . . . ) . T h e initial items represent the words in the sentence. T h e a t t r i b u t e values of these items are obtained from the lexicon.

In case of lexical ambiguity, each possibility is represented by an item. For e x a m p l e , suppose the input sentence is

"I

saw a m a n , " then the word "saw" is represented by the following two items sent to nodes N and V : N P 2 respectively:

<[I,I], ((cat n) -plu (nform norm)), {}> <[i,I], ((cat v) (cform fin) -pas

(tense past)), {}>

W h e n a node receives an item, it a t t e m p t s to combine the i t e m with items f r o m other nodes to f o r m new items. T w o items

< [ i l j x ] , A~, S I > and <[i2,j~], A2, $2> can be combined if

1. their surface strings are adjacent to each other: i2 = j x + l .

2. their a t t r i b u t e values A1 and As are unifiable. 3. their sources are disjoint: Sx N $2 = @. T h e result of the c o m b i n a t i o n is a new item:

<[ix~j2], unify(A1, A2), $113 $2>.

T h e new items represent larger parse trees resulted f r o m combining smaller ones. T h e y are then prop- agated further to other nodes.

T h e principles in G B theory are i m p l e m e n t e d as a set of constraints t h a t m u s t be satisfied dur- ing the p r o p a g a t i o n and c o m b i n a t i o n of items. T h e constraints are a t t a c h e d to nodes and links in the network. Different nodes and links m a y have different constraints. T h e items received or created by a node m u s t satisfy the constraints at the node.

T h e constraints a t t a c h e d to the links serve as filters. A link only allows items t h a t satisfy its constraints to pass through. For example, the link from V:NP to NP in Figure 1 has a constraint t h a t any i t e m passing t h r o u g h it m u s t be unifiable with (case acc). T h u s items representing NPs with n o m i n a t i v e case, such as "he", will not be able to pass through the link.

By default, the a t t r i b u t e s of an i t e m percolate with the item as it is sent across a link. However, the links in the network m a y block the percolation of certain attributes.

T h e sentence is successfully parsed if an item is found at IP or C P node whose surface string is the input sentence. A parse tree of the sentence can be retrieved by tracing the sources of the item.

A n e x a m p l e

T h e message passing process for analyzing the sentence

[image:2.612.64.296.92.309.2]

(3)

IP i12 @

(~) ~ b a r ~.

(~

i9/ V~ bar i [ / ~ i ,

• / ] NP. i4. Aux Have Be

NP i4

\

Nbar i3

Det il N i2

The i c e - c r e a m

~ I P ~ _{Ibar i}t l

/ \

I i6 V P il0

i9 Vbar

/.

18

v,

Be i5 V:NP i7

w a s e a t e n

& The message passing process b. The parse tree retrieved 11 : < [ 0 , 0 ] ((cat d)), {}>

12 = < [ 1 , 1 ] ((cat n) -plu (nform norm) + t h e t a ) , { } > 13 = < [ 1 , 1 ] ((cat n) -plu (nform norm) +theta),{i2}>

14 = < [ 0 , 1 ] ((cat n) -plu (nform norm) -cm +theta), {il, i3}>

15 = < [ 2 , 2 ] ((cat i) -plu (per 1 3) (cform fin) +be +ca +govern (tense p a s t ) ) , {}> 16 = < [ 2 , 2 ] ((cat i) -plu (per 1 3) (cform fin) +be +ca +govern (tense p a s t ) ) , {i5}> 17 = < [ 3 , 3 ] ((cat v) +pas), {}>

18 ----<[3,3] ((cat v) +pas +nppg -npbarrier (np-atts NNORM)), {i7}> 19 = < [ 3 , 3 ] ((cat v) +pas +nppg -npbarrier (rip-arts NNORH)), {is}> 110=<[3,3] ((cat v) +pas +nppg -npbarrier (rip-arts NNORM)), {i9}>

111=<[2,3] ((cat ±) +pas +nppg -npbarrier (np-atts NNORH) (per 1 3) (cform fin) +ca +govern (tense p a s t ) ) ) , {i6, ilo}>

i12~-<[0,3], ((cat i) +pas (per 1 3) (cform fin) +ca +govern (tense p a s t ) ) , {i4, ill}>

Figure 2: Parsing the sentence "The ice-cream was eaten"

(1) T h e ice-cream was eaten

is illustrated in Figure 2.a. In order not to convolute the figure, we have only shown the items t h a t are involved in the parse tree of the sentence and their p r o p a g a t i o n paths.

T h e parsing process is described as follows: 1. T h e i t e m il is created by looking up the lexi-

con for the word "the" and is sent to the node Det, which sends a copy of il to NP.

2. T h e item i2 is sent to N, which propagates it to Nbar. T h e a t t r i b u t e values ofi2 are percolated to i3. T h e source c o m p o n e n t eli3 is {i2}. I t e m i3 is then sent to NP node.

3. W h e n NP receives i3 f r o m Nbar, i3 is combined with il f r o m Det to form a new item i4. One of the constraints at NP node is:

if (nform norm) then -cm,

which m e a n s t h a t n o r m a l NPs need to be case- m a r k e d . Therefore, i4 acquires -cm. I t e m i4 is then sent to nodes t h a t have links to NP.

4. T h e word "was" is represented by item i5, which is sent to I b a r via I.

5. T h e word "eaten" can be either the past par- ticiple or the passive voice of "eat". T h e second possibility is represented by the item i7. T h e word belongs to the s u b c a t e g o r y V : N P which takes an NP as the c o m p l e m e n t . There- fore, the i t e m i7 is sent to node V:NP. 6. Since i7 has the a t t r i b u t e +pas (passive voice),

an n p - m o v e m e n t is generated at V:NP. T h e m o v e m e n t is represented by the a t t r i b u t e s nppg, n p b a r r i e r , and n p - a t t s . T h e first two a t t r i b u t e s are used to m a k e sure t h a t the m o v e m e n t is consistent with G B principles. T h e value of n p - a t t s is an a t t r i b u t e vector, which m u s t be unifiable with the antecedent of this n p - m o v e m e n t , l~N0aM is a s h o r t h a n d for

( c a t n) (nform norm)•

[image:3.612.94.533.53.441.2]

(4)

i6 f r o m I to f o r m i11.

8. W h e n IP receives i11, it is combined with i4 from NP to f o r m i12. Since ill contains an np- m o v e m e n t whose n p - a t t s a t t r i b u t e is unifiable with i4, i4 is identified as the antecedent of np- m o v e m e n t . T h e n p - m o v e m e n t a t t r i b u t e s in i12 are cleared.

T h e sources of i12 are i4 f r o m NP and ill from Ibar. Therefore, the top-level of parse tree consists of an NP and I b a r node d o m i n a t e d by IP node. T h e complete parse tree (Figure 2.b) is obtained by re- cursively tracing the origins of i4 and ill f r o m NP and I b a r respectively. T h e trace after "eaten" is in- dicated by the n p - m o v e m e n t attributes of i7, even though the tree does not include a node representing the trace.

3 . M o d e l i n g L i n g u i s t i c s D e v i c e s

GB principles are stated in t e r m s of linguistic con- cepts such as barrier, government and m o v e m e n t , which are relationships between nodes in syntactic structures. Since we interpret the principles with descriptions of the structures, instead of the structures themselves, we m u s t be able to model these notions with the descriptions.

D o m i n a n c e a n d m - c o m m a n d :

D o m i n a n c e and m - c o m m a n d are relationships between nodes in syntactic structures. Since an item represent a node in a syntactic structure, relationships between the nodes can be represented by relationships between items:

d o m i n a n c e : An i t e m d o m i n a t e s its direct and in- direct sources. For example, in Figure 2, i4 d o m i n a t e s il, i2, and iz.

m - c o m m a n d : T h e head daughter of an item representing a m a x i m a l category m - c o m m a n d s non- head daughters of the item and their sources.

B a r r i e r

C h o m s k y (1986) proposed the notion of barrier to unify the t r e a t m e n t of government and subjacency. In C h o m s k y ' s proposal, barrierhood is a p r o p e r t y of m a x i m a l nodes (nodes representing m a x i m a l categories). However, not every m a x i m a l node is a barrier. T h e barrierhood of a node also depends on its context, in t e r m s of L - m a r k i n g and inheritance.

Instead of m a k i n g barrierhood a p r o p e r t y of the nodes in syntactic structures, we define it to be a p r o p e r t y of links in the g r a m m a r network. T h a t

is, certain links in the g r a m m a r network are classified as barriers. In Figure 1, barrier links have a black ink-spot on t h e m . Barrierhood is a p r o p e r t y of these links, independent of the context. T h i s definition of barrier is simpler t h a n C h o m s k y ' s since it is context-free. In our e x p e r i m e n t s so far, this simpler definition has been found to be adequate.

G o v e r n m e n t

Once the notion of barrier has been defined, the gov- e r n m e n t relationship between two nodes in a structure can be defined as follows:

g o v e r n m e n t : A governs B if A is the m i n i m a l governor t h a t m - c o m m a n d s B via a sequence of non-barrier links, where governors are N, V, P, A, and tensed I.

I t e m s representing governors are assigned

+govern attribute. T h i s a t t r i b u t e percolates across head dominance links. If an i t e m has +govern attribute, then non-head sources of the i t e m and their sources are governed by the head of the i t e m if there are paths between t h e m and the i t e m satisfying the conditions:

1. there is no barrier on the path.

2. there is no other i t e m with +govern a t t r i b u t e on the p a t h ( m i n i m a l i t y condition (Chomsky, 1986, p.10)).

M o v e m e n t :3

Movement is a m a j o r source of complexity in principle-based parsing. Directly modeling Move-c~ would obviously generate a large n u m b e r of invalid movements. Fortunately, m o v e m e n t s m u s t also satisfy:

c - c o m m a n d c o n d i t i o n : A m o v e d element m u s t c- c o m m a n d its trace (Radford, 1988, p.564), where A c - c o m m a n d B if A does not domi- nate B b u t the parent of A d o m i n a t e s B. T h e c - c o m m a n d condition implies t h a t a m o v e m e n t consists of a sequence of moves in the reverse direction of dominance links, except the last one. There- fore, we can model a m o v e m e n t with a set of attribute values. If an i t e m contains these a t t r i b u t e values, it means t h a t there is a m o v e m e n t out of the structure represented by the item. For example, in Figure 2.b, i t e m i10 contains m o v e m e n t attributes: nppg, n p b a r r ± e r and n p - a t t s . This indicates t h a t there is an n p - m o v e m e n t out of the VP whose root node is il0.

(5)

T h e m o v e m e n t a t t r i b u t e s are generated at the parent node of the initial trace. For example, V:NP is a node representing n o r m a l transitive verbs which take an NP as c o m p l e m e n t . W h e n V:NP receives an i t e m representing the passive sense of the word

eaten,

V : N P creates another i t e m

< [i,i] , ((cat v) -npbarrier +nppg (np-atts (cat n))), { } >

This i t e m will not be combined with any item f r o m NP node because the NP c o m p l e m e n t is assumed to be an np-trace. T h e i t e m is then sent to nodes d o m i n a t i n g V:NP. As the i t e m p r o p a g a t e s further, the a t t r i b u t e s is carried with it, simulating the effect of m o v e m e n t . T h e n p - m o v e m e n t land at I P node when the I P node combines an i t e m f r o m subject NP and a n o t h e r i t e m f r o m I b a r with n p - m o v e m e n t attributes. A precondition on the landing is t h a t the a t t r i b u t e s of the former can be unified with the value of n p - a t t s of the latter. W h - m o v e m e n t s are dealt with by a t t r i b u t e s whpg, whbarrier, w h - a t t s . This t r e a t m e n t of m o v e m e n t requires t h a t the parent node of a initial trace be able to determine the type of m o v e m e n t . W h e n a m o v e m e n t is generated, the type of the m o v e m e n t depends on the ca (case assigner) a t t r i b u t e of the item:

c a +

m o v e m e n t e x a m p l e s

wh active V, P, finite IP

np A, passive V, non-finite IP

For example, when IP node receives an i t e m f r o m Ibar, IP a t t e m p t s to combine it with another i t e m from subject NP. If the subject is not found, then the IP node generates a m o v e m e n t . If the i t e m represent a finite clause, then it has attributes +ca (cform f i n ) and the m o v e m e n t is of type wh. Oth- erwise, the m o v e m e n t is of type np.

4. I n t e r p r e t a t i o n o f P r i n c i p l e s

We now describe how the principles of GB theory are i m p l e m e n t e d .

~

- b a r T h e o r y : ~N~

• Every syntactic category is a projection of a ]

lexical head. /

• T h e r e two levels of projection of lexical I heads. Only the bar-2 projections can b e )

c o m p l e m e n t s and adjuncts, j /

T h e first condition requires t h a t every non-lexical category have a head. This is guaranteed by a constraint in i t e m combination: one of the sources of the two items being combined m u s t be from the head daughter.

T h e second condition is i m p l e m e n t e d by the structure of the g r a m m a r network• T h e combina- tions of items represent constructions of larger parse trees f r o m smaller ones. Since the structure of the g r a m m a r network satisfies the constraint, the parse trees constructed by i t e m c o m b i n a t i o n also satisfy the X-bar theory.

C a s e F i l t e r : Every lexical NP m u s t be case-~ arked, where A case-marks B iff A is a case as- I ~ i g n e r and A governs B ( H a e g e m a n , 1991, p.156)fl

T h e case filter is i m p l e m e n t e d as follows: 1. Case assigners (P, active V, tensed I) have +ca

a t t r i b u t e . Governors t h a t are not case assigners (N, A, passive V) have - c a attribute• 2. Every i t e m at NP node is assigned an at-

t r i b u t e value -cm, which m e a n s t h a t the item needs to be case-marked. T h e -cm a t t r i b u t e then p r o p a g a t e s with the item. T h i s i t e m is said to be the origin of the -era a t t r i b u t e . 3. Barrier links do not allow any i t e m with -cm

to pass through, because, once the i t e m goes beyond the barrier, the origin o f - c m will not be governed, let alone case-marked.

4. Since each node has at m o s t one governor, if' the governor is not a case assigner, the node will not be case-marked. Therefore, a case- filter violation is detected if +govern -era - c a co-occur in an item.

5. If +govern +ca -cm co-occur in an item, then the head daughter of the i t e m governs and case-marks the origin of -cm. T h e case-filter condition on the origin of -era is met. T h e -era a t t r i b u t e is cleared.

For example, consider the following sentences: (2) a. I believe J o h n to have left.

b. *It was believed J o h n to have left. c. I would hope for J o h n to leave• d. *I would hope J o h n to leave.

T h e word "believe" belongs to a s u b c a t e g o r y of verb (V:IP) t h a t takes an IP as the c o m p l e m e n t . Since there is no barrier between V : I P and the subject of IP, words like "believe" can govern into the IP c o m p l e m e n t and case-mark its subject (known as

(6)

*g ... V : I P ~.. -pas / ~ ' I P

believe / ~ \ NP -crn Ibar John

to have left

a. Case-filter satisfied at V:IP

~

:CP ~ C P . ~

+govern Cbar h o p e +ca ~'~/ ~ ;

f o r _{NP -cm Ibar}

John

to leave

c. Case-filter satisfied at Cbar, --cm cleared

+govern V:IP ~..-cm

:;as

_/

/ /

-,<

_IP be,ieved ~ \

NP -era Ibalr John

to have left

b. Case-filter v i o l a t i o n at V:IP

v : c P ~

/

hope

NP -cm IbM

John

to leave

d. The attribute --cm is blocked by a barrier.

Figure 3: Case Filter E x a m p l e s

sive "believed" is not a case-assigner. T h e case-filter violation is detected at V:IP node (Figure 3.b).

T h e word "hope" takes a CP complement. It does not govern the subject of CP because there is a barrier between them. T h e subject of an infini- tive CP can only be governed by c o m p l e m e n t "for" (Figure 3.c and 3.d).

c r i t e r i o n : Every chain m u s t receive and one~ ly one 0-role, where a chain consists of an NP I d the traces (if any) coindexed with it (van I

emsdijk and Williams, 1986, p.245). /

We first consider chains consisting of one element. T h e 0-criterion is i m p l e m e n t e d as the following constraints:

1. An item at NP node is assigned + t h e t a if its nform a t t r i b u t e is norm. Otherwise, if the value of nform is t h e r e or i t , then the i t e m is assigned - t h e t a .

2. Lexical nodes assign + t h e t a or - t h e t a to items depending on whether they are 0-assigners (V, A, P) or not (N, C).

3. Verbs and adjectives also have a s u b j - t h e t a

attribute.

value O-role* examples

+ s u b j - t h e t a yes "take", "sleep" - s u b j - t h e t a no "seem", passive verbs *assigning O-role to subject

This a t t r i b u t e percolates with the item from V to IP. T h e IP node then check the value of

t h e t a and s u b j - t h e t a to m a k e sure t h a t tile verb assigns a 0-role to the subject if it requires one, and vice versa.

Figure 4 shows an e x a m p l e of 0-criterion in action when parsing:

(3) *It loves Mary

-theta lP ~ . +subj-theta -em /~// % +govern +ca

NP Ibar

It . . . " ...

+theta "" V. ~ +theta

+govern Iove Nl:* Mary

Figure 4: 0-criterion in action

T h e subject NP, "it", has a t t r i b u t e - t h e t a , which is percolated to the IP node. T h e verb "love" has attributes + t h e t a + s u b j - t h e t a . T h e NP, " M a r y " , has a t t r i b u t e + t h e t a , W h e n the items representing "love" and "Mary" are combined. Their t h e t a attribute are unifiable, thus satisfying the 0-criterion.

[image:6.612.149.463.54.317.2] [image:6.612.315.506.406.552.2]

(7)

T h e above constraints g u a r a n t e e t h a t chains with only one element satisfy 0-criterion. We now consider chains with more t h a n one element. T h e base-position of a w h - m o v e m e n t is case-marked and assigned a 0-role. T h e base position of an np- m o v e m e n t is assigned a 0-role, b u t not case-marked. To ensure t h a t the m o v e m e n t chains satisfy 0- criterion we need only to m a k e sure t h a t the items representing the parents of i n t e r m e d i a t e traces and landing sites of the m o v e m e n t s satisfy these conditions:

None of + c a , + t h e t a and + s u b j - t h e t a is

present in the items representing the parent of i n t e r m e d i a t e traces of (wh- and np-) move- m e n t s as well as the landing sites of wh- m o v e m e n t s , thus these positions are not case- m a r k e d and assigned a O-role.

Both +ca and + s u b j - t h e t a a r e present in the items representing parents of landing sites of n p - m o v e m e n t s .

S u b j a c e n c y : M o v e m e n t cannot cross m o r e t h a n J ne barrier ( H a e g e m a n , 1991, p.494).

A w h - m o v e m e n t carries a w h b a r r i e r attribute. T h e value - w h b a r r i e r m e a n s t h a t the m o v e m e n t has not crossed any barrier and +whbarrier m e a n s t h a t the m o v e m e n t has already crossed one barrier. Barrier links allow items with - w h b a r r i e r to pass through, but change the value to +whbarrier. I t e m s with +whbarrier are blocked by barrier links. W h e n a w h - m o v e m e n t leaves an i n t e r m e d i a t e trace at a position, the corresponding w h b a r r i e r becomes -.

T h e subjacency of n p - m o v e m e n t s is similarly bandied with a n p b a r r i e r a t t r i b u t e .

E r m p t y C a t e g o r y P r i n c i p l e ( E C P ) : A traceJ its parent m u s t be properly governed.

In literature, proper g o v e r n m e n t is not, as the t e r m suggests, s u b s u m e d by government. For example, in

(4) W h o do you think [cP e' [IP e came]]

the tensed I in liP e came] governs but does not properly govern the trace e. On the other hand, # properly governs but does not govern e ( H a e g e m a n ,

1901, p.4 6).

Here, we define proper government to be a subclass of government:

P r o p e r g o v e r n m e n t : A properly governs B iff A governs B and A is a 0-role assigner (A do not have to assign 0-role to B).

Therefore, if an i t e m have b o t h +govern and one of + t h e t a o r + s u b j - t h e t a , then the head of the i t e m properly governs the non-head source items and their sources t h a t are reachable via a sequence of non-barrier links. This definition unifies the notions of government and proper government. In (4), e is properly governed by tensed I, e I is properly governed by "think".

This definition w o n ' t be able to account for difference between (4) and (5) ( T h a t - T r a c e Effect, ( H a e g e m a n , 1991, p.456)):

(5) * W h o do you think [CP e' t h a t [IP e came]] However, T h a t - T r a c e Effect can be explained by a separate principle.

T h e proper g o v e r n m e n t of wh-traces are handled by an a t t r i b u t e whpg ( n p - m o v e m e n t s are similarly dealt with by an nppg a t t r i b u t e ) :

Value Meaning

-whpg the m o s t recent trace has yet to be properly governed.

+~hpg the m o s t recent trace has already been properly governed.

1. If an i t e m has the a t t r i b u t e s -whpg, - t h e t a , +govern, then the i t e m is an E C P violation, because the governor of the trace is not a 0- role assigner. If an i t e m has a t t r i b u t e s -whpg,

+ t h e t a , + g o v e r n , then the trace is properly governed. T h e value of whpg is changed to +. 2. Whenever a w h - m o v e m e n t leaves an interme-

diate trace, whpg becomes -.

3. Barrier links block items with -~hpg.

N:CP

-ca CP

claim

/

CSpec Cbar

that Reagan met e

Figure 5: An e x a m p l e of E C P violation

For example, the word claim takes a C P complement. In the sentence:

(6) *Whol did you m a k e the claim e~ t h a t R e a g a n m e t ei

[image:7.612.334.527.465.598.2]

(8)

representing claim, their unification has a t t r i b u t e s (+govern - t h e t a -whpg), which is an E C P violation. T h e i t e m is recognized as invalid and discarded.

P R O T h e o r e m : P R O m u s t be ungoverned 1 H a e g e m a n , 1991, p.263).

W h e n the IP node receives an i t e m f r o m I b a r with cform not being f i n , the node makes a copy of the i t e m and assign +pro and -ppro to the copy and then send it further w i t h o u t combining it with any i t e m f r o m (subject) NP node. T h e a t t r i b u t e +pro represents the hypothesis t h a t the subject of the clause is P R O . T h e m e a n i n g of -ppro is t h a t the subject P R O has not yet been protected (from being governed).

W h e n an i t e m containing -ppro passes through a barrier link, -ppro becomes +ppro which means t h a t the P R O subject has now been protected. A P R O - theorem violation is detected if +govern and -ppro co-occur in an item.

5. O b j e c t e d - o r i e n t e d I m p l e m e n t a t i o n

T h e parser has been i m p l e m e n t e d in C + + , an object-oriented extension of C. T h e object-oriented p a r a d i g m makes the relationships between nodes and links in the g r a m m a r network and their soft- ware counterparts explicit and direct. C o m m u n i c a - tion via message passing is reflected in the message passing m e t a p h o r used in object-oriented languages.

I \

1 , 1

, ,_,,_1 \

\

- - - - ~ " = (~) I I

instance of subclass of instance class

Figure 6: T h e class hierarchy for nodes

Nodes and links are i m p l e m e n t e d as objects. Figure 6 shows the class hierarchy for nodes. T h e constraints t h a t i m p l e m e n t the principles are distributed over the nodes and links in the network. T h e i m p l e m e n t a t i o n of the constraints is m o d u l a r because they are defined in class definitions and all the instances of the class and its subclasses inherit

these constraints. T h e object-oriented p a r a d i g m allows the subclasses to m o d i f y the constraints.

T h e i m p l e m e n t a t i o n of the parser has been tested with e x a m p l e sentences from C h a p t e r s 4 - 10, 15-18 of (van Riemsdijk and Williams, 1986). T h e chapters left out are m o s t l y a b o u t logical f o r m and Binding Theory, which have not yet been implemented in the parser. T h e average parsing t i m e for sentences with 5 to 20 words is below half of a second on a S P A R C s t a t i o n ELC.

6. D i s c u s s i o n a n d R e l a t e d W o r k

C o m p l e x i t y o f u n i f i c a t i o n

T h e a t t r i b u t e vectors used here are similar to those in unification based g r a m m a r s / p a r s e r s . An impor- t a n t difference, however, is t h a t the a t t r i b u t e vectors used here satisfy the unil closure condition (Barton, Jr. et al., 1987, p.257). T h a t is, non- atomic a t t r i b u t e values are vectors t h a t consist only of a t o m i c a t t r i b u t e values. For example:

(7)

a. ((cat v) +pas +whpg (wh-atts (cat p)) b. * ((cat v) +pas +ghpg (wh-atts (cat v)

(np-att (cat n))))

(7a) satisfies the unit closure condition, whereas (7b) does not, because w h - a t t s in (7b) contains a n o n - a t o m i c a t t r i b u t e n p - a t t s . (Barton, Jr. et al., 1987) argued t h a t the unification of recursive at- t r i b u t e structures is a m a j o r source of c o m p u t a - tional complexity. On the other hand, let a be the n u m b e r of a t o m i c a t t r i b u t e s , n be the n u m b e r of n o n - a t o m i c attributes. T h e t i m e it takes to unify two a t t r i b u t e vectors is a + n a if they satisfy the unit closure condition. Since b o t h n and a can be regarded as constants, the unification takes only constant a m o u n t of time. In our current implementation, n = 2, a = 59.

A t t r i b u t e g r a m m a r i n t e r p r e t a t i o n

[image:8.612.65.298.445.593.2]

(9)

encoded in the attribution rules, and the phrase structure grammar is replaced by X-bar theory and Move-~. Therefore, a large number of structures will be constructed and evaluated by the attribution rules, thus leading to a serious overgeneration problem. For this reason, Correa pointed out that the attribute grammar interpretation should be used as a specification of an implementation, rather than an implementation itself.

A c t o r - b a s e d GB parsing

Abney and Cole (1986) presented a GB parser that uses actors (Agha, 1986). Actors are similar to objects in having internal states and responding to messages. In our model, each syntactic category is represented by an object. In (Abney and Cole, 1986), each instance of a category is represented by an actor. The actors build structures by creat- ing other actors and their relationships according to 0-assignment, predication, and functional-selection. Other principles are then used to filter out illicit structures, such as subjacency and case-filter. This generate-and-test nature of the algorithm makes it suscetible to the overgeneration problem.

7. C o n c l u s i o n

We have presented an efficient message passing algorithm for principle-based parsing, where

* overgeneration is avoided by interpreting principles in terms of descriptions of structures;

* constraint checking involves only a constant- sized attribute vector;

• principles are checked in different orders at different places so that stricter principles are ap- plied earlier.

We have also proposed simplifications of GB theory with regard to harrier and proper government, which have been found to be adequate in our exper- iments so far.

R e f e r e n c e s

Abney, S. and Cole, J. (1986). A government- binding parser. In Proceedings of NELS.

Agha, G. A. (1986). Actors: a model of concurrent computation in distributed system. MIT Press, Cambridge, MA.

Barton, Jr., G. E., Berwick, R. C., and Ristad, E. S. (1987). Computational Complexity and Natural

Language. The MIT Press, Cambridge, Mas- sachusetts.

Berwick, R. C. (1991). Principles of principle-based parsing. In Berwick, B. C., Abney, S. P., and Tenny, C., editors, Principle-Based Parsing: Computation and Psycholinguistics, pages 1- 38. Kluwer Academic Publishers.

Chomsky, N. (1981). Lectures on Government and Binding. Foris Publications, Cinnaminson, USA.

Chomsky, N. (1986). Barriers. Linguistic Inquiry Monographs. The MIT Press, Cambridge, MA.

Correa, N. (1991). Empty categories, chains, and parsing. In Berwick, B. C., Abney, S. P., and Tenny, C., editors, Principle-Based Parsing: Computation and Psycholinguislics, pages 83- 121. Kluwer Academic Publishers.

Dorr, B. J. (1991). Principle-based parsing for ma- chine translation. In Berwick, B. C., Abney, S. P., and Tenny, C., editors, Principle-Based Parsing: Computation and Psycholinguistics,

pages 153-184. Kluwer Academic Publishers.

Fong, S. (1991). The computational implementation of principle-based parsers. In Berwick, B. C., Abney, S. P., and Tenny, C., editors, Principle- Based Parsing: Computation and Psycholin- guistics, pages 65-82. Kluwer Academic Pub- lishers.

Haegeman, L. (1991). Introduction to Government and Binding Theory. Basil Blackwell Ltd. Johnson, M. (1991). Deductive parsing: The use

of knowledge of language. In Berwick, B. C., Abney, S. P., and Tenny, C., editors, Principle- Based Parsing: Computation and Psycholin- guistics, pages 39-64. Kluwer Academic Pub- lishers.

Lin, D. and Goebel, I%. (1993). Contex-free grammar parsing by message passing. In Proceedings of PACLING-93, Vancouver, BC.

Radford, A. (1988). Transformational Grammar.

Cambridge Textbooks in Linguistics. Cam- bridge University Press, Cambridge, England.