Localising Barriers Theory

(1)

L o c a l i s i n g

B a r r i e r s

T h e o r y

Michael Schiehlen*

Institute for Computational Linguistics, University of Stuttgart, Azenbergstr. 12, W-7000 S t u t t g a r t 1

1 I n t r o d u c t i o n

Government-Binding Parsing has become attractive in the last few years. A variety of systems have been designed in view of a correspondence as direct as possible with linguistic theory ([Johnson, 1989], [Pollard and Sag, 1991], [Kroch, 1989]). These approaches can be classified by their m e t h o d of handling global constraints. Global constraints are syntactic in na- ture: T h e y cover more than one projection. In contrast, local constraints can be checked inside a projection and, thus, lend themselves to a treatment in the lexicon. Conditions on features have been the subject of intensive study and viable logics have been proposed for them (see e.g. the CUF formalism [Dhrre and Eisele, 1991], [Dorna, 1992]). In this paper, we assume such a unification-based mechanism to take care of local conditions and focus on global constraints. One class of approaches to principle- based parsing (see [Pollard and Sag, 1991] for HPSG, [Kroch, 1989] for TAG) attempts to reduce global conditions to local constraints and thus to make them accessible to treatment in a feature framework. This strategy has been pursued only at the expense of sacrificing the precise formulation of the theory and the definitory power stemming from it. T h e re- sult has been a shift from the structural perspec- tive assumed by GB theory to the object-oriented view taken by unification formalisms. T h e other class of approaches ([Johnson, 1989]) has allowed the full range of possible restrictions on trees and has in- curred potential undecidability for its parsers. We take up a middle stance on t h e m a t t e r in that we propose a separate logic for global constraints and posit that global constraints only work on ancestor lines (see 7).

We assume "movement" to be encoded by the kind of gap-threading technique familiar from HPSG, LFG. In order to integrate global constraints a "state" (information that serves to express barrier configura- tions in the part of the tree which has already been built up) is associated with each "chain" (information about a moved element). Following H PSG, LFG, we have in mind a rule-based parser. Thus, states are manipulated when rules are chained. We need a calculus that is able to derive global constraints working on a local basis. We begin by developing this calculus hand in hand with an analysis of Chomsky's frame-

*I wish to thank Robin Cooper, Mark Johnson and Esther KSnig-Baumer for comments on earlier versions of this paper.

work. We then go on to show that many approaches to barriers theory and a variety of diverse phenom- ena can be moulded into our format and conclude with an indication of ways to use the system on-line during parsing.

2 D e p e n d e n c i e s B e t w e e n N o d e s

We take a tree T to be a structure (N,>), where N is a set of nodes and > stands for dominance, a

binary relation on N. We say that nodes a and b

are connected iff a > b V b > a V a = b. We define the relation of immediate dominance ~- between two nodes a and b as a > b A ~ 3 c : a > c A c > b. Dominance is an irreflexive partial order relation satisfying the axioms (1--3). Ancestors of a node are connected (1), there exists a (single) root (2), dominance reduces to immediate dominance (3). Variables are universally quantified unless specified otherwise.

(1) z > z A y > z --* x connected with y

(2) ~xVy : x > y

(3) x > z ~ 3y : x ~ y A y > z

Chomsky [1986, 9,30] discusses several definitions for constraints on unbounded dependencies.

(13) a c-commands/~ iff a does not dominate/~ [and/~ does not dominate or equal a] and every 7 that dominates a dominates/~.

Where 7 is restricted to maximal projections we will say that a m-commands/?.

(18) a governs/~ iff a m-commands/~ and there is no 7, 7 a harrier for/~/, such that 7 excludes a.

(59)/~ is n-subjacent to a iff there are fewer than n + l barriers for/~ t h a t exclude a.

(2)

(4) aRb ~-* a, b unconnected ^

I{c I B(c,b) ^ - , e > a } l < n

Balanced relations like government require a definition in terms of two BC-definable relations: Rl(a, b) and R2(b, a).

(5) B(c,b) ~ c > b

We can show several properties of BC-definable relations. The nodes are unconnected.

(6) aRb ---* a, b unconnected

In order to investigate BC-definable relations it suf- fices to investigate the ancestor lines of their second argument b (that is {y J y >_ b}).

(7)

x~-y A z > a l A ",y>__al A x > a 2 A -w>_a~

A y > b --* (alRb ~ a2Rb)

(7) gives rise to equivalence classes for the first argument of R. For a particular pair (a,b) we can always find a y as defined in (8).

(s) a •

^ x > a ^ y>a

^ y > b

Definable relations are never empty. Barriers are pre- served in the upward direction of the ancestor line:

(9) [y]Ry

(10) x > y ^

[xlP

(10)

is less innocent than it looks. I give a revealing binding example from Kamp and Reyle [1993].

If [cP=~ [cP=y hei sees Mary ] and she smiles] John/ is happy.

*[cP=~ [vP=~ Hei

sees Mary ] and J o h n / i s happy].

3 B a r r i e r D e f i n i t i o n s

3.1 Adjunction

Adjunction rules raise a problem for algebraic in- vestigations of barriers theory (e.g. [Kracht, 1992]): They insert material into a tree but do not create new projections. Thus, adjunction rules imply a distinction between projections and segment nodes that correspond to graph-theoretical nodes. We shall use Greek letters to refer to projection nodes and Latin letters for segment nodes. The only way to create projections covering more than one segment is through adjunction. Since adjunction rules have equivalent mother and daughter nodes, projections are coherent in the sense that:

Va ~ fl Vbi, b2 • f~ : a > bi --* a > b2

Chomsky [1986] defines projection dominance so that dominates ~ only if every segment of a dominates (every segment of) f/. In case this definition is not empty, (1) guarantees a unique minimal segment a,~in of a. Thus, we can rephrase Chomsky's definition in terms of segment nodes and get that a dominates fl just in case the minimal segment of a dominates some segment of 3.

(11) dominate(a,/3) *-+ a e a A b • / 3 A minimal segment(a) A a > b

Likewise, Chomsky's definition of exclusion, viz that a excludes j3 if no segment of a dominates (any seg- ment of) /3, can be transformed to the equivalent condition that a excludes/3 if the maximal segment of a does not dominate a segment of 3.

(12) exclude(aft) ~ a E a A b e fl A maximal segment(a) A --a > b

This way, we reduce projection dominance to segment dominance. In (13--15), conditions of segment minimality or maximality are included where they are appropriate by (11) and (12).

3.2 C h o m s k y ' s T h e o r y

Chomsky [1986, 14] gives the following two core definitions for barriers. We are not concerned about the exact formulation of L-marking (for a definition see [Chomsky, 1986, 24]).

(25) 7 is a blocking category for fl iff 7 is not L-marked and 7 dominates/3. (26) 7 is a barrier for ~ iff (a) or (b):

a. 7 immediately dominates 6, a blocking category for 3;

b. 7 is a blocking category for 3, 7 ~ IP.

We understand 7 in (25) and (26) to be a maximal projection, and we understand "immediately dominate" in (26a) to be a relation between maximal projections (so that 7 immediately dominates 5 in this sense even if a nonmaximal projection intervenes).

Formulation of these definitions in first order logic yields (13--15). In order to obtain an open-ended definition scheme the equivalence of the above definitions is held implicit: Barrier concepts are true iff they comply with a manifest definition (see also 22 and 23).

(3)

-, L-marked(c) A minimal segment(c) A

c > b .

(14) barrier(c,b)

maximal projection(c) A minimal segment(c) A 3d : blocking category(d,b) A

c > d A

V e : c > e > d - - +

-, ( maximal projection(e) A minimal segment(e) ). (15) barrier(c,b) ¢=

blocking category(c,b) A -,IP(c).

We regard unary predicates as local conditions (L) and binary predicates as global concepts (B for "barrier concept"). Abstracting over the particular predicates involved we end up with the following definition schemes (16 for 13 and 15, 17 for 14).

(16)

B(c, b) ¢= L(e) A c > b . (17) S(e, b)

L(e) A 3d : B(d, b) A

e > d A

Ve : e > e > d ~ ",L(e).

We call the existential subformula of (17) an inher- itance clause I. The only global conditions in our system are inheritance clauses and c > b, a condition that always holds for barrier concepts (see 5). We will discuss in detail a way to derive inheritance clauses on a rule to rule basis. For the sake of conciseness we adopt the following abbreviation for inheritance clauses.

35 : B(d, b) A e > d A Ve : c > e > d --* -,L(e) ,: y

I(c,b,B,L)

3.3 N e g a t i v e I n h e r i t a n c e C l a u s e s

It has interesting repercussions to incorporate a scheme with a negated inheritance clause, viz. (18).

(18) B(e, b)

L(c) A

c > b A

-,3d : B(d, b) A

c > d A

Ve : c > e > d - * -,L(e).

For illustration we discuss several applications for negative inheritance clauses.

Chomsky [1986, 37] talks about IPs as inherent barriers, this effect being restricted to the most deeply embedded tensed IP. To capture this concept we once again need a negative inheritance clause: An IP is most deeply embedded if it does not dominate any other IP.

(20) barrier(Tfl) ¢= tensed IP(7) A

7>8A

--,36 : IP(6,8) A 7>6.

IP(7,3) ~ IP(7) A 7>8.

A feature of negative inheritance clauses that is de- sirable in m a n y cases is t h a t they allow to cancel barriers higher up in the tree. They can be used to circumvent (24). Classical GB theory has had to re- sort to a variety of tricks to account for discontinuous domains. A case in point is the coherent infinitive construction found in German and Dutch ~. A stan- dard account is to reanalyse 0-structure into another structure that lacks the annoying barrier-generating nodes. Different submodules of the theory will work on different structures. Consider the following example.

dab [cP [tP PRO [vp

[NP

der Wagen] zu

reparieren]]] [v versucht] wurde

In this example V governs NP but not "PRO" even though "PRO" intervenes between V and NP. CP might be called a phantom barrier. Generally, a phantom (like CP, IP above) is a barrier just in case it does not dominate a non-phantom (VP above). Thus CP shields "PRO" but remains open for government of NP. This state of affairs can be caught in the present framework by a negative inheritance clause.

(21) barrier(7,#) ¢= phantom(7) A

7>#A

"~q# : nonphantom(~,3) A 7 > 8 .

nonphantom(7,8 ) ¢= nonphantom(7) A 7 > 8.

Similar cases arise with negation. Again, the litera- ture adopts different lines of argument to account for the phenomenon. Kamp and Reyle [1993] handle the binding case below with a rule of double negation elimination, an operation that deletes structure.

*Either he~ owns a Porsche or John/ hides it.

Either h e / d o e s not own a Porsche or John/ hides it.

(4)

The examples below are drawn from Cinque [1990, 83]. He uses a superscription convention to annotate the scope of the negation and assumes an LF amalgamation process triggered by coindexing of this sort. CP is no barrier anymore for LF-amalgamated elements since they become wh-movable. We might model amalgamation with the "nonphantom" clause of (21). Then, this clause would have to hold true for inherently wh-movable elements (bare quantifiers in Cinque's analysis) as well.

*Molti amici, [cP ha invitato t, che io sap- pin.

Molti amici, [cP

[NegP

n o n ha invitato t,

che io sappia.

3 . 4 P r o p e r t i e s o f t h e D e f i n i t i o n S c h e m e s

In this paragraph we further investigate properties of the three definition schemes we are dealing with. We summarize scheme (16) in (22). def is a variable ranging over the given definitions.

(22) B(c,b) ~ Bdef: Ldef(c ) A c>b

We can collapse all definitions de/into a single definition with local condition K(c) ~

Vd4Ld4(c).

In order to summarize the schemes (16--17) we intro-

duce vectors of definitions def" of length n and corre- sponding sequences of nodes Z of length n + 1. xl is fixed to c and Xn+l to b.

(23) B(c,b) *-* B def, Z : V i • { 1 , . . . , n } :

Ldef(i)(xi) A xi > xi+l.

For definitions conforming to type (16--17) we can show the following property: If we have found a son y violating the relation R all descendants b of the father x will be inaccessible to R.

(24) x ~- y A aRx A ~ a R y A x > b --* --,aRb

In a full-fledged definition scheme where (16--18) are available (24) ceases to hold. In the example dis- cussed above a does not govern y but does govern b.

a [cP=, [vP=y b

In pre-Barriers GB theory and most current computational approaches only inherent barriers are allowed (scheme 16) and the violating number of barriers in axiom (4) is set to null. Note that under these provisos, barriers theory shrinks to command theory:

(4') aRb ~ a, b unconnected A Vc : K ( c ) A c>b---*c>a

T h e following constraint holds in this configuration: A barrier as in (24) is not affected by the triggering first argument.

(25) x ~-y A Ba : [aRx A --,aRy] A bRx ---. --,bRy

Chomsky [1986, 11] discusses (25) at some length. In his example (see below) "decide" =a does not govern " P R O " , but "e" =b would. He shows t h a t if either of the mentioned requirements (n=O and intrinsic barriers) is not met the theorem is refuted.

(21) John decided [cP e [xP P R O to [ r e see the movie ]]]

If (16--18) are given then we can show the following theorem: Brothers are equivalent when occurring as a second argument of a BC-definable relation.

(26) a, bl unconnected A a, b2 unconnected A

by N- bl A by N- b2 ~ (aP0bl ~ aRb2)

4 Localising t h e Global C o n s t r a i n t s

The next step is to localise the definitions ( 1 6 - - 18). For ease of reference we repeat the definition schemes.

(27) B(c,b) ~ 3def: [Ll(C) A c>b] V

ILl(c) A I(c,b,B, L2)] V [Ll(c) A c > b A -,I(c,b,B, L2)]

We only take into account nodes c t h a t separate a from b in the sense t h a t they sit on the ancestor line of b but not on that of a (see also the restrictions of 4 and 5). T h e o r e m (28) specifies a connection between the inheritance clauses valid on a father node z and those valid on the son y. Recall that inheritance clauses are the only global conditions we consider.

(28) x N y A y>_b A "-,y>_a ---*

(B(y, b) V (I(y, b, B, L) A -~L(y)) *-* I(x, b, B, L))

(5)

4.1 U p w a r d S t a t e s

Upward states supply information about barrier nodes encountered on the ancestor line below. T h e y are constructed when the second argument b of a relation R has been found and the tree is being searched for the first argument a. Formally, upward states are sets (standing for conjunctions) associated with some node c and some dependency coming from b.

{B,L) e UState(c,b) ~ I(c,b,B,L)

Any inheritance clause that can be derived at c on the basis of the lower upward state and the rule schemes (27--28) is included in c's upward state. If a clause is not in the state, it cannot be inferred by (16--18). Consequently, the negation of a missing clause must hold. We assume a counter for c and b to be increased and checked as defined by the theory (computing the number n of passed barriers, see 4).

IncreaseCounter(c,b) ~ B(c,b)

We use the upward state to break off search as soon as we can infer from the theory that an element a cannot possibly be found in the rest of the tree. Theorem (29) stands to express that as soon as we have found a node y violating the definitions upward search becomes obsolete.

(29)

4.2 D o w n w a r d States

Downward states encode information about barrier nodes encountered on the ancestor line above. They are computed when the second argument b of a relation tt is being expected because a first argument a has been discovered. Formally, downward states are first order formulae associated with some node c, some ancestor node ct of c, and some dependency leading to b. Atomic formulae of DState(c,cl,b) are inheritance clauses I with respect to c and b.

formula E DState(c,ct,b)

formula(c,b) ~ IncreaseCounter(cl ,b)

T h e rule schemes (27--28) supply all sufficient and necessary conditions for transfer of inheritance clauses between nodes. Accordingly an atomic formula in the upper downward state can be transformed into a formula holding for the lower node c. False formulae are discarded, while true formulae in- crease the counter.

We use downward states to restrict the search space. By (24) we can sometimes infer that search into a subtree will be pointless. Negative inheritance

clauses, however, can only be checked when a can- didate for b has been encountered. When the parser descends paths while searching, it always assumes that the current path will dominate b. For upward states, in contrast, the ancestor line of b is fixed. Only downward states scan trees. (26) shows that a state will not change for brother nodes. So we only have to store one downward state per rule (e.g. under its mother node).

4.3 Example

Consider the chain of "how" in the following example

how do [zp. you

[vP, t [vP

remember [cp t / * w h y lip Bill t behaved t ]]]]]

In a left-to-right top-down parse, the first barrier to be encountered would be IP* if it dominated either a blocking category (BC) or no other tensed IP. VP* is no BC or barrier since it does not dominate the intermediate trace (it is not the minimal segment of the VP node). CP is L-marked and hence a barrier only if it dominates a BC. If "why" excludes a trace in SpecCP, the BC IP occurs between CP and the next trace. Due to the d-role of "how", government is violated leading to an ungrammatical sentence. If an intermediate trace is allowed, a new chain is started and no BC occurs. IP refutes the hypothesis that IP* is the deepest embedded tensed IP, and it turns out to be this IP as soon as the variable is found. So only one subjacency barrier occurs: The sentence is grammatical.

5 C o n c l u s i o n

(6)

A

P r o o f s

Proof of (6) is trivial.

The theorem (7) is symmetric for al and a2. Suppose

alRb A "~a2Rb. a2 and b are unconnected. So there exist kl barriers not dominating al (kl < n) and k2 barriers not dominating a2 (k2 > n). Suppose c is a barrier not dominating a2 but dominating al (there are at least k 2 - k l > 1 such barriers), c > b and y > b ,

hence c and y are connected. But y>_c entails y > a l .

I f c > y then either x > c > y or c > x . But c > x implies c > a2.

To prove (9) note that all barriers for y dominate y by (5). Hence they also dominate a e [y].

We now turn to (10). Take al E [x] and a2 E [y]. a2 and y are not connected. We show that if --c > a2 and c > b then -~c > al. Assume c > b and c > ax. Then x and c are connected both dominating b. We know that -~x _> c > ax. Hence c > x > y. Suppose

y! is y's father. Then c > x >_ y! ~.- y and equally

c > x > y! ~- a2. We obtain that {c I B(c,b) A -~c> el} D {c I B(c,b) ^ -~c>a2}. Hence -~[x]Rb.

We prove (24). Suppose c is a barrier for x. Then by (23) there is a sequence of nodes xl = c and

xn > xn+l = x. But xn > x > b, so c is a barrier for b as well. a and y are unconnected. Suppose c is a barrier for y but not x. Then xl = c and x n > x ~ + l = y. xn

and x are connected both dominating y. We know that -~x > xn > y and ~xn > x else c would be a barrier for x. Hence Xn = x and we get x , = x > b. There are at least as many barriers for b as there are for y, so -~aRb.

To prove (25) we adopt the argumentation of the foregoing proof and infer that x is a barrier for y. bILz shows that b, x are unconnected, hence -~x > b and -~bRy.

(26) follows if we prove B(c, bl) ~ B(c, b2) by induction. The theorem is symmetric. Assume a c such that B(c, bl). Then either scheme (16) holds: L(c) A c>bx hence c>b2. Or (17) and L(c) A 3d : B(d, bl) A c > d A Ve : c > e > d ---* ~L(e). By induction B(d, b2) as well. For the negative scheme (18) we use symmetry to extend the implication I(c, bx, B, L) ---, I(c, b2, B, L) to an equivalence.

For (28) we give a proof by cases. Either B(y, b) --. I(z, b, B, L). y is the barrier node d referred to in the consequent. Or I(y, b, B, L) A -~L(y) --* I(x, b, B, L). We set the barrier node d of the first inheritance clause equal to the one of the second. Does a node e between x and d satisfy L? y does not, nor do the nodes between y a n d d, and there is no node between x and y. But y and e must be connected, both dominating d. We show I(x, b, B, L) --* B(y, b) V I(y, b, B, L). The barrier node d of the antecedent clause and y are connected, both dominating b (see

5). d cannot sit between x and y. If d - y the first disjunct holds. If y > d we set d equal to the barrier node of the second disjunct. No e between y and d satisfies L.

We reduce (29) to (10). If a > y > b we make use of (6). Otherwise let x! be the smallest node that dominates both y and a and let x be such t h a t x ! ~- x >__ y. Then by (10) "~[x] Rb, meaning --~aRb (see 8).

R e f e r e n c e s

[Chomsky, 1986] Noam Chomsky. Barriers. Linguis- tic Inquiry Monograph 13, MIT Press, Cambridge, Massachusetts, 1986.

[Cinque, 1990] Guglielmo Cinque. Types of -A- Dependencies. Linguistic Inquiry Monograph 17, MIT Press, Cambridge, Massachusetts, 1990.

[DSrre and Eisele, 1991] Jochen DSrre and Andreas Eisele. A Comprehensive Unification-Based Gram- mar Formulism. Deliverable R3.1.B, DYANA - - ESPRIT Basic Research Action BR3175, 1991.

[Dorna, 1992] Michael Dorna. Erweiterung der Constraint-Logiksprache CUF um ein Typsystem.

Diplomarbeit Nr. 896, Institut fiir Informatik, Universit~t Stuttgart, 1992.

[Johnson, 1989] Mark Johnson. The Use of K n o w l - edge of Language. In Journal of Psycholinguistic Research, 18(1), 1989.

[Kamp and Reyle, 1993]

Hans Kamp and Uwe Reyle. From Discourse to Logic, Vol I. to appear: Kluwer, Dordrecht, 1993. [Kracht, 1992] Marcus Kracht. The Theory of Syn-

tactic Domains. Logic Group Preprint Series No. 75, Department of Philosophy, University of Utrecht, February 1992.

[Kroch, 1989] Anthony S. Kroch. Asymmetries in Long-Distance Extraction in a Tree-Adjoining Grammar. In Mark Baltin and Anthony Kroch, eds. Alternative Conceptions of Phrase Structure.

University of Chicago Press, Chicago, 1989.

[Miiller and Sternefeld, 1991] Gereon Miiller and Wolfgang Sternefeld. Extraction, Lexical Varia- tion, and the Theory of Barriers. Universit~it Kon- stanz, September 1991.

[Pollard and Sag, 1991] Carl Pollard and Ivan A. Sag. Agreement, Binding and Control. draft, June 1991.

[Rizzi, 1990] Luigi Rizzi. Relativized Minimality.

Linguistic Inquiry Monograph 16, MIT Press, Cambridge, Massachusetts, 1990.