• No results found

Localising Barriers Theory

N/A
N/A
Protected

Academic year: 2020

Share "Localising Barriers Theory"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

L o c a l i s i n g

B a r r i e r s

T h e o r y

Michael Schiehlen*

Institute for Computational Linguistics, University of Stuttgart, Azenbergstr. 12, W-7000 S t u t t g a r t 1

E-mail: [email protected]

1

I n t r o d u c t i o n

Government-Binding Parsing has become attractive in the last few years. A variety of systems have been designed in view of a correspondence as direct as pos- sible with linguistic theory ([Johnson, 1989], [Pollard and Sag, 1991], [Kroch, 1989]). These approaches can be classified by their m e t h o d of handling global constraints. Global constraints are syntactic in na- ture: T h e y cover more than one projection. In con- trast, local constraints can be checked inside a pro- jection and, thus, lend themselves to a treatment in the lexicon. Conditions on features have been the subject of intensive study and viable logics have been proposed for them (see e.g. the CUF formalism [Dhrre and Eisele, 1991], [Dorna, 1992]). In this pa- per, we assume such a unification-based mechanism to take care of local conditions and focus on global constraints. One class of approaches to principle- based parsing (see [Pollard and Sag, 1991] for HPSG, [Kroch, 1989] for TAG) attempts to reduce global conditions to local constraints and thus to make them accessible to treatment in a feature framework. This strategy has been pursued only at the expense of sacrificing the precise formulation of the theory and the definitory power stemming from it. T h e re- sult has been a shift from the structural perspec- tive assumed by GB theory to the object-oriented view taken by unification formalisms. T h e other class of approaches ([Johnson, 1989]) has allowed the full range of possible restrictions on trees and has in- curred potential undecidability for its parsers. We take up a middle stance on t h e m a t t e r in that we propose a separate logic for global constraints and posit that global constraints only work on ancestor lines (see 7).

We assume "movement" to be encoded by the kind of gap-threading technique familiar from HPSG, LFG. In order to integrate global constraints a "state" (in- formation that serves to express barrier configura- tions in the part of the tree which has already been built up) is associated with each "chain" (informa- tion about a moved element). Following H PSG, LFG, we have in mind a rule-based parser. Thus, states are manipulated when rules are chained. We need a cal- culus that is able to derive global constraints working on a local basis. We begin by developing this calculus hand in hand with an analysis of Chomsky's frame-

*I wish to thank Robin Cooper, Mark Johnson and Esther KSnig-Baumer for comments on earlier versions of this paper.

work. We then go on to show that many approaches to barriers theory and a variety of diverse phenom- ena can be moulded into our format and conclude with an indication of ways to use the system on-line during parsing.

2

D e p e n d e n c i e s B e t w e e n N o d e s

We take a tree T to be a structure (N,>), where N is a set of nodes and > stands for dominance, a

binary relation on N. We say that nodes a and b

are connected iff a > b V b > a V a = b. We define the relation of immediate dominance ~- between two nodes a and b as a > b A ~ 3 c : a > c A c > b. Dominance is an irreflexive partial order relation satisfying the axioms (1--3). Ancestors of a node are connected (1), there exists a (single) root (2), dominance reduces to immediate dominance (3). Variables are universally quantified unless specified otherwise.

(1) z > z A y > z --* x connected with y

(2) ~xVy : x > y

(3) x > z ~ 3y : x ~ y A y > z

Chomsky [1986, 9,30] discusses several definitions for constraints on unbounded dependencies.

(13) a c-commands/~ iff a does not domi- nate/~ [and/~ does not dominate or equal a] and every 7 that dominates a dominates/~.

Where 7 is restricted to maximal projec- tions we will say that a m-commands/?.

(18) a governs/~ iff a m-commands/~ and there is no 7, 7 a harrier for/~/, such that 7 excludes a.

(59)/~ is n-subjacent to a iff there are fewer than n + l barriers for/~ t h a t exclude a.

(2)

(4) aRb ~-* a, b unconnected ^

I{c I B(c,b) ^ - , e > a } l < n

Balanced relations like government require a defini- tion in terms of two BC-definable relations: Rl(a, b) and R2(b, a).

(5) B(c,b) ~ c > b

We can show several properties of BC-definable re- lations. The nodes are unconnected.

(6) aRb ---* a, b unconnected

In order to investigate BC-definable relations it suf- fices to investigate the ancestor lines of their second argument b (that is {y J y >_ b}).

(7)

x~-y A z > a l A ",y>__al A x > a 2 A -w>_a~

A y > b --* (alRb ~ a2Rb)

(7) gives rise to equivalence classes for the first argu- ment of R. For a particular pair (a,b) we can always find a y as defined in (8).

(s) a •

^ x > a ^ y>a

^ y > b

Definable relations are never empty. Barriers are pre- served in the upward direction of the ancestor line:

(9) [y]Ry

(10) x > y ^

[xlP

(10)

is less innocent than it looks. I give a revealing binding example from Kamp and Reyle [1993].

If [cP=~ [cP=y hei sees Mary ] and she smiles] John/ is happy.

*[cP=~ [vP=~ Hei

sees Mary ] and J o h n / i s happy].

3 B a r r i e r D e f i n i t i o n s

3.1 Adjunction

Adjunction rules raise a problem for algebraic in- vestigations of barriers theory (e.g. [Kracht, 1992]): They insert material into a tree but do not cre- ate new projections. Thus, adjunction rules imply a distinction between projections and segment nodes that correspond to graph-theoretical nodes. We shall use Greek letters to refer to projection nodes and Latin letters for segment nodes. The only way to create projections covering more than one segment is through adjunction. Since adjunction rules have equivalent mother and daughter nodes, projections are coherent in the sense that:

Va ~ fl Vbi, b2 • f~ : a > bi --* a > b2

Chomsky [1986] defines projection dominance so that dominates ~ only if every segment of a domi- nates (every segment of) f/. In case this definition is not empty, (1) guarantees a unique minimal seg- ment a,~in of a. Thus, we can rephrase Chomsky's definition in terms of segment nodes and get that a dominates fl just in case the minimal segment of a dominates some segment of 3.

(11) dominate(a,/3) *-+ a e a A b • / 3 A minimal segment(a) A a > b

Likewise, Chomsky's definition of exclusion, viz that a excludes j3 if no segment of a dominates (any seg- ment of) /3, can be transformed to the equivalent condition that a excludes/3 if the maximal segment of a does not dominate a segment of 3.

(12) exclude(aft) ~ a E a A b e fl A maximal segment(a) A --a > b

This way, we reduce projection dominance to seg- ment dominance. In (13--15), conditions of segment minimality or maximality are included where they are appropriate by (11) and (12).

3.2 C h o m s k y ' s T h e o r y

Chomsky [1986, 14] gives the following two core def- initions for barriers. We are not concerned about the exact formulation of L-marking (for a definition see [Chomsky, 1986, 24]).

(25) 7 is a blocking category for fl iff 7 is not L-marked and 7 dominates/3. (26) 7 is a barrier for ~ iff (a) or (b):

a. 7 immediately dominates 6, a blocking category for 3;

b. 7 is a blocking category for 3, 7 ~ IP.

We understand 7 in (25) and (26) to be a maximal projection, and we understand "immediately dominate" in (26a) to be a relation between maximal projections (so that 7 immediately dominates 5 in this sense even if a nonmaximal projection in- tervenes).

Formulation of these definitions in first order logic yields (13--15). In order to obtain an open-ended definition scheme the equivalence of the above defi- nitions is held implicit: Barrier concepts are true iff they comply with a manifest definition (see also 22 and 23).

(3)

-, L-marked(c) A minimal segment(c) A

c > b .

(14) barrier(c,b)

maximal projection(c) A minimal segment(c) A 3d : blocking category(d,b) A

c > d A

V e : c > e > d - - +

-, ( maximal projection(e) A minimal segment(e) ). (15) barrier(c,b) ¢=

blocking category(c,b) A -,IP(c).

We regard unary predicates as local conditions (L) and binary predicates as global concepts (B for "bar- rier concept"). Abstracting over the particular predi- cates involved we end up with the following definition schemes (16 for 13 and 15, 17 for 14).

(16)

B(c, b) ¢= L(e) A c > b . (17) S(e, b)

L(e) A 3d : B(d, b) A

e > d A

Ve : e > e > d ~ ",L(e).

We call the existential subformula of (17) an inher- itance clause I. The only global conditions in our system are inheritance clauses and c > b, a condition that always holds for barrier concepts (see 5). We will discuss in detail a way to derive inheritance clauses on a rule to rule basis. For the sake of conciseness we adopt the following abbreviation for inheritance clauses.

35 : B(d, b) A e > d A Ve : c > e > d --* -,L(e) ,: y

I(c,b,B,L)

3.3 N e g a t i v e I n h e r i t a n c e C l a u s e s

It has interesting repercussions to incorporate a scheme with a negated inheritance clause, viz. (18).

(18) B(e, b)

L(c) A

c > b A

-,3d : B(d, b) A

c > d A

Ve : c > e > d - * -,L(e).

For illustration we discuss several applications for negative inheritance clauses.

Chomsky [1986, 37] talks about IPs as inherent bar- riers, this effect being restricted to the most deeply embedded tensed IP. To capture this concept we once again need a negative inheritance clause: An IP is most deeply embedded if it does not dominate any other IP.

(20) barrier(Tfl) ¢= tensed IP(7) A

7>8A

--,36 : IP(6,8) A 7>6.

IP(7,3) ~ IP(7) A 7>8.

A feature of negative inheritance clauses that is de- sirable in m a n y cases is t h a t they allow to cancel barriers higher up in the tree. They can be used to circumvent (24). Classical GB theory has had to re- sort to a variety of tricks to account for discontinuous domains. A case in point is the coherent infinitive construction found in German and Dutch ~. A stan- dard account is to reanalyse 0-structure into another structure that lacks the annoying barrier-generating nodes. Different submodules of the theory will work on different structures. Consider the following exam- ple.

dab [cP [tP PRO [vp

[NP

der Wagen] zu

reparieren]]] [v versucht] wurde

In this example V governs NP but not "PRO" even though "PRO" intervenes between V and NP. CP might be called a phantom barrier. Generally, a phan- tom (like CP, IP above) is a barrier just in case it does not dominate a non-phantom (VP above). Thus CP shields "PRO" but remains open for government of NP. This state of affairs can be caught in the present framework by a negative inheritance clause.

(21) barrier(7,#) ¢= phantom(7) A

7>#A

"~q# : nonphantom(~,3) A 7 > 8 .

nonphantom(7,8 ) ¢= nonphantom(7) A 7 > 8.

Similar cases arise with negation. Again, the litera- ture adopts different lines of argument to account for the phenomenon. Kamp and Reyle [1993] handle the binding case below with a rule of double negation elimination, an operation that deletes structure.

*Either he~ owns a Porsche or John/ hides it.

Either h e / d o e s not own a Porsche or John/ hides it.

(4)

The examples below are drawn from Cinque [1990, 83]. He uses a superscription convention to annotate the scope of the negation and assumes an LF amalga- mation process triggered by coindexing of this sort. CP is no barrier anymore for LF-amalgamated el- ements since they become wh-movable. We might model amalgamation with the "nonphantom" clause of (21). Then, this clause would have to hold true for inherently wh-movable elements (bare quantifiers in Cinque's analysis) as well.

*Molti amici, [cP ha invitato t, che io sap- pin.

Molti amici, [cP

[NegP

n o n ha invitato t,

che io sappia.

3 . 4 P r o p e r t i e s o f t h e D e f i n i t i o n S c h e m e s

In this paragraph we further investigate properties of the three definition schemes we are dealing with. We summarize scheme (16) in (22). def is a variable ranging over the given definitions.

(22) B(c,b) ~ Bdef: Ldef(c ) A c>b

We can collapse all definitions de/into a single defi- nition with local condition K(c) ~

Vd4Ld4(c).

In order to summarize the schemes (16--17) we intro-

duce vectors of definitions def" of length n and corre- sponding sequences of nodes Z of length n + 1. xl is fixed to c and Xn+l to b.

(23) B(c,b) *-* B def, Z : V i • { 1 , . . . , n } :

Ldef(i)(xi) A xi > xi+l.

For definitions conforming to type (16--17) we can show the following property: If we have found a son y violating the relation R all descendants b of the father x will be inaccessible to R.

(24) x ~- y A aRx A ~ a R y A x > b --* --,aRb

In a full-fledged definition scheme where (16--18) are available (24) ceases to hold. In the example dis- cussed above a does not govern y but does govern b.

a [cP=, [vP=y b

In pre-Barriers GB theory and most current com- putational approaches only inherent barriers are al- lowed (scheme 16) and the violating number of barri- ers in axiom (4) is set to null. Note that under these provisos, barriers theory shrinks to command theory:

(4') aRb ~ a, b unconnected A Vc : K ( c ) A c>b---*c>a

T h e following constraint holds in this configuration: A barrier as in (24) is not affected by the triggering first argument.

(25) x ~-y A Ba : [aRx A --,aRy] A bRx ---. --,bRy

Chomsky [1986, 11] discusses (25) at some length. In his example (see below) "decide" =a does not govern " P R O " , but "e" =b would. He shows t h a t if either of the mentioned requirements (n=O and intrinsic bar- riers) is not met the theorem is refuted.

(21) John decided [cP e [xP P R O to [ r e see the movie ]]]

If (16--18) are given then we can show the following theorem: Brothers are equivalent when occurring as a second argument of a BC-definable relation.

(26) a, bl unconnected A a, b2 unconnected A

by N- bl A by N- b2 ~ (aP0bl ~ aRb2)

4

Localising t h e Global C o n s t r a i n t s

The next step is to localise the definitions ( 1 6 - - 18). For ease of reference we repeat the definition schemes.

(27) B(c,b) ~ 3def: [Ll(C) A c>b] V

ILl(c) A I(c,b,B, L2)] V [Ll(c) A c > b A -,I(c,b,B, L2)]

We only take into account nodes c t h a t separate a from b in the sense t h a t they sit on the ancestor line of b but not on that of a (see also the restrictions of 4 and 5). T h e o r e m (28) specifies a connection be- tween the inheritance clauses valid on a father node z and those valid on the son y. Recall that inheritance clauses are the only global conditions we consider.

(28) x N y A y>_b A "-,y>_a ---*

(B(y, b) V (I(y, b, B, L) A -~L(y)) *-* I(x, b, B, L))

(5)

4.1 U p w a r d S t a t e s

Upward states supply information about barrier nodes encountered on the ancestor line below. T h e y are constructed when the second argument b of a relation R has been found and the tree is being searched for the first argument a. Formally, upward states are sets (standing for conjunctions) associ- ated with some node c and some dependency coming from b.

{B,L) e UState(c,b) ~ I(c,b,B,L)

Any inheritance clause that can be derived at c on the basis of the lower upward state and the rule schemes (27--28) is included in c's upward state. If a clause is not in the state, it cannot be inferred by (16--18). Consequently, the negation of a missing clause must hold. We assume a counter for c and b to be increased and checked as defined by the theory (computing the number n of passed barriers, see 4).

IncreaseCounter(c,b) ~ B(c,b)

We use the upward state to break off search as soon as we can infer from the theory that an element a cannot possibly be found in the rest of the tree. Theorem (29) stands to express that as soon as we have found a node y violating the definitions upward search becomes obsolete.

(29)

4.2 D o w n w a r d States

Downward states encode information about barrier nodes encountered on the ancestor line above. They are computed when the second argument b of a re- lation tt is being expected because a first argument a has been discovered. Formally, downward states are first order formulae associated with some node c, some ancestor node ct of c, and some dependency leading to b. Atomic formulae of DState(c,cl,b) are inheritance clauses I with respect to c and b.

formula E DState(c,ct,b)

formula(c,b) ~ IncreaseCounter(cl ,b)

T h e rule schemes (27--28) supply all sufficient and necessary conditions for transfer of inheritance clauses between nodes. Accordingly an atomic for- mula in the upper downward state can be trans- formed into a formula holding for the lower node c. False formulae are discarded, while true formulae in- crease the counter.

We use downward states to restrict the search space. By (24) we can sometimes infer that search into a subtree will be pointless. Negative inheritance

clauses, however, can only be checked when a can- didate for b has been encountered. When the parser descends paths while searching, it always assumes that the current path will dominate b. For upward states, in contrast, the ancestor line of b is fixed. Only downward states scan trees. (26) shows that a state will not change for brother nodes. So we only have to store one downward state per rule (e.g. under its mother node).

4.3 Example

Consider the chain of "how" in the following example

how do [zp. you

[vP, t [vP

remember [cp t / * w h y lip Bill t behaved t ]]]]]

In a left-to-right top-down parse, the first barrier to be encountered would be IP* if it dominated either a blocking category (BC) or no other tensed IP. VP* is no BC or barrier since it does not dominate the intermediate trace (it is not the minimal segment of the VP node). CP is L-marked and hence a barrier only if it dominates a BC. If "why" excludes a trace in SpecCP, the BC IP occurs between CP and the next trace. Due to the d-role of "how", government is violated leading to an ungrammatical sentence. If an intermediate trace is allowed, a new chain is started and no BC occurs. IP refutes the hypothesis that IP* is the deepest embedded tensed IP, and it turns out to be this IP as soon as the variable is found. So only one subjacency barrier occurs: The sentence is grammatical.

5 C o n c l u s i o n

(6)

A

P r o o f s

Proof of (6) is trivial.

The theorem (7) is symmetric for al and a2. Suppose

alRb A "~a2Rb. a2 and b are unconnected. So there exist kl barriers not dominating al (kl < n) and k2 barriers not dominating a2 (k2 > n). Suppose c is a barrier not dominating a2 but dominating al (there are at least k 2 - k l > 1 such barriers), c > b and y > b ,

hence c and y are connected. But y>_c entails y > a l .

I f c > y then either x > c > y or c > x . But c > x implies c > a2.

To prove (9) note that all barriers for y dominate y by (5). Hence they also dominate a e [y].

We now turn to (10). Take al E [x] and a2 E [y]. a2 and y are not connected. We show that if --c > a2 and c > b then -~c > al. Assume c > b and c > ax. Then x and c are connected both dominating b. We know that -~x _> c > ax. Hence c > x > y. Suppose

y! is y's father. Then c > x >_ y! ~.- y and equally

c > x > y! ~- a2. We obtain that {c I B(c,b) A -~c> el} D {c I B(c,b) ^ -~c>a2}. Hence -~[x]Rb.

We prove (24). Suppose c is a barrier for x. Then by (23) there is a sequence of nodes xl = c and

xn > xn+l = x. But xn > x > b, so c is a barrier for b as well. a and y are unconnected. Suppose c is a barrier for y but not x. Then xl = c and x n > x ~ + l = y. xn

and x are connected both dominating y. We know that -~x > xn > y and ~xn > x else c would be a barrier for x. Hence Xn = x and we get x , = x > b. There are at least as many barriers for b as there are for y, so -~aRb.

To prove (25) we adopt the argumentation of the foregoing proof and infer that x is a barrier for y. bILz shows that b, x are unconnected, hence -~x > b and -~bRy.

(26) follows if we prove B(c, bl) ~ B(c, b2) by in- duction. The theorem is symmetric. Assume a c such that B(c, bl). Then either scheme (16) holds: L(c) A c>bx hence c>b2. Or (17) and L(c) A 3d : B(d, bl) A c > d A Ve : c > e > d ---* ~L(e). By induction B(d, b2) as well. For the negative scheme (18) we use symmetry to extend the implication I(c, bx, B, L) ---, I(c, b2, B, L) to an equivalence.

For (28) we give a proof by cases. Either B(y, b) --. I(z, b, B, L). y is the barrier node d referred to in the consequent. Or I(y, b, B, L) A -~L(y) --* I(x, b, B, L). We set the barrier node d of the first inheritance clause equal to the one of the second. Does a node e between x and d satisfy L? y does not, nor do the nodes between y a n d d, and there is no node between x and y. But y and e must be connected, both dominating d. We show I(x, b, B, L) --* B(y, b) V I(y, b, B, L). The barrier node d of the antecedent clause and y are connected, both dominating b (see

5). d cannot sit between x and y. If d - y the first disjunct holds. If y > d we set d equal to the barrier node of the second disjunct. No e between y and d satisfies L.

We reduce (29) to (10). If a > y > b we make use of (6). Otherwise let x! be the smallest node that dominates both y and a and let x be such t h a t x ! ~- x >__ y. Then by (10) "~[x] Rb, meaning --~aRb (see 8).

R e f e r e n c e s

[Chomsky, 1986] Noam Chomsky. Barriers. Linguis- tic Inquiry Monograph 13, MIT Press, Cambridge, Massachusetts, 1986.

[Cinque, 1990] Guglielmo Cinque. Types of -A- Dependencies. Linguistic Inquiry Monograph 17, MIT Press, Cambridge, Massachusetts, 1990.

[DSrre and Eisele, 1991] Jochen DSrre and Andreas Eisele. A Comprehensive Unification-Based Gram- mar Formulism. Deliverable R3.1.B, DYANA - - ESPRIT Basic Research Action BR3175, 1991.

[Dorna, 1992] Michael Dorna. Erweiterung der Constraint-Logiksprache CUF um ein Typsystem.

Diplomarbeit Nr. 896, Institut fiir Informatik, Universit~t Stuttgart, 1992.

[Johnson, 1989] Mark Johnson. The Use of K n o w l - edge of Language. In Journal of Psycholinguistic Research, 18(1), 1989.

[Kamp and Reyle, 1993]

Hans Kamp and Uwe Reyle. From Discourse to Logic, Vol I. to appear: Kluwer, Dordrecht, 1993. [Kracht, 1992] Marcus Kracht. The Theory of Syn-

tactic Domains. Logic Group Preprint Series No. 75, Department of Philosophy, University of Utrecht, February 1992.

[Kroch, 1989] Anthony S. Kroch. Asymmetries in Long-Distance Extraction in a Tree-Adjoining Grammar. In Mark Baltin and Anthony Kroch, eds. Alternative Conceptions of Phrase Structure.

University of Chicago Press, Chicago, 1989.

[Miiller and Sternefeld, 1991] Gereon Miiller and Wolfgang Sternefeld. Extraction, Lexical Varia- tion, and the Theory of Barriers. Universit~it Kon- stanz, September 1991.

[Pollard and Sag, 1991] Carl Pollard and Ivan A. Sag. Agreement, Binding and Control. draft, June 1991.

[Rizzi, 1990] Luigi Rizzi. Relativized Minimality.

Linguistic Inquiry Monograph 16, MIT Press, Cambridge, Massachusetts, 1990.

References

Related documents

MET 109 Computer Programming and Applications 2 MET 117 Manufacturing Processes 2 MET 127 Advanced Manufacturing Processes 2 MET 201 Statics 3 MET 205W Material Science 3 MET

d) Present their action plan developed after CLTS triggering and discuss if it’s achievable e) Define the roles of community leaders &amp; HSA in CLTS and tie them to the action plan

(Middle) Five heatmap blocks, each of which demonstrates (first column) antimicrobial susceptibility, (second column) morphology under exposure to antimicrobials, and

In this study, CHD4 rs74790047, TSC2 rs2121870, and AR rs66766408, were found to be common exonic mutations in both lung cancer patients and normal individuals exposed to high

Для розробки технології обкочування труб існує необхідність порівняння НДС при обкочуванні труб різної товщини за різними схемами (див. 1,2) Метою

This document contains some of the most frequently used formulae that are discussed in

Using the [2nd][TBLSET] screen and then [2nd][TABLE] we can see that the values of the function are indeed zero at -1, 2 and 3. Hence we can now say that x = -1 , x = 2 and

Entrepreneurial Leadership Program Facilitator’s Guide Session Four: Challenges Facing Today’s Business Leaders.. Activity 4.3:. Interaction — Business Case Studies