• No results found

Computing with Features as Formulae

N/A
N/A
Protected

Academic year: 2020

Share "Computing with Features as Formulae"

Copied!
26
0
0

Loading.... (view fulltext now)

Full text

(1)

Mark Johnson*

Brown University

This paper extends the approach to feature structures developed in Johnson (1991a), which uses

SchO'nfinkel-Bernays' formulae to express feature structure constraints. These are shown to be a

disjunctive generalization of Datalog clauses, as used in database theory. This paper provides a

fixed-point characterization of the minimal models of these formulae that serves as the theoretical

foundation of a forward-chaining algorithm for determining their satisfiability. This algorithm,

which generalizes the standard attribute-value unification algorithm, is also recognizable as a

nondeterministic variant of the semi-naive bottom-up algorithm for evaluating Datalog programs,

further strengthening the connection between the theory of feature structures and databases.

1. Introduction

Despite their simplicity, a surprisingly wide range of linguistic phenomena can be de- scribed in terms of simple equality constraints on values in attribute-value structures, which are a particularly simple kind of feature structure (see Shieber 1986; Johnson 1988; Uszkoreit 1986; and Bresnan 1982 for examples of some of these analyses). But some phenomena do not seem to be able to be described in such a pure 'unification' framework. For example, the analysis of conjunctions in LFG (Kaplan and Maxwell 1988b) and the formalizations of Discourse Representation Theory (Kamp 1981) pre- sented in Johnson and Klein (1986) and Johnson and Kay (1990) require additional mechanisms for representing and manipulating aggregates or sets of values in ways that are beyond the capability of such "pure" attribute-value systems. Further, sortal constraints (which also cannot be expressed as simple equality constraints) can be used to formulate simpler and more comprehensible grammars (Carpenter 1992; Carpenter and Pollard 1991; Pollard and Sag 1987, 1992).

Versions of both of these kinds of constraint, as well as the familiar attribute-value constraints, can be expressed as

Scho'nfinkel-Bernays'formulae

(as demonstrated in John- son 1991a, 1991b), so that the problem of determining the satisfiability of a system of such constraints is reduced to the satisfiability problem for the corresponding formula. This class of formulae (defined in Section 3.1) seems to be expressive enough for most linguistic purposes when used with an external phrase-structure backbone. That is, these formulae are used as

annotations

on phrase structure rules in the manner de- scribed in, e.g., Kaplan and Bresnan (1982), Shieber (1986), and Johnson (1988). This paper extends the author's previous paper on the topic (Johnson 1991a) by sketch- ing several other linguistic applications of Sch6nfinkel-Bernays' formulae (including a version of D-theory [Marcus, Hindle, and Fleck 1983; Vijay-Shanker 1992]), and pre- senting a least-fixed-point theorem that serves as the theoretical basis for a "forward- chaining" algorithm for determining satisfiability of Sch6nfinkel-Bernays' formulae. Interestingly, this algorithm can be viewed both as a straightforward generalization

* Cognitive and Linguistic Sciences, Box 1978, Brown University, Providence, RI. E-mail: [email protected]

(2)

Computational Linguistics Volume 20, Number 1

of the standard attribute-value unification algorithm and also as a nondeterministic variant of the semi-naive evaluation method for Datalog clauses.

Several extended "unification-based" constraint formalisms have been developed. In this paper, the term "feature structure" denotes any kind of structured entity used as a component of a category label. An attribute-value structure is a particularly simple kind of feature structure of the kind used in "pure" unification-based frameworks (Shieber 1986). Some extensions to the basic attribute-value framework are rather weak, e.g., allowing disjunctive and negative constraints and preserving decidability. 1 Such systems require an "off-line" phrase structure backbone to which these constraints are attached. It seems that most of the constraints that can be expressed in these formalisms can be expressed as Sch6nfinkel-Bernays' formulae, the constraint formalism described below.

A second class of extended constraint formalisms has been devised to be capable of expressing the entire grammar as systems of constraints and as far as I know, for all of these systems the problem of determining the satisfiabihty of an arbitrary system of constraints that they can express is undecidable. 2 This is because the recognition problem for an arbitrary "unification-based" grammar is undecidable unless the size of the phrase structure tree is constrained somehow, e.g., by the offiine parsability constraint (Johnson 1988; Kaplan and Bresnan 1982; Pereira 1982; Shieber 1992), but there seems to be no natural way to impose such constraints in these systems because the encoding of the phrase structure tree in the feature structure is not distinguished from other features. 3 Thus in order to maintain decidability the system described here is not designed to be capable of expressing phrase structure constraints directly, and must be used with an external phrase-structure component, as in LFG (Bresnan 1982). (However, Bob Carpenter [p.c.] points out that one can impose a bound on the size of the feature structure that can serve as an analysis [say, some polynomial of the length of the input], and so ensure decidability.) Interestingly, a first-order logic- based approach similar to the one presented in this paper can also be developed for extended constraint formalisms capable of expressing the entire grammar, but this is not discussed further here; see Johnson (in press b) for details.

In the approach developed here Sch6nfinkel-Bernays' formulae are used to express a variety of feature structure constraints. Previous work has shown that these formulae are expressive enough to define arbitrary disjunctions and negations of constraints (Johnson 1990a, 1990b), a kind of 'set-valued' entity (Johnson 1991a), and they can be used to impose useful sort constraints (Johnson 1991b). The expression of D-theory constraints on nodes in trees is discussed in this paper.

This paper extends the ideas in these earlier papers with theoretical results that suggest a forward-chaining algorithm for determining the satisfiability of an arbi- trary Sch6nfinkel-Bernays' formula. This generalizes the standard feature-graph uni- fication algorithm and is closely related to the semi-naive bottom-up algorithm used in database theory.

1 For examples of this approach see Dawar and Vijay-Shanker (1990), D6rre and Eisele (1990), Johnson (1988, 1990a, 1990b, 1991a, 1991b, in press a), Karttunen (1984), Kasper (1987a, 1987b, 1988), Kasper and Rounds (1986, 1990), Langholm (1989), Pereira (1987), and Smolka (1992).

2 Examples of this approach are Carpenter, Pollard, and Franz (1991), D6rre (1991), D6rre and Eisele (1991), Johnson (in press b), Kay (1979, 1985a, 1985b), Pollard and Sag (1987), Rounds and Manaster-Ramer (1987), Smolka (1988), and Zajac (1992).

3 While it may well be that the universal recognition and parsing problems for natural language are undecidable (Chomsky [1986, 1988] points out that there is no contrary evidence), I know of no evidence that this is actually the case. It seems reasonable then to also investigate formalisms that can only express decidable systems of constraints (and for which there exist satisfiability-testing algorithms) if linguistically adequate systems can be found.

(3)

Specifically, it is s h o w n that the satisfying H e r b r a n d m o d e l s of an arbitrary Sch6n- finkel-Bernays' f o r m u l a are the fix points of certain functions, and that the least fixed points of these functions are all of the m o d e l s of the f o r m u l a that are " m i n i m a l " in a certain sense. This leads to a f o r w a r d - c h a i n i n g algorithm for c o m p u t i n g all of the atomic consequences of a Sch6nfinkel-Bernays' formula; the fixed-point t h e o r e m s h o w s that this suffices to d e t e r m i n e the satisfiability of an arbitrary Sch6nfinkel-Bernays' formula.

2. Constraints, Partial Information, and Feature Structures

This a p p r o a c h exploits the fact that constraints on w e l l - f o r m e d linguistic structures (e.g., w e l l - f o r m e d n e s s constraints i m p o s e d b y the g r a m m a r ) d o not n e e d to be isomor- phic to the structures that satisfy them. A l t h o u g h the distinction b e t w e e n constraints a n d structures that satisfy t h e m m i g h t seem too obvious to w a r r a n t c o m m e n t , it is not m a d e in most w o r k on feature structures.

A c o m m o n v i e w holds that feature structures are inherently "partially specified" entities, w h i c h "unify" or m e r g e with other feature structures to yield m o r e instan- tiated feature structures in an " i n f o r m a t i o n - p r e s e r v i n g " w a y (Shieber 1986). If two feature structures contain " c o n t r a d i c t o r y information," then it is impossible to m e r g e t h e m to p r o d u c e a consistent object; unification is then said to fail. The feature struc- ture for an utterance is the result (if one exists) of unifying all of the feature structures for the lexical entries a n d syntactic rules in a p p r o p r i a t e ways. Thus in this v i e w fea- ture structures p l a y

two

roles; not o n l y d o they serve as linguistic structures, b u t they are also used to e n c o d e constraints that the linguistic structures m u s t satisfy (see Sec- tion 2.10 of Johnson (1988) for an e x t e n d e d discussion).

That is, u n d e r this v i e w feature structures serve not only as linguistic structures that m a y or m a y not satisfy a constraint, b u t are also i n t e r p r e t e d as 'representing' or 'describing' all of the feature structures that they subsume. G i v e n this dual role for feature structures, it is i m p o r t a n t in this a p p r o a c h that if a feature structure S satis- fies a constraint o~, t h e n e v e r y feature structure s u b s u m e d b y S s h o u l d also satisfy o~ (Pereira 1987). If this " u p w a r d closure" p r o p e r t y holds, then the set of feature struc- tures satisfying a n y constraint can be represented b y the set of its " m i n i m a l models." Unfortunately, m a n y useful constraints d o not h a v e this property. For example, un- der a classical interpretation, the set of feature structures satisfying n e g a t e d feature structure constraints are not u p w a r d - c l o s e d (Moshier a n d R o u n d s 1987).

The w o r k described in this p a p e r p u r s u e s a different approach. Following Kaplan a n d Bresnan (1982), feature structures are only ( c o m p o n e n t s of) linguistic structures, and not partial descriptions of (other) linguistic structures. As such, a feature struc- ture either does or does not satisfy a n y particular set of constraints. An utterance is w e l l - f o r m e d just in case there is some linguistic structure that

satisfies

all of the con- straints i m p o s e d b y the g r a m m a r and has the phonological f o r m of that utterance as its phonological f o r m (which itself is just a n o t h e r constraint that the structure m u s t satisfy). Since the relationship b e t w e e n a feature structure a n d a constraint that it satis- fies is essentially the same as the relationship b e t w e e n an interpretation and a f o r m u l a that is true u n d e r that interpretation, it seems natural to conceive of a constraint as a kind of f o r m u l a (in a f o r m a t that allows efficient c o m p u t a t i o n a l manipulation) that has feature structures as its i n t e n d e d interpretations.

(4)

Computational Linguistics Volume 20, Number 1

feature structures plays no special role in this approach; specifically, it is not required that the set of structures that satisfy a constraint be upward-closed.

In general, a linguistic structure S m u s t satisfy several constraints, say oaT... ~ o~,~, in o r d e r to be well f o r m e d , so in o r d e r to solve the recognition a n d parsing problems, all w e n e e d do is d e t e r m i n e if there are a n y S that satisfy oq~... ~ O~n, a n d if so, describe t h e m somehow.

It is c o n v e n i e n t to devise a language for expressing constraints, so that the o~i are w e l l - f o r m e d f o r m u l a e of this language, and its satisfaction relation is exactly the satisfaction relation m e n t i o n e d above. Viewed f r o m this perspective, the p r o b l e m of d e t e r m i n i n g if there is a structure S that satisfies the constraints o~1~ . . . ~ O~n is the same as the p r o b l e m of d e t e r m i n i n g if the f o r m u l a c~ is satisfiable, w h e r e o~ is O q A " ' " /X O~ n

a n d conjunction is given the s t a n d a r d interpretation. A l g o r i t h m s for deciding the sat- isfiability of arbitrary f o r m u l a e in this language (if t h e y exist) can therefore be u s e d to d e t e r m i n e the satisfiability of the linguistic constraints. Moreover, if o~ ~ c~ ~ t h e n o / i s a true description of e v e r y m o d e l of c~, i.e., the logical c o n s e q u e n c e s of o~ are descrip- tions of e v e r y w e l l - f o r m e d linguistic structure that satisfies the constraints. T h u s the logic of the constraint l a n g u a g e p r o v i d e s in principle all the necessary tools for deter- mining if a set of constraints are satisfiable, a n d if t h e y are, p r o v i d i n g descriptions of the satisfying structures.

F r o m this perspective, an " i n f o r m a t i o n state" is a kind of formula, a n d "unify- ing" two such i n f o r m a t i o n states is a c c o m p l i s h e d b y conjoining t h e m a n d simplifying the resulting formula, not b y some m a n i p u l a t i o n of their models. Partial i n f o r m a t i o n states are those that are satisfied b y m o r e than one interpretation. The c o n s e q u e n c e relation c o r r e s p o n d s to the s u b s u m p t i o n relation of traditional unification g r a m m a r (a f o r m u l a o~ "contains m o r e i n f o r m a t i o n " than f o r m u l a o / i f f o~ ~ a'), a n d unsatisfiability c o r r e s p o n d s to unification failure.

3. Languages for Expressing Feature Structure Constraints

There are m a n y different possible constraint languages. Specialized languages can be constructed specifically for the task of expressing feature structure constraints (such as Kasper a n d R o u n d s ' s FDL [Kasper a n d R o u n d s 1990] a n d Johnson's attribute-value languages [Johnson 1988]). Alternatively, the constraints m a y be able to be expressed in some s t a n d a r d language, so that the satisfiability p r o b l e m for linguistic constraints is r e d u c e d to the satisfiability p r o b l e m for that language, as is d o n e here. 4

Johnson (1990a), following a suggestion first m a d e in Kaplan a n d Bresnan (1982), s h o w e d h o w attribute-value constraints could be f o r m a l i z e d in the quantifier-free sub- set of first-order logic, while later w o r k (Johnson 1991a, 1991b) p r o p o s e d a different formalization in the Sch6nfinkel-Bernays' subset of first-order formulae. 5

R o u g h l y speaking, there is a trade-off b e t w e e n the expressive p o w e r of a l a n g u a g e a n d its c o m p u t a t i o n a l tractability. For example, the satisfiability p r o b l e m for the lan- g u a g e consisting of conjunctions of equalities a n d inequalities of first-order t e r m s can

4 A third approach, developed by Smolka (1992), is to define a specialized language tailored for expressing attribute-value constraints and note its translation into some standard language, in this case, also the Sch6nfinkel-Bernays' class.

5 Of course, there is no a priori reason for these subsets of first-order logic to be optimally suited for expressing feature structure constraints. Kasper and Rounds (1990) and more recently Blackburn (1991) and Blackburn and Spaan (1992) have suggested that it may be useful to express feature structure constraints in a special kind of modal logic. Johnson (1991b) also discusses the application of general first-order logic and nonmonotic logics to the specification of more complex constraints on feature structures.

(5)

be decided in quasi-linear time using the congruence-closure algorithm, but this lan- guage can only express conjunctions of feature-value equalities and inequalities. If this language is extended to allow disjunctions (so that disjunctive feature-value con- straints can be expressed), the satisfiability problem becomes NP-complete (Gallier 1986; Kasper and Rounds 1990; Nelson and Oppen 1980).

Since disjunctive constraints seem to be a practical necessity for describing natural languages (Barton, Berwick, and Ristad 1987; Karttunen 1984), most practical feature structure systems will probably have NP-hard satisfiability problems. Given that we have to solve an NP-hard problem anyway, it seems reasonable to investigate the most expressive feature structure constraint language that has an NP-complete satisfiability problem. The Sch6nfinkel-Bernays' class, used in the manner described here, appears to be the most expressive language for feature structure constraints proposed in the literature so far whose satisfiability problem is no harder than NP.

3.1 The SchSnfinkel-Bernays' Class

The Sch6nfinkel-Bernays' class (hereafter SB) is the class of first-order closed prenex formulae without function symbols in which no existential quantifier occurs in the scope of any universal quantifier. That is, a formula is in SB iff it has no free variables and is of the form

3 V l . . . 3 V m V X 1 . . . V X n O L ,

where a contains no quantifier symbols or function symbols. SB formulae are a proper subset of first-order formulae, and they are interpreted in exactly the same w a y as first- order formulae. The body a may contain boolean connectives (including negation), which can be used to express arbitrary boolean combinations of constraints.

Unlike the satisfiability problem for full first-order logic, which is undecidable (co-recursively enumerable), the satisfiability problem for SB is decidable; in fact it is PSPACE-complete (Lewis and Papadimitriou 1981). Further, if SBn is the class of SB formulae with n or fewer universal quantifiers, then for any fixed n the satisfiability problem for SBn is NP-complete (Lewis 1980). In the applications described here, the number of universal quantifiers is fixed (i.e., it does not vary with the utterance or even with the grammar), so the corresponding satisfiability problems are all NP-complete. The class of SB formulae is interesting for other reasons besides its ability to express a wide range of linguistic constraints. As shown below, the class of SB formulae in clausal form constitute an extension of Datalog that allows disjunctive consequents.

3.2 Formalizing Attribute-Value Structures Using SB

SB is both simple and expressive enough that grammar designers might choose to state linguistic constraints directly in SB, rather than in terms of attributes and val- ues. Nevertheless, it is important to understand how the properties of attribute-value structures can be stated in SB, since many of the techniques used to formalize them can be applied to other linguistically interesting structures as well.

In fact there are several ways of formalizing attribute-value structures in SB, all of which seem to be linguistically equivalent. What follows is a formalization in SB that allows values to be used as attributes and allows attributes to be quantified over (this is handy for stating "sort constraints"), but no special claims are made for it over and above any other SB formalization.

(6)

C o m p u t a t i o n a l Linguistics Volume 20, N u m b e r 1

a t h r e e - p l a c e r e l a t i o n arc, w h e r e arc(x, a~ y) m e a n s t h a t t h e r e is a n a r c l e a v i n g n o d e x l a b e l e d a p o i n t i n g to n o d e

y.6

O f c o u r s e , n o t all i n t e r p r e t a t i o n s q u a l i f y a s a t t r i b u t e - v a l u e s t r u c t u r e s ; e.g., t h o s e w h i c h s a t i s f y b o t h arc(x~ a~ y) a n d arc(x~ a~ z) f o r s o m e y ~ z v i o l a t e t h e r e q u i r e m e n t t h a t t h e r e is a t m o s t o n e arc w i t h a n y g i v e n l a b e l l e a v i n g a n y n o d e . W e c a n e x p r e s s t h i s r e q u i r e m e n t a s a n SB f o r m u l a t h a t is t r u e i n t h e i n t e n d e d i n t e r p r e t a t i o n s ( n a m e l y a t t r i b u t e - v a l u e f e a t u r e s t r u c t u r e s ) .

Vx Va Vy Vz arc(x~a~y) A a r c ( x , a , z ) ~ y = z. (1)

S i m i l a r l y , w e c a n e x p r e s s t h e p r o p e r t i e s o f t h e " a t t r i b u t e - v a l u e c o n s t a n t s " w i t h SB f o r m u l a e . L e t con b e a p r o p e r t y (i.e., a o n e - p l a c e r e l a t i o n ) t r u e o f t h e " a t t r i b u t e - v a l u e c o n s t a n t " e l e m e n t s . T h e s e e l e m e n t s a r e r e q u i r e d to h a v e n o a r c s l e a v i n g t h e m . T h e f o l l o w i n g f o r m u l a e x p r e s s e s t h i s r e q u i r e m e n t .

V x V a Vy ~,, ( c o n ( x ) A a r c ( x ~ a ~ y ) ) . (2)

N o t e t h a t t h e w o r d " c o n s t a n t " in t h e n a m e " a t t r i b u t e - v a l u e c o n s t a n t " is m i s l e a d i n g h e r e , s i n c e i n t h i s f r a m e w o r k n o t all SB c o n s t a n t s y m b o l s w i l l d e n o t e a t t r i b u t e - v a l u e " c o n s t a n t s . " M o r e p r e c i s e l y , b e i n g a n ' a t t r i b u t e - v a l u e c o n s t a n t ' is a p r o p e r t y o f a n i n d i v i d u a l in a n i n t e r p r e t a t i o n (i.e., a n e l e m e n t o f a f e a t u r e s t r u c t u r e ) , w h e r e a s b e i n g a c o n s t a n t is a p r o p e r t y o f a s y m b o l i n a f o r m u l a . C o n s t a n t s c a n b e u s e d to d e n o t e c o m p l e x a t t r i b u t e - v a l u e e n t i t i e s a s w e l l a s a t t r i b u t e - v a l u e c o n s t a n t s .

F i n a l l y , w e r e q u i r e t h a t t h e n a m e s o f a t t r i b u t e - v a l u e c o n s t a n t s d e n o t e d i s t i n c t a t t r i b u t e - v a l u e c o n s t a n t s . W e r e s e r v e a f i n i t e s u b s e t N o f t h e c o n s t a n t s o f o u r l a n g u a g e for u s e a s t h e n a m e s o f a t t r i b u t e - v a l u e c o n s t a n t s , a n d r e q u i r e t h a t t h e y s a t i s f y t h e f o l l o w i n g s c h e m a t a . 7

F o r e a c h c i n N , con(c). (3)

F o r e a c h d i s t i n c t p a i r Cl, c2 in N , cl ~

C2.

(4)

S c h e m a (3) r e q u i r e s e a c h s y m b o l in N to d e n o t e a n a t t r i b u t e - v a l u e c o n s t a n t , a n d s c h e m a (4) e n f o r c e s d i s t i n c t n e s s in e s s e n t i a l l y t h e s a m e m a n n e r a s t h a t u s e d i n t h e s p e c i f i c a t i o n s y s t e m s o f a l g e b r a i c d a t a - t y p e t h e o r y ( K a p u r a n d M u s s e r 1987).

F o r m u l a s (1) a n d (2) a n d t h e i n s t a n c e s o f s c h e m a t a (3) a n d (4) c a n b e r e g a r d e d a s defining a t t r i b u t e - v a l u e f e a t u r e s t r u c t u r e s . T h e s e a x i o m s a r e q u i t e p e r m i s s i v e : i n

6 Johnson (1991a) and Smolka (1992) propose that an attribute-value arc labeled a from x to y be

conceptualized as an instance of a two-place relation a(x, y). For most applications there is little

substantive difference between these two approaches; the approach taken here allows attributes to be quantified over, e.g., to state sortal constraints, and permits values to be used as attributes, as in e.g., LFG (Kaplan and Bresnan 1982); for discussion and linguistic applications see also Johnson (1988). 7 As Patrick Blackburn (p.c.) points out, one consequence of this is that every model of these constraints

will contain individuals corresponding to each attribute-value constant (since each constant symbol will be assigned a denotation). Whether this is desirable or problematic is debatable, but as he pointed out, it is easy to devise a conceptualization in which each attribute-value constant ci is conceptualized

as a one-place predicate ci(.) that is true of at most one element. Under such a conceptualization

(which can be formalized in SB as shown below) attribute-value constants would be the unique members of one-element sorts.

(i) For each c in N, Vx Vy c(x) A c(y) ~ x = y. (Uniqueness)

(ii) For each c in N, Vx c(x) ~ con(x). (Constant property)

(7)

arc(n,al,bl)

arc(n,al,n')

^ arc(n,a2,n')

n n

all

a x ~

a2

b!

n'

arc(n,a2,b2)

n

b2

Figure 1

Three constraints expressed as formulae and also depicted graphically.

addition to the usual finite acyclic feature structures, they allow infinite structures, cyclic structures, structures in w h i c h c o m p l e x values serve as attributes, etc. While ruled out b y fiat in s t a n d a r d treatments, admitting these additional structures causes no linguistic difficulties that I am aware of (in fact, some analyses crucially d e p e n d on their existence, as described in section 2.1.3 of Johnson [1988]), so in the interests of p a r s i m o n y additional constraints that forbid t h e m are not stipulated.

In fact, because SB f o r m u l a e possess the finite m o d e l p r o p e r t y (i.e., if an SB f o r m u l a has a model, t h e n it has a finite model), restricting attention to finite m o d e l s does not change the set of satisfiable SB formulae. Therefore it could have no effect on the set of w e l l - f o r m e d utterances. Cyclic feature structures can be prohibited with a constraint formalizable in SB, as described in Johnson (1991b), a n d one can express a constraint in SB that requires that all attributes are "attribute-value constants" (even t h o u g h there a p p e a r s to be no linguistic m o t i v a t i o n for such a constraint, a n d indeed, some analyses crucially d e p e n d on this not being the case, as p o i n t e d out in Johnson [1988]).

To s u m m a r i z e , the simplest SB axioms defining attribute-value structures are quite permissive, allowing a w i d e r range of structures to c o u n t as attribute-value structures than m a n y other formalizations. H o w e v e r , all of the major restrictions o n attribute- v a l u e structures discussed in the literature either h a v e no effect w h a t s o e v e r in this f r a m e w o r k , or else can be directly stated as additional SB constraints.

3.3 Expressing Feature Structure Constraints with SB

In this approach, simple attribute-value constraints are r e p r e s e n t e d b y quantifier-free atomic formulae. For example, a constraint that the v a l u e of n's al arc is bl w o u l d be r e p r e s e n t e d b y the a t o m

arc(n,al,bl),

a constraint that the value of n's a2 arc is b2 is r e p r e s e n t e d b y

arc(n, a2,

b2), a n d a constraint that the v a l u e of n's al arc is the same as the value of its a2 arc is r e p r e s e n t e d b y the conjunction

arc(n, all n') A arc(n, a2~ n')

(n ~ is the single v a l u e of both arcs). These three constraints are depicted graphically in Figure 1. N o t e that the g r a p h s in this figure are (depictions of) formulae, not attribute- value feature structures.

[image:7.468.91.357.55.161.2]
(8)

Computational Linguistics Volume 20, Number 1

A major motivation for using SB is that a wide variety of constraints, in addi- tion to standard attribute-value constraints, can be expressed using it. This allows a grammar developer to introduce a wide variety of "designer features" with possibly idiosyncratic, customized properties, while guaranteeing that the composite system is decidable (usually in NP-time, as noted above).

For example, suppose we want to impose sort restrictions of the following kind. To abbreviate the lexical entries of verbs we might introduce the one-place predicate

3rd-sg, where 3rd-sg(x) indicates that the value of x's person attribute is 3rd and x's

number attribute is singular. This constraint can be expressed using the following SB formula.

Vx 3rd-sg(x) ~ arc(x, person, 3rd) A arc(x, number, singular).

(5)

Similarly, constraints that restrict the possible values of certain attributes can be im- posed. For example, one might want to require that the value of every arc labeled

number is either singular or plural. This constraint can be expressed as the following SB formula.

Vx Vy arc(x, number, y) ~ y = singular V y = plural. (6)

These examples demonstrate only a small fraction of the variety of the feature structure constraints that can be expressed in SB. Even though all of these examples are based on attribute-value features, other sorts of features can be described in SB as well. For example, Johnson (1991a) shows how to formulate a variety of constraints on 'set- valued' features in SB.

3.4 Expressing Tree Structure Constraints with SB Formulae

Inspired by the work on description theory or 'D-theory' (Marcus, Hindle, and Fleck 1983; Vijay-Shanker 1992), this section shows how some elementary constraints on precedence and dominance in a tree can be expressed as SB formulae. It differs from that work in that different kinds of constraints are expressible (Vijay-Shanker was concerned with the formalization of a different kind of grammar), and that all of the constraints expressible in the system described below are decidable (this follows from the fact that they are defined and expressed using Sch6nfinkel-Bernays' formulae). These constraints are intended to appear as annotations on phrase structure rules (in the same w a y that attribute-value constraints do) and could be used to enforce a va- riety of "long-distance" relationships, such as the co- and contra-indexing constraints of binding theory (i.e., equality and inequality constraints on the values of index at- tributes).

The axiomatization begins by defining the primitive tree structure relations precedes

and dominates. Once these primitive tree structure relations are defined, they can be used to approximate more complex relationships such as c-commands, as described below. All of these axioms are in the Sch6nfinkel-Bernays' class, so the satisfiability of arbitrary boolean combinations of such constraints is decidable.

(9)

that D is a weak partial order over the nodes in a tree. In what follows, N(x) is interpreted as meaning that x is a tree node.

VX "~X K X

Vx Vy x < y ~ ~y < x

V x V y V z x < y A y < z - + x < z Vx D(x, x) +-+ N(x)

Vx Vy D(x,y) A D(y,x) ~ x = y Vx Vy Vz D(x, y) A D(y,z) ~ D(x,z)

(irreflexivity) (7a)

(asymmetry) (7b)

(transitive closure) (7c)

(reflexivity) (8a)

(antisymmetry) (8b)

(transitive closure) (8c)

Axiom (9) requires that there is a node that dominates all other nodes, and axiom (10) requires that for each pair of nodes either one precedes the other or one dominates the other. Axiom (11) enforces the "no tangling" constraint.

3 x X(x) A Vy X(y) + D(x, y)

Vx Vy X(x) A X(y) + ((x < y v y < x) -~(D(x, y) v D(y, x)))

Vw Vx Vy Vz w < x A D ( w , y ) A D ( x , z ) ~ y < z

(single root condition) (9)

(exclusivity) (10)

(nontangling condition) (11)

Finally, the following axioms (implicit in the standard treatments cited above) require the precedence and dominance relations to range over tree nodes.

vx Vy x < y + X(x) A X(y) (12a)

Vx Vy D(x, y) -~ N(x) A N(y) (12b)

This concludes the specification of linear precedence and dominance relations over nodes. We now turn to the specification of other relations in terms of these. The proper dominance relation P can be defined in terms of dominance as follows.

Vx Vy p(x, y) ~ x # y A D(x, y). (13)

However, m a n y interesting linguistic relations cannot be defined by Sch6nfinkel-Ber- nays' axioms. For example, the c-commands relation C is defined by the following formula (which says that x c-commands y iff x does not dominate y, and every node z that properly dominates x also properly dominates y).

Vx vy C(x, y) ~ ~D(x, y) A Vz e(z, x) ~ P(z, y). (14)

It is easy to see that this definition is not equivalent to a Sch6nfinkel-Bernays' for- mula by expanding the equivalence into two implications and moving the embedded quantifier out.

Vx Vy Vz C(x,y) + (-~D(x,y) A (P(z,x) ~ P(z,y))). (14a)

Vx Vy 3z (~D(x, y) A P(z, x) ~ P(z, y) ) ~ C(x, y). (14b)

Formula (14b) is not in SB because it contains an existential quantifier inside the scope of a universal quantifier. There are a number of ways to respond to this problem.

(10)

Computational Linguistics Volume 20, Number 1

restricted (just as in SB). However, it seems to be difficult to devise a decidable system capable of simultaneously expressing both tree structure and the variety of feature structure constraints that the SB approach described here can. Blackburn, Gardent, and Meyer-viol (1993) introduce a modal language L T for describing trees decorated with feature structures, whose satisfiability problem is undecidable. In the long run, such specialized "designer logics" may provide the most satisfying integration of tree structure and feature structure constraints.

Second, the 'one-sided' approximation (14a) can be used in place of the correct axiom (14). The effect of using such one-sided approximations was investigated in Johnson (1991a). It was shown there that if ~ is a formula such as the one in (14) and ~/is the one-sided approximation (14a), then for any formula ~+(C) in which C only appears positively, ~ A ~+ (C) is satisfiable iff ~' A ~+ (C) is satisfiable. That is, if we are concerned only with positively occurring constraints, we can simplify (14) to (14a), i.e., ignore (14b), without affecting constraint satisfiability.

Third, we can regard formulae such as (14) as the "macro" (15), used to expand constraints at the interface between the syntactic rules and the constraint solver. This "macro expansion" rewrites c-commands constraints into boolean combinations of con- straints that the constraint solver can handle.

C(x,y) ~ -@(x~y) A vz

p(z~x) ~

P(z~y).

(15a)

-~C(x~y)

~ D(x~y) v 3z (P(z~x) A

-~P(z~y)).

(15b)

The second and the third approaches differ in important ways. In the second ap- proach, c-commands is a relation that is "understood" by the constraint solver (albeit only in its one-sided form), so it can be used to define other relations. In the third approach, c-commands constraints are not primitive constraints, so relations defined in terms of c-commands must also be expressible in terms of "macro expansion." In the second approach, constraints are quantifier-free formulae (quantifiers appear only in the axioms), so the satisfiability problem is in NP. But in the third approach, macro expansion produces formulae that contain additional quantifiers, so the satisfiability problem may be PSPACE-complete.

3.5 L i m i t a t i o n s o n C o n s t r a i n t s E x p r e s s i b l e w i t h SB F o r m u l a e

Of course, SB is not as expressive as full first-order logic. It is incapable of expressing functional relationships, since these require an existential quantifier inside the scope of a universal quantifier. This means, among other things, that it is impossible to state a constraint in SB requiring that a certain node must exist (as was noted in the discussion of c-command in the previous section) or that all nodes possess certain attributes. Thus for example, the following constraint, which requires that every tensed entity p o s s e s s number and person attributes, is a first-order formula that is not in SB, since it requires a functional relationship between entities with tense attributes and the values of their number and person attributes.

Vx Vy arc(x~ tensG y) ~ (3z arc(x~ number~ z)) A (3z arc(x~ person~ z)) (16)

Similarly, a number of other extensions to the basic attribute-value framework dis- cussed in the literature cannot be formalized in SB. Subsumption constraints, used in the treatment of (natural language) conjunction, are not expressible as SB formulae because the satisfiability problem for conjunctions of subsumption and attribute-value constraints is undecidable (D6rre and Rounds 1992). Positively occurring functional

(11)

(El) V x x = x .

(E2) V x V y x = y ~ y = x .

(E3) V x 0 . . .

VXn xj

= x 0 A P ( X l ~ . . . ~

xj~...~ xn) ~ P(Xl~...~ Xo~...~ Xn)

for j = 1~...~ n, for every predicate symbol P appearing in ~.

( E 4 )

VXo...VXn xj

= x 0

--~ f(xl~...~Xj~...~Xn) =f(Xl~...~Xo~...~Xn)

for j = 1~...~ n, for every function symbol f appearing in ~.

Figure 2

Equality axiom schemata for a first-order formula ~.

uncertainty

constraints, used in the LFG treatment of long-distance dependencies (Kap-

lan and Zaenen 1989) appear to have a decidable satisfiability problem (Kaplan and Maxwell 1988a), but the satisfiability problem for arbitrary boolean combinations of functional uncertainty constraints is undecidable (Keller 1991), so these cannot be ex- pressed using SB formulae either (since the quantifier-free subclass of SB is closed under boolean operations).

3.6 The Equality Relation

In this paper the intended interpretation of the equality relation is identity; i.e., a = b if and only if a and b denote the same individual. However, for some purposes (e.g., in the least-fixed-point characterization of minimal models given below) this "special" interpretation of the equality complicates matters, and it is more convenient to treat the equality relation as a "normal" relation that is defined by a set of axioms E.

The idea is that E has the property that a formula ~ is satisfiable under the identity interpretation of equality if and only if {9~} U E is satisfiable in an interpretation in which equality is not given any special treatment. In effect, the axioms E require that the equality relation denotes an equivalence relation, and permit the substitution of equals for equals. Together these imply that no predicate can distinguish equal individuals. This means that in terms of satisfiability and the consequence relation, exactly the same results are obtained irrespective of whether equality is treated as identity or defined by the axioms E.

Such treatments of equality in first-order logic are well known and described in standard texts. For example, Chang and Lee (1973) give the axiom schemata in Figure 2, which generates syntactic equality axioms E for a first-order formula 9~, and prove that E has the properties just described, s

What is important here is that for an SB formula ~ the instances of the axiom schemata are all SB formulae, and there are only finitely many instances of these schemata.

This means that for an arbitrary SB formula ~ there is another SB formula ~ such that ~ is satisfiable with respect to an identity interpretation of equality if and only if A ~ is satisfiable with respect to an interpretation in which equality is treated like any other relation. Thus a method for determining the satisfiability of SB formulae without equality can be used to determine satisfiability of SB formulae in which equality is interpreted as identity.

8 O f course, (E4) h a s n o i n s t a n c e s if ~ is a n SB f o r m u l a , since SB f o r m u l a e d o n o t contain f u n c t i o n s y m b o l s .

[image:11.468.39.423.49.190.2]
(12)

Computational Linguistics Volume 20, Number 1

(17) Vx x = x.

(18) Vx Vy x = y ~ y = x.

(19) Vx Va Vy VXl x -~ Xl /X arc(x, a, y) ~ arc(x1, a, y).

(20) Vx Va Vy Val a = al A arc(x, a, y) ~ arc(x, al, y).

(21) Vx Va Vy V y l y = yl A a r c ( x , a , y ) ~ arc(x,a,yl).

(22) VCl Vc2 c = Cl A con(c) ~ con(c1).

(23) Vx Vy x = y A 3rd-sg(x) ~ 3rd-sg(y).

Figure 3

The equality axioms for arc, con, and 3rd-sg predicates.

For e x a m p l e , consider the SB f o r m u l a e in (1-4) a n d (5-16). These contain the three-place relation s y m b o l arc a n d the one-place relation s y m b o l s con a n d 3rd-sg. T h e equality a x i o m s o b t a i n e d f r o m s c h e m a t a (El-E4) for a n y s y s t e m of constraints that m e n t i o n just these relations are g i v e n in Figure 3.

4. Clausal Form and Disjunctive Datalog

It is technically easier to w o r k w i t h a syntactically restricted class of SB f o r m u l a e w h e r e the b o d y of each f o r m u l a has a particular syntactic f o r m k n o w n as clausal form or Skolem standard form.

Definition

A clause is a f o r m u l a of the f o r m ,-~ o~1 V • • • V ~-, c~m V fll V • • • V ft,, w h e r e each oq a n d fly is an a t o m i c f o r m u l a (i.e., is of the f o r m p ( h , . . . , tn)), a n d m, n > 0. A f o r m u l a ~ is in clausal f o r m iff it is a conjunction of clauses.

The ,-~ ai are called negative literals a n d the fj are called positive literals. A clause for w h i c h m = 0 (i.e., o n e that consists solely of p o s i t i v e literals) is called a positive clause, a n d o n e for w h i c h n = 0 (i.e., o n e that consists solely of n e g a t i v e literals) is called a negative clause. A clause for w h i c h n = 1 is called a definite clause. A Horn clause is a clause for w h i c h n < 1, i.e., either a n e g a t i v e clause or a definite clause.

A b u s i n g n o t a t i o n s o m e w h a t , a f o r m u l a ~ in clausal f o r m will s o m e t i m e s also be treated as the set of the clauses that m a k e u p the conjunction ~. Similarly, b e c a u s e clauses are u s e d as r e w r i t i n g rules below, the clause ~-, o~1 V -.. V ,-~ o~ m V f l V . . . V

fin

will s o m e t i m e s be w r i t t e n as the e q u i v a l e n t i m p l i c a t i o n oz I A . . - / x o~ m ~ fll V .-. V fin. A f o r m u l a ~ in clausal f o r m d o e s n o t contain a n y quantifier s y m b o l s . As is stan- dard, all variables in ~ are treated as implicitly u n i v e r s a l l y quantified at the clausal level. Existentially quantified variables in SB f o r m u l a e are inessential, in that t h e y can a l w a y s be directly r e p l a c e d b y S k o l e m constants.

Restricting attention to SB f o r m u l a e in clausal f o r m i m p o s e s no real restriction o n the class of constraints expressible. S t a n d a r d p r o c e d u r e s for t r a n s f o r m i n g first-order f o r m u l a e into clausal form, s u c h as the ones d e s c r i b e d in C h a n g a n d Lee (1973), D u f f y

[image:12.468.40.436.55.237.2]
(13)

(1991), or G e n e s e r e t h a n d Nilsson (1987), t r a n s f o r m SB f o r m u l a e into SB f o r m u l a e in clausal form.

Interestingly, clausal f o r m SB f o r m u l a e c o r r e s p o n d one-to-one with an extension to Datalog (Ullman 1988) that allows disjunctive " h e a d s " or consequences. In notation b o r r o w e d f r o m disjunctive logic p r o g r a m m i n g (Kowalski 1979; Lobo, Minker, a n d Rajasekar 1992; L o v e l a n d 1987), the clauses a b o v e w o u l d be written as listed in the appendix. While the connection b e t w e e n logic p r o g r a m m i n g a n d feature structures is well k n o w n (Ait-Kaci 1984; Ait-Kaci and Podelski 1993; C a r p e n t e r 1991, 1992; H 6 h f e l d a n d Smolka 1988; Pereira 1987; Shieber 1992; Smolka 1992), this shows that the t h e o r y of feature structure constraints is also related to database t h e o r y as well.

N e g a t i v e clauses c o r r e s p o n d to Datalog integrity constraints, a n d clauses with a single positive literal are definite clauses. Simple assertions, e.g., a b o u t the existence of arcs, consisting of exactly one positive literal are Datalog atomic clauses. Clauses with two or m o r e positive literals cannot be expressed in Datalog itself, b u t require the disjunctive extension of Datalog. The a p p e n d i x displays all of the SB f o r m u l a e m e n t i o n e d in this p a p e r so far in clausal f o r m in Datalog notation; (6'), (10 ~) and (7") are expressed in the disjunctive extension to Datalog. In fact, the axioms defining attribute-value structures (1-4) a n d syntactic equality (El-E3) are all H o r n Datalog clauses; i.e., the disjunctive extension is not n e e d e d for defining attribute-value feature structures.

5. Determining the Satisfiability of SB Formulae

This section describes a f o r w a r d - c h a i n i n g algorithm for d e t e r m i n i n g the satisfiability of SB f o r m u l a e in clausal form. This algorithm is a nondeterministic variant of the semi- naive evaluation m e t h o d for Datalog clauses in which the union-find algorithm is u s e d to efficiently maintain equivalence classes of equal terms. It is also recognizable as a generalization of the s t a n d a r d unification algorithm for feature structures to arbitrary H o r n SB constraints. 9 The t r e a t m e n t is informal because the goal of the section is to point out several i m p o r t a n t standard i m p l e m e n t a t i o n techniques rather than to a d v a n c e a totally n e w algorithm.

The k e y intuition b e h i n d the algorithm is this. To d e m o n s t r a t e the satisfiability of a set S of clauses, it is sufficient to exhibit a set A of g r o u n d atoms d r a w n from the H e r b r a n d base of S such that the following conditions hold (the next section p r o v e s this assertion).

(a)

(b)

(c)

' ~ of a n y negative clause in S are For no g r o u n d instance ~ o~ 1 V .-. V ~ o~ m

t in A. (If they were, then that clause w o u l d be falsified all of c~1,..., c%

b y A.)

For each g r o u n d instance fl~ V • .. v fl~ of a positive clause in S, at least one of the

fl[

is in A.

' A . A ' ~ f l ~ V . . - V f l ~ o f a n i m p l i c a t i o n i n For each g r o u n d instance c h .. o~ m

!

S, if all of the o ~ , . . . , o~ m are in A then so is at least one of the f l ( , . . . , fl~. (In fact, the other two conditions are just special cases of this condition.)

9 It is a generalization of the algorithm described in Hegner (1991), which treats Horn combinations of attribute-value constraints.

(14)

Computational Linguistics Volume 20, Number 1

5.1 Naive Evaluation

One could attempt to find such a set A in the following manner. First, one nondeter- ministically selects a fl~ from each of the ground instances of the positive clauses in S and adds these to A. Then one attempts to close A with respect to condition (c); if ' ' of some (ground instance of an) implication are in A, all of the antecedents o q , . . . , olm

then one of the consequents fl~,..., fl~ is nondeterministically selected and added to A, unless at least one of them is already present. (Of course, all such nondeterministic paths might have to be investigated.) Periodically, condition (a) is checked; if it fails to hold, then this nondeterministic path on the search for A must be abandoned. Non- determinism arises solely from the presence of disjunction in consequents of clauses; if S is a set of Horn clauses then the fixed-point calculation proceeds deterministically. Ignoring the checking of condition (a), the method is essentially computing a fixed- point of the nonnegative clauses in S via a kind of iterative approximation known as naive evaluation. Naive evaluation is unnecessarily computationally inefficient. Once the set A is large enough to require an atom o~ to be added to A, naive evaluation "rediscovers" this requirement on all subsequent passes.

5.2 Semi-Naive Evaluation

Semi-naive evaluation avoids rediscovering the same fact in the same w a y by insisting that each time a clause is applied at least one of the antecedents was just discovered on the previous round (Ullman 1988, 1989). This is done by maintaining two sets of atoms, A and &A, where A is the set of atoms discovered one or more iterations ago, and &A is the set of atoms discovered at the last iteration. The nondeterministic semi-naive algorithm for computing a set A (if it exists) is sketched in Figure 4. In that algorithm choose is a "function" that nondeterministically picks one member from its set argument; it can be implemented using, e.g., backtracking. Ullman describes methods of matching clauses in S against the sets A and &A that avoids calculating all of the ground instances of the clauses in S.

The semi-naive algorithm can be used directly with the syntactic equality axioms given in Section 3.4 as a decision procedure for SB formulae, and hence for systems of feature structure constraints. However, the resulting system is inefficient because the equality axioms, specifically the instances of schemata (E2) and (E3), cause the "copying" of any atom containing an argument that appears in an equality atom to all members of the equivalence class containing that argument.

For example, if p(a), q(b) and a = b are atoms in A, then instances of (E2) and (E3) ensure that p(b),q(a) and b = a will be added to &A and thence to A. In general, if it is discovered that n constants a l , . . , an are equal, then A will ultimately contain the n 2 equalities ai = aj, 1 < i < n, 1 < j <_ n, as well as at least n "copies" of any predicate containing any ai.

5.3 Union-Find and Equality

As noted above, the equality axioms ensure that the relation that the equality symbol denotes is an equivalence relation and the substitutivity of equals for equals. In general, the union-find algorithm (Corman, Leiserson, and Rivest 1990; Gallier 1986; Nelson and Oppen 1980) maintains the equivalence classes of the equality relation far more efficiently than an approach that uses the syntactic equality axioms.

The equivalence classes are encoded by associating each constant with a pointer that is either null or points to another constant, where a points to b only if a = b. These pointer correspond exactly to the "invisible pointers" used in standard implementa- tions of the attribute-value unification algorithm.

(15)

Input: A set of SB clauses S.

O u t p u t : A set of g r o u n d clauses A iff S is satisfiable.

A : - - 0 ,

&A :=

{choose({fl~,...

,fl~}) : fl~ V . . . V fl~ is a g r o u n d instance of a positive clause in S},

until &A := 0 do

A : = A U & A ,

' V .. • V N ' is a g r o u n d instance of a n e g a t i v e if { o ~ , . . . , oz~} C A, w h e r e ~, oL 1 oz m

clause in S a n d at least o n e of the o~; is in &A,

then fail,

a A :=

{choose({fl~,...,

fl~}) : o~ A . . . A olin ~ fl~ V . . . V fl~ is a g r o u n d instance

, . . . , ~ is in &A

of an implication in S s u c h that {oL~ Oq'n} C A, at least o n e of the o~ i a n d no flj~ is in A},

return A.

Figure 4

The semi-naive algorithm for computing A.

The find o p e r a t i o n dereferences its a r g u m e n t , i.e., it follows these p o i n t e r s until it reaches a c o n s t a n t w i t h a null pointer, w h i c h is the e q u i v a l e n c e classes' representative. Just as in the s t a n d a r d a t t r i b u t e - v a l u e unification algorithm, all a r g u m e n t s are a l w a y s d e r e f e r e n c e d before t h e y are used.

The

union

operation, called w h e n e v e r an a t o m a = b is a d d e d to the set A, m e r g e s their e q u i v a l e n c e classes b y redirecting the p o i n t e r associated w i t h the r e p r e s e n t a t i v e of one of t h e m to p o i n t to the r e p r e s e n t a t i v e of the other. 1°

In this a p p r o a c h , o n l y a t o m s that contain the redirected constant n e e d to be a d d e d to &A a n d thence to A. For e x a m p l e , if p(a) a n d

q(b)

are a t o m s in A a n d the e q u a l i t y a = b is discovered, c a u s i n g a to be redirected to b, t h e n o n l y

p(b)

is a d d e d to &A, a n d thence to A. Further, the "original" a t o m

p(a)

is no longer required; indeed, the n e w a t o m

p(b)

is exactly an a r g u m e n t - d e r e f e r e n c e d v a r i a n t of the old a t o m , so it is not n e c e s s a r y to c o p y the a t o m at all. In general, equalities b e t w e e n n items are repre- sented b y n - 1 n o n n u l l pointers, a n d c o p y i n g of a t o m s can be a v o i d e d b y a r g u m e n t dereferencing.

5.4 An Example

This section p r e s e n t s a v e r y s i m p l e e x a m p l e that d e m o n s t r a t e s the s e m i - n a i v e algo- r i t h m a n d the union-find techniques. The clauses u s e d are the a t t r i b u t e - v a l u e a x i o m s c h e m a t a (1-4) a n d the a x i o m s defining the sort

3rd-sg

(5), as well as the a d d i t i o n a l

10 The union-find algorithm achieves quasi-linear running time when it incorporates path compression and union by rank (Corman, Leiserson, and Rivest 1990).

(16)

Computational Linguistics Volume 20, Number 1

s I

3rd-sg(u), arc(u, number, v), v # sg,

Vx Va Vy Vz arc(x, a, y) A arc(x, a, z) ~ y = z,

Vx Va Vy ,-~ con(x)V ~ arc(x,a,y),

con(sg), con(pl), con(3rd),

sg # pl, sg # 3rd, pl # 3rd,

Vx 3rd-sg(x) ~ arc(x, person, 3rd),

Vx 3rd-sg(x) ~ arc(x, number, sg)

constraints

AV axioms

sort defn.

Figure 5

Input clauses.

constraints

3rd-sg(u), arc(u, number, v)

a n d

v ~ sg.

These latter constraints assert the existence of an entity ( d e n o t e d b y u) with the p r o p e r t y

3rd-sg

a n d with a

number

arc w h o s e v a l u e v is s o m e t h i n g o t h e r than

pl.

The c o m p l e t e set of clauses is given in Figure 5. In practice the attribute-value axioms w o u l d p r o b a b l y not be explicitly enu- m e r a t e d b u t "built in," so that a p p r o p r i a t e instances are g e n e r a t e d o n l y w h e n n e e d e d . N o w w e p r o c e e d to iteratively calculate the sets &A a n d A using the a l g o r i t h m in Figure 4. We calculate the first initial set of n e w atoms, &A0. A0 = 0, of course.

&Ao = {3rd-sg(u), arc(u, number, v), con(sg), con(pl), con(3rd)}

O n the first iteration w e note that the a n t e c e d e n t s of b o t h clauses defining the sort

3rd-sg

are satisfied (with x b o u n d to u), so &A1 is given as follows.

A1 = {3rd-sg(u),arc(u,number~ v),con(sg),con(pl),con(3rd)}

&A1 = {arc(u,person~3rd),arc(u, number, sg)}

In the s e c o n d iteration the a n t e c e d e n t s of the first attribute-value axiom are satisfied, so &A2 contains an equality atom.

3rd-sg(u), arc(u, number, v), con(sg), con(pl), con(3rd), I

A2 =

arc(u, person,3rd),arc(u, number, sg)

J

&A2 = {v = sg}

The equality a t o m causes v to be redirected to

sg,

a n d at this stage the inconsistency of the d e r i v e d a t o m v =

sg in &A2 w i t h the i n p u t constraint v ~ sg in S is detected. (It

m a y be helpful to think of the constraint

v ~ sg as the equivalent clause v =

sg ~ false).

The a l g o r i t h m therefore returns w i t h failure, indicating that the set S is unsatisfiable. At the point at w h i c h the inconsistency is detected, the set A contains the following atoms, w h e r e

v =~ sg

indicates that v is redirected to

sg.

{ 3rd-sg(u)~ arc(u, number, v), con(sg), con(pl), con( 3rd),

A 3 ~-

arc(u, person, 3rd), arc(u, number, sg),

v ~ sg

The c o r r e s p o n d e n c e of this p r o c e d u r e to the s t a n d a r d attribute-value unification algo- r i t h m is quite strong. In this p r o c e d u r e , the attribute-value a x i o m (1) detects situations in w h i c h some n o d e has two arcs w i t h the same label pointing to, say, y a n d z. If such

[image:16.468.53.436.67.210.2]
(17)

a situation arises, the equality y = z is inferred, w h i c h results in y being redirected to z a n d causes all of the arcs leaving y to be a d d e d to &A, w h e r e they will be com- p a r e d with the arcs leaving z. The other attribute-value a x i o m schemata (2--4) detect c o n s t a n t - c o n s t a n t and c o n s t a n t - c o m p l e x clashes, causing failure if one is f o u n d .

Efficient processing d e m a n d s that the atoms in A be i n d e x e d b y their a r g u m e n t s to speed u p the matching atoms with the antecedents of clauses. One w a y of d o i n g this is to store on each constant a list of the atoms in w h i c h that constant appears. Such an index has the same structure as the s t a n d a r d g r a p h e n c o d i n g of feature structure constraints.

6. A F i x e d - P o i n t T h e o r e m

We n o w turn to the theoretical justification of the b o t t o m - u p f o r w a r d - c h a i n i n g proce- d u r e s sketched in the last section, a n d s h o w that such m e t h o d s will find a m o d e l for a set of SB f o r m u l a e in clausal f o r m if one exists. This section d e m o n s t r a t e s that an SB f o r m u l a in clausal f o r m is satisfiable if and only if a b o t t o m - u p f o r w a r d - c h a i n i n g p r o c e d u r e finds a d e d u c t i v e l y closed set of atoms A. A similar t h e o r e m for the case in w h i c h all the clauses are H o r n clauses is p r e s e n t e d in Lloyd (1984); this section extends that w o r k to arbitrary clauses.

It presents a characterization of the m o d e l s of an arbitrary first-order f o r m u l a 9~ in clausal f o r m in terms of the least-fixed points of a set {T~, X} of partial functions from H e r b r a n d interpretations to H e r b r a n d interpretations. These functions h a v e the p r o p e r t y that A is a H e r b r a n d interpretation that satisfies ~ if a n d only if the least-fixed p o i n t of at least one of t h e m is a s u b m o d e l of A.

For SB f o r m u l a e this set of functions is finite a n d the least-fixed points are reached in a finitely b o u n d e d n u m b e r of iterations. Since the p r o c e d u r e s described in the last section calculate the least-fixed points of these functions, t h e y can be used to d e t e r m i n e the satisfiability of an arbitrary SB f o r m u l a as well as all of its g r o u n d atomic consequences.

The functions T~, x p l a y a similiar r61e here to one that the t r a n s f o r m a t i o n

Tp plays

in the least-fixed-point semantics of H o r n clause programs. Informally, each function in the set {T~,x} c o r r e s p o n d s to one w h o l e sequence of n o n d e t e r m i n i s t i c choices of disjuncts in n o n - H o r n clauses that could be m a d e d u r i n g an iterative a p p r o x i m a t i o n of the least-fixed point. This section is based on Sections 5 a n d 6 of C h a p t e r I of L l o y d (1984), to w h i c h the r e a d e r s h o u l d turn for f u r t h e r details.

The fixed-point t h e o r e m holds for arbitrary first-order f o r m u l a e in clausal form, b u t the set {T~,x} is finite if a n d only if ~ does not contain a n y function symbols, i.e., is an SB formula. Equality is not treated specially, so the f o r m u l a 9~ m u s t contain a p p r o p r i a t e equality axioms, as m e n t i o n e d above.

Let U be the H e r b r a n d universe with respect to ~ (i.e., the set of all t e r m s that can be constructed using the constant a n d function symbols a p p e a r i n g in ~),11 a n d let B~ be the set of all g r o u n d atoms that can be f o r m e d using the predicate symbols of 9~ with elements of U as arguments. A H e r b r a n d interpretation A is a subset of B~.

N o t e that the set of H e r b r a n d interpretations 2 B~ partially o r d e r e d b y the subset relation forms a c o m p l e t e lattice. Further, U a n d h e n c e B~ are finite iff ~ is an SB formula. (If ~ is in clausal f o r m but is not an SB f o r m u l a then it m u s t contain a function symbol, so its H e r b r a n d universe U is infinite.)

11 As is standard, if 9~ has no constant symbols then it is necessary to take U to be a set consisting of a single constant, say {a}. See, e.g., Chang and Lee (1973) for details.

(18)

Computational Linguistics Volume 20, Number 1

Definition

A H e r b r a n d i n t e r p r e t a t i o n A t r i v i a l l y falsifies a set of clauses ~ iff there is a g r o u n d ' ' of a n e g a t i v e clause ,-~ a l V - . . V ,-~ am in 9~ s u c h that instance ,-, o~ 1 v - - . v , ~ a m

c A.

That is, a H e r b r a n d i n t e r p r e t a t i o n trivially falsifies a set of clauses ~ just in case s o m e n e g a t i v e clause in ~ is false in that interpretation. Clearly, a n y such i n t e r p r e t a t i o n c a n n o t satisfy ~, b u t the c o n v e r s e d o e s not hold: there are i n t e r p r e t a t i o n s that d o not trivially falsify 9~ b u t still d o not satisfy ~ b e c a u s e t h e y d o n o t satisfy o n e or m o r e of the n o n n e g a t i v e clauses in ~.

We t u r n n o w to the n o n n e g a t i v e clauses in ~. The idea is that e v e n if A d o e s not

I A - A t / I

satisfy a g r o u n d instance a 1 "" am ~ f l V ..- V fin of s o m e n o n n e g a t i v e clause in ~, w e can e x t e n d A so that it d o e s so b y a d d i n g o n e of the f ; . T h e chief technical difficulty here is c a u s e d b y the n o n d e t e r m i n i s m i n v o l v e d in d e c i d i n g w h i c h of the f [ to a d d , a n d a device called a "choice function" is i n t r o d u c e d to choose, for each g r o u n d instance of a clause, w h i c h of the a t o m s in its c o n s e q u e n t will be a d d e d to A if its a n t e c e d e n t is c o n t a i n e d in A. A choice function for a clause a l A . . , A am ~ f l V . . . V fin

is therefore a function f r o m all possible w a y s of g r o u n d i n g that clause to o n e of the fli.

Definition

A choice function for a clause c = (al A . . . A am ~ fa V . . . V ft,) is a n y function in (Vc -+ U) ~ [ 1 , . . . , n], w h e r e Vc is the set of v a r i a b l e s in the clause c.

T h a t is, a choice function for a clause is a function f r o m v a r i a b l e a s s i g n m e n t s to an integer r e p r e s e n t i n g o n e of the c l a u s e ' s consequents. It is so n a m e d b e c a u s e for each v a r i a b l e a s s i g n m e n t (i.e., each w a y of g r o u n d i n g the v a r i a b l e s in ~) it " c h o o s e s " a n a t o m f r o m the c o n s e q u e n t of the clause. H o r n clauses h a v e o n l y o n e choice function, a n d n e g a t i v e clauses h a v e no choice functions at all. N o t e that since U is finite for SB f o r m u l a e , there are a finite n u m b e r of v a r i a b l e a s s i g n m e n t functions for a n y SB clause a n d hence o n l y a finite n u m b e r of choice functions for a n y SB clause.

A choice function X for a set of clauses ~ is a function f r o m 9~ to choice functions s u c h that for each n o n n e g a t i v e clause c in ~, X(c) is a choice function for c (the v a l u e that X takes on n e g a t i v e clauses is ignored). Clearly, a choice function exists for e v e r y set of clauses.

G i v e n a set of clauses ~ a n d a choice function X for ~o, w e define a function F~, x f r o m H e r b r a n d i n t e r p r e t a t i o n s to H e r b r a n d i n t e r p r e t a t i o n s as follows.

Definition

F~, x is a function in 2 B* ~ 2 B* that is d e f i n e d as follows.

= { 0 ( f i ) : for all n o n n e g a t i v e clauses

C = ( a l A " ' " A otto ---~ f l l V " ' " V f l n ) in

a n d for all v a r i a b l e a s s i g n m e n t s 0 : Vc ~ U s u c h that 0 ( a , ) , . . . , 0 ( a , ) E A a n d i = X(c)(O)}

Intuitively, F~, x c o r r e s p o n d s to o n e n o n d e t e r m i n i s t i c step in the ' b o t t o m - u p ' construc- tion of a H e r b r a n d m o d e l for ~ d e s c r i b e d in the p r e v i o u s section. If A m a k e s the a n t e c e d e n t of s o m e g r o u n d instance of s o m e clause in 9~ true, t h e n w e u s e the choice function X to pick an a t o m in the c o n s e q u e n t of that g r o u n d clause a n d a d d it to the interpretation. Different choice functions X r e p r e s e n t different s e q u e n c e s of n o n d e t e r - ministic choices, a n d result in the construction of p o s s i b l y different interpretations.

Figure

Figure 1 Three constraints expressed as formulae and also depicted graphically.
Figure 2 Equality axiom schemata for a first-order formula ~.
Figure 3 The equality axioms for
Figure 5 Input clauses.

References

Related documents