• No results found

Dependency Parsing

N/A
N/A
Protected

Academic year: 2020

Share "Dependency Parsing"

Copied!
377
0
0

Loading.... (view fulltext now)

Full text

(1)

D ep en d en cy Parsing

N o r m a n M a c A sk ill F raser

Thesis sub m itted for the degree o f P h D U n iversity College London

(2)

ProQuest Number: 10106699

All rights reserved

INFORMATION TO ALL USERS

The quality of this reproduction is dependent upon the quality of the copy submitted.

In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed,

a note will indicate the deletion.

uest.

ProQuest 10106699

Published by ProQuest LLC(2016). Copyright of the Dissertation is held by the Author.

All rights reserved.

This work is protected against unauthorized copying under Title 17, United States Code. Microform Edition © ProQuest LLC.

ProQuest LLC

789 East Eisenhower Parkway P.O. Box 1346

(3)

A b stra ct

Syntactic structure can be expressed in terms of either constituency or de­ pendency. Constituency relations hold between phrases and their constituent lexical or phrasal parts. Dependency relations hold between individual words.

Almost all results in formal language theory relate to constituency grammars,

of which the phrase structure grammars are best known. In the realm of n atu­

ral language description, almost all m ajor linguistic theories express syntactic

structure in terms of constituency. This dominance carries over into natural

language processing, where most parsers are designed to discover the vertical

constituency relations which hold between words and phrases, rather than the

horizontal dependency relations which hold between pairs of words.

This thesis introduces dependency grammars, their formal properties, their

origins in linguistic theory and, particularly, their use in parsers for natural lan­

guage processing. A survey of dependency parsers — the most comprehensive

to date — is presented. It includes detailed discussions of twelve published de­

pendency parsing algorithms. The survey highlights similarities and differences

between dependency parsing and m ainstream phrase structure gram m ar pars­

ing. In particular, it examines the hypotheses th at (i) it is possible to construct

a fully functional dependency parser based on an established phrase structure

parsing algorithm without altering any fundamental aspects of the algorithm,

and (ii) it is possible to construct a fully functional dependency parser using

an algorithm which could not be applied without substantial modification in

a fully functional phrase structure parser.

Elements of a taxonomy of dependency parsing are outlined. These include

variables in origin, manner, order, and focus of search, as well as in the number

of passes made during parsing, techniques for the management of ambiguity,

and the use of an adjacency constraint to limit search.

Com puter implementations of a number of original dependency parsing

algorithms are presented in an Appendix, together with new implementations

(4)

C ontents

A cknow ledgem ents 13

A bbreviations 15

1 Introduction 16

1.1 Scope of the t h e s i s ...16

1.2 C hapter o u t l i n e ... 22

2 D ep en d en cy gram mar 23 2.1 O verview ... 23

2.2 Gaifman g r a m m a r s ... 24

2.2.1 D efin itio n s...24

2.2.2 A recognizer for Gaifman g ra m m a rs ... 30

2.2.3 Representing dependency s tru c tu re s ...32

2.2.4 T he generative capacity of Gaifman g ra m m a rs ... 36

2.3 Beyond Gaifman grammars ... 41

2.4 Origins in Hnguistic t h e o r y ... 43

2.5 Related gram m atical fo rm alism s...51

2.5.1 Case g ra m m a r... 52

2.5.2 Categorial gram m ar ... 53

2.5.3 Head-driven phrase structure g r a m m a r ... 57

2.6 S u m m a r y ... 58

(5)

3.1.1 M achine tran slatio n s y s t e m s ... 61

3.1.2 Speech understanding s y s te m s ... 63

3.1.3 O th er a p p l i c a t i o n s ... 64

3.1.4 Im plem entations of t h e o r i e s ... 64

3.1.5 E xploratory s y s t e m s ... 65

3.2 PARS: Parsing A lgorithm R epresentation S c h e m e ...69

3.2.1 D a ta s t r u c t u r e s ... 69

3.2.2 E x p r e s s io n s ... 71

3.3 S u n u n a r y ...75

4 T h e R A N D parsers 76 4.1 O v erv iew ... 76

4.2 T h e b o tto m -u p a l g o r i t h m ... 78

4.2.1 Basic p r in c ip le s ... 78

4.2.2 T h e parsing a l g o r i t h m ...79

4.3 T h e top-dow n a l g o r i t h m ...85

4.3.1 T h e parsing a l g o r i t h m ...85

4.4 S u m m a r y ...88

5 H e llw ig ’s P L A IN s y ste m 90 5.1 O v erv iew ... 90

5.2 D ependency R epresentation Language ... 91

5.2.1 T h e form of D RL e x p r e s s io n s ... 91

5.2.2 W ord order c o n s tr a in t s ... 94

5.2.3 T h e base l e x i c o n ... 96

5.2.4 T h e valency lexicon . ... 96

5.3 T h e parsing a l g o r i t h m ... 98

5.4 T h e well-formed su b strin g t a b l e ... 102

(6)

6 T h e K ielik on e parser 107

6.1 O v erv iew ... 107

6.2 Evolution of th e p a r s e r ... 109

6.2.1 T h e earliest version: two way finite a u t o m a t a ...109

6.2.2 A gram m ar represen tatio n language: D P L ... 113

6.2.3 C o nstrain t based gram m ar: F U N D P L ... 115

6.3 T h e p a r s e r ... 120

6.3.1 T h e g r a m m a r ... 120

6.3.2 B lackboard-based c o n tro l... 121

6.3.3 T h e parsing a l g o r i t h m ... 123

6.3.4 A m b ig u ity ... 128

6.3.5 Long distance d e p e n d e n c i e s ... 128

6.3.6 S tatistics and p e rfo rm a n c e ...129

6.3.7 O pen q u e s t i o n s ... 130

6.4 S u m m a r y ...132

7 T h e DLT M T sy ste m 134 7.1 O v erv iew ... 134

7.2 D ependency gram m ar in D L T ... 137

7.3 An ATN for parsing d e p e n d e n c ie s ...140

7.4 A probabilistic dependency p a r s e r ...143

7.5 S u m m a r y ...149

8 L exicase parsers 151 8.1 O v erv iew ... 151

8.2 Lexicase t h e o r y ...152

8.2.1 D ependency in L e x i c a s e ... 153

8.2.2 Lexical entries in L e x ic a s e ... 159

8.3 Lexicase p a r s in g ...164

(7)

8.3.2 Lindsey’s p a r s e r ... 170

8.4 Sum m ary ... 172

9 W ord G ram m ar parsers 174 9.1 O v erv iew ... 174

9.2 W ord G ram m ar t h e o r y ... 175

9.2.1 Facts ab o u t words ... 175

9.2.2 G eneralizations ab ou t w o r d s ... 181

9.2.3 A single-predicate s y s t e m ...186

9.2.4 S y n tax in W G ... 187

9.2.5 Sem antics in W ord G r a m m a r ... 191

9.3 W ord G ram m ar parsing ...193

9.3.1 F raser’s p a r s e r ... 194

9.3.2 H udson’s p a r s e r ...208

9.4 Sum m ary ... 215

10 C o v in g to n ’s parser 217 10.1 O v erv iew ... 217

10.2 E arly dependency g r a m m a r ia n s ... 217

10.3 U nification-based dependency g r a m m a r ... 218

10.4 C ovington’s p a r s e r ...220

10.5 S u m m a r y ...228

11 T h e CSELT la ttic e parser 230 11.1 O v erv iew ... 230

11.2 T h e problem: lattice p a r s in g ... 231

11.3 T h e solution: th e SYNAPSIS parser ...235

11.3.1 Overview of SYNAPSIS ... 235

11.3.2 D ependency gram m ar ...238

11.3.3 Casefram es ... 240

(8)

11.3.5 T h e sequential p a r s e r ... 243

11.3.6 T h e parallel p a r s e r ...249

11.4 S u m m a r y ...251

12 E lem en ts o f a ta x o n o m y o f d ep en d en cy parsing 254 12.1 Search origin ... 254

12.1.1 B ottom -up dependency p a r s i n g ... 256

12.1.2 Top-down dependency p a r s i n g ...261

12.1.3 Mixed top-dow n and b o tto m -u p dependency parsing . . 269

12.2 Search m anner ...271

12.3 Search o r d e r ...272

12.4 N um ber of p a s s e s ... 275

12.5 Search f o c u s ...276

12.5.1 N etwork n a v ig a tio n ... 277

12.5.2 P a ir s e l e c t i o n ... 277

12.5.3 Heads seek d e p e n d e n ts ...278

12.5.4 D ependents seek h e a d s ...278

12.5.5 Heads seek dependents ordependents seek h e a d s ... 279

12.5.6 Heads seek dependents an d d ependents seek heads . . . . 279

12.5.7 Heads seek dependents th en d ependents seek heads . . . . 279

12.5.8 D ependents seek heads th e n h eads seek dependents . . . . 281

12.6 A m biguity m anagem ent ...281

12.7 A djacency as a constraint on s e a r c h ...288

12.8 S u m m a r y ... 289

(9)

L ist o f F ig u res

2.1 stem m a for S m a rt people dislike stupid ro b o ts... 33

2.2 tree diagram (D -m arker) for Sm art people dislike stupid robots . 33 2.3 arc diagram for S m art people dislike stupid r o b o t s...34

2.4 dependency tree for * S m a r t people stupid dislike r o b o ts...35

2.5 arc diagram for *Smart people stupid dislike ro b o ts...35

2.6 D ependency stru c tu re of Old sailors tell tall t a l e s... 36

2.7 F irst phrase stru c tu re analysis of They are racing horces . . . . 39

2.8 Second phrase stru c tu re analysis of They are racing horces . . . 39

2.9 D ependency stru c tu re for They are racing horses. T h e sentence root is racing...40

2.10 syntactic stru c tu re in DG (a) and in H PSG (b) 58

3.1 dependency-based NLP p r o j e c t s ... 68

5.1 stem m a showing a simple dependency s t r u c t u r e ...92

5.2 Hellwig’s W EST for Flying planes can be d a n g e r o u s... 104

6.1 a functional dependency s t r u c t u r e ... 110

6.2 left and right context s t a c k s ...112

6.3 a D PL definition of S u b j e c t ...115

6.4 th e general form of functional s c h e m a ta ... 117

6.5 a schem a for Finnish tran sitiv e v e r b s ...118

6.6 th e binary relation ‘S u b ject’ ...118

6.7 th e ‘S y n C a t’ c a te g o ry ... 119

(10)

6.9 th e Kielikone parser control strateg y a u t o m a t o n ... 126

7.1 th e D istributed Language T ranslation system ... 137

7.2 dependency analysis of th e sentence Whom did you say it was given t o ?...139

7.3 th e use of com ma in coordinate stru c tu re a n a ly s e s ...140

7.4 an ATN for parsing D anish s e n te n c e s ... 142

7.5 an ATN for parsing D anish s u b je c ts ...143

7.6 a dependency link netw ork for th e sentence You can remove the docum ent fro m the d r a w e r...148

8.1 a syn tactic stru c tu re w ith em pty nodes 155 8.2 a sy n tactic stru c tu re w ithout em pty n o d e s ... 155

8.3 a sy n tactic stru c tu re constrained by th e one-bar co n strain t . . .1 5 8 8.4 a Lexicase syntactic s t r u c t u r e ... 159

8.5 com ponents of S taro sta & N om ura’s Lexicase p a r s e r ... 164

8.6 a m aster en try showing th e intersection of th e featu re sets of two hom ographie w o rd s ...171

9.1 dependency stru c tu re of Ollie obeyed R o n n i e... 177

9.2 p a rt of th e W G ontological h i e r a r c h y ... 181

9.3 p a rt of th e W G word ty p e h ie ra rc h y ...182

9.4 p a rt of th e W G gram m atical relation h ie ra rc h y ...184

9.5 a W G dependency a n a ly s is ... 187

9.6 th e use of constituency in W G ... 188

9.7 a stru c tu re p erm itted by W G ’s version of a d ja c e n c y ...189

9.8 th e use of visitor links to bind an extracted elem ent to th e m ain v e r b ... 189

(11)

9.10 th e use of visitor links to in terp ret th e object of an em bedded

s e n t e n c e ...191

9.11 sem antic stru c tu re is very sim ilar to syntactic stru c tu re in W G . 192 9.12 a prohibited dependency s tr u c tu r e ... 203

9.13 with a telescope depends on s a w...215

9.14 with a telescope depends on the m an ...215

11.1 a simple lattice for th e u tte re d words I k n o w...232

11.2 a SYNAPSIS case fra m e...241

11.3 a SYNAPSIS dependency r u l e ... 242

11.4 an o th er SYNAPSIS c a s e f r a m e ... 242

11.5 a SYNAPSIS knowledge s o u r c e ...242

11.6 a simplified DI showing jolly s l o t s ... 249

11.7 a single parse t r e e ...250

11.8 a d istrib u ted representation of th e sam e parse t r e e ...251

12.1 PSG and DG analyses of th e sentence Tall people sleep in long beds ...258

12.2 phrase stru c tu re of A cat sleeps on the c o m p u te r... 263

(12)

L ist o f T ables

2.1 Subtrees in Figure 2.6 37

2.2 C om plete subtrees in Figure 2 . 6 ...37

2.3 C om plete subtree labels in Figure 2 . 6 ... 38

2.4 Subtrees and com plete subtrees in th e DG analysis of th e sen­ tence They are racing horses shown in Figure 2.9. O nly com ­ plete subtrees are labelled... 40

2.5 C o n stitu en ts in th e phrase stru c tu re analysis of th e sentence They are racing horses shown in Figure 2 . 7 ...41

4.1 m ain features of H ays’ bottom -up dependency p a r s e r ... 88

4.2 m ain features of H ays’ top-dow n dependency p a r s e r ... 89

5.1 m ain features of Hellwig’s dependency p a r s e r ... 106

6.1 m ain features of th e Kielikone dependency p a r s e r ... 133

7.1 different dependency links retrieved from th e B K B ... 146

7.2 m ain features of th e DLT ATN dependency p a r s e r ... 150

7.3 m ain features of th e DLT probabilistic dependency parser . . . .1 5 0 8.1 m ain features of S taro sta and N om ura’s Lexicase p a r s e r ...173

8.2 m ain features of Lindsey’s Lexicase p a r s e r ... 173

9.1 inheriting properties for w l ...185

9.2 m ain features of F raser’s W ord G ram m ar p a r s e r ... 216

(13)

10.1 m ain features of C ovington’s first two dependency parsers . . . . 229

11.1 m ain features of th e SYNAPSIS dependency p a r s e r ... 253

12.1 origin of search—s u m m a r y ... 255

12.2 m anner of search— s u m m a r y ... 272

12.3 order of search—s u m m a r y ... 273

12.4 num ber of passes— s u m m a r y ... 276

12.5 focus of search— s u m m a r y ... 277

(14)

A ck n o w led g em en ts

This thesis m ay b ear one nam e on its title page b u t it represents an in ­

vestm ent of tim e and effort, of wise advice and honest criticism , of practical

su p p o rt and unfailing love on th e p a rt of m any people. I am grateful to th em

all.

F irst m ention m ust go to Dick Hudson, who has been so m uch m ore th a n

ju st a thesis supervisor. O ver th e years he has selflessly given me his tim e,

enthusiasm and insight. He has listened p atien tly to all of m y hair-brained

ideas and helped me to have fewer of them . M y heartfelt thanks go to him

and to his family, Gay, Lucy and Alice, who have never failed to respond

positively to my all too frequent disruptions of their dom estic hves.

I am very grateful to Neil Sm ith and all mem bers of th e D e p artm en t of

Phonetics and Linguistics a t U niversity College London for su p p o rtin g m e so

well during m y tim e in th eir m idst. Special thanks are due to M ark Huckvale,

M onika P ounder, and a num ber of members of th e W ord G ram m ar sem inar,

including Billy C lark, Jo h n Fletcher, and And R osta. I have also benefited

enorm ously from th e su p p o rt and encouragem ent I have received as a m em ber

of th e Social and C om puter Sciences R esearch G roup a t the U niversity of S u r­

rey. I am grateful to all members of th e group, and especially to Nigel G ilbert

for enabling me to fit thesis-w riting into a hectic research schedule, and to

Scott M cG lashan for his expert assistance w ith th e HTjgX/ ty p esettin g pack­

age. I have gained m uch from discussions w ith other people a t th e U niversity

of Surrey, p articu larly G rev C o rb e tt an d R on K n o tt.

T h e finishing touches were added while I was a m em ber of th e Speech

and Language Division of Logica C am bridge Ltd. I am grateful to Jerem y

Peckham for his persistent belief in th e value of NLP research and for his

practical su p p o rt, and to Nick Youd, Simon T h o rn to n , Trevor T hom as and

(15)

A significant portion of this thesis is devoted to dissecting other people’s

dependency parsers. I would not have been able to do so w ithout th e help of

those individuals who m ade otherw ise unobtainable inform ation available to

me. M any of th em have read drafts of p arts of th e thesis, and th eir com m ents

have been invaluable. T hey include Doug Arnold, P aulo Baggia, M ichael Cov­

ington, P e te r Hell wig, G erhard N iederm air, C laudio R ullent, K laus Schubert,

Stan S taro sta and Job van Zuijlen.

I have lost track of the num ber of friends and relations who have helped

me by providing practical su p p o rt, by telling me to get on w ith it, and by

m aking me laugh. T h e generous gift of Ian and M air B unting, who provided

th e perfect re tre a t in which to work w ithout fear of in terru p tio n , has hastened

th e com pletion of this thesis by an enorm ous am ount. Likewise, th e p ractical

su p p o rt of Jim and Rilla C annon, whose h o sp itality knows no bounds. My

family have provided th e sort of long-distance su p p o rt which always feels close

a t hand.

M ost of all, I w ant to th a n k Sarah for p u ttin g up w ith m y n o ctu rn al w riting

habits, for believing th a t I really would finish this thing, and for being my

friend.

(16)

A b b rev ia tio n s

A PSG augm ented phrase stru c tu re gram m ar ATN augm ented tran sitio n network

B F P b est fit principle

COG com binatory categorial gram m ar CD conceptual dependency

C F PS G context-free phrase stru c tu re gram m ar CG categorial gram m ar

CN F C hom sky norm al form DCG definite clause gram m ar

DDG d au g h ter dependency gram m ar DG dependency gram m ar

DUG dependency unification gram m ar FU G functional unification gram m ar GB governm ent-binding theory

G PS G generalized phrase stru c tu re gram m ar H PSG head-driven phrase stru c tu re gram m ar ID im m ediate dom inance

LFG lexical-functional gram m ar LP linear precedence

M T m achine translatio n

N LP n a tu ra l language processing PSG phrase stru c tu re gram m ar TAG tree-adjoining gram m ar

(17)

C h a p ter 1

In tr o d u c tio n

T h e in tu itive appeals of th e two theories cannot b e discussed, since intuitions are personal and irrational. (Hays 1964: 522)

1.1

S c o p e o f t h e th e s is

T here are, in contem porary linguistic theory, two different views of gram m at­

ical relations. T h e first of these sees relations of g ram m atical dependency as

basic: sy n tactic stru ctu res are essentially networks of gram m atically related

entities. T h e second view denies gram m atical relations basic s ta tu s, instead

seeing th em as being derived from more fundam ental stru ctu res, such as con­

stitu e n t structures. This la tte r view has predom inated th ro u g h o u t m ost of

this century, first in I m m e d ia te C o n s t i t u e n t (IC) analysis (Bloomfield 1914,

1933), an d later, from th e mid-1950s onw ards, in P h r a s e S t r u c t u r e G r a m ­

m a r (PSG ) (C hom sky 1957).

T h e dom ination of constituency-based approaches has n o t been lim ited

to theoretical linguistics. In com putational linguistics also, th e overwhelm­

ing m a jo rity of proposals which posit a d istin ct sy n tactic layer assum e th a t

th a t layer is based on co n stitu en t stru c tu re ra th e r th a n dependency stru ctu re.

This asym m etry can not legitim ately be a ttrib u te d to an y established results

showing th e superiority of one system over th e o th er in respect of descriptive

adequacy, or any oth er sub stan tive function: no such results exist. However,

(18)

gram m atical dependency is alm ost as old as th e stu d y of gram m ar, it has, for

m ost of its existence rem ained ju st th a t: a notion.

T h e first rigorous form alization of a dependency gram m ar (DG) cam e ju st

over th irty years ago (see G aifm an 1965), a few years after th e first form aliza­

tion of th e class of PSG s (C hom sky 1956). By th e tim e th e form al definition of

a DG was published in a wide circulation journal, th e corresponding definitions

of PSG had been in th e public dom ain for a decade, w ith large in tern atio n al

program m es of research in form al language theory and theoretical linguistics

building on a PSG foundation. DG as an expHcitly articu lated system thus

entered an aren a in which PSG was already well-established. Given th a t th e

earliest published form al accounts of DG established its equivalence (weak and

strong) w ith context-free PSG (C FPSG )^, there was little incentive to a b an ­

don th e now fam iliar and w ell-understood form alism in favour of th e unfam iliar

an d com paratively less-well understood formalism.

A rem arkable situ atio n now obtains. Form al work in DG is virtually frozen

in th e s ta te it was in around th e mid-1960s, w ith only a handful of groups

around th e world m aking any (m odest) advances since then (hardly any of

w hich has ever been published in English). In co n trast, a much larger —

th o u gh still m odest by PSG stan d ard s — num ber of theoretical linguists con­

tinues to assum e some version of DG as th e foundation of sy n tactic stru c­

tu re. U nfortunately, alm ost all linguistic theories based on DG have d ep arted

to some ex ten t from th e te rra firm a of form al definition.^ Since th e choice

of DG as basic is a m inority preference, those m aking th e choice have gone

to some lengths to argue th e case for DG ra th e r th a n PSG (for exam ple,

H udson 1984: 92-8, forthcom ing; S taro sta 1988: 35-6). T he opposite is gen­

erally not found: proponents of theories based on PSG do no t typically su pp o rt

th e choice of PSG w ith argum ents for th e superiority of PSG over DG (b u t

^Given a definition of equivalence to be described in Chapter 2 below.

(19)

see th e d e b ate in H udson 1980a; D ahl 1980; H udson 1980b; H ie taran ta 1981;

and H udson 1981b for some responses to argum ents against PSG ).

T h e principal arg u m ent offered by proponents of DG is th a t PSG ap­

proaches introduce a red u n d an t layer of structure. Lexical-Functional G ram ­

m ar (L FG ) offers a p articu larly clear illustration of this, w ith its c-stru ctu re

(c o n stitu en t stru ctu re) and sep arate f-stru ctu re (functional stru c tu re ), th e la t­

te r being constructed by reference to th e form er (K aplan an d B resnan 1982).

In a DG approach a single stru c tu re suffices. T h e position ad o p ted by m any

advocates of PSG is th a t it is unnecessary, not to say im possible, to argue

against m oving targ ets such as the underform alized versions of DG on offer.

T his is to present th e issues as being neatly polarized. In fact, m ost lin­

guists nowadays w ork w ith hybrid system s which express b o th dependency

and constituency in a single stru c tu re , albeit one which owes m ore to th e

PSG tra d itio n th a n to th e DG trad itio n . T h e m ost w idespread exam ple is

X gram m ar (originally proposed by H arris 1951) which augm ents a C FPS G

by distinguishing one elem ent in each co n stitu en t as th e h e a d of th a t con­

stitu e n t. However, th e re are com plications here since a num ber of sy n tactic

theories have been charged w ith uncritically ad o p tin g unform alized versions of

X theory (P u llu m 1985; K ornai and P u llu m 1990) — th e very charge laid a t

th e door of certain DG theories!

T h e general p au city of form al results concerning DG carries over from

theoretical to co m p u tatio n al linguistics. Here DG is scarcely m entioned, far

less argued against. In th e sm all num ber of cases in which it achieves passing

m ention, th e sam e reasons for not using DG are employed: first, th e only

existing form al results show th e equivalence of DG and C F P S G so th e re is

no incentive to work w ith th e less fam iliar system ; second, alm ost n o th in g

else is known form ally ab o u t DG so u n til such tim e as ad d itio n al solid results

becom e available th ere is no incentive to invest effort in try in g to w ork w ith in

(20)

Let us consider these points in turn . F irst, then, th e equivalence of DG

and C FPSG . In th eir m onograph Linguistics and Inform ation Science Sparck

Jones and K ay provide a brief introduction to DG and th en furnish an account

for w hy DG is not m entioned again;

We have p u t phrase stru c tu re and dependency to g eth er in th e sam e

class because it is easy to show th a t th e differences betw een th e m

are triv ial from alm ost every point of view (see G aifm an 1965).

It is also possible to w rite gram m atical rules in a suitable no­

ta tio n which describes a single language and which assigns to

each sentence of th a t language bo th p h rase-stru ctu re and d ep en ­

dency trees (see K ay 1965; R obinson 1967). In this p ap er we shall

m ake no fu rth er references to dependency gram m ar, intending w hat

we say a b o u t p h rase-stru ctu re gram m ar to b e und ersto o d as a p ­

plying also to dependency w ith occasional m inor m odifications”

(Sparck Jones and Kay 1973: 83-4).

Sparck Jones and K ay’s observation th a t it is possible to devise a m eta­

form alism which includes b o th dependency and constituency inform ation is

useful from a descriptive point of view. However, th e point it misses is th a t th e

equivalence of th e formalisms or th e possibility of devising a m eta-form alism

leaves open th e question of w hether phrase s tru c tu re parsing and dependency

parsing can be achieved by m eans of identical algorithm s. This is a question

which has hardly ever been raised in th e literatu re. H ays’ claim th a t “a phrase-

stru c tu re parser can be converted in to a dependency parser w ith only a m inor

a lte ra tio n ” (Hays 1966b: 79) is presented w ithout argu m en t or illu stratio n so

its sta tu s is, a t best, uncertain. A sem inal tex t in com puter science bears th e

title Algorithm s + Data Structures = Programs (W irth 1975). It is well u n d er­

sto o d th a t a change in d a ta stru c tu re m ay necessitate a change in algorithm

if th e n et effects of th e program are to rem ain co n stan t. “T h e developm ent of

(21)

tu re ” (G oldschlager and Lister 1982: 65). T hus it cannot be taken for granted

a priori th a t fam iliar phrase stru ctu re parsing algorithm s will m ap effortlessly

in to th e dependency parsing dom ain.

T h e second criticism of DG in com putational linguistics is th a t where DG

has been employed, for exam ple in parsing system s, th e resulting systems

have not been constructed on a principled or even well-defined foundation.

W inograd writes:

T h e form al theory of dependency gram m ar has em phasized ways

of describing stru ctu res ra th e r th a n how th e system ’s p erm anent

knowledge is stru ctu red or how a sentence is processed. It does not

address in a system atic way th e problem of finding th e correct de­

pendency stru c tu re for a given sequence of words. In system s th a t

use dependency as a way of characterizing stru ctu re, th e parsing

process is generally of an ad hoc n a tu re (W inograd 1983: 75).

O nce again, this claim is presented w ithout fu rth er arg u m ent or evidence.

T h e absence of em pirical d a ta which characterizes these claims is not as

surprising as it m ight first seem when it is u nderstood th a t th e num ber of

dependency parsing systems in existence is severely lim ited in com parison w ith

th e num ber of phrase stru c tu re parsing system s. It is also th e case th a t those

descriptions of dependency parsing system s which have app eared in p rin t have,

on th e whole, been published in relatively obscure sources or have only been

circulated privately. Some accounts have been terse to th e p o in t of leaving m ost

of th e d etail unreported. No survey or com parative account of dependency

parsers is currently in existence.

One of th e chief objectives of this thesis is to fill this gap in th e literatu re

by presenting an extensive survey of existing dependency parsing system s, th e

first such survey to be prepared.

T h e availability of this survey m aterial presents a unique o p p o rtu n ity to

(22)

in dependency parsing com pare w ith those which are widely used and well-

understood in phrase stru c tu re parsing. This stu d y focuses on two hypotheses:

H y p o th e sis 1

It is possible to construct a fully functional dependency parser

based directly on an established phrase stru c tu re parsing algorithm

w ithout altering any fundam ental aspects of th e algorithm .

T his hypothesis is a strong version of H ays’ (1966b: 79) claim. It is m otivated

by G aifm an’s definition of strong equivalence betw een DG and PSG which

g uarantees some m easure of stru c tu ra l correspondence a t each po in t in th e DG

and PSG parse trees (see C h ap ter 2 below). However, it is not th e strongest

possible hypothesis, since it stops short of predicting th a t a dependency parser

can b e co n stru cted based on any phrase stru c tu re parsing algorithm .

H y p o th e sis 2

It is possible to construct a fully functional dependency parser using

an alg o rith m which could not be used w ithout su b sta n tia l m odifi­

cation in a fully functional conventional phrase stru c tu re parser.

This hypothesis is m otivated by an appreciation of th e particu lar way in which

DG rules encode inform ation, as com pared w ith th e way in which PSG rules

encode inform ation.

As I have previously noted, m ost linguistically m otivated DGs have p ro ­

ceeded beyond th e lim its of w hat has been defined in a m ath em atically rigor­

ous way. It is impossible to un d ertak e a survey of dependency parsing system s

w ith o u t encountering some of these devices of unknow n form al power. W hile

noting in passing these extensions where relevant, I shall co n cen trate my a n al­

ysis on th e parsing of th e context free backbone of these theories (i.e. th a t

which can be m apped onto a G aifm an gram m ar). I shall no t be concerned

in this thesis to m ake an y q u alitativ e judgem ents between DG and PSG qua

(23)

1.2

C h a p te r o u tlin e

W h at follows divides conceptually into three parts,

1. C h ap ter 2 introduces dependency gram m ar. It presents a form al account

of DG and outlines th e equivalence relation used to com pare DG w ith

PSG . T he developm ent of DG from its origins in th e classical world

th ro u g h to th e present day are charted in th e la tte r p a rt of th e chapter.

2. C hapters 3 to 11 present th e m ost detailed review and critique of de­

pendency parsers yet assembled. C h ap ter 3 describes th e grow th of th e

use of DG in com p utation al systems for n a tu ra l language processing.

C hapters 4 to 11 are each devoted to th e description and evaluation of

a different dependency parser or closely related fam ily of dependency

parsers. T h e chapters are arranged in ap p ro xim ate chronological order;

th e oldest parser is presented first and th e m ost recent parser is presented

last. Needless to say, th e developm ent phases of some parsers overlapped

so th e ordering of chapters m ust be regarded as no m ore th a n a rough

guide to th e relative age of th e system s reported therein.

3. Finally, draw ing heavily on the preceding analyses of existing dep en ­

dency parsers. C h ap ter 12 sets out some elem ents of a first taxonom y of

dependency parsing, defines some technical vocabulary for th e field and

specifies th e range of relevant variables. T h e two hypotheses s ta te d above

are exam ined in C h ap ter 13 in light of th e survey of existing dependency

(24)

C h a p ter 2

D e p e n d e n c y gram m ar

“It all depends.” C.E.M . Joad,

BBC R adio ‘Brains T ru st’, 1942-1948

2.1

O v e r v ie w

Before proceeding w ith a survey of parsing system s based on DG it is necessary

to b e clear ab o u t exactly w hat a DG is. One of th e dangers when working

w ith a notion like gram m atical dependency is th a t it can come to m ean all

things to all people. T h e purpose of this ch apter is therefore to furnish an

unam biguous definition of DG, to introduce some terminology, and to review

w here system s approxim ating to this definition of DG have been employed in

theoretical linguistics.

Section 2.2 introduces G aifm an gram m ars, th e only version of DG to be

defined w ith full m ath em atical rigour. Accordingly, these system s are tak en as

a stab le reference point in this thesis. T h e formal properties of G aifm an gram ­

m ars a re defined, together w ith a decision procedure for determ ining w hether

or n o t a given strin g is accepted or rejected by an a rb itra ry G aifm an gram m ar.

A ltern ativ e conventions for p o rtrayin g dependency stru ctu res diagram m ati-

cally are introduced. A lthough th ere is insufficient space here to reproduce

(25)

PSG , th e equivalence relation employed is described an d scrutinized.

In practice, very few — if any — linguists have used G aifm an’s system

in th e description of n a tu ra l language w ithout making use of various aug­

m entations of unknow n formal power. These augm entations are flagged in

Section 2.3. Those which m ust necessarily be exam ined in th e course of the

survey of dependency parsing system s are described in greater d etail in later

chapters. Section 2.4 charts th e origins and developm ent of DG in linguistic

theory.

In Section 2.5, th ree gram m atical formalisms bearing some sim ilarities to

DG are identified, nam ely Case G ram m ar, C ategorial G ram m ar, and Head-

D riven P h rase S tru ctu re G ram m ar. A lthough a full description of these fram e­

works is no t ap p ro p riate here, th eir basic concepts are introduced and some

reasons for excluding th em from this stu d y are provided.

2 .2

G a ifm a n g r a m m a r s

2 .2 .1

D e f in it io n s

T h e first form al definition of DG was offered by H aim G aifm an (1965). In this

section, I present his definition along w ith illustrative examples.^

D e f i n i t i o n

A d e p e n d e n c y g r a m m a r A is a 5-tuple

A = (T,C,X,7e,^)

where

1. T is a finite set of word symbols, i.e. th e term inal sym bols. For th e p u r­

poses of exposition, th e letters u, v, w, x, y, z, w ith or w ith o u t subscripts,

will denote members of this set.

(26)

2. C is a finite set of category symbols. For th e purposes of exposition,

th e letters U, V, W, X , Y, Z, w ith or w ithout subscripts, will d enote

m em bers of this set.

3. ^ is a set of assignm ent rules, whose elem ents are all m em bers of T x C.

Every word belongs to a t least one category and every category m ust

have a t least one word assigned to it. A word m ay be assigned to m ore

th a n one category.

4. 7^ is a set of rules which give for each category th e set of categories

which m ay derive directly from it w ith th eir relative positions. For each

category X , th ere is a finite num ber of rules of th e form

(w here Yi to Yn are m em bers of C) indicating th a t Yi • • • m ay de­

pend on X in th e order given, where m arks th e position of X in

th e sequence. A rule of th e form X{ * ) allows X to occur w ith o u t any

dependents.

5. ^ is a subset of C whose members are those categories which m ay govern

a sentence, i.e. th e s ta r t symbols.

Ex a m p l e

A i is an exam ple of a dependency gram m ar, where A i = ({people, robots,

dislike, sm art, stupid} , {N, V, A ) , {(people, N), (robots, N), (dislike, V),

(sm art, A), (stu p id. A) } , {N(*), N (A ,*), V (N ,*,N ), A(*) }, {V} ).

C o n v e n t i o n

By convention, th e fact th a t some % is a m em ber of Q m ay be indicated

(27)

Following this convention, G of A i may be represented as *(V).

C o n v e n t i o n

By convention, A m ay be represented as follows: for each d istin ct category

X m C create a correspondence of th e form X : L where L is th e set of all

words X such th a t ( t , X ) is in A.

T h u s, A of A i m ay be represented as {N:{people, robots}, V:{dislike} ,

A :{sm art, stupid}}.

C o n v e n t i o n

To im prove readability, a gram m ar of ty p e A m ay be represented by w riting

each m em ber of ^ on a line by itself, followed by each m em ber of 7?. on a line by

itself, followed by each m em ber of ^4 on a line by itself. T and C are im plicitly

defined in A .

T hus, A i m ay be represented as follows:

*(V) N(*) N (A ,*) V (N ,*,N ) A(*)

N :{people, robots} V:{dishke}

A :{sm art, stupid}

T h e next definition elucidates th e relationship betw een sentences of a lan ­

guage A and th e gram m ar of type A which generates A.

In this definition it is necessary to m ake reference to occurrences of words

or categories in a sequence. An occurrence is an ordered pair (a:,z), w here x is

th e word or category and i is th e position num ber of x in th e sequence. P , Q

and P , w ith or w ith o u t subscripts denote occurrences of words or categories.

(28)

said to be of category X .

D E F I N I T I O N

A s e n te n c e XiX2 • • • is analyzed by a gram m ar of type A iff th e following

are true:

1. A sequence of categories X1X2 ' ' ’ Xm can be form ed such th a t Xi is of

category A i for 1 < i < m .

2. A 2-place relation d can be established betw een pairs of words in X1X2 • - • Xj

P dQ signifies th e fact th a t P depends on Q, i.e. th e relation d holds be­

tween P and Q.

For every d we define an o th er relation d* where Pd*Q iff there is a se­

quence Pq, Pi " ■ Pn such th a t Pq = P^ P^ = Q and PidPi^i for every

0 < 2 < n — 1.

T he relation d is constrained in th e following ways:

(a) For no P , Pd*P.

(b) For every P , there is a t m ost one Q such th a t P dQ .

(c) If Pd*Q and R is betw een P and Q in sequence (i.e. either S { P ) <

6"(P) < 5'(Q) or 5"(P) > 5 '(P ) > S (Q )), th en Pd-'Q .

(d) T h e whole set of occurrences is connected by th e relation d.

3. If P is an occurrence of Xj and if th e occurrences th a t depend on

it are P i , P2 -- - Pn, also, if Ph is an occurrence of w here h =

1 ’ "71, and th e order in which these words occur in th e sentence is

5 5 ‘ 5 " " ’ 1 ) th en A j(A jj • • • X{^ * " ' ' ^ i n) is a

rule of R. In th e case th a t no occurrence depends on P, A j(* ) is a rule

(29)

4. T h e occurrence which governs th e sentence (i.e. which depends on no

oth er occurrence) is an occurrence of a word whose category is a m em ber

O Î Ç .

T h e stru c tu re corresponding to a sentence of a language generated by a

gram m ar of type A is called a dependency tree.

D e f i n i t i o n

A d e p e n d e n c y t r e e for a sentence Xi - - • Xn consists of th e strin g of c ate­

gories X i ’ ' ' X nt together w ith th e relation d.

D E F I N I T I O N

A la n g u a g e is weakly generated by a dependency gram m ar iff for every

sentence in th a t language th ere is a corresponding dependency tree and no

dependency tree exists for a sequence of words which is no t a sentence. A lan ­

guage is strongly generated by a dependency gram m ar iff it is weakly generated

by th a t dependency gram m ar and, for every syntactically correct in te rp re ta ­

tion, and only for these, th ere are corresponding dependency trees.

T h e above definitions can be sum m arized inform ally as follows. In the

stru c tu re corresponding to a sentence of a language generated by a dependency

gram m ar of type A:

1. one and only one occurrence is independent (i.e. does n o t depend on any

other);

2. all o th er occurrences depend on some elem ent;

3. no occurrence depends on m ore th a n one other; and

4. if A depends directly on B and some occurrence C intervenes betw een

th em (in linear order of string), th en C depends d irectly on A or on B

(30)

To aid discussion, I shall ad o p t th e following terminology. All occurrences of

words in a sentence shall be called w o rd s . W here th e intention is to refer

to words in th e lexicon, this will be stated explicitly. T h e single independent

word in a sequence (i.e. th e word which depends on no other) shall be called

th e r o o t. O ne word W i is said to be a s u b o r d i n a t e of an o th er word W2

if W i depends on W2 or on an o th er subordinate of W2, i.e. W i depends di­

rectly or indirectly on W2. T h e word on which another word depends shall be

called its h e a d . T h e requirem ent th a t a head-dependent pair eith er be next to

each oth er or sep arated by direct or indirect dependents of them selves (point

4 above) is known as th e a d ja c e n c y c o n s tr a in t .

Ex a m p l e

Given these definitions, th e sentences in (1) belong to th e language defined

by A i, whereas th e sequences in (2) are outside of th a t language. (By conven­

tion, sequences which are not well-formed in respect of a p artic u la r gram m ar

are prefixed by ‘*’).

(1) a People dislike robots.

b S tupid people dislike sm art robots,

c S m art robots dislike people,

d People dishke sm art people.

(2) a * Sm art people dislike,

b *Stupid dislike robots,

c *Stupid robots.

d * R obots people dislike,

e * R obots sm art dislike people.

E xam ple (2a) is ill-formed because dislike is a V, and Vs require two dep en ­

dents, one preceding and one following. In this case, no following d ependent is

present. Exam ple (2b) is ill-formed because all of th e words are not connected

to g eth er by dependency. T h e sequence is divided into two parts: stupid (which

(31)

category N for dislike). None of th e words in (2c) is missing a dependent. How­

ever, th e in dependent word robots is of category N, b u t only words of category

V m ay govern a sentence. In exam ple (2d), none of th e words is missing a de­

p en d en t and th e independent elem ent dislike belongs to th e required category

V. However, th e dependents of V are required to occur one on either side of

V, whereas here th ey b o th occur before it. Exam ple (2e) is ill-formed because

of th e in ap p ro p riate position of smart. E ith er it is a dependent of robots, in

w hich case it should precede th a t word, or it is a dependent of people. If it is

a d ep en d en t of people th en it precedes it as it ought, b u t sm art an d people are

sep arated b y th e word dislike, which is dependent on neither.

I shall henceforth refer to dependency gram m ars of ty p e A as G a if m a n

G r a m m a r s .

2*2.2

A r e c o g n iz e r for G a ifm a n g r a m m a r s

So far, I have characterized G aifm an gram m ars in term s of co n strain ts on th e

well-formedness of gram m ar rules and dependency stru ctu res. In this section

I describe a decision procedure — a r e c o g n iz e r — which accepts all an d only

th e well-formed strings of th e language described by a G aifm an gram m ar. T h e

recognizer is based on one described by Hays (1964: 516-17).

T h e principal d a ta stru c tu re used by th e recognizer is a table. To d eterm in e

w h eth er or not a strin g is generated by a G aifm an gram m ar A proceed as

follows:

1. S ta rtin g from 1, and counting upw ards in units of 1, assign an in teg er to

each word in th e string, working from left to right. T h e in teg er assigned

to a word shall be known as th e position of th a t word. L et M a x equal

th e position of th e rightm ost word.

2. Set up a table, having M a x positions, num bered from 1 to M a x . A cell

[a,h] shall occupy all th e positions from Pa to Pb, where 1 < a < 6 <

(32)

3. For each word Wi in th e string retrieve all th e classes X i to assigned

to th a t word by assignm ent rules of th e form W : { X i , X ^ } in A. If

Pi is th e position of VFj, w rite Xi to X n in th e tab le a t cell

4. For each word class X a t cell [j, j] in th e tab le (1 < j < M a x ) determ ine

w hether a rule of th e form % (*) exists in A. If so, insert % (*) in th e

tab le a t cell

5. Let y be a variable. Set V = 2.

6. Consider each sequence of V adjacent cells in th e table. For each se­

quence which consists of exactly one word class sym bol X and V -1 trees,

arranged in th e order

Fi )

X^ Yj

, . . . , Yy—1

search in A j for a corresponding rule of th e form:

. . . , Zi^ " " I Zy— \ )

If th e ro o t of each tree Yn in th e tab le is identical to each dependent

Zn in th e gram m ar rule th en if Y\ is located a t cell and

Yy-i is located a t cell [FV-i,c/t, in sert a new tre e in th e tab le

occupying cell FV-i^ight]- T h e form of th e new tre e should be as

follows:

^ ( T i , F2, ..., F^, *, F j , ..., FV -i)

7. If y = M a x th en go to step 8, otherw ise increm ent V and go to step 6.

8. If a tree exists in th e tab le occupying cell [ l,M a x ] th e n succeed if th e

ro o t of th e tre e is of type X and a rule of th e form *(A') exists in A.

(33)

Hays presents his algorithm informally, so it has been necessary to recon­

stru c t some of th e details in the above account.

A Prolog im plem entation of this recognition algorithm can be found in the

file h a y s _ r e c o g n iz e r .p l in A ppendix A.3.

Hays also outlines a generative procedure for en u m eratin g all the strings

generated by a G aifm an gram m ar (Hays 1964; 514-15). A Prolog im plem en­

ta tio n of a reconstructed version of H ays’ procedure can be found in the file

h a y s _ g e n e r a t o r .p l in A ppendix A.3.

2 .2 .3

R e p r e s e n t in g d e p e n d e n c y s t r u c t u r e s

T h ere are a t least th ree conventions for presenting dependency stru ctu res di-

agram m atically: stemmas^ tree diagrams and arc diagrams.

T h e first representational scheme — due to Tesniere (1959) — presents

words as nodes in a graph which is known as a s t e m m a (see Figure 2.1,

for exam ple). Dependencies betw een word occurrences are signalled by links

betw een nodes. By convention, heads are located nearer th e to p of th e diagram

th a n th e ir dependents. T he first occurrence in a sentence is positioned fu rthest

to th e left in a diagram and th e n th occurrence appears to th e rig h t of th e

n - l t h occurrence and to th e left of th e n + l t h occurrence. For simplicity,

category labels are usually om itted from diagram s of all types.

A lthough stem m as contain th e ap p ro p riate am o u n t of inform ation, they

can som etim es prove to be difficult to read, especially w hen th e sentences

represented are long and involve a lot of altern atio n betw een left-pointing and

right-pointing dependencies.

In th e second typ e of diagram , exemplified in F igure 2.2, dependency is

represented by th e relative vertical position of nodes in a tree; if a line connects

a lower node to a higher node then th e sym bol corresponding to th e lower node

depends on th e one corresponding to th e higher node. I shall call diagram s of

this kind t r e e d ia g r a m s . T hey are also known as D - m a r k e r s .

(34)

dislike

robots

sm art stupid

F igure 2.1: stem m a for S m art people dislike stupid robots

sm art people dislike stu p id robots

(35)

' V '

S m art people dislike stupid robots

Figure 2.3: arc diagram for S m a rt people dislike stupid robots

m eans of directed arcs. I shall ad o p t th e convention of directing arcs from

heads to dependents, although (unfortunately) th ere is no generally accepted

convention and it is no t unusual to find examples in th e literatu re of arcs being

oppositely directed. I shall refer to diagram s of this kind as a r c d ia g ra m s .

F ig u re 2.3 is equivalent to Figures 2.1 and 2.2 in th e inform ation it expresses.

Some authors (such as M atthew s 1981) draw arc diagram s w ith th e arcs

below th e symbols in th e sentence ra th e r th a n above th em as shown here.

H udson som etim es divides th e arcs so th a t those having a designated func­

tio n a p p ear below th e sentence symbols, whilst th e rest ap p ear above them

(H udson 1988b: 202; page 189 below).

T h e adjacency co n strain t is satisfied in th e sentence Sm a rt people dislike

stupid robots^ as can be seen in th e dependency stru c tu re variously represented

in Figures 2.1, 2.2 an d 2.3. T h e co nstrain t is violated in th e dependency

stru c tu re shown in Figure 2.4.

In Figure 2.4, S tu p id violates th e constraint, s tu p id is sep arated from its

head robots by d is lik e which depends on n eith er s tu p id nor robots^ neither

is it a su b o rd in ate of stu p id nor robot. In a tree diagram , th e d o tted line

w hich connects a w ord w ith its node is called its p r o je c tio n . N ote th a t in

F ig u re 2.2, links and projections do not intersect. Such tree diagram s and th eir

corresponding sy n tactic stru ctu res are said to be p r o je c tiv e . In Figure 2.4 a

link and a pro jectio n intersect a t precisely th e po in t where ill-formedness was

(36)

people stupid dislike robots sm art

Figure 2.4: dependency tree for * S m a rt people stupid dislike robots

S m art people stupid dislike robots

Figure 2.5: arc diagram for * Sm a rt people stupid dislike robots

are said to be n o n - p r o je c tiv e .

T h e vocabulary of projectivity is rooted in th e im agery of tree diagram s.

I shall henceforth m ake use of th e m ore n eu tral term s a d j a c e n t and n o n -

a d ja c e n t.

T h e arc diagram corresponding to Figure 2,4 is shown in Figure 2.5. N otice

th a t arcs never cross in arc diagram s of stru ctu res which satisfy th e adjacency

constraint, whereas arcs do cross where th e stru ctu res violate th e adjacency

constraint. (T h e only exception to this generalization is discussed below).

In general, I shall use arc diagram s to represent dependency structures;

when describing a p articu lar dependency system rep o rted in th e lite ra tu re I

(37)

Old sailors tell tall tales

F igure 2.6: D ependency stru c tu re of Old sailors tell tall tales

2 .2 .4

T h e g e n e r a t iv e c a p a c it y o f G a ifm a n g r a m m a r s

As well as providing a formally explicit definition of one class of DG, Gaifman

went on to investigate th e generative capacity of th e class. He did this by

com paring his DG w ith phrase stru c tu re gram m ar.

He concluded th a t for every DG there is a strongly equivalent CFPSG

and for a subclass of C FPSG s (in which every p h rase is a projection of a

lexical category) th e re is a strongly equivalent DG. His proof is too lengthy

to reproduce here; it can be found in G aifm an (1965). Definitions of strong

equivalence betw een th e two systems can be found in Hays (1961b) and in

G aifm an (1965: 320-25).

Let a s u b t r e e be a connected subset of a dependency tree. (T his is w hat

Pickering and B arry (1991) have recently called a ‘dependency c o n stitu en t’.)

Let a c o m p le te s u b t r e e consist of some elem ent of a tree, plus a ll other

elem ents directly or indirectly dependent on it. T h u s, th e dependency tree

in Figure 2.6 includes th e subtrees shown in Table 2.1. O f these, only those

shown in Table 2.2 are com plete subtrees.

A phrase stru c tu re and a dependency stru ctu re, b o th defined over th e sam e

string, c o r r e s p o n d r e la tio n a l ly if every c o n stitu en t is coextensive w ith a

sub tree and every com plete subtree is coextensive w ith a co n stitu en t. Two

stru c tu ra l entities are c o e x te n s iv e if they refer to exactly th e sam e elements

in a string.

(38)

Old

Old sailors Old sailors tell

Old sailors tell tall tales sailors

sailors tell tell

tell tall tales tell tales tall tall tales tales

Table 2.1: Subtrees in Figure 2.6

Old Old sailors

Old sailors tell ta ll tales tall

ta ll tales

(39)

L A B E L S U B T R E E

Old sailors tell tall tales

Old

Old sailors

Old sailors tell tall tales tall

tall tales

T able 2.3: C om plete subtree labels in Figure 2.6

which depends on no other word in th e sam e subtree. Labels for th e com plete

subtrees of th e dependency tree shown in Figure 2.6 are given in T able 2.3.

Let each p h rasal co n stitu en t in a PSG also have a label, w here th e label

is conventionally und ersto o d (for exam ple, th e label of a noun phrase is often

given as ‘N P ’, etc).^

In dependency theory, a strin g is said to d e r iv e f r o m th e label of th e

corresponding com plete subtree. In phrase stru c tu re theory, a strin g is said to

d e r iv e fr o m th e label of th e corresponding co n stitu en t. A label a c c o u n ts fo r

th e set of strings th a t derive from it. Tw o labels are s u b s t a n t i v e l y e q u iv a ­

l e n t if they account for th e sam e set of strings.

A phrase stru c tu re and a dependency stru c tu re c o r r e s p o n d if (i) th ey

correspond relationally and (ii) every com plete su b tree has a label which is

su b stan tiv ely equivalent to th e label of th e coextensive co n stitu en t.

A DG is s t r o n g l y e q u iv a le n t to a PSG if (i) th ey have th e sam e te r ­

m inal alp h ab et, an d (ii) for every strin g over th a t a lp h ab e t, every s tru c tu re

a ttrib u te d by eith er gram m ar corresponds to a stru c tu re a ttrib u te d by th e

other.

Let us consider, by way of exam ple, th e am biguous sentence (3), th e two

p h rase stru c tu re in terp retatio n s of which are shown in Figures 2.7 an d 2.8.

T h e linguistic plausibility of these analyses is no t an issue here.)

(3) T hey are racing horces.

(40)

p VP

A uxP V P

Aux

T hey are racing horses

Figure 2.7: F irst phrase stru c tu re analysis of They are racing horces

P P

V N P

A djP N

Adj

T h ey are racing horses

(41)

They are racing horses

Figure 2.9: D ependency stru c tu re for They are racing horses. T h e sentence root is racing.

L A B E L S U B T R E E

they they

are are

th ey racing they are racing

racing th ey are racing horses are racing

are racing horses racing

racing horses horses horses

Table 2.4: Subtrees and com plete subtrees in th e DG analysis of th e sentence

They are racing horses shown in Figure 2.9. O nly com plete subtrees are la­ belled.

Now consider th e dependency stru c tu re in Figure 2.9. This includes the

subtrees shown in T able 2.4.

T h e constituents in Figure 2.7 are shown in Table 2.5 (ignoring th e initial

category assignm ents).

Since every c o n stitu en t in Figure 2.7 is coextensive w ith a su b tree in Fig­

ure 2.9 an d every com plete subtree in Figure 2.9 is coextensive w ith a con­

stitu e n t, th e stru ctu res correspond relationally. Since it is also th e case th a t

every com plete su b tree has a label which is su bstantiv ely equivalent to th e

label of th e coextensive co n stitu en t, th e stru ctu res correspond. Close exam i­

(42)

L A B E L CO N S T I T U E N T

NP

s

AuxP VP VP NP

they

they are racing horses are

are racing

are racing horses horses

T able 2.5: C o n stitu en ts in th e phrase stru c tu re analysis of th e sentence They are racing horses shown in Figure 2.7

However, only Figures 2.7 and 2.9 share substantively equivalent labellings so

only these stru ctu res can be said to correspond.

2 .3

B e y o n d G a ifm a n g r a m m a r s

In presenting his work on PSG s, C hom sky frequently and explicitly represented

th e m as a form alization of th e stru c tu ra list Im m ediate C o n stitu en t model (e.g.

C hom sky 1962). This claim has recently been contested by M anaster-R am er

and Kac (1990), thus highlighting some of th e difficulties inherent in trying to

formalize a pre-existing linguistic notion faithfully.

T h e issues are som ew hat clearer in th e case of D G , since G aifm an, as

a u th o r of th e form alization, makes no claims regarding its relation to any

existing notion other th a n th a t em bodied in a RAND C orporation m achine

tra n slatio n program . Hays, on th e o th er hand, represents G aifm an’s work as

being a form alization of th e hnguistic notion of dependency. For example,

following a discussion of th e different linguistic notions underlying IC theory

and dependency theory in his 1964 Language paper, his sum m ary of w hat is

to follow includes th e following statem en t:

Section 2 presents a form alism for th e theory, identifying th e com­

ponents of any dependency gram m ar (Hays 1964: 512, my em pha­

(43)

I have been unable to find any discussions anyw here in th e lite ra tu re which

investigate this assertion by reference to actu al linguistic theories which claim

to be based on some notion of dependency.

W h at is noticeable is th a t few of the self-proclaim ed dependency-based

theories of language have m ade use of G aifm an’s formalism . This contrasts

sharply w ith th e u p take of C hom sky’s PSG formalism, an d particularly C F ­

PSG . T h e only DGs which incorporate a m ore or less in tact version of G aifm an

gram m ar are those which use it as th e base com ponent in a transform ational

gram m ar (Hays 1964: 522-4; Robinson 1970) or as th e tran scrip tio n system

on one s tra tu m of a stratificational gram m ar (Hays 1964: 522-4). O therw ise,

a ltern ativ e quasi-form alism s are employed.

It is common to find versions of DG which m ake use of com plex feature

stru ctu res instead of or as well as word category labels, w ith dependency rules

being allowed to m anipulate features in a rb itra ry ways (e.g. S ta ro sta 1988;

Covington 1990b). Consider th e following illu strativ e exam ple of a dependency

rule for intransitive verbs which enforces subject-verb agreem ent (a d a p ted from

Covington 1990b: 234):

category : verb person : X n u m ber : Y

Here th e head is of syntactic category ‘verb ’, of person ‘X ’ and num ber ‘Y ’.

Its single dependent m ust be a preceding nom inative case noun, also of person

‘X ’ and num ber ‘Y ’. ‘X ’ and ‘Y ’ are variables over featu re values.

T his kind of augm entation could easily b e form alized as an extension to

G aifm an’s definition of DG. So long as th e feature stru c tu re s a re sim ply a r­

rangem ents of symbols draw n from a finite set, th e generative pow er rem ains

unchanged. T h e proof is trivial: any arrangem ent of features m ay b e ‘frozen’

and tre a ted as though it were an atom ic symbol.^ T his is d irectly analogous to / ' category : noun \

person : X

num ber : Y 5 *

\ case : n o m in a tiv e /

Figure

Table 2.1: Subtrees in Figure 2.6
Table 2.3: Complete subtree labels in Figure 2.6
Figure 2.7: First phrase structure analysis of They are racing horces
Figure 2.9: Dependency structure for They are racing horses. The sentence root is racing.
+7

References

Related documents

ACTION: PT to iterate one issue over text clarity with PS and then post the governance document and recognize formal adoption by the steering

An 18k gold keyless wind open face pocket watch, the white enamel dial with light turquoise centre, hourly applied Roman numerals, bordered by a minute track, subsidiary seconds

formation, and austenite yielding coincides with a maximum in the work hardening rate; (3) transformation of austenite to martensite is stress-assisted and the work required offsets

As described in section 1.2, the goal of this thesis is to detect the five driving maneuvers (stop, lane change left and right, emergency braking, emergency evasion) as early

a) An eggshell audit is a civil tax audit in which the taxpayer has filed a false tax return. If the falsity comes to light then there is possibility that the IRS can refer the

POSTER PRESENTATIONS Poster Salonu / Poster Hall.. ULUSAL NEFROLOJİ, HİPERTANSİYON, DİYALİZ VE TRANSPLANTASYON

Evaluation of the visible light photocatalytic activity of the thin films To evaluate the visible light photocatalytic activity of the prepared thin film

Thus one of the most important implications arising from project studies drawing on the social constructionist account presented in this paper is the possibilities offered to