An Experiment On Incremental Analysis Using Robust Parsing Techniques

(1)

Using Robust Parsing Techniques

Kilian Foth and Wolfgang Menzel and Horia F. Pop and Ingo Schröder

foth j menzelj hfpop [email protected]

Fachbereich Informatik,Universität Hamburg

Vogt-Kölln-Straÿe 30,22527 Hamburg, Germany

Abstract

Theresultsofanexperimentarepresentedinwhich

anapproachforrobustparsinghasbeenapplied

in-crementally. Theyconrmthatduetotherobust

na-tureof theunderlyingtechnologyan arbitrary

pre-x of a sentence can be analysedinto an

interme-diate structural description which is ableto direct

thefurtheranalysiswithahighdegreeofreliability.

Most notably, this result can be achieved without

adaptingthe grammarortheparsingalgorithmsto

the case of incremental processing. The resulting

incrementalparsingprocedureissignicantlyfaster

if comparedto a non-incrementalbest-rst search.

Additionallyitturnsoutthat longersentences

ben-etmostfromthisacceleration.

1 Introduction

Natural language utterances usually unfold over

time, i. e., both listening and reading are carried

outinan incrementalleft-to-rightmanner.

Model-ing a similar type of behaviourin computer-based

solutions is achallenging and particularly

interest-ingtaskforanumberofquitedierentreasonsthat

aremostrelevantinthecontextofspokenlanguage

systems:

Without any external signals about the end

of an utterance, incremental analysis is the

onlymeansto segmenttheincomingstreamof

speechinput.

An incremental analysis mode provides for a

more natural (mixed-initiative) dialogue

be-haviourbecausepartialresultsarealready

avail-ablewellbefore theend ofanutterance.

Parsingmayalreadytakeplaceinconcurrency

tosentenceproduction. Thereforethespeaking

timebecomesavailableascomputingtime.

Dynamic expectations about the upcoming

partsoftheutterancemightbederivedrightin

timetoprovideguidinghintsforother

process-ing components,e. g., predictionsaboutlikely

wordformsforaspeechrecognizer(Hauenstein

Inprinciple,twoalternativestrategiescanbe

pur-suedwhen designingan incrementalparsing

proce-dure:

1. Tokeepopenall necessarystructural

hypothe-sesrequiredto accomodateeverypossible

con-tinuationof anutterance. This is thestrategy

usuallyadoptedinanincrementalchartparser

(Wirén, 1992).

2. Tocommittooneoralimitednumberof

inter-pretationswhere

(a) either this commitment is made rather

early and a mechanism for partial

re-analysisis provided(Lombardo,1992)or

(b) it is delayeduntilsucientinformationis

eventuallyavailabletotakeanultimate

de-cision(Marcus,1987).

The apparent eciency of human language

un-derstandingis usually attributed to an early

com-mitmentstrategy.

Our approach, in fact, represents an attempt to

combinethese two strategies: On the onehand, it

keepsmany ofthe availablebuildingblocks forthe

initialpartof anutteranceand passesan(updated)

search space to the followingprocessing step.

Ad-ditionally,theoptimal structuraldescriptionforthe

dataalreadyavailableis determined. Thisnotonly

makesanactualinterpretation(togetherwith

expec-tationsforpossiblecontinuations) availableto

sub-sequent processing components, but also opens up

thepossibilityto usethis informationto eectively

constrainthesetofnewstructuralhypotheses.

Determining theoptimal interpretation for ayet

incompletesentence,however,requiresaparsing

ap-proach robustenough to analyse an arbitrary

sen-tenceprex intoameaningful structure. Therefore

twocloselyrelatedquestionsneedtoberaised:

1. Can the necessary degree of robustness be

achieved?

2. Is the information contained in the currently

(2)

tentialforsearchspacereductionsagainstapossible

loss of accuracy, a series of experiments has been

conducted. Sections2and3introduceandmotivate

the framework of the experiments. Section 4

de-scribesanumberofheuristicsforre-useofprevious

solutionsandSection5presentstheresults.

2 Robust Parsing in a Dependency

Framework

Our grammar models utterances as dependency

trees, which consist of pairs of words so that one

depends directly on theother. This subordination

relation canbe qualiedbyalabel(e. g.to

distin-guishcomplementsfrommodiers). Sinceeachword

canonly depend on oneother word, a labeledtree

isformed,usuallywiththenite verbasitsroot.

The decisionon which structure to postulate for

anutteranceisguidedbyexplicitconstraints,which

arerepresentedasuniversallyquantiedlogical

for-mulasaboutfeaturesofwordformsandpartialtrees.

Forinstance,oneconstraintmightpostulatethat a

relationlabeledas`Subject'canonlyoccurbetween

anounandaniteverbtoitsright,orthattwo

dif-ferentdependenciesofthesameverbmaynotboth

be labeled `Subject'. For eciency reasons, these

formulasmayconstrainindividualdependencyedges

orpairsofedgesonly. Theapplicationofconstraints

canbeginassoonastherst wordof anutterance

isread;noglobalinformationabouttheutteranceis

requiredforanalysisofitsbeginning.

Sincenaturallanguageinputwilloftenexhibit

ir-regularitiessuchasrestarts,repairs,hesitationsand

other grammatical errors, individual errors should

notmakefurtheranalysisimpossible. Instead,a

ro-bust parsershouldcontinuetobuild astructure for

theutterance. Ideally,thisstructureshouldbeclose

tothat ofasimilar,but grammaticalutterance.

Thisgoalisattainedbyannotatingtheconstraints

that constitute the grammar with scores ranging

from0to 1. A structure that violatesoneormore

constraintsisannotatedwiththeproductofthe

cor-respondingscores,andthestructurewiththehighest

combinedscoreisdenedasthesolutionofthe

pars-ingproblem. Ingeneral,thehigherthescoresofthe

constraints,themoreirregularconstructionscanbe

analysed. Parsing an utterance with annotated or

soft constraintsthus amountsto multi-dimensional

optimization. Both complete and heuristic search

methodscanbeemployedtosolvesuch aproblem.

Our robust approach also provides an easy way

to implementpartialparsing. If necessary,e.g., an

isolatednounlabeledas`Subject'mayformtheroot

of a dependency tree, although this would violate

therstconstraintmentionedabove. Ifaniteverb

isavailable,however,subordinatingthenoununder

analysisofincompleteutterances.

Dierentlevels ofanalysiscanbedenedtomodel

syntactic aswell assemanticstructures. A

depen-dency tree is constructed for each of these levels.

Sinceconstraintscanrelatetheedgesinparallel

de-pendency trees to each other, having several trees

contributesto therobustness of theapproach.

Al-together,the grammarused intheexperiments

de-scribedcomprises12levelsof analysisand490

con-straints(Schröderet al.,2000).

3 Prex Parsing with Weighted

Constraints

In general, dependency analysis is well-suited for

incrementalanalysis. Since subordinations always

concerntwowordsratherthanfullconstituents,each

wordcanbeintegratedintotheanalysisassoonasit

isread,althoughnotnecessarilyintheoptimalway.

Also,thepre-computeddependencylinkscaneasily

be re-usedin subsequentiterations. Therefore,

de-pendencygrammarallowsane-grainedincremental

analysis(Lombardo,1992).

dann

lassen

sie

uns

doch

noch

einen

Termin

mod

c1

mod

c1

c2

mod

(a)

let

you

yet

a

Then

us

meeting

dann

lassen

sie

uns

doch

noch

einen

Termin

c1

mod

c3

c2

ausmachen

c1

c2

mod

(b)

Then

let

you

us

yet

a

meeting

appoint

Let’s appoint yet another meeting then.

<part>

Figure1: An exampleforaprexanalysis

Whenassigningadependencystructureto

incom-pleteutterances,theproblemariseshowtoanalyse

wordswhose governorsor complementsstilllie

be-yondthetimehorizon. Twodistinctalternativesare

possible:

1. Theparsercanestablishadependencybetween

thewordandaspecialnoderepresentinga

pu-tativewordthat isassumedtofollowinthe

re-maining input. This explicitly models the

ex-pectationsthat would beraised by the prex.

(3)

under-many constraints cannot be meaningfully

ap-pliedto wordswithunknownfeatures.

2. An incomplete prex can be analyzed directly

if a grammar is robust enough to allow

par-tial parsing as discussed in the previous

sec-tion: If the constraint that forbids multiple

trees receives a severe but non-zero penalty,

missing governors or complementsare

accept-ableaslongasnobetterstructureispossible.

Experimentsinprexparsingusingadependency

grammarofGermanhaveshownthatevencomplex

utterances with nested subclauses can be analysed

in thesecond way. Figure 1a provides an example

of this: Because the innitive verb `ausmachen' is

notyetvisible,itscomplement`Termin'isanalysed

asanisolatedsubtree,andthemainverb`lassen'is

lacking a complement. After the missingverb has

beenread,twoadditional dependency edges suce

tobuildthecorrectstructurefromthepartialparse.

Thismethodallowsdirectcomparisonbetween

in-cremental and non-incremental parser runs, since

both methods use the same grammar. Therefore,

wewillfollowuponthesecondalternativeonlyand

construct extendedstructures guided by the

struc-turesofprexes,withoutexplicitlymodelingmissing

words.

4 Re-Use of Partial Results

While a prex analysiscan produce partial parses

anddiagnoses, sofar thisinformationhasnotbeen

used in subsequentiterations. In fact, after a new

wordhasbeenread,anothersearchisconductedon

allwordsalreadyavailable. Toreduce this

duplica-tionof work, wewish to narrowdown theproblem

spacefor thesewords. Therefore, at each iteration,

thesetofhypotheseshastobeupdated:

Bydeciding which old dependencyhypotheses

shouldbekept.

Bydecidingwhichnew dependencyhypotheses

shouldbeadded tothesearchspaceinorderto

accomodatetheincomingword.

Forthatpurpose,severalheuristicshavebeen

de-vised,basedonthefollowingprinciples:

Prediction strength. Restrictthesearchspaceas

much as possible, while maintaining

correct-ness.

Economy. Keep asmuchofthepreviousstructure

aspossible.

Rightmost attachment. Attach the incoming

wordtothemostrecentwords.

Theheuristicsarepresentedhereinincreasing

or-optimal solution. Add all dependency edges

where theincomingwordmodies,oris

modi-ed by,anotherword.

B. AsA,butalsokeepalllinksthatdierfromthe

previous optimal solution only in their lexical

readings.

C. AsB,butalsokeepalllinksthatdierfromthe

previous optimal solution only inthe

subordi-nationofitslastword.

D. AsC,butalsokeepalllinksthatdierfromthe

previous optimal solution only inthe

subordi-nation ofall thewordslyingon thepath from

thelastwordto therootofthesolutiontree.

E. As D,but foralltreesintheprevioussolution.

5 Results

Inorder to evaluate the potential of the heuristics

described above,wehave conducted a series of

ex-periments using a grammar that was designed for

non-incremental,robustparsing. Wetested the

in-cremental against a non-incremental parser using

222 utterances taken from the VERBMOBIL

do-main(Wahlster,1993).

0% 20% 40% 60% 80% 100%

Heuristics

A B C D E

68.7%

81.1%

14.1%

86.8%

92.9%

20.5% 88.3%

93.7%

22.8% 80.4%

95.7%

84.3% 78.5%

96.4%

178.5%

...

Figure 2: Solution qualityand processing time for

dierentheuristics

Figure2comparestheveheuristicswithrespect

tothefollowingcriteria: 1

Accuracy. Theaccuracy (graybar)describeshow

manyedgesofthesolutionsarecorrect.

accuracy=

# correctedges

# edgesfound

1

(4)

theblackbaronthenumberofsolutionsfound

non-incrementally(whichislessthan100%)

be-causewewantfocusontheimpactofour

heuris-tics,notthecoverageofthegrammar.

weakrecall=

#correctedges

#edgesfoundnon-incrementally

Relativerun-time. Therun-timerequiredbythe

incremental procedure as a percentage of the

timerequiredbythenon-incrementalsearch

al-gorithmisgivenasthewhitebar.

Thedierencebetweenthegrayandtheblackbar

is due to errors of the heuristic method, i. e.,

ei-ther because of its incapability to nd the correct

subordinationorduetoexcessiveresourcedemands

(which leadtoprocessabortion).

Two observations can be made: First, all but

the last heuristics need less time than the

non-incrementalalgorithmto completewhile

maintain-ing a relative high degree of quality. Second, the

more elaborate the heuristics are, the longer they

needtorun(asexpected)andthebetterarethe

re-sultsfortheaccuracymeasure. However,the

heuris-tics D and E could not complete to parse all

sen-tencesbecauseinsomecasesapre-denedtimelimit

was exceeded; this leads to the observed decrease

in weak recall when compared to heuristics C. As

expected, a trade-o between computing time and

qualitycanbefound. Overall,heuristicsCseemsto

beagoodchoicebecauseitachievesanaccuracyof

upto93.7%inonlyonefthoftherun-time.

0% 50% 100% 150% 200% 250%

SentenceLength 1:::5 6:::10 11:::15 16:::20 overall

250.8% 0.3%

37.2%

3.5%

31.4%

48.2%

10.6%

100%

22.8%

12.8%

Time for

incremen-tal compared to

non-incremental method

Absolute time for

in-cremental(16:::20set

to100%)

Figure3: Processingtimevs.sentencelength

Figure 3 compares the time requirements of

heuristicsCfordierentsentencelengths.

Therelativerun-time(as inFigure 2) isgiven

thetimeforsentencelengthbetween16and20

set to100%.

The results show that the speedup observed in

Figure2is notevenlydistributed. Whilethe

incre-mentalanalysisof the short sentences takeslonger

(2.5 times slower) than the non-incremental

algo-rithm,the opposite is truefor longersentences(10

times faster). However, this is welcome behavior:

Theincrementalproceduretakeslongeronlyinthose

casesthataresolvedveryfastanyway;the

problem-aticcasesareparsedmorequickly. Thisbehavioris

arsthintthattheincrementalanalysiswithre-use

ofpartialresultsisastepthatalleviatesthe

combi-natorialexplosionofresourcedemands.

0% 20% 40% 60% 80% 100%

SentenceLength 1:::5 6:::10 11:::15 16:::20

overall

92.6%

94.8%

93.3%

94.5%

76.7%

92.9%

87.4%

88.3%

93.7%

Figure4: Accuracyvs.sentencelength(colorshave

thesamemeaningasinFigure2)

Finally, Figure 4 compares the quality resulting

fromheuristics C for dierent sentence lengths. It

turnsoutthat,althoughaslightdecreaseis

observ-able,the accuracy is relativelyindependentof

sen-tencelength.

6 Conclusions

An approachto theincrementalparsing of natural

language utterances has been presented, which is

basedon theideato userobust parsingtechniques

to dealwith incompletesentences. It determinesa

structuraldescriptionforarbitrarysentenceprexes

by searching for the optimal combination of local

hypotheses. Thissearch isconducted in aproblem

spacewhichisrepeatedlynarroweddownaccording

(5)

pectationthat the grammarused is robust enough

to reliably carry out such a prex analysis,

al-thoughithasoriginallybeendevelopedforthe

non-incremental case. The optimal structure as

deter-minedbytheparser obviouslycontainsrelevant

in-formation about the sentence prex, so that even

verysimpleandcheapheuristicscanachievea

con-siderablelevelofaccuracy. Therefore,largepartsof

thesearch spacecanbe excludedfromrepeated

re-analysis,whicheventuallymakesitevenfasterthan

itsnon-incrementalcounterpart. Mostimportantly,

the observedspeedup grows with thelength of the

utterance.

On the other hand, none of the used

structure-basedheuristicsproducesasignicantimprovement

of qualityevenif a largeamount of computational

resources is spent. Quite a number of cases can

beidentied where even the mostexpensiveof our

heuristics is not strong enough, e. g., the German

sentencewithatopicalizeddirectobject:

Die

nom,acc

Frau sieht der

nom

Mann.

The woman sees the man.

Thewoman,themansees.

Here, when analysing the subsentence die Frau

sieht, the parser will wrongly considerdie Frau as

the subject, because it appears to have the right

caseandthereisaclearpreferenceto doso. Later,

when the next word comes in, there is no way to

allowfordie Frau tochangeitsstructural

interpre-tation, because this is not licensed by any of the

givenheuristics.

Therefore, substantially more problem-oriented

heuristics are required, which should take into

ac-count not only the optimal structure, but also the

conicts caused by it. Using a weak but cheap

heuristics,afastapproximationoftheoptimal

struc-turecanbeobtainedwithinaveryrestrictedsearch

space, and then rened by subsequent structural

transformations (Foth et al., 2000). To a certain

degree this resembles the idea of applying reason

maintenancetechniquesforconictresolutionin

in-crementalparsing(Wirén,1990). Indecidingwhich

strategy is good enough to nd the necessary rst

approximationtheresultsofthispapermightplaya

crucialrole,sincethe possiblecontributionof

indi-vidualheuristicsinsuchanextendedframeworkcan

bepreciselyestimated.

Acknowledgements

ThisresearchhasbeenpartlyfundedbytheGerman

Research Foundation Deutsche

Forschungsgemein-KilianFoth, Wolfgang Menzel, and IngoSchröder.

2000. A transformation-based parsing technique

with anytime property. In Proceedings of the

International Workshop on Parsing Technologies

(IWPT-2000), pages89100,Trento,Italy.

AndreasHauensteinandHansWeber. 1994. An

in-vestigation of tightly coupled time synchronous

speech language interfaces using a unication

grammar. In Proceedings of the 12th National

Conference on Articial Intelligence: Workshop

ontheIntegrationofNaturalLanguageandSpeech

Processing,pages4249,Seattle,Washington.

Johannes Heinecke, Jürgen Kunze, Wolfgang

Men-zel, and Ingo Schröder. 1998. Eliminative

pars-ing with graded constraints. In Proceedings of

the Joint Conference COLING/ACL-98,

Mon-tréal, Canada.

VincenzoLombardo.1992. Incrementaldependency

parsing. InProceedings of the Annual Meeting of

the ACL,Delaware,Newark,USA.

Mitchell P. Marcus. 1987. Deterministic

pars-ing and description theory. In Peter Whitelock,

Mary McGee Wood, HaroldSomers, Rod

John-son,andPaulBennett,editors,LinguisticTheory

and Computer Applications, pages 69112.

Aca-demicPress,London, England.

WolfgangMenzelandIngoSchröder. 1998. Decision

procedures for dependency parsing using graded

constraints. In Sylvain Kahane and Alain

Pol-guère, editors, Proceedings of the Joint

Confer-ence COLING/ACL-98 Workshop: Processing of

Dependency-basedGrammars, pages7887,

Mon-tréal, Canada.

Ingo Schröder, Wolfgang Menzel, Kilian Foth,

and Michael Schulz. 2000. Modeling

depen-dency grammar with restricted constraints.

In-ternational Journal Traitement Automatique des

Langues: Grammaires dedépendance,41(1).

Wolfgang Wahlster. 1993. Verbmobil: Translation

of face-to-face dialogs. InProceedings of the 3rd

European Conference on Speech Communication

andTechnology, pages2938,Berlin,Germany.

Hans Weber. 1995. LR-inkrementelles

proba-bilistisches Chartparsing von

Worthypothesen-graphenmitUnikationsgrammatiken: Eineenge

Kopplung von Suche und Analyse.

Verbmobil-Report52,UniversitätErlangen-Nürnberg.

MatsWirén. 1992. StudiesinIncremental

Natural-Language Analysis. Ph.D. thesis, Department of

Computer and Information Science, Linköping

University,Linköping,Sweden.

MatsWirén. 1990. Incrementalparsingandreason

maintenance. InProceedings of the 13th

Interna-tional Conference on Computational Linguistics