Using Robust Parsing Techniques
Kilian Foth and Wolfgang Menzel and Horia F. Pop and Ingo Schröder
foth j menzelj hfpop [email protected]
Fachbereich Informatik,Universität Hamburg
Vogt-Kölln-Straÿe 30,22527 Hamburg, Germany
Abstract
Theresultsofanexperimentarepresentedinwhich
anapproachforrobustparsinghasbeenapplied
in-crementally. Theyconrmthatduetotherobust
na-tureof theunderlyingtechnologyan arbitrary
pre-x of a sentence can be analysedinto an
interme-diate structural description which is ableto direct
thefurtheranalysiswithahighdegreeofreliability.
Most notably, this result can be achieved without
adaptingthe grammarortheparsingalgorithmsto
the case of incremental processing. The resulting
incrementalparsingprocedureissignicantlyfaster
if comparedto a non-incrementalbest-rst search.
Additionallyitturnsoutthat longersentences
ben-etmostfromthisacceleration.
1 Introduction
Natural language utterances usually unfold over
time, i. e., both listening and reading are carried
outinan incrementalleft-to-rightmanner.
Model-ing a similar type of behaviourin computer-based
solutions is achallenging and particularly
interest-ingtaskforanumberofquitedierentreasonsthat
aremostrelevantinthecontextofspokenlanguage
systems:
Without any external signals about the end
of an utterance, incremental analysis is the
onlymeansto segmenttheincomingstreamof
speechinput.
An incremental analysis mode provides for a
more natural (mixed-initiative) dialogue
be-haviourbecausepartialresultsarealready
avail-ablewellbefore theend ofanutterance.
Parsingmayalreadytakeplaceinconcurrency
tosentenceproduction. Thereforethespeaking
timebecomesavailableascomputingtime.
Dynamic expectations about the upcoming
partsoftheutterancemightbederivedrightin
timetoprovideguidinghintsforother
process-ing components,e. g., predictionsaboutlikely
wordformsforaspeechrecognizer(Hauenstein
Inprinciple,twoalternativestrategiescanbe
pur-suedwhen designingan incrementalparsing
proce-dure:
1. Tokeepopenall necessarystructural
hypothe-sesrequiredto accomodateeverypossible
con-tinuationof anutterance. This is thestrategy
usuallyadoptedinanincrementalchartparser
(Wirén, 1992).
2. Tocommittooneoralimitednumberof
inter-pretationswhere
(a) either this commitment is made rather
early and a mechanism for partial
re-analysisis provided(Lombardo,1992)or
(b) it is delayeduntilsucientinformationis
eventuallyavailabletotakeanultimate
de-cision(Marcus,1987).
The apparent eciency of human language
un-derstandingis usually attributed to an early
com-mitmentstrategy.
Our approach, in fact, represents an attempt to
combinethese two strategies: On the onehand, it
keepsmany ofthe availablebuildingblocks forthe
initialpartof anutteranceand passesan(updated)
search space to the followingprocessing step.
Ad-ditionally,theoptimal structuraldescriptionforthe
dataalreadyavailableis determined. Thisnotonly
makesanactualinterpretation(togetherwith
expec-tationsforpossiblecontinuations) availableto
sub-sequent processing components, but also opens up
thepossibilityto usethis informationto eectively
constrainthesetofnewstructuralhypotheses.
Determining theoptimal interpretation for ayet
incompletesentence,however,requiresaparsing
ap-proach robustenough to analyse an arbitrary
sen-tenceprex intoameaningful structure. Therefore
twocloselyrelatedquestionsneedtoberaised:
1. Can the necessary degree of robustness be
achieved?
2. Is the information contained in the currently
tentialforsearchspacereductionsagainstapossible
loss of accuracy, a series of experiments has been
conducted. Sections2and3introduceandmotivate
the framework of the experiments. Section 4
de-scribesanumberofheuristicsforre-useofprevious
solutionsandSection5presentstheresults.
2 Robust Parsing in a Dependency
Framework
Our grammar models utterances as dependency
trees, which consist of pairs of words so that one
depends directly on theother. This subordination
relation canbe qualiedbyalabel(e. g.to
distin-guishcomplementsfrommodiers). Sinceeachword
canonly depend on oneother word, a labeledtree
isformed,usuallywiththenite verbasitsroot.
The decisionon which structure to postulate for
anutteranceisguidedbyexplicitconstraints,which
arerepresentedasuniversallyquantiedlogical
for-mulasaboutfeaturesofwordformsandpartialtrees.
Forinstance,oneconstraintmightpostulatethat a
relationlabeledas`Subject'canonlyoccurbetween
anounandaniteverbtoitsright,orthattwo
dif-ferentdependenciesofthesameverbmaynotboth
be labeled `Subject'. For eciency reasons, these
formulasmayconstrainindividualdependencyedges
orpairsofedgesonly. Theapplicationofconstraints
canbeginassoonastherst wordof anutterance
isread;noglobalinformationabouttheutteranceis
requiredforanalysisofitsbeginning.
Sincenaturallanguageinputwilloftenexhibit
ir-regularitiessuchasrestarts,repairs,hesitationsand
other grammatical errors, individual errors should
notmakefurtheranalysisimpossible. Instead,a
ro-bust parsershouldcontinuetobuild astructure for
theutterance. Ideally,thisstructureshouldbeclose
tothat ofasimilar,but grammaticalutterance.
Thisgoalisattainedbyannotatingtheconstraints
that constitute the grammar with scores ranging
from0to 1. A structure that violatesoneormore
constraintsisannotatedwiththeproductofthe
cor-respondingscores,andthestructurewiththehighest
combinedscoreisdenedasthesolutionofthe
pars-ingproblem. Ingeneral,thehigherthescoresofthe
constraints,themoreirregularconstructionscanbe
analysed. Parsing an utterance with annotated or
soft constraintsthus amountsto multi-dimensional
optimization. Both complete and heuristic search
methodscanbeemployedtosolvesuch aproblem.
Our robust approach also provides an easy way
to implementpartialparsing. If necessary,e.g., an
isolatednounlabeledas`Subject'mayformtheroot
of a dependency tree, although this would violate
therstconstraintmentionedabove. Ifaniteverb
isavailable,however,subordinatingthenoununder
analysisofincompleteutterances.
Dierentlevels ofanalysiscanbedenedtomodel
syntactic aswell assemanticstructures. A
depen-dency tree is constructed for each of these levels.
Sinceconstraintscanrelatetheedgesinparallel
de-pendency trees to each other, having several trees
contributesto therobustness of theapproach.
Al-together,the grammarused intheexperiments
de-scribedcomprises12levelsof analysisand490
con-straints(Schröderet al.,2000).
3 Prex Parsing with Weighted
Constraints
In general, dependency analysis is well-suited for
incrementalanalysis. Since subordinations always
concerntwowordsratherthanfullconstituents,each
wordcanbeintegratedintotheanalysisassoonasit
isread,althoughnotnecessarilyintheoptimalway.
Also,thepre-computeddependencylinkscaneasily
be re-usedin subsequentiterations. Therefore,
de-pendencygrammarallowsane-grainedincremental
analysis(Lombardo,1992).
dann
lassen
sie
uns
doch
noch
einen
Termin
mod
c1
mod
c1
c2
mod
(a)
let
you
yet
a
Then
us
meeting
dann
lassen
sie
uns
doch
noch
einen
Termin
c1
mod
c3
c2
ausmachen
c1
c2
mod
mod
(b)
Then
let
you
us
yet
a
meeting
appoint
Let’s appoint yet another meeting then.
<part>
<part>
Figure1: An exampleforaprexanalysis
Whenassigningadependencystructureto
incom-pleteutterances,theproblemariseshowtoanalyse
wordswhose governorsor complementsstilllie
be-yondthetimehorizon. Twodistinctalternativesare
possible:
1. Theparsercanestablishadependencybetween
thewordandaspecialnoderepresentinga
pu-tativewordthat isassumedtofollowinthe
re-maining input. This explicitly models the
ex-pectationsthat would beraised by the prex.
under-many constraints cannot be meaningfully
ap-pliedto wordswithunknownfeatures.
2. An incomplete prex can be analyzed directly
if a grammar is robust enough to allow
par-tial parsing as discussed in the previous
sec-tion: If the constraint that forbids multiple
trees receives a severe but non-zero penalty,
missing governors or complementsare
accept-ableaslongasnobetterstructureispossible.
Experimentsinprexparsingusingadependency
grammarofGermanhaveshownthatevencomplex
utterances with nested subclauses can be analysed
in thesecond way. Figure 1a provides an example
of this: Because the innitive verb `ausmachen' is
notyetvisible,itscomplement`Termin'isanalysed
asanisolatedsubtree,andthemainverb`lassen'is
lacking a complement. After the missingverb has
beenread,twoadditional dependency edges suce
tobuildthecorrectstructurefromthepartialparse.
Thismethodallowsdirectcomparisonbetween
in-cremental and non-incremental parser runs, since
both methods use the same grammar. Therefore,
wewillfollowuponthesecondalternativeonlyand
construct extendedstructures guided by the
struc-turesofprexes,withoutexplicitlymodelingmissing
words.
4 Re-Use of Partial Results
While a prex analysiscan produce partial parses
anddiagnoses, sofar thisinformationhasnotbeen
used in subsequentiterations. In fact, after a new
wordhasbeenread,anothersearchisconductedon
allwordsalreadyavailable. Toreduce this
duplica-tionof work, wewish to narrowdown theproblem
spacefor thesewords. Therefore, at each iteration,
thesetofhypotheseshastobeupdated:
Bydeciding which old dependencyhypotheses
shouldbekept.
Bydecidingwhichnew dependencyhypotheses
shouldbeadded tothesearchspaceinorderto
accomodatetheincomingword.
Forthatpurpose,severalheuristicshavebeen
de-vised,basedonthefollowingprinciples:
Prediction strength. Restrictthesearchspaceas
much as possible, while maintaining
correct-ness.
Economy. Keep asmuchofthepreviousstructure
aspossible.
Rightmost attachment. Attach the incoming
wordtothemostrecentwords.
Theheuristicsarepresentedhereinincreasing
or-optimal solution. Add all dependency edges
where theincomingwordmodies,oris
modi-ed by,anotherword.
B. AsA,butalsokeepalllinksthatdierfromthe
previous optimal solution only in their lexical
readings.
C. AsB,butalsokeepalllinksthatdierfromthe
previous optimal solution only inthe
subordi-nationofitslastword.
D. AsC,butalsokeepalllinksthatdierfromthe
previous optimal solution only inthe
subordi-nation ofall thewordslyingon thepath from
thelastwordto therootofthesolutiontree.
E. As D,but foralltreesintheprevioussolution.
5 Results
Inorder to evaluate the potential of the heuristics
described above,wehave conducted a series of
ex-periments using a grammar that was designed for
non-incremental,robustparsing. Wetested the
in-cremental against a non-incremental parser using
222 utterances taken from the VERBMOBIL
do-main(Wahlster,1993).
0% 20% 40% 60% 80% 100%
Heuristics
A B C D E
68.7%
81.1%
14.1%
86.8%
92.9%
20.5% 88.3%
93.7%
22.8% 80.4%
95.7%
84.3% 78.5%
96.4%
178.5%
...
Figure 2: Solution qualityand processing time for
dierentheuristics
Figure2comparestheveheuristicswithrespect
tothefollowingcriteria: 1
Accuracy. Theaccuracy (graybar)describeshow
manyedgesofthesolutionsarecorrect.
accuracy=
# correctedges
# edgesfound
1
theblackbaronthenumberofsolutionsfound
non-incrementally(whichislessthan100%)
be-causewewantfocusontheimpactofour
heuris-tics,notthecoverageofthegrammar.
weakrecall=
#correctedges
#edgesfoundnon-incrementally
Relativerun-time. Therun-timerequiredbythe
incremental procedure as a percentage of the
timerequiredbythenon-incrementalsearch
al-gorithmisgivenasthewhitebar.
Thedierencebetweenthegrayandtheblackbar
is due to errors of the heuristic method, i. e.,
ei-ther because of its incapability to nd the correct
subordinationorduetoexcessiveresourcedemands
(which leadtoprocessabortion).
Two observations can be made: First, all but
the last heuristics need less time than the
non-incrementalalgorithmto completewhile
maintain-ing a relative high degree of quality. Second, the
more elaborate the heuristics are, the longer they
needtorun(asexpected)andthebetterarethe
re-sultsfortheaccuracymeasure. However,the
heuris-tics D and E could not complete to parse all
sen-tencesbecauseinsomecasesapre-denedtimelimit
was exceeded; this leads to the observed decrease
in weak recall when compared to heuristics C. As
expected, a trade-o between computing time and
qualitycanbefound. Overall,heuristicsCseemsto
beagoodchoicebecauseitachievesanaccuracyof
upto93.7%inonlyonefthoftherun-time.
0% 50% 100% 150% 200% 250%
SentenceLength 1:::5 6:::10 11:::15 16:::20 overall
250.8% 0.3%
37.2%
3.5%
31.4%
48.2%
10.6%
100%
22.8%
12.8%
Time for
incremen-tal compared to
non-incremental method
Absolute time for
in-cremental(16:::20set
to100%)
Figure3: Processingtimevs.sentencelength
Figure 3 compares the time requirements of
heuristicsCfordierentsentencelengths.
Therelativerun-time(as inFigure 2) isgiven
thetimeforsentencelengthbetween16and20
set to100%.
The results show that the speedup observed in
Figure2is notevenlydistributed. Whilethe
incre-mentalanalysisof the short sentences takeslonger
(2.5 times slower) than the non-incremental
algo-rithm,the opposite is truefor longersentences(10
times faster). However, this is welcome behavior:
Theincrementalproceduretakeslongeronlyinthose
casesthataresolvedveryfastanyway;the
problem-aticcasesareparsedmorequickly. Thisbehavioris
arsthintthattheincrementalanalysiswithre-use
ofpartialresultsisastepthatalleviatesthe
combi-natorialexplosionofresourcedemands.
0% 20% 40% 60% 80% 100%
SentenceLength 1:::5 6:::10 11:::15 16:::20
overall
92.6%
94.8%
93.3%
94.5%
76.7%
92.9%
87.4%
87.4%
88.3%
93.7%
Figure4: Accuracyvs.sentencelength(colorshave
thesamemeaningasinFigure2)
Finally, Figure 4 compares the quality resulting
fromheuristics C for dierent sentence lengths. It
turnsoutthat,althoughaslightdecreaseis
observ-able,the accuracy is relativelyindependentof
sen-tencelength.
6 Conclusions
An approachto theincrementalparsing of natural
language utterances has been presented, which is
basedon theideato userobust parsingtechniques
to dealwith incompletesentences. It determinesa
structuraldescriptionforarbitrarysentenceprexes
by searching for the optimal combination of local
hypotheses. Thissearch isconducted in aproblem
spacewhichisrepeatedlynarroweddownaccording
pectationthat the grammarused is robust enough
to reliably carry out such a prex analysis,
al-thoughithasoriginallybeendevelopedforthe
non-incremental case. The optimal structure as
deter-minedbytheparser obviouslycontainsrelevant
in-formation about the sentence prex, so that even
verysimpleandcheapheuristicscanachievea
con-siderablelevelofaccuracy. Therefore,largepartsof
thesearch spacecanbe excludedfromrepeated
re-analysis,whicheventuallymakesitevenfasterthan
itsnon-incrementalcounterpart. Mostimportantly,
the observedspeedup grows with thelength of the
utterance.
On the other hand, none of the used
structure-basedheuristicsproducesasignicantimprovement
of qualityevenif a largeamount of computational
resources is spent. Quite a number of cases can
beidentied where even the mostexpensiveof our
heuristics is not strong enough, e. g., the German
sentencewithatopicalizeddirectobject:
Die
nom,acc
Frau sieht der
nom
Mann.
The woman sees the man.
Thewoman,themansees.
Here, when analysing the subsentence die Frau
sieht, the parser will wrongly considerdie Frau as
the subject, because it appears to have the right
caseandthereisaclearpreferenceto doso. Later,
when the next word comes in, there is no way to
allowfordie Frau tochangeitsstructural
interpre-tation, because this is not licensed by any of the
givenheuristics.
Therefore, substantially more problem-oriented
heuristics are required, which should take into
ac-count not only the optimal structure, but also the
conicts caused by it. Using a weak but cheap
heuristics,afastapproximationoftheoptimal
struc-turecanbeobtainedwithinaveryrestrictedsearch
space, and then rened by subsequent structural
transformations (Foth et al., 2000). To a certain
degree this resembles the idea of applying reason
maintenancetechniquesforconictresolutionin
in-crementalparsing(Wirén,1990). Indecidingwhich
strategy is good enough to nd the necessary rst
approximationtheresultsofthispapermightplaya
crucialrole,sincethe possiblecontributionof
indi-vidualheuristicsinsuchanextendedframeworkcan
bepreciselyestimated.
Acknowledgements
ThisresearchhasbeenpartlyfundedbytheGerman
Research Foundation Deutsche
Forschungsgemein-KilianFoth, Wolfgang Menzel, and IngoSchröder.
2000. A transformation-based parsing technique
with anytime property. In Proceedings of the
International Workshop on Parsing Technologies
(IWPT-2000), pages89100,Trento,Italy.
AndreasHauensteinandHansWeber. 1994. An
in-vestigation of tightly coupled time synchronous
speech language interfaces using a unication
grammar. In Proceedings of the 12th National
Conference on Articial Intelligence: Workshop
ontheIntegrationofNaturalLanguageandSpeech
Processing,pages4249,Seattle,Washington.
Johannes Heinecke, Jürgen Kunze, Wolfgang
Men-zel, and Ingo Schröder. 1998. Eliminative
pars-ing with graded constraints. In Proceedings of
the Joint Conference COLING/ACL-98,
Mon-tréal, Canada.
VincenzoLombardo.1992. Incrementaldependency
parsing. InProceedings of the Annual Meeting of
the ACL,Delaware,Newark,USA.
Mitchell P. Marcus. 1987. Deterministic
pars-ing and description theory. In Peter Whitelock,
Mary McGee Wood, HaroldSomers, Rod
John-son,andPaulBennett,editors,LinguisticTheory
and Computer Applications, pages 69112.
Aca-demicPress,London, England.
WolfgangMenzelandIngoSchröder. 1998. Decision
procedures for dependency parsing using graded
constraints. In Sylvain Kahane and Alain
Pol-guère, editors, Proceedings of the Joint
Confer-ence COLING/ACL-98 Workshop: Processing of
Dependency-basedGrammars, pages7887,
Mon-tréal, Canada.
Ingo Schröder, Wolfgang Menzel, Kilian Foth,
and Michael Schulz. 2000. Modeling
depen-dency grammar with restricted constraints.
In-ternational Journal Traitement Automatique des
Langues: Grammaires dedépendance,41(1).
Wolfgang Wahlster. 1993. Verbmobil: Translation
of face-to-face dialogs. InProceedings of the 3rd
European Conference on Speech Communication
andTechnology, pages2938,Berlin,Germany.
Hans Weber. 1995. LR-inkrementelles
proba-bilistisches Chartparsing von
Worthypothesen-graphenmitUnikationsgrammatiken: Eineenge
Kopplung von Suche und Analyse.
Verbmobil-Report52,UniversitätErlangen-Nürnberg.
MatsWirén. 1992. StudiesinIncremental
Natural-Language Analysis. Ph.D. thesis, Department of
Computer and Information Science, Linköping
University,Linköping,Sweden.
MatsWirén. 1990. Incrementalparsingandreason
maintenance. InProceedings of the 13th
Interna-tional Conference on Computational Linguistics