parsing and word recognition in a spoken language interface
WilliamSchuler
Departmentof Computer and Information Science
University of Pennsylvania
200 S.33rd Street
Philadelphia, PA 19104
Abstract
Thispaperdescribesanextensionofthe
se-manticgrammarsusedinconventional
sta-tisticalspokenlanguageinterfacesto allow
the probabilities of derivedanalyses to be
conditionedonthemeaningsordenotations
of input utterances in the context of an
interface's underlying application
environ-ment orworld model. Since these
denota-tionswill beused toguide disambiguation
ininteractiveapplications,theymustbe
ef-ciently shared among the many possible
analysesthat maybeassignedto aninput
utterance. This paperthereforepresentsa
formalrestrictiononthescopeofvariables
in a semantic grammar which guarantees
that the denotationsof all possible
analy-sesofaninpututterancecanbecalculated
in polynomial time, without undue
con-straints on the expressivity of the derived
semantics. Empiricaltests show that this
model-theoreticinterpretationyieldsa
sta-tistically signicant improvement on
stan-dard measures of parsing accuracy over a
baselinegrammarnotconditionedon
deno-tations.
1 Introduction
The development of speaker-independent
mixed-initiative speech interfaces, in which users notonly
answer questions but also ask questions and give
instructions, is currently limited by the
perfor-manceoflanguagemodelsbasedlargelyonword
co-occurrences. Even under ideal circumstances, with
largeapplication-specic corporaon which totrain,
TheauthorwouldliketothankDavidChiang,Karin
Kipper, and threeanonymousreviewersfor particularly
helpfulcommentsonthismaterial. Thisworkwas
sup-conventional language models are not suÆciently
predictiveto correctlyanalyze awide variety of
in-puts fromawide variety ofspeakers,suchasmight
beencounteredinageneral-purposeinterfacefor
di-rectingrobots, oÆceassistants,orotheragentswith
complexcapabilities. Such tasks may involve
unla-beledobjectsthatmustbepreciselydescribed,anda
widerrangeofactionsthanastandarddatabase
in-terfacewould require(which also must be precisely
described),introducingagreatdealofambiguityinto
inputprocessing.
This paper thereforeexplores the use of a
statis-tical model of language conditioned on the
mean-ingsordenotationsofinpututterancesinthecontext
ofaninterface'sunderlyingapplicationenvironment
or world model. This use of model-theoretic
inter-pretation represents an important extension to the
`semanticgrammars'usedinexistingstatistical
spo-kenlanguageinterfaces,whichrelyonco-occurrences
amonglexically-determinedsemanticclassesandslot
llers (Miller et al., 1996), in that the probability
of an analysis is now also conditioned on the
exis-tence of denoted entities andrelations in the world
model. The advantage of the interpretation-based
disambiguationadvanced here is that the
probabil-ityof generating,forexample,thenounphrase`the
lemon next to the safe' can be more reliably
esti-matedfrom thefrequencywith which noun phrases
have non-empty denotations { given the fact that
`thelemonnexttothesafe'doesindeeddenote
some-thingintheworldmodel {thanitcanfrom the
rel-ativelysparseco-occurrencesofframelabelssuchas
lemonandnext-to,orofnext-toandsafe.
Since there are exponentially many word strings
attributable to any utterance, and an exponential
(Catalan-order)numberofpossibleparse tree
anal-yses attributable to any string of words, this use
ofmodel-theoreticinterpretationfordisambiguation
performedonlargenumbersofpossibleanalysesina
practical interactiveapplication. This paper
there-forealsopresentsaformalrestrictiononthescopeof
variablesinasemanticgrammar(withoutuntoward
constraintsontheexpressivityofthederived
seman-tics) which guarantees that the denotations of all
possibleanalysesofaninpututterancecanbe
calcu-latedin polynomialtime. Empiricaltestsshowthat
this use of model-theoretic interpretation in
disam-biguation yields a statistically signicant
improve-mentonstandardmeasuresofparsingaccuracyover
abaselinegrammarnotconditionedondenotations.
2 Model-theoretic interpretation
Inordertodeterminewhetherauser'sdirections
de-note entities and relations that exist in the world
model { and of course, in order to execute those
directions once they are disambiguated { it is
nec-essary to precisely representthe meanings of input
utterances.
Semantic grammars of the sort employed in
cur-rentspokenlanguageinterfacesforightreservation
tasks (Miller et al., 1996; Sene et al., 1998)
asso-ciatefragmentsoflogical{typicallyrelational
alge-bra{expressionswith recursivetransitionnetworks
encodinglexicalizedrulesinacontext-freegrammar
(theindependentprobabilitiesoftheserulescanthen
beestimated from atrainingcorpusand multiplied
togethertogiveaprobabilityforanygivenanalysis).
Inightreservationsystems,theseassociated
seman-tic expressions usually designate entities through a
xed set ofconstant symbolsused asproper names
(e.g.forcitiesandnumberedights);butin
applica-tions with unlabeled (perhaps visually-represented)
environments, entities must be described by
pred-icating one or more modiers over some variable,
narrowing the set of potential referents by
specify-ing colors,spatial locations,etc., until only the
de-siredentityorentitiesremain. Asemanticgrammar
for interacting with this kind of unlabeled
environ-ment might contain the following rules, using
vari-ables x
1 :::x
n
(over entities in the world model) in
theassociatedlogicalexpressions:
VP!VPPP :x
1 :::x n :$1(x 1 :::x m )^ $2(x 1 ;x m+1 :::x n )
VP!holdNP :x
1
:Hold(Agent;x
1
)^$2(x
1 )
NP!aglass :x
1
:Glass(x
1 )
PP!underNP :x
1 x
2
:Under(x
1 ;x
2
)^$2(x
2 )
NP!thefaucet:x
1
:Faucet(x
1 )
inwhichmandnareintegersand0mn. Each
lambda expression x
1 :::x
n
: indicates a function
from a tuple of entities he
1 :::e
n
i to a truth value
x 1 :::x n=2 :$1(x 1 :::x m=1 )^$2(x
1 ;x m+1 :::x n ) fhf 1 ;g 1 i;hf 2 ;g 2 i;:::g
VP!holdNP
x1:H(A;x1)^$2(x1)
fg1;g2;:::g
hold NP!...
x 1 :G(x 1 ) fg 1 ;g 2 ;:::g
a glass
PP!underNP
x1x2:U(x1;x2)^$2(x2)
fhf1;g1i;hf2;g2i;:::g
under NP!...
x
1 :F(x
1 ) ff 1 ;f 2 ;:::g
the faucet
Figure1: Semanticgrammarderivationshowingthe
associated semantics and denotation of each
con-stituent. Entire rulesareshownat eachstepin the
derivationinordertomakethesemanticassociations
explicit. stitutinge 1 :::e n forx 1 :::x n
), which denotesaset of
tuples satisfying, drawn from E n
(where E is the
setofentitiesin theworldmodel).
Thepseudo-variables$1;$2;::: inthisnotation
in-dicate the sites at which the semantic expressions
associatedwitheachrule'snonterminalsymbolsare
tocompose(the numberscorrespondto therelative
positions of the symbols on the right hand side of
eachrule,numberedfromlefttoright). Semantic
ex-pressionsforcompletesentencesarethenformedby
composing thesub-expressionsassociatedwitheach
ruleattheappropriatesites. 1
Figure 1 shows the above rules assembled in a
derivation of the sentence `hold a glass under the
faucet.' Thedenotationannotatedbeneatheach
con-stituent is simply the set of variable assignments
(foreachfreevariable)thatsatisfytheconstituent's
semantics. These denotations exactly capture the
meaning (in a given world model) of the
assem-bled semantic expressions dominated by each
con-stituent,regardlessofhowmanysub-expressionsare
subsumedbythat constituent,andcanthereforebe
sharedamong competing analysesin lieu of the
se-manticexpressionitself,asapartialresultin
model-theoreticinterpretation.
2.1 Variable scope
Note, however, that the adjunction of the
preposi-tional phrase modier `under the faucet' adds
an-otherfreevariable(x
2
)to thesemanticsoftheverb
1
Thisuseofpseudo-variablesisintendedtoresemble
thatoftheunixprogram`yacc,'whichhasasimilar
VP!VPPP
x1:::xn=1:$1(x1:::xm=0)^$2(x1;xm+1:::xn)
VP!holdNP
Qx
1
:H(A;x1)^$2(x1)
hold NP!...
PP!underNP
x1:Qx
2
:U(x1;x2)^$2(x2)
under NP!...
Figure 2: Derivation with minimal scoping. The
variable x
1
in the semantic expression associated
withtheprepositionalphrase`underthefaucet'
can-notbeidentiedwiththevariableintheverbphrase.
phrase,andthereforeanotherfactorofjEjtothe
car-dinalityofitsdenotation. Moreover,underthiskind
ofglobalscoping,ifadditionalprepositionalphrases
are adjoined, they would each contribute yet
an-otherfree variable,increasing thecomplexityof the
denotation by an additional factor of jEj, making
the shared interpretation of such structures
poten-tially exponential on the length of the input. This
proliferation of free variables means that the
vari-ables introduced by the noun phrases in an
utter-ance, such as `hold a glass under the faucet,'
can-not all be given global scope, as in Figure 1. On
theother hand,thevariables introducedby
quanti-ed noun phrasescannot be bound as soon as the
noun phrasesarecomposed,asin Figure2,because
these variables may need to be used in modiers
composed in subsequent (higher) rule applications.
Fortunately, if these non-immediate variable
scop-ingarrangementsareexpressedstructurally,as
dom-inance relationships in the elementary tree
struc-turesofsomegrammar,thenastructuralrestriction
on this grammar can beenforced that preservesas
manynon-immediatescopingarrangementsas
possi-blewhilestillpreventinganunboundedproliferation
offreevariables.
Thecorrectscopingarrangements(e.g.forthe
sen-tence `hold a glass under the faucet,' shown
Fig-ure 3) canbeexpressed using ordered sets of parse
rules grouped together in such a way as to allow
other structuralmaterial to intervene. In thiscase,
a groupwould include arule for composing a verb
and a noun phrase with someassociated predicate,
and oneormorerulesfor bindingeachof the
pred-icate's variables in a quantiersomewhere aboveit
(thereby ensuring that these rules always occur
to-getherwiththequantierrulesdominatingthe
pred-icaterule),whilestillallowingrulesadjoining
prepo-sitional phrasemodiers to apply in between them
x
2 :::x
n=1 :Q
x
1 :$1(x
1 :::x
n )
fhig
VP!VPPP
x1:::xn=1:$1(x1:::xm=1)^$2(x1;xm+1:::xn)
fg1;g2;:::g
VP!holdNP
x
1 :H(A;x
1 )^$2(x
1 )
fg
1 ;g
2 ;:::g
hold NP!...
PP!PP
x
2 :::x
n :Q
x1 :$1(x
1 :::x
n )
fg
1 ;g
2 ;:::g
PP!underNP
x1x2:U(x2;x1)^$2(x1)
fhf1;g1i;hf2;g2i;:::g
under NP!...
Figure3: Derivation withdesiredscoping.
beboundbythesamequantiers).
These `grouped rules' can be formalized using a
tree-rewriting system whose elementary trees can
subsume several ordered CFGrule applications (or
stepsin acontext-freederivation),asshown in
Fig-ure 4. Each such elementary tree contains a rule
(node)associated withalogicalpredicate andrules
(nodes) associated with quantiers binding each of
thepredicate'svariables. These treesarethen
com-posed byrewriting operations (dotted lines), which
splitthemupandeitherinsertthembetweenor
iden-tifythemwith(ifdemarcatedwithdashedlines)the
rules in another elementary tree {in this case, the
elementarytreeanchoredbytheword`under.' These
trees areconsidered elementary in order to exclude
thepossibilityofgeneratingderivationsthatcontain
unbound variables or quantiers over unused
vari-ables,which would haveno intuitive meaning. The
composition operations will bepresentedin further
detailin Section2.2.
2.2 Semantic compositionas tree-rewriting
A generalclass of rewriting systemscanbe dened
using sets of allowable expansions of some type of
object to incorporate zero or more other instances
of the same type of object, each of which is
simi-larly expandable. Such a system can generate
ar-bitrarily complex structure by recursively
expand-ingor`rewriting'eachnewobject,concludingwitha
set ofzero-expansions at thefrontier. Forexample,
acontext-free grammar may be cast asa rewriting
systemwhose objectsare strings, andwhose
VP!VP x 2 :::x n :Q x 1 :$1(x 1 :::x n )
VP!holdNP
x1:::xn:$1(x1:::xn)^$2(x1)
V!hold
x1:Hold(A;x1)
NP!...
...
VP!VP
x 2 :::x n :Q x1 :$1(x 1 :::x n )
VP!VPPP
x 1 :::x n :$1(x 1 :::x m )^$2(x
1 ;x m+1 :::x n ) VP 1
PP!PP
x 2 :::x n :Q x 1 :$1(x 1 :::x n )
PP!PNP
x1:::xn:$1(x1:::xn)^$2(x1)
P!under
x1x2:Under(x2;x1)
NP
2 1
2
PP!PP
x2:::xn:Qx
1
:$1(x1:::xn)
NP!QN
$2
Q!a N!faucet
x
1
:Faucet(x
1 )
Figure4: Completeelementarytreefor`under'showingargumentinsertionsites.
ofzeroormoresub-strings arrangedaroundcertain
`elementary'stringscontributingterminalsymbols.
A class of tree-rewriting systems can similarly
be dened as rewriting systems whose objects are
trees, and whose allowable expansions are
produc-tions (similar to context-free productions), each of
which rewrite a tree A as some function f applied
to zeroormoresub-treesA
1 ;:::A
s
;s0arranged
around some`elementary' tree structure dened by
f (Pollard,1984;Weir,1988):
A!f(A
1 ;:::A
s
) (1)
This elementary tree structure can be used to
ex-press the dominance relationship between a logical
predicateand thequantiersthatbind itsvariables
(whichmustbepreservedinanymeaningfulderived
structure); but in order to allow thesame instance
of a quantier to bind variables in more than one
predicate, the rewriting productions of such a
se-mantic tree-rewriting system must allow expanded
subtrees toidentify parts oftheirstructure
(speci-cally, theparts containingquantiers)with partsof
each other's structure, and with that of their host
elementary tree.
Inparticular,arewritingproductioninsucha
sys-temwouldrewriteatreeAasanelementarytree
0
withasetofsub-treesA
1 ;:::A
s
insertedintoit,each
of which isrst partitioned intoaset of contiguous
components(inorderto isolateparticularquantier
nodesandotherkindsof sub-structure)usingatree
partitionfunction gatsomesequenceofsplitpoints
h#
i1 ,...#
ici
i,whicharenodeaddressesinA
i
(therst
ofwhichsimplyspeciestheroot). 2
Theresulting
se-quence ofpartitioned componentsofeachexpanded
2
sub-treeare then inserted into
0
at a
correspond-ing sequence of insertion site addresses h
i1 ,...
ici i
denedbytherewritingfunctionf:
f(A
1 ;:::A
s )= 0 [h 11 ,... 1c1 i;g #11,...#1c 1 (A 1 )]::: [h s1 ,... sc s i;g # s1 ,...# scs (A s )] (2)
Since each address can only host a single inserted
component,any componentsfrom dierent sub-tree
argumentsoff that areassigned tothesame
inser-tionsiteaddressareconstrainedtobeidenticalin
or-derforthe productionto apply. Additionally, some
addresses may be`pre-lled' aspartof the
elemen-tary structure dened in f, and therefore may also
beidentiedwithcomponentsofsub-treearguments
off thatareinsertedatthesameaddress.
Figure4showsthesetofinsertionsites(designated
withboxedindices)foreachargumentofan
elemen-tarytreeanchoredby`under.' Thesites labeled 1,
associated with the rstargumentsub-tree (in this
case,thetreeanchoredby`hold'),indicate thatitis
composed bypartitioning it into three components,
eachdominatingordominatedbytheothers,the
low-estofwhichisinsertedattheterminalnodelabeled
`VP,' the middle of which is identied with a
pre-lled component (delimited by dashed lines),
con-tainingthequantiernodelabeled`VP!VP,'and
theuppermost of which (emptyin thegure) is
in-sertedattheroot,whilepreservingtherelative
dom-inancerelationshipsamong thenodesin bothtrees.
Similarly, sites labeled 2, associated with the
sec-ondargumentsub-tree(forthenounphrase
comple-thetreeinwhicheveryaddressispeciesthei th
bypartitioning itinto twocomponents{again, one
dominatingtheother{thelowestofwhichisinserted
attheterminalnodelabeled`NP,'andtheuppermost
ofwhichisidentied withanotherpre-lled
compo-nent containing the quantier node labeled `PP !
PP,' again preserving the relative dominance
rela-tionshipsamongthenodesin bothtrees.
2.3 Shared interpretation
Recall the problem of unbounded variable
prolif-eration described in Section 2.1. The advantage
of using atree-rewriting system to model semantic
composition is that such systems allow the
appli-cation of well-studied restrictions to limit their
re-cursivecapacity to generate structural descriptions
(in this case, to limit the unbounded overlapping
ofquantier-variabledependenciesthatcanproduce
unlimitednumbersoffreevariablesatcertainstepsin
aderivation), withoutlimitingthemulti-level
struc-tureoftheirelementarytrees, usedherefor
captur-ing the well-formedness constraint that apredicate
bedominatedbyitsvariables'quantiers.
One such restriction, based on the regular form
restriction dened for tree adjoining grammars
(Rogers, 1994), prohibits a grammar from allowing
anycycleofelementarytrees,eachinterveninginside
aspine(apathconnectingtheinsertionsites ofany
argument) of the next. This restriction is dened
below:
Denition2.1 Letaspineinanelementarytreebe
the path of nodes (or object-level rule applications)
connecting all insertion site addresses of the same
argument.
Denition2.2 AgrammarGisinregularformifa
directedacyclic graph hV;Ei can bedrawn with
ver-tices v
H ;v
A
2 V corresponding to partitioned
ele-mentarytrees ofG (partitionedas describedabove),
anddirectededges hv
H ;v
A
i2EV V fromeach
vertexv
H
,correspondingtoapartitionedelementary
tree that can host an argument, to each vertex v
A ,
corresponding to a partitioned elementary tree that
can function asitsargument,whose partition
inter-sects its spine atany place other thanthe top node
inthe spine.
This restrictionensuresthat there will beno
un-bounded `pumping' of intervening tree structure in
anyderivation,sotherewillneverbeanunbounded
amountofunrecognizedtreestructuretokeeptrack
ofat any stepinabottom-upparse,sothenumber
of possible descriptions of each sub-span of the
in-putwillbeboundedbysomeconstant. Itiscalleda
formaregularlanguage.
A CKY-style parser can now be built that
rec-ognizeseach context-freerule in anelementarytree
from the bottom up, storing in order the
unrecog-nized rules that lie aboveit in the elementary tree
(as well asanyremaining rulesfrom anycomposed
sub-trees) as a kind of promissory note. The fact
that any regular-form grammar has a regular path
setmeansthat onlyanitenumberofstateswillbe
requiredtokeeptrackofthispromised,unrecognized
structureinabottom-uptraversal,sotheparserwill
havethe usual O(n 3
) complexity (timesa constant
equalto the nite numberof possibleunrecognized
structures).
Moreover,sincetheparsercanrecognizeanystring
derivablebysuchagrammar,itcancreateashared
forest representation of every possible analysis of a
given input by annotating every possible
applica-tionofparse rulesthatcould beused inthe
deriva-tion of each constituent (Billot and Lang, 1989).
This polynomial-sized shared forest representation
canthenbeinterpreteddeterminewhichconstituents
denote entities and relationsin the worldmodel, in
ordertoallowmodel-theoreticsemanticinformation
toguidedisambiguationdecisionsinparsing.
Finally, the regular form restriction also has the
importanteectof ensuringthatthenumberof
un-recognizedquantiernodesatanystepina
bottom-upanalysis{andthereforethe numberof free
vari-ablesinanywordorphraseconstituentofaparse{is
alsoboundedbysomeconstant,whichlimitsthesize
of any constituent's denotationto a polynomial
or-derofE, thenumberofentitiesin theenvironment.
The interpretation of any shared forest derived by
this kind of regular-formtree-rewriting system can
thereforebecalculatedinworst-casepolynomialtime
onE.
Adenotation-annotatedsharedforestforthenoun
phrase`the girlwith thehat behind thecounter'is
shown in Figure 5, using the noun and preposition
treesfromFigure 4,withalternativeapplicationsof
parserulesrepresentedascirclesbeloweachderived
constituent. This shared structure subsumes two
competinganalyses: onecontainingthenounphrase
`the girl with the hat', denotingthe entity g
1 , and
the other containing the noun phrase `the hat
be-hind the counter', which does not denote anything
in the world model. Assuming that noun phrases
rarelyoccurwithemptydenotationsinthe training
data, theparse containing thephrase`the girlwith
thehat'will be preferred,becausethereis indeeda
NP!girl
x1:Girl(x1)
fg1;g2;g3g
P!with
x1x2:With(x2;x1)
fhh1;g1i;hh2;b1ig
NP!hat
x1:Hat(x1)
fh1;h2;h3;h4g
P!behind
x1x2:Behind(x2;x1)
fhc1;g1ig
NP!counter
x1:Counter(x1)
fc1;c2g PP!PNP
x1:::xn=2:$1(x1:::xn)^$2(x1)
fhh1;g1i;hh2;b1ig PP!PP
x2:::xn=2:Qx
1
:$1(x1:::xn)
fg1;b1g
PP!PNP
x1:::xn=2:$1(x1:::xn)^$2(x1)
fhc1;g1ig PP!PP
x2:::xn=2:Qx
1
:$1(x1:::xn)
fg1g NP!NPPP
x1:::xn=1:$1(x1:::xm=1)^$2(x1;xm+1:::xn)
fg1g
NP!NPPP
x1:::xn=1:$1(x1:::xm=1)^$2(x1;xm+1:::xn)
; PP!PNP
x1:::xn=2:$1(x1:::xn)^$2(x1)
; PP!PP
x2:::xn=2:Qx
1
:$1(x1:::xn)
;
x1:::xn=1:$1(x1:::xm=1)^$2(x1;xm+1:::xn)
; or fg1g
Figure5: Sharedforestfor`thegirlwiththehatbehindthecounter.'
tensions of tree-adjoining grammar (Joshi, 1985),
namely multi-component tree adjoining grammar
(Becker et al., 1991) and descriptiontree
substitu-tion grammar (Rambow et al., 1995), and indeed
representssomethingofacombinationofthetwo:
1. Like description tree substitution grammars,
but unlike multi-component TAGs, it allows
trees to be partitioned into any desired set of
contiguouscomponentsduring composition,
2. Likemulti-componentTAGs,butunlike
descrip-tion tree substitution grammars, it allows the
specication ofparticular insertionsites within
elementarytrees,and
3. Unlike both, it allow duplication of structure
(which isusedfor mergingquantiersfrom
dif-ferentelementarytrees).
The use of lambda calculus functions to dene
de-composable meaningsfor input sentences drawson
traditions of Church (1940) and Montague (1973),
butthisapproachdiersfromtheMontagovian
sys-complexity(in order to allowtractable
disambigua-tion).
Thisapproachtosemanticsisverysimilartothat
describedbyShieber(1994),inwhich syntacticand
semantic expressions are assembled synchronously
using paired tree-adjoining grammars with
isomor-phic derivations, except that in this approach the
derivedstructures areisomorphicaswell,hencethe
reductionofsynchronoustreepairsto
semantically-annotated syntax trees. This isomorphism
restric-tiononderivedtreesreducesthenumberofquantier
scoping congurations that can beassigned to any
given input (most of which are unlikely to be used
inapracticalapplication),butitsrelativeparsimony
allowssyntactically ambiguous inputsto be
seman-tically interpreted in ashared forest representation
in worst-case polynomial time. The interleaving of
semantic evaluation and parsing for the purpose of
disambiguationalsohasmuchin commonwith that
of Dowding et al. (1994), except that in this case,
constituentsarenotonlysemanticallytype-checked,
pro-derspeciedsemanticrepresentationof
structurally-ambiguouselementarytree constituents in ashared
forest and the underspecied semantic
representa-tion of (e.g. quantier) scope ambiguity described
byReyle(1993). 3
3 Evaluation
Thecontributionofthismodel-theoreticsemantic
in-formation toward disambiguationwasevaluated on
asetof directionsto animatedagentscollectedin a
controlled but spatially complex 3-D simulated
en-vironment (of children running a lemonadestand).
In order to avoid priming them towards
particu-larlinguisticconstructions,subjectswereshown
un-narrated animations of computer-simulated agents
performingdierenttasksinthisenvironment
(pick-ingfruit, operatingajuicer, and exchanging
lemon-adeformoney),whichweredescribedonlyasthe
`de-siredbehavior'ofeachagent. Thesubjectswerethen
askedtodirecttheagents,usingtheirownwords,to
performthedesiredbehaviorsasshown.
340utteranceswere collectedandannotatedwith
brackets andelementary treenodeaddresses as
de-scribed in Section 2.2, foruse as trainingdata and
asgoldstandarddataintesting. Somesample
direc-tionsare shown in Figure6. Mostelementarytrees
wereextracted,withsomesimplicationsforparsing
eÆciency,from anexisting broad-coveragegrammar
resource (XTAG Research Group, 1998), but some
elementary trees for multi-word expressions had to
becreatedanew. Inall,acompleteannotationofthis
corpus required a grammar of 68 elementary trees
and a lexicon of 288 lexicalizations (that is, words
orsets of wordswith indivisible semantics,forming
the anchors of a given elementary tree). Each
lex-icalizationwasthen assigned asemanticexpression
describingtheintendedgeometricrelationorclassof
objectsin thesimulated3-Denvironment. 4
Theinterfacewastestedontherst100collected
utterances, and the parsing model was trained on
the remainingutterances. Thepresence orabsence
ofadenotationofeachconstituentwasaddedtothe
label ofeachconstituentin the denotation-sensitive
parsingmodel(forexample,statisticswerecollected
for the frequency of `NP: ! NP:+ PP:+' events,
meaning a noun phrase that does not denote
any-3
Denotationof competingapplications ofparserules
canbeunioned(thoughthiseectivelytreatsambiguity
as a form of disjunction), or stored separately to some
nitiebeam(thoughsomegloballypreferablebutlocally
dispreferredstructureswouldbelost).
4
Here it was assumed that the intention of the user
wastodirecttheagentto performtheactionsshownin
onthe ground.
Pick upthelemon.
Place the lemoninthe pool.
Take the dollar billfromthe person infront ofyou.
Walktothe left towards thebig blackcube.
Figure6: Sampleutterancesfrom collectedcorpus.
thingin theenvironmentexpandsto anoun phrase
and a prepositional phrasethat do have a
denota-tionin theenvironment), whereasthe baseline
sys-temused aparsingmodel conditionedononly
con-stituentlabels(forexample,`NP!NPPP'events).
Theentire wordlatticeoutput of the speech
recog-nizer was fed directly into the parser, so as to
al-low themodel-theoreticsemanticinformation to be
brought to bear on word recognition ambiguity as
wellasonstructuralambiguityin parsing.
Sinceanyderivation ofelementarytreesuniquely
denesasemanticexpressionateachnode,thetask
ofevaluatingthiskindofsemanticanalysisisreduced
to the familiar task of evaluating athe accuracy of
a labeled bracketing (labeled with elementary tree
namesandnodeaddresses). Here,thestandard
mea-suresof labeled precisionand recallare used. Note
that there maybemultiple possible bracketings for
each goldstandardtreein agivenwordlatticethat
dieronly in the startand end frames of the
com-ponent words. Since neither the baseline nor test
parsing models are sensitive to the start and end
frames of the component words, the gold standard
bracketing is simplyassumedto usethe mostlikely
frame segmentation in the word lattice that yields
thecorrectwordsequence.
The results of the experiment are summarized
below. The environment-based model shows a
statistically signicant (p<.05) improvement of 3
points in labeled recall, a 12% reduction in error.
Most of the improvement can be attributed to the
denotation-sensitiveparserdispreferringnounphrase
constituents with mis-attached modiers, which do
notdenote anythingintheworldmodel.
Model LR LP
baselinemodel 82% 78%
baseline+denotationbit 85% 81%
4 Conclusion
seman-yses to be conditioned on the results of a
model-theoretic interpretation. In particular, a formal
re-strictionwaspresentedonthescopeofvariablesina
semanticgrammar whichguaranteesthat the
deno-tationsofallpossibleanalysesofaninpututterance
can be calculated in polynomial time, without
un-due constraints on the expressivity of the derived
semantics. Empirical tests show that this
model-theoretic interpretation yields a statistically
signif-icantimprovementonstandardmeasuresof parsing
accuracy over a baseline grammar not conditioned
ondenotations.
References
Tilman Becker,AravindJoshi,and OwenRambow.
1991. Longdistancescramblingandtreeadjoining
grammars. In Fifth Conference of the European
ChapteroftheAssociationforComputational
Lin-guistics(EACL'91),pages21{26.
SylvieBillotandBernardLang. 1989. Thestructure
ofsharedforestsinambiguousparsing. In
Proceed-ingsofthe27 th
AnnualMeetingoftheAssociation
for Computational Linguistics (ACL '89), pages
143{151.
Alonzo Church. 1940. A formulation of the
sim-ple theory of types. Journal of Symbolic Logic,
5(2):56{68.
JohnDowding,RobertMoore,FrancoisAndery,and
DouglasMoran. 1994. Interleavingsyntaxand
se-manticsin aneÆcientbottom-upparser. In
Pro-ceedings of the 32nd AnnualMeetingof the
Asso-ciation for ComputationalLinguistics (ACL'94).
Aravind K. Joshi. 1985. How much context
sensi-tivityisnecessaryforcharacterizingstructural
de-scriptions: Treeadjoining grammars. InL.
Kart-tunen D. Dowty and A. Zwicky, editors,
Natu-rallanguageparsing: Psychological,computational
andtheoretical perspectives, pages 206{250.
Cam-bridgeUniversityPress,Cambridge,U.K.
Scott Miller, David Stallard, Robert Bobrow, and
Richard Schwartz. 1996. A fully statistical
ap-proach to natural language interfaces. In
Pro-ceedings of the 34th Annual Meetingof the
Asso-ciation for Computational Linguistics (ACL'96),
pages55{61.
Richard Montague. 1973. The proper treatment
of quantication in ordinary English. In J.
Hin-tikka, J.M.E. Moravcsik, and P. Suppes, editors,
Approaches to Natural Langauge, pages 221{242.
D.Riedel,Dordrecht. ReprintedinR.H.
Thoma-soned.,FormalPhilosophy,YaleUniversityPress,
grammars, head grammars and natural langauge.
Ph.D.thesis,StanfordUniversity.
OwenRambow,David Weir,and K.Vijay-Shanker.
1995. D-treegrammars. InProceedingsofthe33rd
Annual Meeting of the Association for
Computa-tionalLinguistics (ACL'95).
Uwe Reyle. 1993. Dealing with ambiguities by
un-derspecication: Construction,representationand
deduction. Journal ofSemantics,10:123{179.
JamesRogers. 1994. CapturingCFLswithtree
ad-joininggrammars. InProceedingsof the32nd
An-nualMeetingoftheAssociationforComputational
Linguistics (ACL '94).
Stephanie Sene, EdHurley, Raymond Lau,
Chris-tine Pao, Philipp Schmid, and Victor Zue. 1998.
Galaxy-II:areferencearchitecturefor
conversa-tional systemdevelopment. In Proceedings of the
5thInternational ConferenceonSpokenLanguage
Processing (ICSLP'98), Sydney,Australia.
Stuart M. Shieber. 1994. Restricting the
weak-generativecapabilityofsynchronoustreeadjoining
grammars. ComputationalIntelligence,10(4).
David Weir. 1988. Characterizing mildly
context-sensitive grammar formalisms. Ph.D. thesis,
De-partment of Computer and Information Science,
UniversityofPennsylvania.
XTAG Research Group. 1998. A lexicalized tree
adjoining grammar for english. Technical report,