American
Journal
of
Computational
Linguistics
H i c r o f i c h e 32 :9
CGmputer Scf ence Department R u t g e r s Uni versi ty
Few
B r m i c k , New Jersey 08903 ABSTRACTPEDAGLOT i s a programmable p a r s e r , a ' m e t a - p a r s e r . ' To program i t , one
describes
n o t j u s t syntax and some semantics,but
a l s o - - i n d e p e n d e n t l y - - i t smodes of b e h a v i o r . The PEDAGLOT formulation o f such modes o f behavior f o l l o w s
a
c a t e g o r i z a t i o n of p a r s i n g p r o c e s s e s i n t o a t t e n t i o n - c o n t r o l , d i s c o v e r y , p r e -d i c t i o n and c o n s t r u c t i o n . Within t h e s e o v e r a l l t y p e s o f - a c t i v i t i e s , c o n t r o l
can
b e s p e c i f l e d covering a numbero f
s y n t a x - p r o c e s s i n g and semantics-process-ing
o p e r a t i o n s . While i t i s not t h e only p o s s i b l e way o f p r o g r a m i n g a meta-p a r s e r , t h e
PEDAGLOT
m o d e - s p e c i f i c a t i o n technique i s s u g g e s t i v e i n i t s e l f o fv a r i o u s new approaches t o modeling and understanding same language p r o c e s s i n g
a c t i v i t i e s besides p a r s i n g , such as g e n e r a t i o n and i n f e r e n c e ,
I t
i s
well known t h a t t o process n a t u r a l language, one needs botha
syntactic
d e s c r i p t i o n of p o s s i b l e sentences, blended i n some way with a semanticd e s c r i p t i o n b f a c e r t a i n domain of d i s c o u r s e , and
a
r a t h e r d e t a i l e d d e s c r i p t i o nof t h e a c t u a l processes used i n hearing
o r
producing sentences.An augmented t r a n s i t i o n network (Woods, 1970)
i s
qn example of t h e blendingof s y n t a c t i c and quasi-semantic d e s c r i p t i o n s , Here r e g i s t e r s would be r e p o s i -
t o r i e s
o f , o r p o i n t e r st o ,
semantics. When used i n conjunctionwith
a semanticnqtwork,
an
ATN can be used 60 p a r s e o r t o generate (Simmons and Slocum, 1912)sentences. The i s s u e o f changing t h e des'cription of t h e a c t u a l processes used
i n such systems has been touched on by Woods ( i n using a 'generation modet), t o
some e x t e n t by Gimmons and S l o ~ u m ( u s i ~ g d e c i s i o n f u n c t i o n s t o c o n t r o l s t y l e of
g e n e r a t i o n ) , and t o a l a r g e r e x t e n t by Kaplari (19751, i n h i s General S y n t a c t i c
Procdssor,
GSP. GSP indeed i s one example o f a system i n which syntax, semanticsand t o
some
e x t e n t processes can each be u s e f u l l y defined.If
we look at syntax, semantics and processes a s t h r e e d e s c r i b a b l e components,t h e s e systems j u s t mentioned i l l u s t r a t e
how
thoroughly intertwined they can become--t o
t h e
e x t e n t t h a t t h e o r i s t sfrom
time
t o time deny t h e e x i s t e n c e o r a t l e a s t t h eimportance of
some
one o f them. Ignoring t h a t d i s p u t e , I would- l i k e t o concentrateon
the
q u e s t i o n o f being a b l e t o comprehensively d e s c r i b e o n e ' s theory of languagei n
terms
o f its syntax, semantics and processes i n a way t h a t allows f o r t h e i rnecessary and
extensive
i n t e r t w i n i n g connections, b u ta t
t h e sane time allows onet o describe them independently.
I came
t a
t h e need f o r doing t h i s while designing a T r e l a x a t i o n p a r s e r , 'a
parser
which
canmake
grammatical r e l a x a t i o n s i f i ti s
given an Ill-formed s t r i n g ,so
as t o
arrive
a t a k l o s e s t t p o s s i b l e p a r s e f o r the' s t r i n g . This probl-em involvedaz
that
night
be allowed by t h e paxser.Thus
the
syntax would be fixed and the wayt h e
parser uses
it would separately have t obe
described. I twas
soon noticedthat
efficiency could be g r e a t l y enhanced i f some rudimentary notion of semanticp l a u s i b i l i t y
could
also
be used. I twould
have t o be describedi n a
way r e l a t e dt o
t h ecbrrect
syntaxbut
s t i l l be usable by t h e parser. Thus, f o r mypurposes,
the
descriptionshad
t o be independent of one another.One f e a t u r e of
a
relaxation parser i s t h a t it can ' f i l l i n t h e gaps' of astring
t h a ti s
missing
various
words. If one could, which my r e l a x a t i o n parserdid
not,
specify the
semantic
contextof
a
sentence, t h egenerated
sentence mightbe semantically
rather
plausible.
Inany
case, t h e relaxation parser operatesi n
various respects l i k e an
actual parser or like
a generator, and it was t h i s r e l a -tionship
between parsing and generating that became of i n t e r e s t ,Out o f
the
design of the relaxation p a r s e r , t h e notation (independent ofsyn-
tax)
which t o someextent
describes various processes and choices o f a l t e r n a t e waysof processing
was
developed. Thus,one
maytake
a s e t of syntax andsemantic
de-scriptions and
then
throughdescribing
t h e processing 'modest involved, d e f i n e aprocessor
whichuses
the p a r t i c u l a r algorithm t h a t t h e individual processes togetherdefine,
One may c a l l the parsert h a t
i s programmabLei n
i t s processes a meta-parser,of which
various existing q a r s e r sand
generators appear t o be s p e c i a l cases,A closer examination of t h e parser I have developed (called
PEDAGLOT*)
may showsome
such
aspects
of
meta-parsing, especially
as
regards t h e r e l a t i o n s h i p betweenparsing
and generating.I
will
describe the syntactic and semantic parts o f t h eparser first:
bynoting
i t s
resemblances t o t h e parser of J . Earley (1970) and theATN system of Woods.
Then
I
w i l l describe t h e process-type specifications t h a t areavailable, and
t h e useof
meta-parsersas
a b a s i s f o r defining general language be-haviors. Purther detail
canbe
found in
the PEDAGMT
manual
(Fabens, 1972 and 1 9 7 3 ) .1.
The
Core
of
the Parqer
-1
The fundamental
operation of
t h e p a r s e r i s very s i m i l a rt o
t h e operationof Earleyvs p a r s e r , with augmentations f o r recording t h e r e s u l t s of p a r s e s
( e , g , ,
their
t r e s t r u c t u r e , and various of t h e i r a t t r i b u t e s , which I c a l lftags'). It
is given a
grammar a s a s e t of context-free rules with v a r i o u sextensions,
most i m p ~ r t a n t ofwhich
a r e t h a t LISP functions may be used aspredicates instead of
terminals,
and thay each rule may be followed by opera-t i o n s
t h a tare
definedi n
tbnns of t h e s y n t a c t i c elements a f t h e r u l ei n
question,An
example of
t h i s n o t a t i o n i sas
follows:S -+ NP
VP
=>
[AGREE [REF NP] [VB VP]
][ S U M = [REF
NP]
]
[OW
=[REF VP] [VB
=[VB VP]
]S -*
NP
[BE] [VPASS] BY NP
=>
[AGREE
[REF
NP]
[VB
[BE]
]I
[SUM =
[REF NP
I ] ][OBJ
=REF
NP]
]
[VB
=[VB
[VPASS]
]
]NP
-+[DET]
[N]
=>
[REF
=[N]]
W +
[VINP
=>
[VB
=[V]] [REF
=[REF
NP]]
Here, each
bracketed
symboli s
the name of arecognition
predicate ( e .g.,IN]
recognizes
nouns,
[BE]
recognizesf o n s
o fh t o
b e 1 ) ,
Following t h e => aret h e
post -recognit ionfunctions.
Forinstance
[AGREE
[REFNP]
[VB VP] ]specifies
a
c a l l
t o t h eAGREE
function which i sgiven,
as arguments, t h e REF a t t r i b u t e (tag)of
the
sub-parse
involved in
t h a tr u l e
and
t h e VB a t t r i b u t eof
t h e VP part oft h e
rule.
Following
i s a
parse t r e e f o r'The
Man
Bites t h e D o g l and values o f t a g sThe Dog
The general flow of t h e parser
is
from top-down, and a s t h e lowest compo-nents (symbols i n t h e s t r i n g ) are found, the post-recognition functions t h a t are
associated with t h e r u l e t h a t recognized them a r e applied. Tags become associated
with sub-parses when t h e post-recognition operation uses t h e form [x = y ] ( i n which
t h e value referenced by y i s s t o r e d
as
t h e x t a g of t h e sub-parse). I n t h e example,[DET] and [N] recognize 'The Manf and 'Manf i s used a s the REF a t t r i b u t e ofi t h e
first
NP.
In t h e second S r u l e , t h e operation of [SUM = [REFNP']]
would be t oretrieve
the
REF
tag o f the second NP (thus t h e prime), and t o s t o r e t h a t as t h eSUM t a g of t h e f i n a l p a n e .
As i n most top-down parses, t h i s p a r s e r begins with S and i t s
two
r u l e s ,s i n c e S i s non-terminal. S is expanded i n t o t h e two sequences of matches
it
shouldperform. This expansion r e s u l t s i n various
(in
t h i s case, two) p r e d i c t i s n s of whatt o
f i n d next, When t h e i n i t i a l symbol i n some r u l e i s a terminal o r a predicate,a discovery i s c a l l e d f o r (in which a match i s pexformed, p o s s i b l y involving t h e
known
values of t h e tags). When some complete sequence of elements i s found (here,for instance,
when
NP
-+[DET]
IN]
has matched t h e[N]
).
Construction invokes t h epost-reoognit ion operat ions and then usual lyt completes some e a r l i e r part of a r u l e
are
thenspecified.
1
have
broken up the
parsingprocess into
t\he$e
three parts
so
as
t o
simi4arly
catalpg t h e
'parsing
modes,' t u r n i n gthis
parserinto
ameta-parser.
Before doingso,
fshould note tbat t h i s
parser
stores each zesult underconstruction
in
a
'chart'
as
is done by Kaplan
i nhis
GSP,so
that, for
instance,
the NP ' t e s t t
w i l l
only
have t o beevaluated once
for each place onei s
wanted i n thestring.
[Nl
[;I
1
[Nl
5
The Man
T
Bites
The Dog$
I1
lustrat
ion
of PEDAGLOT'
s
ParsingChart
Simple Arrows
indicate 'Predictions.'
Double
HeadArrows indicate
iDiscoveries,
Dotted
Arrows indicate tlConstruction.
Also,
for various
wellknown
reasonsof efficiency,
Earley's concept ofindependent
processing of syntactic events i s used (combined conceptually withthe chart),
SOthat a
main
controller
can evaluate
the individual syntactic
' t e s t s 115
are
effectively abandoned if
other results can complete t h e parse, o r a sub-parse,
first
.
2 .
Meta-Parsing Modes
One
can see
that, except f o rthe
n o t a t i o n a l i n e f f i c i e n c i e s o f t h e context,free
formalism
(asopposed to
theaugmented
t r a n s i t i o n networkform),
t h i s parseri s
verymuch like
other standard parsers (especially
ATN s).
I t differs i n t h a tthere
is
a waytof specifying how t o proceed. Currently, this system has approxi-mately a dozen toodesr and
I will
present some of them here. Each mode s p e c i f i e show
t ohandle
a certain partof
t h e parsing process. They can be classified i n t ofour categories: attention
control, prediction,
discovery and c o n s t r u c t i o n .a, Attention
Control W e s :Since the
parser operates
on a chart of independent events ( ' p a r s i n gquestions1), one must give t h e p a r s e r a method of sequencing through them.
Thus, one may specify 'breadth-first1 or 'depth-first1 and t h e appropriate
~echanism
will
be invoked{this merely involves t h e
way t h e processor stacksi t s jobs). A
'best-first
'
option i s -under development, which, when given anevaluation function to be applied
to the set of currently a c t i v e p a r t i a lparses,
allows the system to o p e r a t e on the 'best1 problem n e x t , Experi-Bents
with this
mode have so far been inconclusive.
One also can s p e c i f y when t o stop (i.e., at the first complete p a r s e ,
or t o w a i t
until
a l l other ambiguous parses have been discovered). The d i s -it~gbiguation routine (which
i s
describedas
a part o f t h e c o n s t r u c t i o n modes)defines
which
parseis
% e s t l , Further, one may specifya
left-to-right orright-to-left mode of
how
to progress along t h e s t r i n g .b. Discovery Modes:
The
starting point of building a relaxation parser is to s p e c i f y whatt o do when
an
exact match i s not made. If the parser i s expecting one wordwhat- it i s looking f o r , o r it can i n c e r t a i n o t h e r circumstances simply
i n s e r t t h e expected word i n t o t'he s t r i n g . Thus, under discovery.modes,
there are vaxious o p t i o n s : e i t h e r t h e p a r s e r i s allowed t o attempt matches
in out-of-sequence
p a r t s of t h e string, o r n a t , And i f n o t , o r i f no suchmatch i s found, t h e p a r s e r may or may not be allowed to make
an
i n s e r t i o n .So i n PEDAGLOT, t h e r e i s an
INSERT
mode (and v a r i o u s r e s t r i c t e d v e r s i o n s o f f t ) and a 'where t o look1 mode which i s used t o c o n t r o l t h e degree t o whicht h e parser can t r y t o f i n d out-of-place matches, There a r e t a g s a s s o c i a t e d
w i t h - t h e s e two s p e c i f i c a t i o n s , t h e INSERT t a g and t h e OMIT t a g , which a r e
associated
with
t h e parses involving i n s e r t i o n s and omissions t b a t containt h e number of insertions made and t h e number of i n p u t symbols omitted i n b u i l d i n g t h e parse.
There i s also a rearrangenient mode.
mus,
given c e r t a i n c o n s t r a i n t s ,the parser could be givep
'The
Bites M a n Dogt and produce a parse f o r *TheMan
Bites t h e Dogt s i n c e it would have found 'Man,' by temporarily omitting'Bites,' but then it looks f o r and finds 'Bitest and f i n a l l y , f i n d i n g no se-
cond
lthe,fthe,l
i n s e r t s one [or some o t h e r determiner because of t h e [DET] func-tion])
and f i n d s 'Dog. I n a similar way it would t r y t o produce a passiveform [i.e., the Man Is Bitten By the Dog) but since t h i s involves more i n s e r - t i o n s , e t c . i t would not be chosen.
These h e u r i s t i c s a r e c o n t r o l l e d by recording numerical summary t a g s
with
each sub-parse that p a r t i c i p a t ei n ,
and a r e judged1 by t h e d i s a m b i g u a t i c ; ~ ~r o u t i n e s . Similar ideas are used by Lyon (1974).
c. P r e d i c t i o n Modes:
As Woods (1975) has pointed o u t , t h e extent t o which a p a r s e r ' s p r e d i c t i o n
increases efficiency varies with the quality of the expected input. This f a c t
p l o s i o n .
In
PEDAGLOT, t h e r ei s
a programmable choice'
f u n c t i o n t h a t - con-t r o l s p r e d l c t i o n s . S p e c i f i c a l l y , when t h e parser encounters a non-terminal
symbol, t h a t symbol i s t h e left-hand s i d e of v a r i o u s r u l e s . An u n c o n t r o l l e d
pkediction (used by a
canonical
top-down p a r s e r ) i s t o s e l e c t each such r u l eas the expansion. I n t u i t i v e l y , however, people do n o t seem t o do t h i s . In-
stead, as
i n
an A'I??, they t r y one and only i f t h a t f a i l s , go I n t o t h e n e x t .I n PEDAGLOT, t h e choice of which r u l e t o t r y can be defined a s t h e r e s u l t of
the c a l l
t o
a 'choosef f u n c t i o n(or
it can be l e f t uncontrolled] , We havedes&ned
various
approaches t o such p r e d i c t i o n s (e.g., a l i m i t e d key-wordscan
of the incoming s t r i n g , and t h e use of 'language s t a t i s t i c s such as t h es e t of rules which can g e n e r a t e the next symbol i n t h e s t r i n g as t h e i r l e f t
most symbol).
The p r e d i c t i o n is c u r r e n t l y made once f o r any given choice p o i n t ; its
outcomes are expected t o be an ordered s e t of r u l e s t o t r y
next.
d
.
Construction
Modes
:The
phase
of p a r s i n g i n which t h e p a r t s of t h e p a r s e t r e e and a s s o c i a t e dtag values
are
formed,i s
a
p l a c e where most of t h e n o n - s y n t a c t i c information(tags]
about
the s t r i n g being parsed can come i n t o play.In
t h e first p l a c e , new t a g s can be formed as f u n c t i o n s of lower l e v e lparse tags
t b o u g h a process called melding, Thus, 'nonsense1 can be discoveredd
pronoun references can sometimes be t i e d down, I n t h e second p l a c e , it i sa r e s u l t of c o n s t r u c t i o n t h a t ambiguity i s discovered and d e a l t w i t h ,
Since these f e a t u r e s of parsing d e a l p r i m a r i l y w i t h semantics (and s i n c e ,
i f
anyrcthere,
sttsntantic r e p r e s e n t a t i o n sof
the s t r i n g r e s i d e i n t h e t a g s ) , mostof
t b PEDAGLOT c o n s t r u c t i o n modes involve t a g s .One play e x p l i c i t l y meld t a g values by using p o s t - r e c o g n i t i o n o p e r a t o r s , o r
names themselves i n s t e a d of with i n d i v i d u a l r u l e s . I n our example we use
this d e v i c e t o i m p l i c i t l y form a simple l i s t o f t h e two
REF
t a g s t h a t be-come a s s o c i a t e d w i t h t h e S r u l e . This implicit melding o p e r a t i o n
can
a l s oi n c l u d e a blocking function, o r some r e f e r e n c e t o a d a t a base. The t a g s
t h a t c o n t a i n INSERT and OMIT information
are
used i n t h i s way t o keep runningt o t a l s o f , and t o minimize t h e munber o f such h e u r i s t i c s
i n
t h e r e l a x a t i o np a r s i n g modes. One may a l s o a s s o c i a t e a LIFT f u n c t i o n which,
when
t h e par- t i a l parse becomes complete, s p e c i f i e s a transformation of t h a t t a gt o
beused as The tag o f the next h i g h e r l e v e l p a r s e .
Ambiguity i s discovered
when
two parses from the same symbol, cbveringt h e same s t r i n g segment axe found. For t h i s case, an AMBIG f u n c t i o n i s asso-
c i a t e d with t a g names, and it makes a ' v a l u e judgement1 o f which t a g i s ' b e t t e r ,
hence which i n t e r p r e t a t i o n t o use. (Other types o f c r i t e r i a can a l s o come i n t o
p l a y here such as u s e r i n t e r a c t i o n , (cf. Kay, 1973).
3. The Uses of Meta-Parsers
I ha-re just catalogued some o f t h e p a r s i n g modes a v a i l a b l e i n
PEDAGLOT.
Others,such
as Bottom-Up ( i n s t e a d of Top-Down) o r Inside-Out ( i n s t e a d o f Left-to-Right, e t c . ) ,are envisionedlbut n o t implemented. Since PEDAGLOT i s an i n t e r a c t i v e program, t h e
u s e r can change modes a t w i l l , j u s t a s he can change syntax o r i n t r o d u c e new t a g s ,
Thus, the obvious first u s e a f meta-parsers i s t h a t one may use them t o d e s i s n
language processors without having t o t i e oneself down from t h e s t a r t t o say, a d e p t h - f i r s t p a r s e r ,
Meta-parsers a l s o have a c e r t a i n amount of t r a c t i b i l i t y t h a t parsers t h a t
blend a l l . a c t i v i t i e s i n t o one huge network may n o t . Ono may sea a t a r a t h e r high l e v e l
what i s going t o be happening ( i . e , , a l l t a g s of a c e r t a i n name w i l l meld t o g e t h e r
i n a c e r t a i n way, u n l e s s t h e grammar s p e c i f i e s o t h e r w i s e ) , If one, however, wants
c e r t a i n foms of local behavior, one may use p r e d i c a t e s o r f u n c t i o n s on individuaQ
one
can
program
a
tchoosel
function which
w i l lmake
t h a t globalchange. To a
large
extent,
thelanguage
designer
may
specify mch
of t h eprocessor
in broad
ternas
and
s t i l l
be ablet o
c o n t r o llocal
events wherenecessary.
In a
more
general
sense,
a
meta-parser
allowsone
t o understandand build
higher
order
theories
abouthow
peoplemight represent and process
language.For
instance,
while
it
may
betrue
t h a tgenerating
i s t h einverse
ofparsing,
there
is
more
thanone
way
t o
do such i n v e r t i n g . One could s t a r tfrom a
senantic
network,
using the choose
functionalong
with t h e INSERT modet o restrict
means
o f
expression
consistent
with theintendea message,
and usingAMBIG functions
to weedout
a l l but reasonable messagesfrom
m n g the
many the parser may produceo r
one
might
simply
t a k e from t h esemantic
network
a simplestring o f
meaningful
words,and then
we a less t i g h t l y programmed'relaxation
parser' t orearrange these
words
to
be
syntactically
correct.
W
are
e
now considering
using
a crude'backwardsT mode
which begins
with
the
operati~n
part
o f
ar u l e
and,
byusing
predicates ( e . g . ,AGREE)
to
yield inverses, specifies what
the
context-free pattern must
produce.
Thus
there
a r e many
variations
of
how
t ogenerate
using a meta-parser.
In
the area
of
language
inference,
t o
take another example of language
processing,PEDAGLOT
suggests
various differing
waysof
approaching
t h eproblem.
First, ofie mayuse it a5
a 'relaxation-parser,
the'parse t r e e 1 can
be pattern-matched a g a i n s tthe
new
sentence,
and
hypothesescan
be
famed.
Or,
one could placea
more
rudimentary
inference
systwon
the 'prediction' part
of
the processor
i t s e l f , and using othercontrols,
thepredictions
t h a tare successful
could berewritten
as
anew
gramar.These two
learning
paradigms
could
each be strengthened by way o f t h e use o f tagst o
contain
(in
asense)
t h e meaning of t h e sentelzces t o be learned, Each of theseparadips
canbe modeled using
a meta-parserlike PEDAGLM.
Thus, a
meta-parser canReferences
Earley,
J.
(1970),llAn
Efficient Context-Free Parsing Algorithm,I1
Comm.
ACM 13,number
2, (February 1970))
pp,94-102.
Fabens, W,
(1972),
PEDAGLOT
Users Manual, Rutgers University CBM-TR-12,
kt. 19722,
Fabens,
W.
(1973),PEDAGLOT
Users Manual
:Part
11,Rutgers University CBM-TR-23,
Nov. 1973.
Kaplan,
R.M.
(1973),
"A General Syntactic Proce~sor,~~
in
R .Rustin
(ed.)Natural Language Processing,
NewYork:
Algorithmics Press,
(1973),pp. 193-242.
Kay,
M.
(1973), llThe
MIND
Systemjfl
in
R.
Rustin
(ed.)Natural
Language Processing,
New
York:
Algorithmics Press, (1973).
pp.
155-188.Lyon, G.
(1974))"Syntax-Directed Least-Errors Analysis
for
Context-Free
Languages:
APractical Approach.lr Comm.
ACM
17,
number
1,
(January 1974),pp.
3-13.Simmons,
R.
and Slocum,
J.
(1972), "Generating English Discourse
fromSemantic
Networks,''l Comm.
ACM
15,number
10,(October 1972), pp.
891-905,Woods,
W.A.
(1970),"Transition Network Grammars
for
Natural Language Analysis,"
Comm, ACkl
13, number 10, (October
1970))pp.
591-606,Woods, W.A.,