Based on an SLM with Arboreal Context Trees
Shinsuke MORI
IBM Research,TokyoResearchLaboratory, IBM Japan,Ltd.
1623-14 Shimotsuruma Yamato-shi, 242-8502, Japan
Abstract
Inthispaper,wepresentaparserbasedona
stochas-ticstructuredlanguagemodel(SLM)withaexible
historyreferencemechanism. AnSLMisan
alterna-tivetoan n-grammodelas a language model fora
speechrecognizer. TheadvantageofanSLMagainst
ann-gram modelis theabilityto return the
struc-ture of a given sentence. Thus SLMs are expected
toplayanimportantpartinspokenlanguage
under-standingsystems. ThecurrentSLMsrefertoaxed
partofthehistoryforpredictionjustlikeann-gram
model. We intro duce a exible history reference
mechanism called an ACT (arboreal context tree;
anextension ofthecontext treeto tree-shaped
his-tories)anddescribeaparserbasedonanSLM with
ACTs. Inthe experiment, we built anSLM-based
parserwitha xedhistoryandonewithACTs,and
comparedtheirparsingaccuracies. Theaccuracyof
our parser was 92.8%, which was higher than that
for theparser withthe xed history (89.8%). This
resultshowsthattheexiblehistoryreference
mech-anismimprovestheparsingabilityofanSLM,which
hasgreatimportanceforlanguage understanding.
1 Introduction
Currently, the state-of-the-art speech recognizers
can take dictation with satisfactory accuracy.
Al-though continuing attempts for improvements in
predictive power are needed in the language
mod-eling area for speech recognizers, another research
topic,understandingofthedictationresults,is
com-inginto focus. Structuredlanguagemodels (SLMs)
(Chelba andJelinek,1998;Charniak,2001; Moriet
al., 2001) were proposed for these purposes. Their
predictivepowersarereportedtobeslightlyhigher
thananorthodoxwordtri-grammodeliftheSLMs
are interp olated with a word tri-gram model. In
contrast with word n-gram models, SLMs use the
syntactic structure (a partial parse tree) covering
the preceding words at each step of word
predic-tion. The syntacticstructure also growsin parallel
with the word prediction. Thus after the
predic-tionofthelastwordofasentence,SLMsareableto
inputsentence(parse trees)with associated
proba-bilities. Thoughtheimpactonthepredictivepower
isnot major, this ability, which is indispensable to
spokenlanguageunderstanding,isaclearadvantage
ofSLMs over word n-gram models. With an SLM
asa language model, a speech recognizeris ableto
directlyoutputa recognitionresultwithits
syntac-ticstructureafterbeinggivenasequenceofacoustic
signals.
The early SLMs refer to only a limited and
xed part of the histories for each step of word
and structure prediction in order to avoid a
data-sparseness problem. For example, in an English
model (Chelba and Jelinek,2000) thenextwordis
predicted from the two right-most exposed heads.
Also in a Japanese model (Mori et al., 2000) the
next word is predicted from 1) all exposed heads
depending on the next word and 2) the words
de-pendingonthoseexposedheads. Oneofthenatural
improvementsin predictivepower for an SLM can
beachievedbyaddingsomeexibilitytothehistory
referencemechanism. Fora linearhistory, which is
referred to by using word n-gram models, we can
use a context tree (Ron et al., 1996) as a exible
history reference mechanism. In an n-gram model
withacontexttree,thelengthofeachn-gramis
in-creased selectively according to an estimate of the
resultingimprovement in predictive quality. Thus,
ingeneral,ann-grammodelwithacontexttreehas
morepredictivepowerin asmallermodel.
In SLMs, the history is not a simple word
se-quence but a sequence of partial parse trees. For
atree-shapedcontext,thereisalsoaexiblehistory
referencemechanismcalledanarborealcontexttree
(ACT) (Mori et al., 2001). 1
Similar to a context
tree,an SLMwith ACTs selects, depending onthe
context,theregionof thetree-shapedhistory to be
referredtoforthenextwordpredictionandthenext
structureprediction. Moriet al. (2001)report that
anSLMwithACTshasmorepredictivepowerthan
anSLM with a xed history reference mechanism.
Therefore,ifaparser basedonanSLMwith ACTs
outperformsan SLM withoutACTs, an SLM with
searchmilestoneforspokenlanguageunderstanding
systems.
Inthispaper,rstwedescribeanSLMwithACTs
for a Japanese dependency grammar. Next, we
presentourstochasticparserbasedontheSLM.
Fi-nally,wereporttwoexperimentalresults: a
compar-isonwithanSLM withoutACTs andanother
com-parisonwithastate-of-the-artJapanesedependency
parser. Theparametersofourparserwereestimated
from9,108syntacticallyannotatedsentencesfroma
nancial newspaper. Wethen tested the parser on
1,011 sentences from thesame newspaper. The
ac-curacyofthe dependencyrelationships reported by
our parser was 92.8%, higher than theaccuracy of
theparserbasedonanSLMwithoutACTs(89.8%).
This provedexperimentally that anACT improves
a parserbasedonanSLM.
2 Structured Language Model based
on Dependency
Themostpopularlanguage modelfora speech
rec-ognizerisawordn-grammodel,inwhicheachword
ispredictedfromthelast(n01)words. Thismodel
works so well that the current recognizer can take
dictationwithanalmostsatisfactoryaccuracy. Now
theresearchfocusinthelanguagemodelareais
un-derstandingthedictationresults. Inthissituation,a
structured languagemodel(SLM) was proposed by
Chelba and Jelinek (1998). In this section, we
de-scribethedependencygrammarversionofanSLM.
2.1 Structured Language Model
Thebasic ideaofan SLM is that each wordwould
bebetter predicted from the wordsthat may have
a dependencyrelationshipwiththewordtobe
pre-dictedthanfromtheproceeding(n01)words. Thus
theprobabilityPofasentencew=w
1
itsparsetreeT isgivenasfollows:
P(T)=
isthei-thpartialparsetreesequence. The
partial parse tree depicted at the top of Figure 1
shows the status before the 9th wordis predicted.
Fromthisstatus,forexample,rstthe9thwordw
9
ispredictedfromthe8thpartialparsetreesequence
t
,andthen the9thpartialparse tree
sequence t
9
is predictedfrom the9th wordw
9 and
the 8thpartial parse tree sequence t
8
to getready
for the 10th word prediction. The problem here
is how to classify the conditional parts of the two
conditionalprobabilitiesin Equation(1)inorderto
predictthenextwordand thenextstructure
with-outencounteringa data-sparsenessproblem. In an
English model (Chelba and Jelinek,2000) thenext P(t
8 )
w2
w4
w6
w
1
w
3
w5
w7
w8
t8,3
t8,2
t8,1
?
Figure1: Wordpredictionfromapartial parse
wordis predicted from the two right-most exposed
heads(forexamplew
6 andw
8
inFigure1)asfollows:
P(w
where root(t) is a function returning the root
la-bel wordof the tree t. A similar approximation is
adaptedtotheprobabilityfunctionforstructure
pre-diction. InaJapanesemodel(Morietal.,2000)the
next word is predicted from 1) all exposed heads
depending on the next word and 2) the words
de-pendingonthoseexposedheads.
Itisclear,however,thatinsomecasessomechild
nodes of the tree t
i01;2 or t
i01;1
are useful for the
next word prediction and in other cases even the
consideration of an exposed head (root of the tree
t
i01;1 ort
i01;2
)suersfroma data-sparseness
prob-lem becauseofthelimitationofthelearningcorpus
size. Thereforeamoreexiblemechanismforhistory
classicationshouldimprovethepredictivepowerof
theSLM.
2.2 SLM forDependency Grammar
Since in a dependencygrammar ofJapanese,every
dependency relationshipis in a uniquedirection as
shownin Figure1 andsincenotwodependency
re-lationshipscrosseachother,thestructureprediction
modelonlyhastopredictthenumb eroftrees. Thus,
thesecondconditionalprobabilityintherighthand
side of Equation (1) is rewritten as P(l
i
is the length (numb er of elements) of the
depen-w2
Figure2: Ahistorytree.
dencygrammarisdenedas follows:
P(T)=
According to a psycholinguistic report on
lan-guage structure (Yngve, 1960), there is an upper
limitonl
i
,thenumb erofwordswhosemodicands
havenotappearedyet. Wesettheupperlimitto9,
themaximumnumb erofslots inhumanshort-term
memory (Miller, 1956). With this limitation, our
SLMbecomesahiddenMarkovmodel.
3 Arboreal Context Tree
A variable memory length Markov model (Ron et
al.,1996),a naturalextensionofthen-grammodel,
is a exible mechanism for a linear context (word
sequence) which selects, depending onthe context,
the length of the history to be referred to for the
nextwordprediction. Thismodelisrepresentedby
a suxtree,called a contexttree, whosenodes are
labeled witha sux ofthe context. Inthis model,
thelengthofeachn-gramisincreasedselectively
ac-cordingtoanestimateoftheresultingimprovement
inpredictivequality.
In SLMs, the history is not a simple word
se-quence but a sequence of partial parse trees. For
atree-shapedcontext,thereisalsoaexiblehistory
referencemechanismcalledanarborealcontexttree
(ACT)(Mori et al.,2001)whichselects, depending
onthecontext,theregionofthetree-shapedhistory
to be referred to for the next word predictionand
forthenextstructureprediction. Inthissection,we
explainACTsandtheirapplicationtoSLMs.
3.1 DataStructure
As we mentioned above, in SLMs the history is a
sequenceofpartialparsetrees. Thiscanberegarded
as a single tree, called a history tree, by adding a
virtualrootnodehavingthesepartialtreesunderit.
a
Figure 3: Anarborealcontexttree(ACT).
9thwordpredictionbasedonthestatusdepictedat
thetop of Figure 1. An arboreal context tree is a
datastructureforexiblehistorytreeclassication.
Each node of an ACT is labeled with a subtree of
thehistory tree. Thelabelof therootis anulltree
and if a node has child nodes, their labels are the
seriesoftrees madebyexpandinga leafof thetree
labeling the parent node. For example, each child
node of the root in Figure 3 is labeledwith a tree
producedbyaddingtheright-mostchildtothelabel
oftheroot. EachnodeofanACThasa probability
distributionP(xjt),wherexisansymbolandtisthe
label of the node. For example, let ha
k
represent a tree consisting of theroot labeled with
a
0
andkchildnodeslabeledwitha
k
sotheright-mostnodeatthebottomoftheACTin
Figure3hasaprobabilitydistributionofthesymb ol
xunder theconditionthat thehistory matchesthe
partial parse trees hhz?iaihbi, where \?" matches
withanarbitrarysymb ol. Puttingitinanotherway,
thenextwordispredictedfromthehistoryhavingb
astheheadoftheright-mostpartialparsetree,aas
theheadofthesecondright-mostpartialparsetree,
and z as the second right-most child of the second
right-mostpartialparsetree. Forexample,inFigure
2thesubtreeconsistingofw
4
to for the prediction of the 9thword w
9
in Figure
1 under the following set of conditions: a = w
6
3.2 AnSLM with ACTs
P(T) =
is an ACT for word prediction and
ACT
s
isanACTforstructureprediction. Notethat
thisisageneralizationofthepredictionfromthetwo
right-mostexposedheads(w
6 andw
8
)intheEnglish
model(ChelbaandJelinek,2000). Ingeneral,SLMs
with ACTs includes SLMs with xed history
refer-encemechanismsasspecialcases.
4 Parser
Inthis section, weexplain ourparser basedon the
SLMwithACTs wedescribedinSections2 and3.
4.1 StochasticParserBased on anSLM
Asyntacticanalyzer,basedonastochasticlanguage
model, calculates the parse tree with the highest
probability ^
T for a given sequence of wordsw
ac-cordingto
wheretheconcatenation ofthewordsinthe
syntac-tic tree T is equal to w. P(T) is an SLM. In our
parser,P(T)istheprobabilityofaparsetreeT
de-nedbytheSLMbasedonthedependencywiththe
ACTs(seeEquation(3)).
4.2 Solution SearchAlgorithm
As shownin Equation(3),ourparser isbasedona
hidden Markov model. It follows that the Viterbi
algorithm is applicable to search for the best
solu-tion. TheViterbialgorithmiscapableofndingthe
best solutionin O(n)time, where nis the number
ofinputwords.
The parser repeats state transitions, reading
wordsof theinputsentencefrom beginningto end.
So that the structure of the inputsentence will be
a singleparse tree, thenumber of treesin thenal
statet
n
mustbe 1 (l
n
=1). Among thenal
pos-sible states that satisfy this constraint, the parser
selectsthestatewiththehighestprobability. Since
our language model uses only a limited part of a
partial parse tree to distinguish among states, the
nal statedoesnot contain enough information to
constructthe parse tree. The parsercan, however,
Table1: Corpus.
#sentences #words #chars
learning 9,108 260,054 400,318
test 1,011 28,825 44,667
Table2: Word-basedparsingaccuracy.
language model parsing accuracy
SLMwithACTs 92.8%(24,867/26,803)
SLM withxedhistory 89.8%(24,060/26,803)
baseline 3
79.4%(21,278/26,803)
*Eachworddependsonthenextone.
or from the combination ofthe wordsequence and
thesequenceofl
i
,thenumberofwordswhose
modi-candshavenotappearedyet. Thereforeourparser
recordsthese values at each prediction step. After
themostprobable last statehasbeenselected, the
parserconstructstheparsetreebyreadingthese
se-quencesfrombeginningtoend.
5 Evaluation
Wedevelop edan SLM with a constant history
ref-erence (Mori et al., 2000) and one with ACTs as
explainedinSection3,andthenimplemented
SLM-based parsers using the solution search algorithm
presented in Section 4. In this section, we report
the results of the parsing experiments and discuss
them.
5.1 Conditions onthe Experiments
Thecorpususedinourexperimentsconsistedof
ar-ticles extracted from a nancial newspaper (Nihon
KeizaiShinbun). Eachsentenceinthearticlesis
seg-mentedintowordsandeachwordisannotatedwith
apart-of-speech(POS)andtheworditdependson.
There are 16 basic POSs in this corpus. Table 1
showsthecorpussize. Thecorpus was dividedinto
tenparts,andtheparametersofthemodelwere
es-timatedfromnineofthem(learning)andthemodel
wastestedontheremainingone(test).
In parameter estimation and parsing, the SLM
withACTs distinguisheslexicons offunctionwords
(4POSs)and ignoreslexiconsofcontentwords(12
POSs) in order to avoid the data-sparseness
prob-lem. As a result, the alphabet of the SLM with
ACTsconsistsof192 functionwords,4 symb olsfor
unknownfunctionwords,and12symb olsforcontent
words. The SLM of the constant history reference
selects words to be lexicalized referring to the
85%
90%
95%
80%
100%
0
1
0
1
1
0
2
1
0
3
1
0
4
1
0
5
1
0
6
1
0
7
0
1
#characters in learning corpus
accuracy
83.15%
86.98%
89.17%
90.97%
92.78%
Figure4: Relationbetweencorpus size andparsing
accuracy.
5.2 Evaluation
Oneofthemajorcriteriaforadependencyparseris
theaccuracyofitsoutputdependencyrelationships.
Forthiscriterion,theinputofaparserisasequence
ofwords,eachannotatedwithaPOS.Theaccuracy
istheratioof correctdependencies(matchesin the
corpus)tothenumber ofthewordsin theinput:
accuracy
=
#wordsdependingonthecorrectword
#words
:
Thelastwordandthesecond-to-lastwordofa
sen-tence are excluded, because there is no ambiguity.
The last word has no word to depend on and the
second-to-lastwordalwaysdependsonthelastword.
Table 2 shows the accuracies of the SLM with
ACTs, the SLM of the constant history reference,
and a baselinein which each worddepends on the
next one. This result shows that the variable
his-tory reference mechanism based on ACTs reduces
30% of theerrorsof theSLM ofa constanthistory
reference. This proves experimentally that ACTs
improve an SLM for use as a spoken language
un-derstandingengine.
Wecalculatedtheparsingaccuracyofthemodels
whose parameters were estimated from 1/4, 1/16,
and1/64ofthelearningcorpusandplottedthemin
Figure4. Thegradientoftheaccuracycurveatthe
pointofthemaximumlearningcorpussizeisstill
im-portant. Itsuggeststhatanaccuracyof95%should
beachievedbyannotatingabout30,000sentences.
Similartomostoftheparsersformanylanguages,
ourparseris basedonwords. However,mostother
parsersfor Japanese arebased ona uniquephrasal
unit called a bunsetsu, a concatenation of one or
w4
w6
w
1
w3
w
5
w7
w
8
w9
1
b
b
2
b
3
b4
b
w2
Figure 5: Conversion from word dependencies to
bunsetsu dependencies.
Table3: Bunsetsu-basedparsingaccuracy.
languagemodel parsingaccuracy
SLM withACTs 87.8%(674/768)
JUMAN+KNP 85.3%(655/768)
baseline 3
62.4%(479/768)
*Eachbunsetsudependsonthenext one.
functionwords. Inordertocompareourparserwith
oneofthestate-of-the-artparsers,wecalculatedthe
bunsetsu-based accuracies of our model and KNP
(Kurohashi and Nagao, 1994) on the rst 100
sen-tencesof thetest corpus. First the sentenceswere
segmentedintowordsbyJUMAN(Kurohashietal.,
1994)andtheoutputwordsequencesareparsedby
KNP.Next,theword-baseddependenciesoutputby
our parser were changed into bunsetsu as used by
KNP, where the bunsetsu which is depended upon
bya bunsetsu is dened as thebunsetsu containing
the word depended upon by the last word of the
source bunsetsu (see Figure 5). Table3 shows the
bunsetsu-basedaccuraciesofourmodelandKNP.In
accuracy,ourparseroutperformedKNP,butthe
dif-ferencewasnotstatisticallysignicant. Inaddition,
there were dierences in theexperimental
environ-ment:
Thetestcorpus sizewaslimited.
ThePOSsystemfortheKNPinputisdetailed,
soithasmuchmoreinformationthanour
SLM-basedparser.
KNPinthisexperimentwasnotequippedwith
commercialdictionaries.
As we mentionedabove, our current model does
not attempt to use lexical information about
con-tentwordsbecauseofthe data-sparsenessproblem.
If we select the content words to be lexicalized by
referringtotheaccuracyofthewithheldcorpus,the
accuracy increases slightly to 92.9%. This means,
however, our method is not able to eciently use
lexical information aboutthecontent wordsat this
Historically,thestructuresofnaturallanguageshave
beendescribedbyaCFGandmostparsers(Fujisaki
et al., 1989; Pereira and Schabes, 1992; Charniak,
1997)arebasedonit. AnSLM forEnglish(Chelba
andJelinek,2000),proposedasalanguagemodelfor
speech recognition,is alsobasedonaCFG.Onthe
otherhand,anSLMforJapanese(Morietal.,2000)
isbasedonaMarkovmodelbyintro ducingalimiton
language structures caused by our human memory
limitations (Yngve, 1960; Miller, 1956). We
intro-duced thesame limitationintoourlanguage model
andourparserisalsobasedonaMarkovmodel.
In the last decade, the importance of the
lexi-con has come into focus in the area of stochastic
parsers. Nowadays, many state-of-the-art parsers
are based on lexicalized models (Charniak, 1997;
Collins, 1997). In these papers, theyreported
sig-nicant improvement in parsing accuracy by
lexi-calization. Our model is also lexicalized, the
lexi-calization islimited to grammatical functionwords
becauseofthesparsenessofdataatthestepofnext
word prediction. The greatest dierence between
ourparserandmanystate-of-the-artparsersisthat
ourparserisbasedona generativelanguagemodel,
whichworksasalanguage modelofaspeech
recog-nizer. Therefore,a speechrecognizerequippedwith
our parser as its language model should be useful
for a spoken language understanding system. The
greatest advantage of our model over other
struc-turedlanguagemodelsistheablitytorefertoa
vari-ablepartofthestructuredhistorybyusingACTs.
There have been several attempts at Japanese
parsers(KurohashiandNagao,1994;Harunoet al.,
1998; Fujio and Matsumoto, 1998; Kudo and
Mat-sumoto,2000). TheseJapaneseparsershaveallbeen
basedona uniquephrasal unitcalled a bunsetsu,a
concatenationofoneormorecontentwordsfollowed
bysome grammatical functionwords. Unlikethese
parsers, our model describes dependencies between
words.Thusourparsercanmoreeasilybeextended
to other languages. In addition, since almost all
pasersin other languagesthan Japanese output
re-lationshipsbetweenwords,theoutputofourparser
canbeusedbypost-parserlanguageprocessing
sys-temsproposed formanyotherlanguages (suchas a
word-level structural alignment of sentences in
dif-ferentlanguages).
7 Conclusion
In this paper we have described a structured
lan-guage model (SLM) based on a dependency
gram-mar. AnSLM treatsa sentenceas awordsequence
andpredictseachwordfrombeginningtoend. The
history at each step of prediction is a sequence of
partial parse trees covering the preceding words.
ingdata-sparsenessproblems. Asananswer,we
pro-pose to apply arborealcontext trees (ACTs) to an
SLM. AnACTis anextension of a context tree to
a tree-shaped history. We built a parser based on
an SLM with ACTs, whose parameters were
esti-matedfrom9,108syntacticallyannotatedsentences
from a nancial newspaper. We then tested the
parser on 1,011 sentences from the same
newspa-per. Theaccuracy of thedependency relationships
oftheparserwas92.8%,higherthantheaccuracyof
a parser basedon anSLM withoutACTs (89.8%).
This proved experimentally that ACTs improve a
parser basedonanSLM.
References
Eugene Charniak. 1997. Statistical parsing with a
context-freegrammarandwordstatistics.In
Pro-ceedingsof the14thNationalConferenceon
Arti-cial Intelligence,pages598{603.
Eugene Charniak. 2001. Immediate-head parsing
for language models. In Proceedings of the 39th
Annual Meetingof the Association for
Computa-tional Linguistics,pages124{131.
CiprianChelbaandFredericJelinek. 1998.
Exploit-ing syntacticstructureforlanguage modeling. In
Proceedingsof the 17th International Conference
on ComputationalLinguistics,pages225{231.
Ciprian Chelba and Frederic Jelinek. 2000.
Struc-tured language modeling. Computer Speech and
Language,14:283{332.
MichaelCollins. 1997. Threegenerative,lexicalised
models for statistical parsing. In Proceedings of
the 35th Annual Meeting of the Association for
Computational Linguistics,pages16{23.
Masakazu Fujio and Yuji Matsumoto. 1998.
Japanese dependencystructureanalysis basedon
lexicalized statistics. InProceedings of the Third
ConferenceonEmpiricalMethodsinNatural
Lan-guage Processing,pages87{96.
T. Fujisaki, F. Jelinek, J. Cocke, E. Black, and
T.Nishino. 1989. Aprobabilisticparsingmethod
forsentencedisambiguation.InProceedingsofthe
International ParsingWorkshop.
Masahiko Haruno, Satoshi Shirai, and Yoshifumi
Ooyama.1998. Usingdecisiontreestoconstructa
practical parser. InProceedingsof the17th
Inter-national Conference on Computational
Linguis-tics,pages505{511.
Taku Kudo and Yuji Matsumoto. 2000. Japanese
dependency structure analysis based on support
vectormachines. InProceedingsofthe2000Joint
SIGDAT Conference on Empirical Methods in
NaturalLanguageProcessingandVeryLarge
Cor-pora.
syn-ComputationalLinguistics,20(4):507{534.
Sadao Kurohashi, Toshihisa Nakamura, Yuji
Mat-sumoto,andMakotoNagao. 1994. Improvements
of Japanese morphological analyzer JUMAN. In
Proceedings of the International Workshop on
Sharable Natural Language Resources, pages 22{
28.
GeorgeA.Miller. 1956. Themagicalnumb erseven,
plus or minus two: Some limits on our capacity
forprocessinginformation. ThePsychological
Re-view,63:81{97.
Shinsuke Mori, Masafumi Nishimura, Nobuyasu
Itoh,ShihoOgino,andHideoWatanab e. 2000. A
stochasticparserbasedona structuralword
pre-dictionmodel. InProceedingsofthe18th
Interna-tional Conference on Computational Linguistics,
pages558{564.
ShinsukeMori,MasafumiNishimura,andNobuyasu
Itoh. 2001. Improvementofastructuredlanguage
model: Arbori-contexttree. InProceedingsof the
SeventhEuropeanConferenceonSpeech
Commu-nication andTechnology.
Fernando Pereira and YvesSchabes. 1992.
Inside-outsidereestimationfrompartiallybracketed
cor-pora. In Proceedingsof the 30thAnnual Meeting
of theAssociationforComputationalLinguistics,
pages128{135.
DanaRon,YoramSinger,andNaftaliTishby. 1996.
Thepowerofamnesia: Learningprobabilistic
au-tomata with variable memory length. Machine
Learning,25:117{149.
Victor H. Yngve. 1960. A model anda hypothesis
forlanguagestructure. The American