FST and Lexical Data from Corpus
JuntaeYoon YoonkwanKim MansukSong
[email protected] fgeneral,[email protected]
DaumCommunications Corp. Dept. ofComputer Science,Engineering College
Kangnam-gu,Samsung-dong, 154-8 YonseiUniv.
Seoul 135-090, Korea Seoul 120-749, Korea
Abstract
Accurateanalysisofthetemporalexpressionis
cru-cial for Korean text processing applications such
asinformationextractionandchunkingforeÆcient
syntacticanalysis. Itisacomplicatedproblemsince
temporal expressions often have the ambiguity of
syntacticroles. Thispaperdiscussestwoproblems:
(1)representingandidentifyingthetemporal
expres-sion(2)distinguishingthesyntacticfunction ofthe
temporal expression in case it has a dual
syntac-tic role. In this paper, temporal expressions and
thecontext fordisambiguationwhich iscalled local
contextare representedusing lexicaldataextracted
from corpusand thenite statetransducer. By
ex-periments, itturns outthatthe methodis eective
fortemporalexpression analysis. Inparticular, our
approach showsthe corpus-basedwork couldmake
a promising result for the problem in a restricted
domaininthat wecaneectievelydealwithalarge
sizeoflexicaldata.
1 Introduction
Accurateanalysisofthetemporalexpressionis
cru-cialfortextprocessingapplicationssuchas
informa-tionextractionandforchunkingforeÆcient
syntac-ticanalysis. Ininformationextraction,ausermight
wantto geta piece of information aboutan event.
Typically, the event is related with date or time,
whichisrepresentedbytemporalexpression.
Chunking is helpful for eÆcient syntactic
analy-sisbyremovingirrelevantintermediateconstituents
generated through parsing. It involvesthe task to
divide sentencesinto non-overlappingsegments. As
aresultofchunking,parsingwould beaproblemof
analysisinsidechunksandbetweenchunks(Yoon,et
al., 1999). Chunking preventstheparser from
pro-ducing intermediate structures irrelevant to anal
output,whichmakestheparsereÆcientwithout
los-ingaccuracy. Thus,itturnsoutthatchunkingisan
essential stage for the application system like MT
thatshould pursuebotheÆciency andprecision.
Korean, an agglutinative language, has
well-developed functional words such as postposition
aphraseis decisivelydetermined. Besides, because
it is a headnal languageand sothe head always
follows its complement, the chunking is relatively
easy. However,wearealsofacedwithanambiguity
probleminchunking,whichisoftenduetothe
tem-poral expression. This is because many temporal
nounsareused asthemodier ofnoun andverb in
asentence. Letusconsiderthefollowingexamples:
[Example]
1a jinan(last)yeoreum(summer) uri-neun
(we/NOM)hamgge(together)san-e(to
moun-tain)gassda(went)
! Wewent to the mountain togetherlast
sum-mer.
1b jinan(last)yeoreum(summer)banghag-e(in va
cation)uri-neun(we/NOM)hamgge(together)
san-e(tomountain) gassda(went)
! Wewenttothemountaintogetherinthelast
summervacation.
2a 10weol(October)9il(9th) jeonyeog(evening)
7si(7o'clock) daetongryeong-yi
(presi-dent/GEN)damhwa-ga(talk/NOM)issda(be)
! The president will give a talk at 7:00pm in
Oct. 7th.
2b 10weol(October)9il(9th) jeonyeog(evening)
7si(7o'clock)bihaenggipyo-reul(ight ticket/
ACC) yeyaghalsu issseubnigga(canreserve)
! Can I reservethe ight ticket for 7:00pm in
Oct. 7?
Intheexamples,eachtemporalexpressionplaysa
syntactically dierent role used as noun phrase or
adverbial phrase (The underlined is a phrase)
al-thoughtheycomprisethesamephrasalforms. The
temporal expressions in 1a and 2a of the example
serveasthetemporaladverbto modifypredicates.
Onthe otherhand, thetemporal expressionsin 1b
and2bareusedasthemodierofothernouns. That
is,asatemporalnouneithercontributesto
construc-tionofanoun compoundormodiesapredicate,it
causesastructuralambiguity.
as-nouns are lexically decided, it does not seem that
their syntactic tags could be accurately predicted
with a relatively small size of POS tagged corpus.
Also, the simple rule based approach cannot make
satisfactoryresults withoutlexical information. As
such,identicationoftemporalexpressionisa
com-plicated probleminKoreantextanalysis.
Thispaperdiscussesidenticationoftemporal
ex-pressions and their syntactic roles. In this paper,
we would deal with two problems: (1)
represent-ingandidentifying thetemporalexpression(2)
dis-tinguishing its syntactic function in case it has a
dual syntacticrole. Actually, thetwoproblems are
closely related since the identication and
disam-biguation process would be done under the
repre-sentationscheme of temporal expression. The
pro-cessbasesonlexicaldataextractedfromcorpusand
thenite statetransducer(FST). Accordingto our
observation oftexts, wecould seethat afew words
followingatemporal noun havegreat eect on the
syntacticfunctionofthetemporal noun. Therefore,
we note that the structural ambiguity could be
re-solved in local contexts, and so obtain lexical
in-formation for the local contexts from corpus. The
lexicaldata which containcontextsfor
disambigua-tion arerepresentedwith temporal wordtransition
overtheFST.
Briey describing our methodology, we rst
ex-tractconcordancedataofeachtemporalwordusing
aconcordance program. The co-occurrences
repre-sentrelationsbetweentemporal wordsand also
ex-plain how temporal nouns and common nouns are
combined to generate a compound noun. It would
bethelikelihood ofwordcombination, which helps
disambiguate the syntactic role if atemporalword
have a syntactic duality. In particular, we classify
temporal nouns into 26 classes in accordance with
their meaning and function. Thus, the word
co-occurrences become those among temporal classes
ortemporal classesand other nouns, which results
in reducing the parameter space. Second,
tempo-ralexpressionscontainingtheco-occurrencesof
tem-poral classesand other nouns are represented with
theFSTtoidentifytemporalexpressionsandassign
theirsyntactictagsinasentence. Ithasbeenshown
thattheFSTpresentsaveryeÆcientwayfor
repre-sentingphraseswithlocality. TheinputoftheFST
is the result from morphological analysis and POS
tagging (here, the temporal noun is tagged onlyas
noun). Itsoutputisthesyntactictagforeachword
inthesentenceandtemporalwordsareattachedtags
suchasnoun andadverb. Figure1showsthe
over-all system from the morphological analyzer to the
chunker.
Therefore, the process attaches syntactic labels
[Example]
1a 0
[jinan(last) yeoreum(summer)]
TA
uri-neun
(we/NOM)hamgge(together)san-e(to
moun-tain)gassda(went)
! Wewent to the mountain togetherlast
sum-mer.
1b 0
[jinan(last) yeoreum(summer)]
TN
banghag-e(in vacation) uri-neun(we/NOM) hamgge
(together)san-e(to mountain)gassda(went)
! Wewenttothemountaintogetherinthelast
summervacation.
2a 0
[10 weol(October) 9 il(9th) jeonyeog(evening)
7 si(7 o'clock)]
TA
daetongryeong-yi
(presi-dent/GEN)damhwa-ga(talk/NOM)issda(be)
! The president will give a talk at 7:00pm in
Oct. 7th.
2b 0
[10 weol(October) 9 il(9th) jeonyeog(evening)
7 si(7 o'clock)]
TN
bihaenggipyo-reul(ight
ticket/ACC) yeyaghalsu issseubnigga(can
re-serve)
! Can I reservethe ight ticket for 7:00pm in
Oct. 7?
2 Related Works
Abney(1991)hasproposed textchunking asa
pre-liminary step to parsing on the basis of
psycho-logical evidence. In his work, the chunk was
de-nedasapartitionedsegmentwhichcorrespondsin
someway to prosodic patterns. In addition,
com-plex attachment decisions as occurring in NP or
VP analysis are postponed without being decided
in chunking. Ramshaw and Marcus (1995)
intro-ducedabaseNPwhich isanon-recursiveNP.They
usedtransformation-basedlearningtoidentify
non-recursivebaseNPsinasentence. Also,V-typechunk
was introduced in their system, and so they tried
to partition sentences into non-overlapping N-type
and V-type chunks. Yoon, et al. (1999) have
de-ned chunkingin variouswaysforeÆcientanalysis
ofKoreantextsand shownthat themethod isvery
eectiveforpracticalapplication.
Besides,therehavebeenmanyworksbasedonthe
nitestatemachine. Thenitestatemachine is
of-tenusedforsystemssuchasspeechprocessing,
pat-tern matching, POS tagging and so forth because
of its eÆciency of speed and space and its
conve-nience of representation. As for parsing, it is not
suitableforfull parsingbasedonthegrammarthat
has recurrent property, but for partial parsing
re-quiringsimplestatetransition. Rocheand Schabes
(1995)havetransformedtheBrill'srulebasedtagger
to the optimized deterministic FST and improved
thespeedandspaceofthetagger. Anotableone
sentence
jinan yeoreum banghag-e uri-neun hamgge san-e gassda
Morp Anal and POS Tagger
FST for idetifying temporal expression
Text Chunker
jinan/A yeoreum/N banghag/N-e/P uri/PN-neun/P hamgge/AD san/N-e/P ga/V-ss/TE-da/E
[jinan/A_tn yeoreum/N_tn banghag/N-e/P] [uri/PN-neun/P] [hamgge/AD] [san/N-e/P] [ga/V-ss/TE-da/E]
jinan/A_tn yeoreum/N_tn banghag/N-e/P uri/PN-neun/P hamgge/AD san/N-e/P ga/V-ss/TE-da/E
Figure1: Systemoverviewfromthemorphologicalanalyzerfromthechunker
rigid phrases,collocationsand idioms unlikeglobal
grammar for describing sentences of a language in
aformallevel. Thetemporalexpression was
repre-sentedwithlocalgrammarinhiswork,whereitwas
claimedthattheformalismofniteautomatacould
beeasily usedtorepresentthem.
3 Acquiring Co-occurrence of
Temporal Expression
3.1 Categorizing TemporalNouns
Sincemanywordshavein commonasimilar
mean-ing and function, theycan be categorized by their
features. Sodotemporalnouns. Thatis,wesaythat
`Sunday' and`Monday'havethesamefeaturesand
sowouldtakethesimilarbehaviorpatternssuchas
co-occurringwiththesimilarwordsinasentenceor
phrase. Hence,intherstplacewecategorize
tem-poral nouns according to their meaning and
func-tion. Werstselect259temporal nounsanddivide
them into 26 classes as shown in Table 1. Among
them, some temporal words havesyntactic duality
and others play one syntactic role. Thus, the
dis-ambiguationprocess would be applied only to the
wordswithdualsyntacticfunctions.
3.2 Acquisition of TemporalExpressions
from Corpus
Temporalwordswouldbecombinedwitheachother
inordertobemadereferencetotime,whichiscalled
temporal expression. Since atemporalexpressionis
typicallycomposed ofoneorafewtemporalwords,
oneul(today)
haggyo-e(to school)
gassda(went)
X
modifying noun
modifying predicate
T
V
V
NP
NP
jeomsim-eun
(lunch/NOM)
masisseossda
(was delicious)
Figure2: Syntacticfunctional ambiguityof
tempo-ralexpression
thetemporalexpressionwithasimplemodellike
-niteautomata. Inthepracticalsystem,however,we
areconfrontedwithacomplicatedproblemin
treat-ingtemporalexpressionssincemanytemporalwords
haveafunctionalambiguityusedasbothanominal
and predicate modier. For instance, a temporal
nounoneul(today)couldplayadierentrole inthe
similarsituation asshown in Figure 2. In therst
andthesecondpath,thewordstofollowoneulareall
noun,but theroles (dependencyrelations)ofoneul
aredierent.
Accurateclassicationoftheirsyntacticfunctions
iscrucialfortheapplication systemsincegreat
dif-ferencewouldbemadeaccordingtoaccuracyofthe
dependencyresult. Practically, wetherefore should
res-temporal prexes
modier 1 ol(this),jinan(last),...
number 2 number...
temporal nouns
temporalunit 3{10 segi(centry),nyeon(year),...
era, age 11 gosaengdae(Paleozoic),...
years 12 geumnyeon(thisyear),saehae(newyear),...
months 13 naedal(next month),jeongwol(January),...
weeks 14 geumju(thisweek),naeju(nextweek),...
days
dayofweek 15 ilyo 0
il(Sunday),wolyo 0
il, ...
day1 16{17 haru(oneday),chuseog(Thanksgivingday),...
day2 18 oneul(today),nae
0
il(tomorrow),...
time
and
dura-tion
time1 19 saebyeog(dawn), achim(morning),...
time2 20{21 yeonmal(year-end),...
season 22 bom(spring),yeoreum(summer), ...
specicduration 23 hwanjeolgi(time ofseasonchanging),...
edge 24-25 chogi(early time),jungban(mid),...
temporal suÆxes temporalsuÆxes 26 dongan(during),naenae(through),...
Table1: Categorizationoftemporalwords
fyingtemporalexpressions. Thepointthatwenote
hereisthatwecouldpredictthesyntacticfunctionof
temporalwordsbylookingaheadoneortwowords.
Namely,looking atafew wordsthatfollowsa
tem-poralwordwecangureoutwhichwordthe
tempo-ralexpressionmodies,andcallthefollowingwords
local context.
Unfortunately, it is not easy to dene the local
context for determining the syntactic function of
each temporal word because they are lexically
re-lated. That is,itiswhollydierentfromeachword
whether atemporalnoun wouldmodifyothernoun
to formacompound nounor modifyapredicateas
an adverbial phrase. Our approach is to use
cor-pus to acquireinformationaboutthelocalcontext.
Since wecould obtain from corpus asmany
exam-plesasneeded,rulesforcompound wordgeneration
canbeconstructedfromtheexamples. Inthispaper,
weuseco-occurrencerelationsoftemporalnouns
ex-tractedfromlargecorpustorepresentandconstruct
rulesforidenticationoftemporalexpressions.
As mentioned before, we would pay attentionto
twopointshere: (1) In what order atemporal
ex-pressionwouldberepresentedwithtemporalwords,
i.e. descriptionofthetemporalexpressionnetwork.
(2)howthelocalcontextwould bedescribed to
re-solvetheambiguityofthesyntacticfunctionof
tem-poralexpressions. Forthispurpose, werstextract
examplesentencescontaining each of 259temporal
words from corpus using the KAIST concordance
program 1
(KAIST,1998). Thenumberoftemporal
words is small and so we could manually
manipu-late lexical data extracted from corpus. Figure 3
1
KAISTcorpusconsistsofabout50millioneojeols.Eojeol
isaspacingunitcomposedofacontentwordandfunctional
shows example sentences about yeoreum(summer)
extractedbytheconcordanceprogram.
Second, we select only the phrases related with
temporal words from the examples (Table 4). As
shownin Table4,yeoreum isassociatedwith
vary-ing words. Temporal words like temporal prexes
cancomebeforeitandcommonnounscanfollowit.
Inthisstage wedescribecontextsofeach temporal
wordandtheoutput (syntactictagof thetemporal
word)under the given context. Inparticular, each
temporalwordisassignedatemporalclass. Besides,
othernouns serveaslocal contextsfor
disambigua-tionofsyntacticfunction oftemporal words.
Fromtheexamples,wecanseethatifbam(night),
byeoljang(villa),banghag(vacation)andsoonfollows
it, yeoreum servesas a component of a compound
noun with thefollowingword. On the otherhand,
thewordnaenaewhichmeansallthetimeisa
tem-poral noun and forms a temporal adverbial phrase
withotherprecedingtemporalnoun. Moreover,
yeo-reum(summer)mightrepresenttime-related
expres-sionwithpreceding temporalprexes.
4 Identifying Temporal Expressions
and Chunking
4.1 RepresentingTemporalExpression
UsingFST
The co-occurrence data extracted by the way
de-scribed in the previous section can be represented
withanite state machine (Figure 5). For
syntac-tic function disambiguation and chunking, the
au-tomatashould produceanoutput,whichleadstoa
nite state transducer. In fact, individual
descrip-tionforeachdatacouldbeintegratedintoonelarge
Fig-L#|ö\(ú°. [ö ]]L],¿#¤:
Figure3: Exampleconcordancedataofyeoreum(summer)
before temporalnoun after output freq
yeoreum(summer)
ö/t
22
(bam,night) TN 2
ö/t
22
¡(banghag,vacation) TN 7
ö/t
22
»$(byeoljang,villa) TN 1
ö/t
22
s(jumal,weekends) TN 1
ö/t
22
(gamgi,u) TN 1
ö/t
22
44/t
26
(naenae,allthetime) TA 1
&/t
1
(jinan, last) ö/t
22
#(naneun,I/TOP) TA 1
©/t
10
(hae,year) ö/t
22
6.25,(majimag,the last) TA 2
¥/t
1
(ibeon,this) ö/t
22
;(jeontuneun,battle/TOP) TA 1
Figure4: Temporalexpression phrasesselectedfrom examples
tuple (
isanite input
alphabet;
2
is anite output alphabet; Q is a
-nitesetofstatesorvertices;i2Qistheinitialstate;
F Qisthesetofnalstates;EQ
isthesetoftransitionsoredges.
Althoughthesyntacticfunctionofatemporal
ex-pressionwouldbenondeterministicallyselectedfrom
the context, temporal expressions and the lexical
data of local context can be represented in a
de-terministic way due to their nite length. For the
deterministicFST,wedenethepartialfunctions
andwhereqa=q
(Roche and Schabes,1995). Then, asubsequential
FST is a eight-tuple (
1
thedeterministicstatetransitionfunctionthatmaps
Q onQ;isthedeterministicemissionfunction
T =(
Figure6: DeterministicFSTresultedfrom Figure5
that maps Q
Our temporal co-occurrence data can be
yeoreum/TN
naenae/TA
Figure5: Finitestatemachineconstructedwiththedatain Figure4
0
1
2
3
W =fbam;jumal;banghag;:::g
wi2W
w
j 62W
Figure 7: A deterministicnite statetransducerto
processtemporal expression
in a similar way. The subsequential FST for our
systemis dened asin Figure6and Figure 7
illus-trates thetransducer in Figure 6. Inthe gure, t
i
isaclasstowhichthetemporalwordbelongsinthe
temporalclassication. w
i
isawordotherthan
tem-poralonesthathastheprecedingtemporalwordbe
its modier, and w
j
is not such awordto make a
compound noun. TN, TA and NT are syntactic
tags. A wordtagged withTN wouldmodifya
suc-ceedingnoun likebam(night),banghag(vacation). A
word attached with TA would modify a predicate
andonewithNT meansit isnotatemporalword.
Actually,individualFSTsarecombinedintooneand
rulesfortaggingoftemporalwordsareputoverthe
FST. The rule is applied according to the priority
byfrequencyin casemorethanoneoutputare
pos-sibleforacontext. Namely,itisarule-basedsystem
wheretherulesareextractedfromcorpus.
4.2 Chunking
AftertheFSToftemporalexpressionsaddstowords
syntactictagssuchasTN andTA,chunkingis
con-ductedwithresultsfromoutputsbytheFST.Aswe
said earlier, chunking in Korean is relatively easy
fullyrecognized. Actually,ourchunkerisalsobased
on thenite statemachine. Thefollowingisan
ex-ampleforchunkingrules.
hNPchunki!hNPijhTNPi
Here,N isanounwithoutanypostposition,NP is
anounwithapostposition,TN isatemporalnoun
recognized asmodifying a succeeding noun,NU is
a numberand UN is a unit noun. After temporal
tagging, the chunker transforms `NT' into N, NP,
etc. according to morphological constituents and
theirPOS.Briey,therulesaysthatanNPchunkis
madefromeitherNPortemporalNP.AnNPwould
be constructed with one or more nouns and their
modieeor withanoun quantied. A TNP, which
is related with time, is made from nouns modied
bytemporalwordswhichwouldbeidentiedbythe
FST. By identication of temporal expression and
chunking,thefollowingexamplesentenceischunked
asbelow.
jinan(last) yeoreum(summer) banghag-e(in
vacation) uri-neun(we/SUBJ)
keompyu-teo(computer) se(three) dae-reul(unit/OBJ)
sassda(bought)
! We bought three computers in the last
summervacation.
5 Experimental Results
For the experimentabout temporal expression, we
expres-precision recall
rate(%) 97.5 90.56
Table2: Resultsofidentifyingtemporalexpression
nochunking usingchunking
avg. # ofcand 4.8 3.3
Table 3: Reduction of candidates resulted from
chunking
sultsfromidentifyingtemporalexpressionsand
dis-ambiguatingtheirsyntacticfunctions. Fromthe
re-sultinthetableweseethatthemethodisvery
eec-tiveinthat itveryaccuratelyidentiesallthe
tem-poralexpressionsandassignsthemsyntactictags.
And, Table 2 shows the reduction resulted from
chunking after temporal expression identication.
We take into consideration the average number of
head candidates for each word since our parser is
dependencybasedone. The testwasconducted on
the rst le (about800 sentences) of KAIST
tree-bank(Choietal. ,1994). Thenumberwasreduced
by31%incandidates comparedto thesystemwith
nochunking,whichmakesparsingeÆcient.
Mostoferrorswerecausedbythecasewhere
tem-poralwordshavedierentsyntacticrolesunder the
samecontext. Inthiscase,theglobalcontextsuchas
thewholesentenceor intersententialinformationor
sometimesverysophisticatedprocessingisneededto
resolvethe problem. For instance, `82 nyeon(year)
hyeonjae-yi(now/GEN)' could be used two-way. If
thespeechtimeistheyear1982,thenhyeonjae-yiare
combined with 82nyeon to represent time.
Other-wise,82doesnotmodifyhyeonjae-yi, which cannot
berecognizedonlywiththelocalcontext.
Neverthe-less,thesystemispromisinginthatgenerallyitcan
improveeÆciency without losingaccuracywhichis
crucialforthepracticalsystem.
6 Conclusions
In this paper, we presented a method for
identi-cation of temporal expressions and their syntactic
functions based on FSTand lexical data extracted
fromcorpus. Sincetemporalwordshavethe
syntac-ticambiguitywhenused in asentence, itis
impor-tanttoidentifythesyntacticfunction aswellasthe
temporal expressionitself.
Forthepurpose,wemanuallyextractedlexical
co-occurrencesfromlargecorpusanditwaspossibleas
the number of temporal nouns is tractable enough
to manipulate lexical data by hand. As shown in
theresult,lexicalco-occurrencesarecrucialfor
dis-ambiguatingthe syntacticfunction of thetemporal
expression. Besides, the nite state approach
pro-markably lessen, by pruning irrelevant candidates,
intermediatestructuresgeneratedwhileparsing.
References
Abney, S. 1991. Parsing By Chunks. In Berwick,
Abney,andTenny,editors,Principle-Based
Pars-ing. Boston: KluwerAcademicPublishers.
Choi, K.S.,Han,Y. S.,Han,Y. G.,and Kwon,O.
W. 1994. KAIST TreeBank Projectfor Korean:
PresentandFutureDevelopment. InProceedings
of the International Workshop on Sharable
Natu-ral Language Resources.
Ciravegna, F. and Lavelli, A. 1997. Controlling
Bottom-Up Chart Parsers through Text
Chunk-ing. InProceedingsofthe5thInternational
Work-shop onParsing Technology.
Collins,M.J. 1996. ANewStatisticalParserBased
on Bigram Lexical Dependencies. InProceedings
of the 34thAnnualMeetingof theACL.
Elgot,C.C.andMezei,J.E. 1965. Onrelations
de-nedbygeneralizedniteautomata. IBMJournal
of ResearchandDevelopment, 9,47{65.
Gross, M. 1993. Local Grammars and their
Rep-resentation by Finite Automata. Data,
Descrip-tion, Discourse: Papers on the English language
in hornour of John McH Sinclair, MichaelHoey
(ed). London: HarperCollinsPublishers.
KAIST. KAIST Concordance Program. URL
http://csve.kaist.ac.kr/kcp/.
Mohri, M. 1997. Finite-state Transducers in
lan-guage and Speech Processing. Computational
Linguistics,Vol23,No(2).
Ramshaw, L. A. and Marcus, M. P. 1995. Text
Chunking UsingTransformation-BasedLearning.
In Proceedings of the ACL Workshop on Very
LargeCorpora.
Roche, E. and Schabes, Y. 1995. Deterministic
Part-of-Speech Tagging with Finite-State T
rans-ducers. Computational Linguistics, Vol 21, No
(2).
Roche,E. andSchabes,Y. 1997. Finite-State
Lan-guage Processing. TheMITPress.
Skut, W. and Brants, T. 1999. Chunk Tagger.
In Proceedings of ESSLLI-98 Workshop on
Au-tomatedAcquisitionofSyntaxandParsing.
Sproat, R. W., Shih, W., Gale, W. and Chang,
N. 1994. A Stochastic Finite-State
Word-segmentationAlgorithmfor Chinese. In
Proceed-ings ofthe 32nd AnnualMeetingofACL
Yoon, J., Choi, K. S. and Song, M. 1999. Three
Types of Chunking in Korean and Dependency
Analysis Based on Lexical Association. In