• No results found

Bunsetsu Identification Using Category Exclusive Rules

N/A
N/A
Protected

Academic year: 2020

Share "Bunsetsu Identification Using Category Exclusive Rules"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Masaki Murata Kiyotaka Uchimoto Qing Ma Hitoshi Isahara

Communications Research Laboratory, Ministry of Posts and Telecommunications

588-2, Iwaoka, Nishi-ku, Kobe, 651-2492, Japan

tel:+81-78-969-2181 fax:+81-78-969-2189 http://www-karc.crl.go.jp/ips/murata

fmurata,uchimoto,qma,isaharag@crl.go.jp

Abstract

Thispaperdescribestwonewbunsetsuidentication

methods using supervised learning. Since Japanese

syntactic analysis is usually done after bunsetsu

identication, bunsetsu identication is important

for analyzing Japanese sentences. In experiments

comparing the four previously available

machine-learning methods (decision tree, maximum-entropy

method, example-based approach anddecision list)

andtwonewmethodsusingcategory-exclusiverules,

the new method using the category-exclusive rules

withthehighestsimilarityperformedbest.

1 Introduction

This paper is about machine learning methods for

identifying bunsetsus, which correspond to English

phrasalunitssuchasnounphrasesandprepositional

phrases. Since Japanese syntactic analysis is

usu-allydoneafterbunsetsuidentication(Uchimotoet

al.,1999),identifying bunsetsuis importantfor

an-alyzingJapanese sentences. Theconventional

stud-iesonbunsetsuidentication 1

haveusedhand-made

rules (Kameda, 1995; Kurohashi, 1998), but

bun-setsuidenticationisnotaneasytask. Conventional

studiesusedmanyhand-maderulesdevelopedatthe

cost of many man-hours. Kurohashi, for example,

made 146 rules for bunsetsu identication

(Kuro-hashi,1998).

In an attempt to reduce the numb er of

man-hours, we used machine-learning methods for

bun-setsuidentication. Becauseit was notclearwhich

machine-learningmethodwouldbetheonemost

ap-propriate for bunsetsu identication, so we tried a

variety of them. In this paper we report

exper-iments comparing four machine-learning methods

(decision tree, maximum entropy, example-based,

anddecisionlistmethods)andournewmethods

us-ingcategory-exclusiverules.

1

Bunsetsuidenticationisaproblemsimilartochunking

(Ramshaw and Marcus, 1995; Sang and Veenstra, 1999) in

2 Bunsetsu identication problem

Weconducted experiments onthe following

super-visedlearning methodsforidentifyingbunsetsu:

Decisiontreemethod

Maximumentropymethod

Example-basedmethod(useofsimilarity)

Decisionlist(useofprobabilityandfrequency)

Method1(useofexclusiverules)

Method2(useofexclusiveruleswiththe

high-est similarity).

In general, bunsetsu identication is done after

morphological and before syntactic analysis.

Mor-phological analysis corresponds to part-of-speech

tagginginEnglish.Japanesesyntacticstructuresare

usually represented by the relations between

bun-setsus,which correspondto phrasal units such as a

noun phrase or a prepositional phrase in English.

So,bunsetsuidenticationisimportantin Japanese

sentenceanalysis.

In this paper, we identify a bunsetsu by using

information from a morphological analysis.

Bun-setsuidenticationistreatedasthetaskofdeciding

whethertoinserta\j"marktoindicatethepartition

between two bunsetsus as in Figure 1. Therefore,

bunsetsuidenticationisdonebyjudgingwhethera

partitionmarkshouldbeinsertedbetweentwo

adja-centmorphemesornot. (Wedonotusetheinserted

partitionmarkinthefollowinganalysisinthispaper

forthesakeofsimplicity.)

Ourbunsetsuidenticationmethodusesthe

mor-phologicalinformationofthetwoprecedingandtwo

succeedingmorphemesofananalyzedspacebetween

twoadjacentmorphemes. Weusethefollowing

mor-phologicalinformation:

(i) Majorpart-of-speech(POS)category, 2

(ii) Minor POScategoryorinectiontype,

(iii) Semanticinformation(therstthree-digit

num-ber of a category numb er as used in \BGH"

(NLRI,1964)),

2

(2)

(Kuro-(I) nominative-caseparticle (bunsetsu) objective-caseparticle (identify) .

(Iidentifybunsetsu.)

Figure1: Exampleofidentiedbunsetsus

bun wo kugiru .

(sentence) (obj) (divide) .

((I)divide sentences)

MajorPOS Noun Particle Verb Symb ol

MinorPOS NormalNoun Case-Particle NormalForm Punctuation

Semantics 2 None 217 2

Word 2 wo kugiru 2

Figure2: Informationusedin bunsetsuidentication

(iv) Word(lexicalinformation).

For simplicity we do not use the \Semantic

infor-mation" and \Word" in either of the two outside

morphemes.

Figure 2 shows the information used to judge

whetherornottoinsertapartitionmarkinthespace

between two adjacent morphemes, \wo (obj)" and

\kugiru (divide)," in the sentence \bun wo kugiru.

((I)dividesentences)."

3 Bunsetsu identication process for

each machine-learning method

3.1 Decision-tree method

In this work we used the program C4.5 (Quinlan,

1995) for the decision-tree learning method. The

four typ es of information, (i) major POS, (ii)

mi-norPOS,(iii) semanticinformation,and (iv)word,

mentioned in the previous section were also used

as features with the decision-tree learning method.

As shownin Figure3,the numb er offeatures is 12

(2+4+4+2)becausewedonotuse(iii)semantic

informationand(iv)wordinformationfromthetwo

outsidemorphemes.

InFigure2,forexample,thevalueofthefeature

`themajorPOSofthefarleftmorpheme'is`Noun.'

3.2 Maximum-entropymethod

Themaximum-entropymethodisusefulwithsparse

data conditions and has been used by many

re-searchers (Berger et al., 1996; Ratnaparkhi, 1996;

Ratnaparkhi, 1997; Borthwick et al., 1998;

Uchi-motoet al.,1999). Inourmaximum-entropy

exper-imentweusedRistad'ssystem(Ristad,1998). The

analysisis performedbycalculatingtheprobability

of insertingor notinsertinga partition mark,from

theoutput ofthesystem. Whichever probability is

Inthemaximum-entropymethod,weusethesame

four types of morphological information, (i) major

POS,(ii)minorPOS,(iii)semanticinformation,and

(iv)word,as inthedecision-treemethod. However,

it doesnotconsidera combinationof features.

Un-likethedecision-tree method, asa resultwehad to

combinefeaturesmanually.

First weconsidered a combinationof the bits of

eachmorphologicalinformation. Becausetherewere

four typesofinformation,thetotalnumberof

com-binations was 2 4

01. Since this numb er is large

and intractable, weconsidered that(i) major POS,

(ii) minorPOS,(iii) semanticinformation,and(iv)

wordinformationgraduallybecomemorespecicin

this order,andwecombinedthefourtyp esof

infor-mationin thefollowingway:

InformationA: (i)majorPOS

InformationB:(i)majorPOSand(ii)minorPOS

InformationC:(i)majorPOS,(ii)minorPOSand

(iii)semanticinformation

InformationD: (i)majorPOS,(ii)minorPOS,

(iii)semanticinformationand(iv)word

(1)

WeusedonlyInformationAandBforthetwo

out-side morphemes because we did not use semantic

and wordinformationin thesame wayit isusedin

thedecision-tree method.

Next,weconsideredthecombinationsofeachtyp e

of information. As shown in Figure 4, the numb er

of combinationswas64(2242422).

Fordatasparseness,inadditiontotheabove

com-binations,weconsideredthecasesinwhichrst,one

of the two outside morphemes was not used,

sec-ondly,neitherofthetwooutsideoneswereused,and

thirdly,onlyoneofthetwomiddleonesisused. The

number of features used in the maximum-entropy

methodis152,whichisobtained asfollows: 3

(3)

Figure3: Featuresusedin thedecision-treemethod

Farleftmorpheme Leftmorpheme Rightmorpheme Farrightmorpheme

Figure4: Featuresusedin themaximum-entropymethod.

No. offeatures= 2 2 4 2 4 2 2

In Figure 2, the feature that uses Information

B in the far left morpheme, Information D in the

left morpheme, Information C in the right

mor-pheme, and Information A in the far right

mor-pheme is \Noun: Normal Noun; Particle:

Case-Particle: none: wo;Verb: NormalForm: 217;

Sym-bol". Inthemaximum-entropymethodweused for

eachspace152featuressuchasthisone.

3.3 Example-based method (useof

similarity)

Anexample-based method was proposed byNagao

(Nagao, 1984) in an attempt to solve problems in

machine translation. To resolve a problem, it uses

themostsimilar example. Inthepresent work, the

example-based method impartially used the same

four typ es of information (see Eq. (1)) as used in

themaximum-entropymethod.

Tousethismethod,wemustdenethesimilarity

ofaninputtoanexample. Weusethe152patterns

fromthemaximum-entropymethodtoestablishthe

level of similarity. We dene the similarity S

be-tweenaninputandan exampleaccording to which

one ofthese 152 levelsisthematchinglevel, as

fol-lows. (The equation reects theimportance of the

twomiddlemorphemes.)

January1,1995ofaKyotoUniversitycorpus(thenumb erof

spacesbetweenmorphemeswas25,814)byusingthismethod,

S=s(m

theleft,right,farleft,andfarrightmorphemes,and

s(x) is themorphological similarity of a morpheme

x,whichisdened asfollows:

s(x)=1(whennoinformationofxismatched)

2(whenInformationAofxismatched)

3(whenInformationBofxismatched)

4(whenInformationCofxismatched)

5(whenInformationDofxismatched)

(3)

Figure 5 shows an example of the levels of

sim-ilarity. When a pattern matches Information A of

allfourmorphemes, suchas \Noun; Particle;Verb;

Symb ol", itssimilarity is 40,004(222210;000+

222). Whenapattern matchesa pattern, such as

\|; Particle: Case-Particle: none: wo; |; |",its

similarityis50,001(521210;000+121).

The example-based method extracts the

exam-ple with the highest level of similarity and checks

whetherornotthatexampleismarked. Apartition

markisinsertedintheinputdataonlywhenthe

ex-ampleismarked. Whenmultipleexampleshavethe

samehighestlevelofsimilarity,the selectionofthe

best example is ambiguous. In this case, wecount

thenumber of marked and unmarked spaces in all

oftheexamples andchoosethelarger.

3.4 Decision-listmethod(useof probability

and frequency)

The decision-list method was proposed by Rivest

(4)

(sentence) (obj) (divide) .

s(x) m

02

m

01

m

+1

m

+2

Noinformation 1 | | | |

InformationA 2 Noun Particle Verb Symb ol

InformationB 3 NormalNoun Case-Particle NormalForm Punctuation

InformationC 4 2 None 217 2

InformationD 5 2 wo kugiru 2

Figure 5: Exampleoflevelsofsimilarity

butareexpandedbycombiningallthefeatures,and

are stored in a one-dimensional list. A priority

or-der is dened in a certain way and all of the rules

arearrangedinthisorder. Thedecision-listmethod

searches for rules from the top of the list and

an-alyzes a particular problem by using only the rst

applicablerule.

Inthis studyweusedin thedecision-listmethod

thesame152typ esofpatternsthatwereusedinthe

maximum-entropymethod.

Todeterminethepriorityorderoftherules,we

re-ferredto Yarowsky'smethod (Yarowsky,1994)and

Nishiokayama'smethod(Nishiokayamaetal.,1998)

andusedtheprobabilityandfrequencyofeach rule

as measures of this priority order. When multiple

rules had the same probability, the rules were

ar-rangedinorder oftheirfrequency.

Suppose, for example, that Pattern A \Noun:

Normal Noun; Particle: Case-Particle: none: wo;

Verb: Normal Form: 217; Symb ol: Punctuation"

occurs 13 times in a learning set and that ten of

theoccurrencesincludetheinsertedpartitionmark.

SupposealsothatPatternB\Noun;Particle;Verb;

Symb ol"occurs123timesinalearningsetandthat

90oftheoccurrencesincludethemark.

Thisexampleisrecognizedbythefollowingrules:

PatternA)Partition 76.9%(10/13), Freq.23

PatternB)Partition 73.2%(90/123), Freq. 123

Manysimilarrulesweremadeandwerethenlisted

inorderoftheirprobabilitiesand,foranyone

prob-ability, in order of their frequencies. This list was

searchedfrom thetopandtheanswerwasobtained

byusingtherstapplicablerule.

3.5 Method1(use ofcategory-exclusive

rules)

Sofar,wehavedescribedthefour existingmachine

learning methods. In the nexttwo sections we

de-scribeourmethods.

Itisreasonable toconsiderthe152patternsused

in three ofthe previousmethods. Now, letus

sup-posethatthe152patternsfromthelearningsetyield

\Partition"meansthattheruledeterminesthata

partition markshould beinsertedintheinputdata

and\non-partition"meansthattheruledetermines

that apartition markshould notbeinserted.

Supposethat whenwesolveahypothetical

prob-lem PatternsA to G are applicable. If weuse the

decision-list method, only Rule A is used, which is

applied rst, and this determines that a partition

mark should not beinserted. For Rules B, C, and

D,althoughthefrequencyofeachruleislowerthan

that ofRule A, thesum of theirfrequenciesof the

rules is higher, so we think that it is better to use

Rules B, C, and D than Rule A. Method 1 follows

thisidea,butwedonotsimplysumupthe

frequen-cies. Instead,wecountthenumberofexamplesused

inRulesB,C,andDandjudgethecategoryhaving

thelargestnumb erofexamplesthatsatisfythe

pat-tern with the highestprobability to be thedesired

answer.

For example, suppose that in theaboveexample

the number ofexamples satisfying RulesB,C, and

D is 65. (Because some examples overlap in

multi-ple rules, the total number of examples is actually

smaller thanthetotalnumb erof thefrequenciesof

the three rules.) In this case, among theexamples

usedbytheruleshaving100%probability,the

num-ber of examples of partition is 65, and thenumb er

ofexamplesofnon-partitionis34. So,wedetermine

that thedesiredansweristopartition.

A rule having 100% probability is called a

category-exclusive rulebecause allthe data

satisfy-ing itbelongto one category, which is either

parti-tion or non-partition. Because for any given space

the numb er of rules used can be as large as 152,

category-exclusiverulesareapplied often 4

. Method

1usesallofthesecategory-exclusiverules,sowecall

it themethodusingcategory-exclusive rules.

Solving problemsbyusing ruleswhose

probabili-tiesarenot100%mayresultinthewrongsolutions.

Almostallofthetraditionalmachinelearning

meth-odssolveproblemsbyusingruleswhoseprobabilities

4

The ratio of the spaces analyzed by using

category-exclusiverulesis99.30%(16864/16983)inExperiment1 of

(5)

RuleB: PatternB ) probabilityofpartition 100% (33/33) Frequency33

RuleC: PatternC ) probabilityofpartition 100% (25/25) Frequency25

RuleD: PatternD ) probabilityofpartition 100% (19/19) Frequency19

RuleE: PatternE ) probabilityofpartition 81.3%(100/123) Frequency123

RuleF: PatternF ) probabilityofpartition 76.9% (10/13) Frequency13

RuleG: PatternG ) probabilityofnon-partition 57.4%(310/540) Frequency540

... ... ...

Figure6: anexampleofrulesused inMethod 1

are not 100%. By using such methods, wecannot

hopetoimproveaccuracy. Ifwewanttoimprove

ac-curacy,wemustusecategory-exclusiverules. There

are somecases, however,forwhich, evenif wetake

thisapproach,category-exclusiverulesarerarely

ap-plied. In such cases, we must add new features to

the analysis to create a situation in which many

category-exclusiverulescanbeapplied.

However, it is not sucient to use

category-exclusive rules. There are many meaningless rules

which happen to be category-exclusive only in a

learning set. We must consider how to eliminate

suchmeaninglessrules.

3.6 Method2(using category-exclusive

rules with the highestsimilarity)

Method 2combinestheexample-basedmethodand

Method 1. That is, it combines the method using

similarity andthe method using category-exclusive

rulesinordertoeliminatethemeaningless

category-exclusiverulesmentionedin theprevioussection.

Method 2 also uses 152 patterns for identifying

bunsetsu. These patterns are used as rules in the

samewayasinMethod1. Desiredanswersare

deter-minedbyusingtherulehavingthehighest

probabil-ity. Whenmultipleruleshavethesameprobability,

Method 2 usesthevalueofthe similaritydescribed

inthesectionoftheexample-basedmethodand

an-alyzestheproblem withtherulehavingthehighest

similarity. Whenmultipleruleshavethesame

prob-ability and similarity, the method takesthe

exam-plesusedbytheruleshavingthehighestprobability

andthehighestsimilarity,andchoosesthecategory

with the larger number of examples as the desired

answer,inthesamewayasinMethod1.

However, when category-exclusive rules having

morethanonefrequencyexist,theaboveprocedure

is performed after eliminating all of the

category-exclusiveruleshavingonefrequency. Inotherwords,

category-exclusive rules having more than one

fre-quency are given a higher priority than

category-exclusive rules having only one frequency but

hav-ing a higher similarity. This is because

category-exclusiveruleshavingonlyone frequencyarenotso

4 Experiments and discussion

InourexperimentsweusedaKyotoUniversitytext

corpus (Kurohashi and Nagao, 1997), which is a

taggedcorpusmadeupofarticlesfromtheMainichi

newspaper. Allexperimentsreported in this paper

were performed using articles dated from January

1 to 5, 1995. We obtained the correct information

onmorphologyandbunsetsuidenticationfromthe

tagged corpus.

Thefollowingexperimentswereconducted to

de-termine which supervised learningmethod achieves

thehighestaccuracyrate.

Experiment1

Learningset: January1,1995

Testset: January3,1995

Experiment2

Learningset: January4,1995

Testset: January5,1995

BecauseweusedExperiment1inmakingMethod

1 andMethod 2, Experiment 1 isa closed data set

for Method1 andMethod2. So,weperformed

Ex-periment2.

The results arelisted in Tables 1 to 4. We used

KNP2.0b4(Kurohashi,1997)andKNP2.0b6

(Kuro-hashi, 1998), which are bunsetsu identication and

syntactic analysis systems using many hand-made

rules in addition to the six methods described in

Section 3. BecauseKNPisnotbasedonamachine

learning method but many hand-maderules,in the

KNPresults\Learningset"and\Testset"inthe

ta-bles havenomeanings. Intheexperimentof KNP,

wealsousesmorphologicalinformation inacorpus.

The\F"inthetablesindicatestheF-measure,which

is theharmonicmeanofa recallandaprecision. A

recallisthefractionofcorrectlyidentiedpartitions

out of all the partitions. A precision is the

frac-tion of correctly identied partitions out of all the

spaces whichwerejudgedto havea partition mark

inserted.

Tables1 to4 showthefollowingresults:

(6)

Method F Recall Precision

DecisionTree 99.58% 99.66% 99.51%

MaximumEntropy 99.20% 99.35% 99.06%

Example-Based 99.98% 100.00% 99.97%

DecisionList 99.98% 100.00% 99.97%

Method1 99.98% 100.00% 99.97%

Method2 99.98% 100.00% 99.97%

KNP2.0b4 99.23% 99.78% 98.69%

KNP2.0b6 99.73% 99.77% 99.69%

Thenumb erofspacesbetweentwomorphemesis

25,814. Thenumb erofpartitionsis9,523.

Table2: Resultsoftestset ofExperiment1

Method F Recall Precision

DecisionTree 98.87% 98.67% 99.08%

MaximumEntropy 98.90% 98.75% 99.06%

Example-Based 99.02% 98.69% 99.36%

DecisionList 98.95% 98.43% 99.48%

Method1 98.98% 98.54% 99.43%

Method2 99.16% 98.88% 99.45%

KNP2.0b4 99.13% 99.72% 98.54%

KNP2.0b6 99.66% 99.68% 99.64%

Thenumb erofspacesbetweentwomorphemesis

16,983. Thenumb erofpartitionsis6,166.

method. Although the maximum-entropy

method has a weak point in that it does not

learn the combinations of features, we could

overcomethisweaknessbymakingalmostallof

thecombinationsoffeaturestoproduceahigher

accuracyrate.

The decision-list method was better than the

maximum-entropymethodin thisexperiment.

Theexample-basedmethodobtained the

high-estaccuracyrateamongthefourexisting

meth-ods.

Although Method 1, which uses the

category-exclusive rule, was worse than the

example-basedmethod, itwasbetterthanthe

decision-list method. One reason for this was that

thedecision-listmethodchoosesrulesrandomly

whenmultipleruleshaveidenticalprobabilities

andfrequencies.

Method 2, which uses the category-exclusive

rule with the highest similarity, achieved the

highest accuracy rate among the supervised

learningmethods.

The example-based method, the decision-list

method, Method 1 andMethod 2 obtained

ac-curacyratesofabout100%forthelearningset.

Method F Recall Precision

DecisionTree 99.70% 99.71% 99.69%

MaximumEntropy 99.07% 99.23% 98.92%

Example-Based 99.99% 100.00% 99.98%

DecisionList 99.99% 100.00% 99.98%

Method1 99.99% 100.00% 99.98%

Method2 99.99% 100.00% 99.98%

KNP2.0b4 98.94% 99.50% 98.39%

KNP2.0b6 99.47% 99.47% 99.48%

Thenumb erofspacesbetweentwomorphemesis

27,665. Thenumb erofpartitionsis10,143.

Table4: ResultsoftestsetofExperiment2

Method F Recall Precision

DecisionTree 98.50% 98.51% 98.49%

MaximumEntropy 98.57% 98.55% 98.59%

Example-Based 98.82% 98.71% 98.93%

DecisionList 98.75% 98.27% 99.23%

Method1 98.79% 98.54% 99.43%

Method2 98.90% 98.65% 99.15%

KNP2.0b4 99.07% 99.43% 98.71%

KNP2.0b6 99.51% 99.40% 99.61%

Thenumb erofspacesbetweentwomorphemesis

32,304. Thenumb erofpartitionsis11,756.

strongforlearningsets.

The two methods using similarity

(example-basedmethodand Method2)werealways

bet-terthantheothermethods,indicatingthatthe

use of similarity is eective ifwecan dene it

appropriately.

We carried out experiments by using KNP, a

system that uses many hand-made rules. The

F-measureofKNPwashighestinthetestset.

Weused two versions ofKNP,KNP 2.0b4and

KNP 2.0b6. The latter was much better than

the former, indicating that the improvements

made by hand are eective. But, the

mainte-nance of rulesby hand has a limit, so the

im-provementsmadebyhandarenotalways

eec-tive.

TheaboveexperimentsindicatethatMethod 2is

best amongthemachinelearningmethods 5

.

In Table 5 we show some caseswhich were

par-titioned incorrectly with KNP but correctly with

5

In these experiments, the dierences were very small.

But,wethinkthatthedierencesaresignicanttosome

ex-tentbecauseweperformedExperiment1andExperiment2,

thedataweusedarealargecorpuscontainingaboutafew

tenthousandmorphemesandtaggedobjectivelyinadvance,

(7)

Table5:CaseswhenKNPwasincorrectandMethod

2 wascorrect

kotsukotsuj NEED

gaman-shi

(steadily) (bepatientwith)

(... bepatientwith...steadily)

yoyuu wo jmotte j NEED

shirizoke

(enoughstrength)obj (have) (beato)

(... beato...havingenoughstrength)

kaisha wo jgurupu-wakej WRONG

shite

companyobj (grouping) (do)

(... dogroupingcompanies)

Method2. Apartition with\NEED"indicatesthat

KNPmissedinsertingthepartitionmark,anda

par-titionwith\WRONG"indicatesthatKNPinserted

thepartitionmarkincorrectly. Inthetestsetof

Ex-periment1,theF-measureofKNP2.0b6was99.66%.

The F-measure increases to 99.83%, under the

as-sumptionthatwhen KNP2.0b6orMethod 2is

cor-rect, the answer is correct. Although the accuracy

rateforKNP2.0b6washigh, thereweresome cases

in which KNP partitioned incorrectly and Method

2 partitioned correctly. A combination of Method

2 with KNP2.0b6 may be able to improve the

F-measure.

The only previous research resolving bunsetsu

identication by machine learning methods, is the

work by Zhang (Zhang and Ozeki, 1998). The

decision-tree method was used in this work. But

this work used only a small numb er of

infor-mation for bunsetsu identication 6

and did not

achieve high accuracy rates. (The recall rate

was 97.6%(=2502/(2502+62)), the precision rate

was92.4%(=2502/(2502+205)),andF-measurewas

94.2%.)

5 Conclusion

To solve the problem of accurate bunsetsu

iden-tication, we carried out experiments comparing

four existing machine-learning methods

(decision-tree method, maximum-entropy method,

example-based method and decision-list method). We

ob-tained the following order of accuracy in bunsetsu

identication.

Example-Based>DecisionList>

MaximumEntropy>DecisionTree

We also described a new method which uses

category-exclusiveruleswith thehighestsimilarity.

Thismethodperformedbetterthantheother

learn-ingmethodsin ourexperiments.

6

ThisworkusedonlythePOSinformationofthetwo

mor-Adam L. Berger, Stephen A. Della Pietra, and Vincent

J.DellaPietra. 1996. AMaximumEntropyApproachto

NaturalLanguageProcessing.ComputationalLinguistics,

22(1):39{71.

Andrew Borthwick, John Sterling, Eugene Agichtein, and

Ralph Grishman. 1998. Exploiting Diverse Knowledge

SourcesviaMaximumEntropyinNamedEntity

Recogni-tion. InProceedingsoftheSixthWorkshoponVeryLarge

Corpora,pages152{160.

MasayukiKameda. 1995.SimpleJapaneseanalysistoolqjp.

TheAssociationforNaturalLanguageProcessing,the1st

NationalConvention,pages349{352. (inJapanese).

SadaoKurohashiandMakotoNagao. 1997.KyotoUniversity

textcorpusproject.pages115{118. (inJapanese).

SadaoKurohashiand MakotoNagao, 1998. Japanese

Mor-phologicalAnalysisSystem JUMANversion3.5.

Depart-mentofInformatics,KyotoUniversity.(inJapanese).

SadaoKurohashi, 1997. JapaneseDependency/Case

Struc-ture Analyzer KNP version 2.0b4. Departmentof

Infor-matics,KyotoUniversity.(inJapanese).

SadaoKurohashi, 1998. JapaneseDependency/Case

Struc-ture Analyzer KNP version 2.0b6. Departmentof

Infor-matics,KyotoUniversity.(inJapanese).

MakotoNagao. 1984.AFrameworkofaMechanical

Transla-tionbetweenJapaneseandEnglishbyAnalogyPrinciple.

ArticialandHumanIntelligence,pages173{180.

Shigeyuki Nishiokayama, Takehito Utsuro, and Yuji

Mat-sumoto. 1998. Extracting preference of dependency

be-tweenJapanesesubordinateclausesfromcorpus.

IEICE-WGNLC98-11,pages31{38. (inJapnese).

NLRI. 1964. (National Language Research Institute).

Word List by Semantic Principles. Syuei Syuppan. (in

Japanese).

J.R.Quinlan.1995. Programsformachinelearning.

Lance A. Ramshaw and Mitchell P. Marcus. 1995. Text

chunkingusingtransformation-basedlearning.In

Proceed-ingsoftheThirdWorkshoponVeryLargeCorpora,pages

82{94.

AdwaitRatnaparkhi. 1996. AMaximumEntropyModelfor

Part-Of-Sp eechTagging.ProceedingsofEmpiricalMethod

forNaturalLanguageProcessings,pages133{142.

AdwaitRatnaparkhi. 1997. ALinearObservedTime

Statis-ticalParserBasedonMaximumEntropyModels. In

Pro-ceedingsof EmpiricalMethodforNaturalLanguage

Pro-cessings.

Eric Sven Ristad. 1998. Maximum Entropy Modeling

To olkit, Release 1.6 beta. http://www.mnemonic.com/

software/memt.

RonaldL. Rivest. 1987. Learning Decision Lists. Machine

Learning,2:229{246.

ErikF.Tjong Kim Sangand Jorn Veenstra. 1999.

Repre-sentingtextchunks.InEACL'99.

Kiyotaka Uchimoto, Satoshi Sekine, and Hitoshi Isahara.

1999. Japanesedependency structure analysis based on

maximum entropy models. In Proceedings of the Ninth

Conference of the European Chapter of the Association

forComputationalLinguistics(EACL),pages196{203.

David Yarowsky. 1994. Decision lists for lexical ambiguity

resolution: Applicationto accent restoration in Spanish

and French. In32th AnnualMeetingof theAssocitation

oftheComputationalLinguistics,pages88{95.

Yujie Zhang and Kazuhiko Ozeki. 1998. The

applica-tion of classication trees to bunsetsu segmentation of

Japanesesentences. JournalofNaturalLanguage

References

Related documents

Metromusing from within and behind the walls of the University, the city centre, with its buildings like those of the Council for Scientific and Industrial Research (CSIR), the

As the exposure under study (living in rural areas, living on a farm, and contact to animals early in life) differs significantly across Germany, cases were restricted to those

DEGS: German Health Interview and Examination Survey for Adults; EPOS: European Prospective Osteoporosis Study; GEDA: German Health Update; ISCED: International Standard

A multi- disciplinary task force was convened by the Centers for Disease Control and Prevention (CDC)-funded Southern California Center of Academic Excellence on Youth Vio-

Methods and results: Whole exome sequencing studies have identified novel compound heterozygous mutations in RYR1 in two affected foetuses with pterygium, severe arthrogryposis

The aim of this study was to evaluate simultan- eously, in a population-based large prospective twin cohort, several potential predictors for fibromyalgia and possible

My results show that the public mission and status of the University of Michigan positively affects a number of employees’ work motivation. This seems to affirm the findings of

We investigate the efficiency of Chebyshev Thresholding Greedy Algorithm (CTGA) for an n -term approximation with respect to general bases in a Banach space.. We show that the