Hidetsugu Nanba (y) and Manabu Okumura (y, z)
yScho ol of InformationScience
JapanAdvanced Instituteof Scienceand Technology
zPrecisionand Intelligence Lab oratory
TokyoInstitute of Technology
nanba@jaist.ac.jp, oku@pi.titech.ac.jp
Abstract
In this pap er, we rst experimentally investigated
thefactorsthatmakeextractshardtoread. Wedid
thisbyhavinghumansubjectstrytoreviseextracts
to pro duce more readableones. Wethen classied
the factors into ve, most of which are related to
cohesion, after which we devised revision rules for
eachfactor,andpartiallyimplementedasystemthat
revisesextracts.
1 Intro duction
Theincreasingnumb erofon-linetextsavailablehas
resulted in automatic text summarization b
ecom-ing a major researchtopic in theNLP community.
Themainapproachistoextractimp ortantsentences
from thetexts, and themain taskin this approach
is that of evaluating the imp ortance of sentences
[MIT,1999]. This pro ducing of extracts - that is,
setsofextractedimp ortantsentences-isthoughtto
b e easy, and hastherefore long b een themain way
that texts are summarized. As Paice p ointed out,
however,computer-pro ducedextractstendtosuer
froma`lackofcohesion'[Paice,1990]. Forexample,
theantecedentscorrespondingtoanaphorsinan
ex-tract are not always included in the extract. This
oftenmakestheextractshardto read.
In the work describ ed in this pap er, we
there-foredeveloped a metho dformakingextracts easier
to read byrevising them. We rst exp erimentally
investigatedthefactors that makeextracts hardto
read. Wedidthis byhavinghumansubjectstryto
revise extracts topro duce more readable ones. We
then classied the factors into ve, most of which
are related to cohesion [Hallidayet al.,1976], after
whichwedevisedrevision rulesforeachfactor, and
partiallyimplementedasystemthatrevises
extract-s. Wethen evaluatedour systemby comparingits
revisions with those pro duced by human subjects
andalsobycomparingthereadabilityjudgmentsof
humansubjectsb etweentherevisedandoriginal
ex-Inthefollowingsectionswebriey review
relat-ed works, describ e our investigation of what make
extracts hard to read, and explain our system for
revising extracts to makethem more readable.
Fi-nally,wedescrib eourevaluationof thesystemand
discusstheresultsofthatevaluation.
2 Related Works
Many investigatorshavetriedto measure the
read-abilityof texts[Klare,1963]. Most ofthem have
e-valuatedwell-formedtextspro ducedbyp eople,and
used two measures: p ercentageof familiarwordsin
thetexts(wordlevel)andtheaveragelengthofthe
sentencesin thetexts(syntacticlevel). These
mea-sures, however, do notnecessarily reect the
actu-al readability of computer-produced extracts. We
thereforehavetotakeintoaccountotherfactorsthat
mightreduce thereadabilityofextracts.
Oneof themcould b e a lackof cohesion.
Halli-dayandHasan [Hallidayetal.,1976]describ ed ve
kinds of cohesion: reference, substitution, ellipsis,
conjunction,and lexicalcohesion.
Minel [Minelet al.,1997] tried to measure the
readabilityofextractsintwoways: bycountingthe
numberof anaphorsin anextract thatdo nothave
antecedentsintheextract,andbycountingthe
num-berofsentenceswhicharenotincludedinanextract
but closelyconnectedto sentencesintheextract.
Wethereforeregardkindsof cohesionasimp
or-tant in trying to classify the factorsthat make
ex-tracts lessreadablein thenextsection.
Oneof thenotable previousworks dealing with
ways to pro duce more cohesive extracts is that
of Paice [Paice,1990]. Mathis presented a
frame-work in which a pair of short sentences are
com-bined into one to yield a more readable extract
[Mathis etal.,1973]. Wethink,however,thatnone
ofthepreviousstudieshaveadequatelyinvestigated
thefactorsmakingextractshardtoread.
Some investigators have compared
inves-[Kawahara,1989,Jing,1999]. Revisionisthoughtto
b edonefor(atleast)thefollowingthreepurp oses:
(1) to shortentexts,
(2) to changethestyleoftexts,
(3) to maketextsmorereadable.
Jing[Jing,1999]istryingtoimplementahuman
summarizationmo delthatincludestworevision
op-erations: reduction (1) and combination (3). Mani
[Manietal.,1999] prop osed a revision system that
uses three op erations: elimination (1), aggregation
(1), and smo othing (1, 3). Mani showed that his
system can make extracts more informative
with-out degrading theirreadability. The present work,
however,isconcernednotwithimproving
readabili-tybut withimprovingtheinformativeness.
3 Less Readability of Extracts
Toinvestigatetherevisionofextractsexp
erimental-ly, we had 12 graduate students pro duce extracts
of 25 newspap erarticles from theNIHON KEIZAI
SHINBUN,theaveragelengthofwhichwas30
sen-tences. Wethen askedthem to revise theextracts
(sixsubjectsp erextract).
We obtained extracts containing 343 revisions,
madeforanyofthethreepurp oseslistedinthelast
section. We selected the revisions for readability,
andclassiedtheminto5categories,bytakinginto
accountthe categories of cohesion byHallidayand
Hasan[Hallidayetal.,1976]. Table1showsthesum
oftheinvestigation.
Next,weillustrateeachcategoryofrevisions. In
theexamples,darkenedsentencesarethosethatare
notincludedin extracts,butareshownfor
explana-tion. Theserial numb erin the original text isalso
shownat thebeginningofsentences 1
.
A)Lack ofconjunctive expressions/presence
of extraneous conjunctiveexpressions
Therelationb etweensentences15and16is
ad-versative, b ecause there is a conjunctive `
(However)' at the b eginning of sentence 16. Butb ecause sentence 15 is not in the extract, `
(However)'isconsideredunnecessaryandshouldb edeleted. Conversely,lackof conjunctive
expression-s mightcause therelation b etweensentencesto b e
dicult to understand. In such a case, a suitable
conjunctiveexpression should b e added. For these
tasks,discoursestructureanalyzer isrequired.
1
Weusethefollowingthreetagstoshowrevisions.
<add>E
1
<=add>:addanewexpressionE
1 .
<del>E
2
<=del>:deleteanexpressionE
2
<=r ep>:replaceanexpressionE
3
(Thecompanyplanstogivewomenmoreopp
ortu-nitytoworkbyemployingfull-timeworkers.)
15.
·@ÝAà-(, 1@F)>Àͬ~@
enVOR^@Ö,>. ©ApÕñä!
(Sincetherehaveb eennosimilarcasesb efore,the
projectthatwomenjoinisnowinahardsituation,
thoughthecompanyputshop esonit.)
16. <del>
3+3
</del>Lô##tø?ã(:ÀÍ
·u?<8: Ú;-I³ ·?<8:¢>
˪@<¸?y@'ILÉD:(I!
(<del>However,</del>itismakingeortsof
ref-ormationwhichwillb eprotableb othforthe
com-panyandthefemaleworkers.)
B) Syntacticcomplexity
2. (
¼ÏÖ
;b eforerevision)þ·æµG%Â@x@&?0IéƳ
@átâ;'H ¦¡ Üï>=?»L£0:(.
6D@Ñ;E'I!
(It istherst projectintelecommunication
busi-ness, which President Kashio wantsto b e oneof
thecentralbusinessesinthefuture, andit isalso
thepreparationforexpandingthebusiness to
cel-lularphone.)
#
(
¼Ï¡
;afterrevision)þ·æµG%Â@x@&?0IéƳ
@átâ;'I!
(It istherst projectintelecommunication
busi-nesses,whichPresidentKashiowantstob eoneof
thecentralbusinessinthefuture.)
¦¡ Üï>=?»L£0:(.6D@Ñ;E'
I!
(It isalsothepreparation forexpandingthe
busi-nesstocellularphone.)
Longer sentences tend to have a syntactically
complex structure [Klare, 1963], and a long
com-poundsentenceshouldgenerallyb edividedintotwo
simplersentences. Ithasalsob eenclaimed,however,
that shortco ordinatesentencesshouldb ecombined
[Mathis etal.,1973].
C) Redundantrep etition
2.
z¡Ð,¯?°D6ÇÁUl$X%z¡fW^t
##&,º?Ë7!
(The new pro duct `ECHIGO BEST 100' which
ECHIGOSEIKAreleasedthisAprilisp opular
a-monghousewives.)
4. <rep
ó·
>z¡Ð
</rep>A;ú¶ø+
G
NTT@Qie\o>=L!
(<rep The company> ECHIGO SEIKA </rep>
hasb eenmakinguseofNTTCaptainsystemsince
1987.)
Ifsubjectsofadjacentsentencesinanextractare
the same, as in the ab ove example, readers might
think theyareredundant. Insuch acase, rep eated
expressions should b e omitted or replaced by
pro-nouns. In this example, the anaphoric expression
factors revisionmetho ds requiredtechniques
A lackofconjunctiveexpressions/ add/deleteconjunctiveexpressions discoursestructure
presenceofextraneous analysis
conjunctiveexpressions
B syntacticcomplexity combinetwosentences;divideasentenceintotwo
C redundantrep etition pronominalize;omitexpressions;
adddemonstratives
D lackofinformation supplementomittedexpressions; anaphoraand
replaceanaphorsbyantecedents;deleteanaphors ellipsisresolution
addsupplementaryinformation informationextraction
E lackofadverbialparticles; add/deleteadverbialparticles
presenceofextraneous
adverbialparticles
D) Lackof information
2.
µò¹@RkNWk$<bYTo@Tob[R"To
cj$Z$7!
(ThesearethecarmakerCHRYSLERandthe
com-putermakerCOMPAC.)
8.
@Ëu«,ÃýrLèÞ24 5J,
B6Ó@ÚLv8åI<()]dm@
p¾?H99'I!
(Wearenowinaviciouscirclewherethelayosby
companiesdiscourageconsumptions,whichinturn
resultsinlowersales.)
9. <del>
5@ä;
</del>RkNWk$,àñ3:
(I@A ÇÐðö@´<} "ÁÕ,B23
.êÎæ´ß?ì|3:(I+G7!
(<del>Insuchasituation,</del>CHRYSLERhas
donewell,b ecauseitsmanagementstrategyexactly
tstheageoflowgrowth.)
In this example, the referent of `
(in such a situation)' in sentence 9 is sentence 8, which isnot in the extract. In such a case, there are two
waystorevise: toreplace theanaphoric expression
withitsantecedent,ortodeletetheexpression. The
revision in the example is the latter one. For the
task, a metho d for anaphoraandellipsis resolution
isrequired.
1.
Yd^aoR@ÛÏ·æA1@<1K´§
gS,çG>(!
(MasayoshiSon,CEOofSoftbank,isnowsuering
fromjetlag.)
3. <add>
Yd^aoR@
<=add>Û·æ,%·wL+
/6enVOR^&<r¥CÇéUW\hA
CD-ROM
L®86Yd^@üù!
(CEOSon<add>ofSoftbank</add>iseagerto
sellsoftwaresusingCD-ROM,andhethinkitisa
bigprojectforhiscompany.)
In this second example, since `CEO Son' app ears
without the name of the company in the
extrac-t, withoutanybackground knowledge,wemaynot
of. Therefore,the nameof thecompany`Softbank'
should b eadded asthesupplementaryinformation.
Thetask requires ametho d forinformation
extrac-tionorat leastnamedentityextraction.
E) lackof adverbialparticles/presence of
extraneous adverbial particles
26.
¦@_"hPN¿æ@õA¤,Ø LÈD
I)*; F(;'I!
(Itisago o dopp ortunitytopromotethemutual
un-derstanding betweenJapan and Vietnamthat
M-r. DoMUOI,a chiefsecretary ofVietnam, visits
Japan.)
29.
f^`hAMVM@q×Ä@î+GE¦¡
½ÍLÙ3:(.!
(Froma viewp ointof security,Vietnamwill b ea
keycountryinAsia.)
30.
¨@;
<del>E
</del>æ(;ûíL²{
3:(.±Ì,ÿ7K)!
(Japanesegovernmentshouldconsiderlong-term
e-conomicalsupp ort<del>,to o</del>.)
Intheaboveexample,thereisanadverbial
parti-cle`
(,too)'andwecanndthatsentences29and 30areparatactical. But,b ecausesentence29isnotintheextract,theparticle`
(,to o)'isunnecessary andshould b edeleted.4 Revision System
Our system uses the Japanese public-domain
an-alyzers JUMAN [Kurohashiet al.,1998] and KNP
[Kurohashi,1998]morphologicallyandsyntactically
analyzeanoriginalnewspap erarticleandits
extrac-t. It then appliesrevisions rules to theextract
re-peatedly,withreferencetotheoriginaltext,untilno
rulescan revisetheextractfurther.
4.1 Revision Rules
mentedrevision rulesonly forfactors(A), (C),and
(D)in Table1byusingJPerl.
a) Deletion ofconjunctive expressions
We prepared a list of 52 conjunctive
expres-sions, and made it a rule to delete each of them
whenevertheextract do esnotinclude thesentence
that expression is related. To identify the
sen-tence related to the sentence by the conjunction
[Mannetal.,1986],thesystemp erformspartial
dis-coursestructureanalysistakingintoaccountall
sen-tences withinthree sentencesofthe one containing
theconjunctiveexpression.
The implementation of our partial discourse
structure analyzer was based on Fukumoto's
dis-course structure analyzer [Fukumoto,1990]. It
in-ferstherelationshipbetweentwosentencesby
refer-ring to the conjunctive expressions, topical words,
anddemonstrativewords.
c) Omission of redundantexpressions
If subjects (or topical expressions marked with
topical postp osition `wa') of adjacent sentences in
anextract werethesame, therep eatedexpressions
wereconsideredredundantand weredeleted.
d-1)Deletion of anaphors
To treat anaphora and ellipsis successfully, we
would need a mechanism for anaphora and ellipsis
resolution(ndingthe antecedentsand omitted
ex-pressions). Because we have no such mechanism,
we implement a rule with ad hoc heuristics: If an
anaphor appears at the b eginning of a sentence in
anextract, itsantecedentmustb e in thepreceding
sentence. Therefore,ifthatsentencewasnotin the
extract,theanaphorwasdeleted.
d-2)Supplement ofomitted subjects
If a subject in a sentencein an extract is
omit-ted,the revision rulesupplementsthesubjectfrom
thenearestpreceding sentencewhose subjectisnot
omittedintheoriginaltext. Thisruleis
implement-ed byusing heuristicssimilar to theab overevision
rule.
5 Evaluation of Revision
Sys-tem
Weevaluated ourrevisionsystem bycomparing its
revisions withthose byhumansubjects(evaluation
1), and comparing readability judgments between
therevisedandoriginalextracts(evaluation2).
5.1 Evaluation 1: comparing system
revisions and human revisions
Becauserevisionisasubjectivetask,itwasnoteasy
that more subjects make, however, can be
consid-ered more reliable and more likelyto benecessary.
Whencomparing therevisionsmade byoursystem
with those made by human subjects, we therefore
took into account the degree of agreement among
subjects.
For this evaluation, we used 31 newspap er
ar-ticles (NIHON KEIZAI SHINBUN) and their
ex-tracts. They were dierent from the articles used
formakingrules. Fifteenofextractsaretakenfrom
Nomoto's work [Nomotoet al.,1997], and the rest
were made by our group. The average numb ers of
sentences in the original articles and the extracts
were25.2and5.1.
Each extract was revised by ve subjects who
had b een instructedto revise the extracts to make
them more readable and hadbeenshownthe5
ex-amples in section 3. As a result, we obtained 167
revisions intotal. TheresultsarelistedinTable2.
Table2: Thenumberofrevisions
revisionmetho ds total
A add(61)/delete(11)
conjunctiveexpressions 72
B combinetwosentences(2)
divideasentenceintotwo(6) 8
C pronominalize(5);omitexpressions(3)
adddemonstratives(8) 16
D supplementomittedexpressions(11)
replaceanaphorsbyantecedents(10)
deleteanaphors(15) 36
addsupplementaryinformation(26) 26
E deleteadverbialparticles(4)
addadverbialparticles(5) 9
167
Wecomparedoursystem'srevisionswiththe
an-swer set comprising revisions that more than two
subjects made. And we used recall (R) and
preci-sion(P)asmeasures ofthesystem'sp erformances.
R=
Numberof system 0
sr evisions
matchedtotheansw er
Numberof r ev isionsintheansw er
P =
Numberof sy stem 0
sr ev isions
matchedtotheansw er
Numberof sy stem 0
sr ev isions
Evaluation results are listed in Table 3. As in
Table3,the coverageofourrevision rules israther
small (ab out 1/4) in the whole set of revisions in
Table2. Itistruethattheexp erimentisrathersmall
and can b econsidered aslessreliable. Thoughitis
lessreliable,someoftheimplementedrulescancover
Table3: Comparison b etweenthe revisions by
hu-manandoursystem
revisionrules R P
a(total:11) 2/2 2/5
c(total:3) 0/0 0/0
d-1(total:15) 4/5 4/7
d-2(total:11) 2/4 2/10
5.2 Evaluation 2: comparing human
readability judgments of original
and revised extracts
In the second evaluation, using the same 31 texts
as in evaluation 1, we asked ve human sub
ject-s to rank the following four kinds extracts in the
order of readability: the original extract (without
revision)(NON-REV), human-revised ones (REV-1
and REV-2), and the one revised by our system
(REV-AUTO).REV-1and REV-2were
respective-lyextractsrevisedin thecaseswheremorethanone
andmore thantwo subjectsagreed torevise.
Weconsideredajudgmentbythemajority(more
than two subjects) to b e reliable. The results are
listedinTable4. Thecolumn`split'inTable4
indi-cates thenumber ofcaseswhere no majoritycould
agree. Theresultsshowthatb othREV-1and
REV-2 extracts weremore readablethan NON-REV
ex-tractsandthatREV-2extractsmightb eb etterthan
REV-1extracts,sincethenumberof`worse'ev
alua-tionswassmallerforREV-2extracts.
Table4: Comparison of readability among original
extractsandrevised ones
b etter same worse split
REV-2vs. NON 15 12 2 2
REV-1vs. NON 22 1 7 1
AUTOvs. NON 2 13 12 0
In comparing REV-AUTO with NON-REV,we
use 27 texts where the readability do es not
de-grade in REV-2, since the readability cannot
im-prove with revisions by our system in those texts
wherethereadabilitydegradesevenwithhuman
re-visions. Even with those texts, however, in almost
half thecases,the readabilityof therevised
extrac-t wasworse thanthat of theoriginal extract. The
mainreasonisthattherevisionsystem
supplement-edincorrectsubjects.
6 Discussion
Although the results of the evaluation are
encour-aging, they also show that oursystem needs to b e
improved. Wehavetoimplementmorerevisionrules
to enlarge the coverage of our system. One of the
mostfrequentrevisionsistoaddconjunctions(37%).
Wealsoneedtoreformourrevision rulesintomore
structureanalyzer,arobustmechanismfor
anapho-ra and ellipsis resolution, and a robust system of
extracting namedentities. Theyareunder
develop-mentnow.
7 Conclusion
In this paper we describ ed our investigationof the
factors that make extracts less readable than they
should b e. Wehad humansubjectsrevise extracts
to made them more readable, andwe classied the
factorsintovecategories. Wethendevisedrevision
rules for three of these factors and implementeda
systemthatusesthemtoreviseextracts. Wefound
experimentallythatourrevisionsystemcanimprove
thereadabilityofextracts.
References
[Fukumoto,1990] Fukumoto, J.(1990) Context
Struc-tureAnalysisBasedontheWriter'sInsistence. IPSJ
SIGNotes,NL-78-15,pp.113{120. inJapanese.
[Hallidayetal.,1976] Halliday,M.A.K., Hasan,R.
(1976) CohesioninEnglish. Longman.
[Jing,1999] Jing,H. (1999) Summary Generation
throughIntelligentCuttingandPastingoftheInput
Do cument. Ph.D.ThesisProposal,ColumbiaUniv.
[Kawahara,1989] Kawahara,H. (1989) Chapter 9, in
Bunshoukouzoutoyouyakubunnoshosou.
Kuroshio-shuppan,pp.141{167.inJapanese.
[Klare,1963] Klare,G.R. (1963) The Measurement of
Readability.IowaStateUniversityPress.
[Kurohashietal.,1998] Kurohashi,S.,Nagao,M.(1998)
Japanese Morphological Analysis System JUMAN
version3.5.
[Kurohashi,1998] Kurohashi,S.(1998)Japaneseparser
KNPversion2.0b6.
[Mathisetal.,1973]
Mathis,B.,Rush,J.,Young,C.(1973) Improvementof
AutomaticAbstractsbytheUseofStructural
Analy-sis. JASIS,24(2),pp.101{109.
[Manietal.,1999] Mani,I., Gates,B., Blo edorn,E.
(1999) ImprovingSummariesbyRevisingThem. the
37thAnnualMeetingoftheACL,pp.558{565.
[MIT,1999] Mani,I.,Maybury,M.T.(1999)Advancesin
AutomaticTextSummarization. MITPress.
[Mannetal.,1986] Mann,W.C.,Thompson,S.A. (1986)
Rhetorical Structure Theory: Description and
Con-structionof TextStructure. Proc.of the third
Inter-nationalWorkshoponTextGeneration.
[Mineletal.,1997] Minel,J.,Nugier,S.,Gerald,P.(1997)
How to Appreciate the Quality of Automatic Text
Summarization? ExamplesofFANandMLUCE
Pro-to cols and their Results onSERAPHIN. Intelligent
Scalable Text Summarization, Proc. of a Workshop,
ACL,pp.25{30.
[Nomotoetal.,1997] Nomoto,T.,Matsumoto,Y.(1997)
TheReadabilityofHumanCo dingandEectson
Au-tomatic Abstracting. IPSJ SIG Notes, NL-120-11,
pp.71{76.inJapanese.
[Paice,1990] Paice,C.D.(1990)ConstructingLiterature