Sentence Ordering in Multidocument Summarization

(1)

Sentence Ordering in Multidocument Summarization

Regina Barzilay

Computer Science

Department

1214 Amsterdam Ave

New York, 10027, NY, USA

[email protected]

Noemie Elhadad

Computer Science

Department

1214 Amsterdam Ave

[email protected]

Kathleen R. McKeown

Computer Science

Department

1214 Amsterdam Ave

[email protected]

ABSTRACT

Theproblem of organizing information formultido cument summarization so that thegenerated summary iscoherent has received relatively little attention. In this pap er, we describ e twonaive ordering techniques andshow thatthey donotp erformwell. Wepresent anintegratedstrategyfor orderinginformation,combiningconstraintsfrom chronolog-icalorderofeventsandcohesion. Thisstrategywasderived fromempiricalobservationsbasedonexp erimentsasking hu-mans to order information. Evaluation of our augmented algorithm shows a signicant improvement of the ordering overthetwonaivetechniques weusedasbaseline.

1.

INTRODUCTION

Multido cument summarization p oses a numb er of new challengesoversingledo cumentsummarization. Researchers havealready investigated issues suchas identifying rep eti-tions or contradictions across input do cuments and deter-miningwhichinformationissalientenoughtoincludeinthe summary [1,3,6, 11,15,19]. One issue thathas received littleattention ishow to organize the selected information sothat the output summary iscoherent. Onceall the rel-evant pieces of information have b een selected across the input do cuments, the summarizer has to decide in which order topresent themso that thewhole text makessense. Insingledo cumentsummarization, onep ossible orderingof the extracted information is provided by the input do cu-mentitself. However,[10]observedthat,insingledo cument summaries written by professional summarizers, extracted sentencesdo notretaintheirprecedenceordersinthe sum-mary. Moreover, in the case of multiple input do cuments, this do es not provide a useful solution: information may b e drawnfrom dierent do cuments and therefore, no one do cument canprovide anordering. Furthermore, theorder b etween twopiecesof information can changesignicantly fromonedo cumenttoanother.

We investigate constraints on ordering in the context of multido cument summarization. Werstdescrib e twonaive

ordering algorithms,usedinseveralsystemsandshow that they do not yield satisfactory results. The rst, Majority Ordering,is critically linkedtothelevelofsimilarity ofthe informationorderingacrosstheinputtexts. Butmanytimes input textshave dierent structure, and therefore, this al-gorithm is notacceptable. The second,Chronological Or-dering, can pro duce go o d results when the information is event-basedandcan,therefore,b eorderedbased ontemp o-ralo ccurence. However,textsdonotalwaysrefertoevents. Wehaveconducted exp erimentstoidentifyadditional con-straintsusingamanually builtcollection ofmultiple order-ings of texts. These exp erimentsshowthat cohesion asan imp ortant constraint. While it is recognized in the gener-ation community thatcohesion isanecessary featurefora generatedtext,weprovideanop erationalwayto automati-cally ensurecohesion whenorderingsentencesinanoutput summary. We augment the Chronological Ordering algo-rithm with a cohesion constraint, and compare it to the naivealgorithms.

OurframeworkistheMultiGensystem[15],adomain in-dep endentmultido cumentsummarizerwhichhasb eentrained and tested on news articles. Inthe following sections, we rst give an overview of MultiGen. We then describ e the twonaiveorderingalgorithmsandevaluatethem. Wefollow thiswithastudyofmultipleorderingspro ducedbyhumans. Thisallowsustodeterminehowtoimprovethe Chronologi-calOrderingalgorithmusingcohesionasanadditional con-straint. Thelastsectiondescrib estheaugmentedalgorithm alongwithitsevaluation.

2.

MULTIGEN OVERVIEW

MultiGen op erates on a set of news articles describing the same event. It creates a summary which synthesizes common informationacrossdo cuments. Inthecaseof mul-tido cument summarizationofarticlesab outthesameevent, source articles cancontain b othrep etitions and contradic-tions. Extractingallthesimilar sentenceswouldpro ducea verb oseandrep etitivesummary,whileextractingonlysome of the similar sentences wouldpro duce a summary biased towards somesources. MultiGen uses acomparison of ex-tracted similar sentences to select the appropriate phrases toinclude inthesummaryandreformulatesthemasanew text.

(2)

analy-do cuments that contain rep eated information and do not necessarily contain sentences from all the do cuments. For eachtheme,thegeneration comp onent[1]identies phrases which are in the intersection of the theme sentences, and selectsthemas partofthesummary. Theintersection sen-tencesarethenorderedtopro duceacoherent text.

3.

NAIVE ORDERING ALGORITHMS ARE

NOT SUFFICIENT

Whenpro ducingasummary,anymultido cument summa-rizationsystemhastocho oseinwhichordertopresentthe outputsentences. Inthissection,wedescrib etwoalgorithms fororderingsentencessuitablefordomainindep endent mul-tido cument summarization. The rst algorithm, Majority Ordering (MO), relies only on the original orders of sen-tencesintheinputdo cuments. Itistherstsolutiononecan thinkofwhenaddressingtheorderingproblem. Thesecond one,ChronologicalOrdering(CO)usestimerelatedfeatures toordersentences. Weanalyze thisstrategyb ecauseitwas originall y implemented in MultiGen and followed by other summarization systems [18]. In the MultiGen framework, ordering sentences isequivalent toordering themesand we describ e the algorithms in terms of themes, but the con-ceptscanb eadaptedtoothersummarization systems such as [3]. Our evaluation shows thatthese metho ds alone do notprovidean adequatestrategyforordering.

3.1

Majority Ordering

3.1.1

The Algorithm

Typically, in single do cument summarization, the order ofsentencesintheoutput summary isdeterminedbytheir order in the input text. This strategy canb e adapted to multido cument summarization. Consider two themes,Th

1 andTh2;ifsentencesfromTh1 preceedsentencesfromTh2 inallinputtexts,thenpresentingTh1 b eforeTh2 isan ac-ceptableorder. But,whentheorderb etweensentencesfrom Th

1

andTh 2

variesfromonetexttoanother,this strategy isnotvalidanymore. One waytodenetheorder b etween Th1 andTh2 istoadopttheordero ccuringinthemajority ofthetextswhereTh1andTh2 o ccur. Thisstrategydenes apairwiseorderb etweenthemes. However,thispairwise re-lation isnottransitive; forexample, giventhe themesTh1 andTh conictb etweentheorders(Th

1 Sincetransitivityisanecessaryconditionforarelationtob e calledanorder,thisrelation do esnotformaglobalorder.

We, therefore, have to expand this pairwise relation to a global order. In other words, we have to nd a linear order b etweenthemes whichmaximizes the agreement b e-tween theorderings imp osed by the input texts. For each pairofthemes,Thi andThj,wekeeptwocounts,Ci;j and C

j;i |C

i;j

isthenumb er ofinputtextsinwhichsentences fromThi o ccur b efore sentencesfrom Thj and Cj;i is the samefor theopp osite order. Theweight of alinear order (Th

)isdenedasthesumofthecountsforevery pair Ci

l ;i

m

,suchthat ilimand l ;m2f1:::k g. Stating thisproblem in termsofadirected graph whereno des are themes,andavertex fromTh

i we are lo oking fora pathwith maximal weight which

tra-ingsalesmanproblemtothisproblem. Despitethisfact,we stillcanapplythisordering, b ecausetypicallythelengthof the output summary is limited to asmall numb er of sen-tences. Forlonger summaries,theapproximationalgorithm describ ed in [4]canb e applied. Figures 1and 2show ex-amplesofpro duced summaries.

Themainproblem withthis strategy is thatit can pro-duce severalorderingswiththesameweight. Thishapp ens when thereisatieb etweentwoopp ositeorderings. Inthis situation,thisstrategydo esnotprovide enoughconstraints todetermineoneoptimal ordering;oneorderischosen ran-domlyamongtheorderswithmaximal weight.

The man accusedof reb ombing two Manhattan subways in1994 wasconvictedThursdayafterthejuryrejectedthe notionthatthedrugProzacledhimtocommitthecrimes. He was foundguilty of two counts of attempted murder, 14countsofrst-degreeassaultandtwocountsofcriminal p ossessionofaweap on.

InDecemb er1994,Learyignitedreb ombsontwo Manhat-tansubwaytrains. Thesecondblastinjured50p eople{16 seriously,includingLeary.

LearywantedtoextortmoneyfromtheTransitAuthority. Thedefense argued thatLearywas notresp onsibleforhis actionsb ecauseof"toxicpsychosis"causedbytheProzac.

Figure1: Asummarypro ducedusingtheMajority Or-deringalgorithm,gradedasGo o d.

A man armed witha handgun hassurrenderedtoSpanish authorities,p eacefullyendingahijackingofaMoro ccanjet. OÆcials in Spain say a p erson commandeered the plane. Afterthe planewasdirectedtoSpain,thehijackersaidhe wantedtob etakentoGermany.

After several hours of negotiations, authorities convinced thep ersontosurrenderearlyto day.

Policesaidthe man hada pistol,buta Moro ccansecurity sourceinRabatsaidthegunwaslikelya\toy".

Therewerenorep ortedinjuries.

OÆcialsinSpainsay theBo eing737leftCasablanca, Mo-ro cco,Wednesdaynightwith83passengersandanine-p er-soncrewheadedforTunis,Tunisia.

Spanishauthoritiesdirectedtheplanetoanisolatedsection ofElPratAirp ortandoÆcialsb egannegotiations.

Figure2: Asummarypro ducedusingtheMajority Or-deringalgorithm,gradedasPo or.

3.1.2

Evaluation

We asked three human judges to evaluate the order of information in 20summaries pro duced usingtheMO algo-rithm intothreecategories| Po or,FairandGo o d. We de-neaPo orsummary,inanop erationalway,asatextwhose readabilitywouldb esignicantl yimprovedbyreorderingits sentences. AFairsummaryisatextwhichmakessensebut reordering ofsomesentencescanyield ab etter readability. Finally, a summary which cannot b e furtherimproved by anysentencereorderingisconsidered aGo o dsummary.

(3)

ment; theynevergavethreedierent gradestoasummary. TheMOalgorithmpro ducesasmallnumb erofGo o d sum-maries,butmostofthesummariesweregradedasFair. For instance,thesummary graded Go o dshowninFigure1 or-ders theinformation ina naturalway; thetextstarts with asentencesummary oftheevent, thenthe outcome ofthe trial isgiven, areminder ofthe factsthat caused thetrial andap ossibleexplanationofthefacts. Lo okingattheGo o d summariespro ducedbyMO,wefoundthatitp erformswell whentheinputarticlesfollowthesameorderwhen present-ingtheinformation. Inotherwords,thealgorithmpro duces a go o d ordering if the input articles orderings have high agreement.

Ontheotherhand,whenanalyzingPo orsummaries,asin Figure2,weobservethattheinputtextshaveverydierent orderings. Bytryingtomaximizetheagreementoftheinput textsorderings, MO pro ducesanew ordering that do esn't o ccurinanyinputtext. Theorderingis,therefore,not guar-anteedanymoretob eacceptable. Anexampleofanew pro-ducedorderingisgiveninFigure2. Thesummarywouldb e morereadableif severalsentencesweremoved around(the lastsentence wouldb eb etter placedb eforethefourth sen-tenceb ecausetheyb othtalkab outtheSpanish authorities handling thehijacking).

Thisalgorithmcanb eusedtoordersentencesaccurately if we are certain that the input texts follow similar orga-nizations. This assumption may hold in limited domains. However, in ourcase,the inputtextswearepro cessing do nothave suchregularities. MO'sp erformance critically de-p endsonthequalityoftheinputtexts,therefore,weshould designanorderingstrategywhichb ettertsourinputdata. Fromhereon, we will fo cusonly on theChronological Or-deringalgorithmandwaystoimproveit.

3.2

Chronological Ordering

3.2.1

The Algorithm

Multido cumentsummarizationofnewstypicallydealswith articlespublishedondierentdates,andarticlesthemselves cover events o ccurring over a wide range in time. Using chronological order in the summary to describ e the main events helps the user understand what has happ ened. It seemslikeanaturalandappropriatestrategy.Asmentioned earlier, in our framework, we are ordering themes; in this strategy, wetherefore need toassign adateto themes. To identify the date an event o ccured requires a detailed in-terpretationoftemp oralreferencesin articles. While there haveb eenrecentdevelopments indisambiguatin g temp oral expressionsandeventordering [12],correlatingeventswith thedateonwhichtheyo ccurredisahardtask. Inourcase, weapproximatethethemetimebyitsrstpublicationdate; that is, the rsttime the theme hasb een rep orted in our setofinput articles. It isan acceptable approximation for news events; the rstpublicati on dateof an event usually corresp ondstoitso ccurrenceinreallife. For instance,ina terroristattackstory,thethemeconveying theattackitself willhaveadateprevioustothedateofthethemedescribing atrialfollowing theattack.

Articlesreleasedbynewsagenciesaremarkedwitha pub-licationdate,consistingofadateandatimewiththreeelds (hour,minutes and seconds). Articles from thesamenews

newsagencies. We neverencounteredtwoarticles withthe samepublication dateduringthedevelopmentofMultiGen. Thus,thepublicatio ndateservesasauniqueidentierover articles. Asaresult, whentwothemes havethesame pub-lication date,it meansthat they b othare rep ortedforthe rsttimein thesamearticle.

OurChronological Ordering (CO)algorithm takesas in-putasetofthemesandordersthemchronologicall y when-everp ossible. Eachthemeisassigned adatecorresp onding toitsrstpublication. Thisestablishes apartial orderover thethemes. Whentwothemeshavethesamedate(thatis, they arerep orted forthe rsttimeinthe samearticle) we sortthemaccordingtotheirorderofpresentationinthis ar-ticle. Wehavenowacompleteorderovertheinputthemes. Toimplement this algorithm in MultiGen, weselect for each theme the sentence that has the earliest publication date. We call it the time stamp sentence and assign its publication dateasthetimestampofthetheme. Figures3 and4showexamplesofpro duced summariesusingCO.

One of four p eople accused along with former Pakistani Prime MinisterNawazSharifhas agreed totestifyagainst him ina case involvingp ossiblehijacking and kidnapping charges,aprosecutorsaidWednesday.

Raja Quereshi, the attorney general, said thatthe former CivilAviationAuthoritychairmanhasalreadygivena state-menttop olice.

Sharif's lawyer dismissed the news when sp eaking to re-p orters after Sharif made an app earance b eforea judicial magistrate to hearwitnessesgivestatementsagainst him. Sharifhassaidheisinno cent.

The allegations stem from an alleged attempt to divert a plane bringingarmy chief General PervezMusharraf to KarachifromSriLankaonOctob er12.

Figure3: Asummarypro ducedusingtheChronological OrderingalgorithmgradedasGo o d.

Thousands ofp eople haveattendeda ceremony inNairobi commemoratingtherstanniversaryofthedeadlyb ombings attacksagainstU.S.EmbassiesinKenya andTanzania. SaudidissidentOsamabinLaden,accusedofmasterminding theattacks,andnineothersarestillatlarge.

PresidentClintonsaid,"Theintendedvictimsofthisvicious crime sto o dforeverythingthatis rightab outour country andtheworld".

U.S. federal prosecutors have charged 17 p eople in the b ombings.

Albrightsaidthatthemourningcontinues.

Kenyansareobservinganationaldayofmourninginhonor ofthe215p eoplewhodiedthere.

Figure4: Asummarypro ducedusingtheChronological OrderingalgorithmgradedasPo or.

3.2.2

Evaluation

Following thesamemetho dologyweusedfortheMO al-gorithm evaluation, weaskedthreehuman judgesto grade 20summaries generated by thesystemusing the CO algo-rithm applied to the same collection of input texts. The resultsareshowninFigure8.

(4)

ify this hyp othesis, we identied sentences that broke the original chronological orderandrestoredtheordering man-ually. Interestingly, the displaced sentences were mainly background information. The evaluation of the mo died summariesshowsaslight butnotvisibleimprovement.

When comparing Go o d (Figure 3) and Po or (Figure 4) summaries, we notice two phenomena: rst, many of the badly placed sentences cannot b e ordered based on their temp oralo ccurence. Forinstance,inFigure4,thesentence quoting Clinton is notoneevent in thesequence ofevents b eing describ ed, butrather areaction to the mainevents. Thisis alsotrue forthesentence rep ortingAlbright's reac-tion. Assigning a dateto a reaction, or moregenerally to any sentence conveying background information, and plac-ingitintothechronologicalstreamofthemaineventsdo es notpro ducealogicalordering. Theorderingofthesethemes isthereforenotcoveredbytheCOalgorithm.

Thesecond phenomenon weobserved is that Po or sum-mariestypicallycontainabrupt switchesof topicsand gen-eralincoherences. Forinstance,inFigure4,quotesfromUS oÆcials (third and fthsentences) are split and sentences ab outthe mourning (rst andsixth sentences) app ear to o farapartin thesummary. Grouping themtogether would increase thereadability ofthe summary. Atthis p oint, we needtondadditional constraints toimprovetheordering.

4.

IMPROVING THE ORDERING:

EXPERIMENTS AND ANALYSIS

In the previous section, weshowedthat using naive or-dering algorithms do es notpro duce satisfactory orderings. Inthissection,weinvestigatethroughexp erimentswith hu-mans,howtoidentifypatternsoforderingsthatcanimprove thealgorithm.

Sentencesin atextcanb eordered in anumb er of ways, andthetextasawholewillstillconvey thesamemeaning. But undoubtedly, some orders are denitely unacceptable b ecausetheybreakconventionsofinformationpresentation. One way to identify theseconventions is to nd common-alities b etween dierent acceptable orderings of the same information. Extracting regularities in several acceptable orderingscanhelp ussp ecifythemainordering constraints foragiven inputtyp e. Since acollection ofmultiple sum-mariesoverthesamesetofarticlesdo esn'texist,wecreated our own collection of multiple orderings pro duced by dif-ferent humans. Using this collection, we studied common b ehaviors andmapp edthemtostrategiesforordering.

Ourcollectionofmultiple orderingsisavailable at http://www.cs.columbia.edu/~noemie/ordering/. Itwas builtinthefollowingway. Wecollectedten setsofarticles. Eachsetconsistedoftwotothreenewsarticlesrep ortingthe sameevent. For each set, we manually selected the inter-sectionsentences,simulating MultiGen

1

. Onaverage, each setcontained8.8intersectionsentences.Thesentenceswere cleaned of explicit references (for instance, o ccurrences of \thePresident" wereresolved to\President Clinton") and connectives,sothatparticipantswouldn'tusethemasclues forordering. Ten subjects participated in the exp eriment andtheyeachbuiltoneorderingp ersetofintersection sen-tences. Each subject was asked to order the intersection 1

We p erformed a manual simulation to ensure that ideal

all,weobtained100orderings,tenalternative orderingsp er set. Figure 5shows theten alternative orderings collected foroneset.

We rst observe that a surprising majority of orderings are dierent. Outof the ten sets, only twosets had some identical orderings (in one set, one pair of orderings were identicalwhilein theotherset,twopairsoforderingswere identical). Inotherwords,therearemanyacceptable order-ings givenonesetofsentences. Thisconrms theintuition thatwedonotneedtolo okforasingleidealglobalordering butratherconstructanacceptable one.

We also notice that, within the multiple orderings of a set, some sentences alwaysapp ear together. They do not app earinthesameorderfromoneorderingtoanother,but theyshare anadjacencyrelation. Fromnowon,wereferto themas blo cks. Foreachset,weidentifyblo cksby cluster-ingsentences. Weuseasadistancemetricb etweentwo sen-tencestheaverage numb er ofsentencesthatseparatethem overall orderings. InFigure 5, forinstance, the distance b etweenthe sentences Dand Gis2. Theblo cksidentied byclusteringare: sentencesB,D,GandI;sentencesAand J;sentencesCandF;andsentencesEandH.

Participant 1 D BGIHFCJAE Participant 2 D GBICFAJEH Participant 3 D BIGFJA EHC Participant 4 D CFGIBJAHE Participant 5 D GBIHFJACE Participant 6 D GIBFCEHJA Participant 7 D BGIFCHEJA Participant 8 D BCFGIEHAJ Participant 9 D GIBEHFAJC Participant 10 D BGICFAJEH

Figure 5: Multipleorderings forone set inour collec-tion.

We observed that all the blo cks in the exp eriment cor-resp ond to clusters of topically related sentences. These blo cksformunitsoftextdealingwiththesamesubject,and exhibitcohesiveprop erties. Forordering,wecanusethisto opp ortunistical l y groupsentences togetherthatall refer to thesametopic.

Collectingasetofmultipleorderingsisanexp ensivetask; itisdiÆcultandtimeconsumingforahumantoorder sen-tences from scratch. Furthermore, to discover signicant commonalities acrossorderings,manymultipleorderings of thesamesetarenecessary. Weplantoextendourcollection and we are condent that it will provide more insights on ordering. Still,theexistingcollection enablesustoidentify cohesion as an imp ortant factor forordering. We describ e next how we integrate the cohesion constraint in the CO algorithm.

5.

THE AUGMENTED ALGORITHM

IntheoutputoftheCOalgorithm,disuenciesarisewhen topicsaredistributedoverthewholetext,violatingcohesion prop erties[13].Atypicalscenarioisillustrated inFigure6. The inputs are texts T

1 , T

2 , T

3

(5)

1 basedonlyonchronologicalclues,containstwotopicalshifts; fromAtoCandbackfromCtoB. Ab ettersummarywould b eS

2

whichkeepsAandB together.

A

Figure 6: Input texts T 1 Chronological Ordering(S1)orbytheAugmented algo-rithm(S

2 ).

5.1

The Algorithm

Ourgoal isto remove disuencies fromthe summary by grouping together topically related themes. This can b e achievedbyintegratingcohesionasanadditional constraint to the COalgorithm. The main technical diÆculty in in-corp orating cohesion in our ordering algorithm is to iden-tifyand to group topically related themes across multiple do cuments. Inother words, giventwothemes, weneed to determineiftheyb elong tothesamecohesionblo ck. Fora singledo cument,segmentation[8]couldb eusedtoidentify blo cks,but we cannot use such atechnique toidentify co-hesion b etween sentences across multiple do cuments. The mainreasonisthatsegmentationalgorithmsexploitthe lin-earstructureofaninputtext;inourcase,wewanttogroup togethersentencesb elongingtodierent texts.

Oursolutionconsistsof thefollowingsteps. Ina prepro-cessingstage,wesegmenteachinputtext,sothatgiventwo sentences within the same text, we candetermine if they aretopically related. Assume thethemes A andB,where Acontainssentences(A1:::An),andB containssentences (B

1 :::B

m

). Recall thatathemeis asetofsentences con-veyingsimilarinformation drawnfromdierentinputtexts. We denote #AB to b e the numb er of pairs of sentences (Ai;Bj) which app earin the sametext, and #AB

+ tob e thenumb erofsentencepairswhichapp earinthesametext andareinthesamesegment.

Inarststage,foreachpairofthemesAandB,we com-putethe ratio #AB

+

=#AB to measure therelatedness of twothemes. Thismeasuretakesintoaccountb othp ositive and negative evidence. If most of the sentencesin A and B that app ear together in the same texts are also in the samesegments,itmeans thatAandB arehighlytopically related. Inthis case, theratio is close to 1. On the other hand, if amongthetexts containing sentences fromA and B,onlyafewpairsareinthesamesegments,thenAandB arenottopicallyrelated. Accordingly theratioiscloseto0. Aand B are considered relatedif thisratio is higher than apredeterminedthreshold. Inourexp eriments,wesetitto 0.6.

This strategydenes pairwise relations b etweenthemes. Atransitiveclosureofthisrelation buildsgroupsofrelated themesandasaresultensuresthatthemesthatdonot ap-p eartogetherinany articlebutareb othrelated toathird themewillstillb elinked. Thiscreatesanevenhigherdegree of relatednessamong themes. Because we use athreshold

latedthemes. Wearenowabletoidentifytopicallyrelated themes. Attheendoftherststage,theyaregroup edinto blo cks.

Inasecondstage,weassignatimestamptoeachblo ckof related themes,asthe earliest timestampofthethemes it contains. WeadapttheCOalgorithmdescrib edin

3.2.1

to workattheleveloftheblo cks. Theblo cksandthethemes corresp ondto,resp ectively,themesandsentencesintheCO algorithm. Byanalogy,wecaneasilyshowthattheadapted algorithm pro duces a complete order of the blo cks. This yields a macro-ordering of the summary. We still need to orderthethemesinsideeachblo ck.

In the last stage of the augmented algorithm, for each blo ck, weorder thethemesitcontainsbyapplying theCO algorithmtothem. Figure7showsanexampleofasummary pro duced bytheaugmentedalgorithm.

Thisalgorithmensuresthatcohesivelyrelatedthemeswill not b e spread overthe text, and decreases the numb er of abrupt switches of topics. Figure 7 shows how the Aug-mented algorithm improves the sentence order compared with the order in the summary pro duced by the CO al-gorithm inFigure4;sentencesquotingUSoÆcials arenow group edtogetherandsoaredescriptionsofthemourning.

Thousands ofp eople haveattendeda ceremony inNairobi commemorating the rstanniversary ofthe deadly b omb-ingsattacksagainstU.S.EmbassiesinKenyaandTanzania. Kenyansareobservinganationaldayofmourninginhonor ofthe215p eoplewhodiedthere.

SaudidissidentOsamabinLaden,accusedofmasterminding the attacks,andnineothersarestillatlarge. U.S.federal prosecutorshavecharged17p eopleintheb ombings.

PresidentClintonsaid,"Theintendedvictimsofthisvicious crime sto o dforeverythingthatis rightab outour country andtheworld". Albrightsaidthatthemourningcontinues.

Figure 7: A Summary pro duced using the Aug-mented algorithm. Related sentences are group ed into paragraphs.

5.2

Evaluation

Followingthesamemetho dologyusedtoevaluatetheMO and the CO algorithms, we asked the judges to grade 20 summaries pro ducedby theAugmentedalgorithm. Results areshowninFigure8.

Themanual eortneeded to compare and judge system output isextensive; considerthateachhuman judgehadto readthreesummaries foreachinputsetaswellasskimthe inputtextstoverifythatnomisleadingorderwasintro duced in the summaries. Consequently, the evaluation that we p erformed todateis limited. Still,this evaluation shows a signicantimprovementinthequalityoftheorderingsfrom theCOalgorithmtotheaugmentedalgorithm. Toassessthe signicance of the improvement, we used the Fisher exact test,conating Po orandFairsummaries intoonecategory. Thistestisadaptedtoourcaseb ecauseofthereducedsize ofourtestset. Weobtained apvalue of0.014[20].

6.

RELATED WORK

(6)

e-MajorityOrdering 2 12 6 ChronologicalOrdering 7 7 6 AugmentedOrdering 2 7 11

Figure8: EvaluationofthetheMajorityOrdering,the Chronological OrderingandtheAugmentedOrdering.

summarysentencesaretypicallyarrangedinthesameorder that they were found in the full do cument (although [10] rep orts thathuman summarizersdo sometimes change the originalorder). Inmultido cumentsummarization, the sum-mary consists of fragments of text or sentences that were selected from dierent texts. Thus, there is no complete ordering of summary sentences that can b e found in the original do cuments.

Theorderingtaskhasb eenextensivelyinvestigatedinthe generation community [14, 17, 9, 2, 16]. One approach is top-down,using schemas [14]orplans [5]todetermine the organizationalstructure ofthetext. Thisappproachp ostu-lates arhetorical structure whichcanb e usedto select in-formationfromanunderlying knowledgebase. Becausethe domainislimited,anenco dingcanb edevelop edofthekinds of prop ositiona l content that match rhetorical elements of theschemaorplan,therebyallowing contenttob eselected andordered. RhetoricalStructureTheory(RST)allows for moreexibili ty inordering content. Therelationso ccurb e-tweenpairsofprop osition s. Constraintsbasedon intention (e.g.,[17]),plan-like conventions[9],or stylistic constraints [2]areusedaspreconditionsontheplanop erators contain-ingRSTrelationstodetermine whenarelation isusedand howitisorderedwithresp ecttootherrelations.

MultiGen generates summaries ofnews onany topic. In an unconstrained domain like this, it would b e imp ossible to enumerate the semantics for all p ossible typ es of sen-tenceswhichcouldmatchtheelementsofaschema,aplan orrhetoricalrelations. Furthermore,itwouldb ediÆcultto sp ecifyagenericrhetoricalplanforasummaryofnews. In-stead,content determination in MultiGen isopp ortunistic, dep ending onthekinds ofsimilariti es thathapp entoexist b etween a set of news do cuments. Similarly, we describ e herean ordering schemethat isopp ortunistic and b ottom-up, dep ending on the coherence and temp oralconnections that happ ento existb etweenselected text. Ourapproach issimilartotheuseofbasicblo cks[16]whereab ottom-up technique is used to group together stretches of text in a long, generateddo cument by nding prop ositions that are relatedbyacommonfo cus. Sincethisapproachwas devel-op edforagenerationsystem,itndsrelatedprop ositionsby comparisonsofprop ositionargumentsatthesemanticlevel. Inourcase,wearedealingwithasurfacerepresentation,so wendalternative metho dsforgroupingtextfragments.

7.

CONCLUSION AND FUTURE WORK

In this pap er weinvestigated information ordering con-straintsinmultido cument summarization. Weanalyzedtwo naiveorderingalgorithms,theMajorityOrdering(MO)and theChronologicalOrdering(CO).WeshowthattheMO al-gorithmp erformswellonlywhenallinputtextsfollow sim-ilarpresentationoftheinformation. TheCOalgorithmcan provide an acceptable solution for many cases, but is not

event based. We rep orton the exp eriments we conducted to identify other constraints contributing to ordering. We showthatcohesionis animp ortant factor,and describ ean op erationalwaytoincorp orateitintheCOalgorithm. This resultsinadeniteimprovementoftheoverallqualityof au-tomatically generatedsummaries.

Infuture work, werst plan to extend ourcollection of multiple orderings, so that we can extract more regulari-tiesandunderstandb etterhowhumanorderinformation to pro duce a readable and uent text. Even though we did not encounter any misleading inferences intro duced by re-orderingMultiGenoutput,weplantodoanextendedstudy of the side eects caused by reorderings. We also plan to investigatewhether theMO algorithmcanb e improvedby applyingitoncohesiveblo cksofthemes,ratherthanthemes.

8.

ACKNOWLEDGMENT

Thisworkwaspartiallysupp ortedbyDARPAgrant N66001-00-1-8919, aLouis Morin scholarship and a Viros scholar-ship. We thank Eli Barzilay for providing help with the exp eriments interface, Michael Elhadad forthe useful dis-cussions and comments, andall the voluntary participants intheexp eriments.

9.

REFERENCES

[1] R.Barzilay,K.McKeown,andM.Elhadad.

Information fusioninthecontextofmulti-do cument summarization. InProc.ofthe37thAnnualMeeting oftheAssoc.ofComputationalLinguistics,1999. [2] N.Bouayad-Agha,R.Power,andD.Scott.Cantext

structure b eincompatibl e withrhetoricalstructure? InProceedingsoftheFirstInternationalConference onNaturalLanguageGeneration(INLG'2000), Mitzp eRamon,Israel,2000.

[3] J.Carb onellandJ.Goldstein.Theuseofmmr, diversity-based rerankingforreorderingdo cuments andpro ducing summaries. InProceedingsofthe21st AnnualInternationalACMSIGIRConferenceon ResearchandDevelopmentinInformationRetrieval, 1998.

[4] T.Cormen,C.Leiserson,andR.Rivest.Introduction toAlgorithms.TheMITPress,1990.

[5] R.Dale.GeneratingReferringExpressions:

ConstructingDescriptionsinaDomainofObjectsand Processes.MITPress,Cambridge,MA,1992.

[6] N.Elhadad andK.McKeown.Generatingpatient sp ecic summariesofmedical articles.Submitted, 2001.

[7] V.Hatzivassiloglo u,J.Klavans,andE.Eskin. Detectingtextsimilarity overshortpassages:

Exploring linguisti cfeaturecombinations viamachine learning. InProceedingsoftheJointSIGDAT ConferenceonEmpiricalMethodsinNatural LanguageProcessingandVeryLargeCorpora,1999. [8] M.Hearst.Multi-paragraph segmentationof

exp ository text.InProceedingsofthe32thAnnual MeetingoftheAssociationforComputational Linguistics,1994.

(7)

cutting andpasting oftheinputdo cument. Technical rep ort,ColumbiaUniversity,1998.

[11] I.ManiandE.Blo edorn.Multi-do cument summarization bygraphsearchandmatching.In ProceedingsoftheFifteenthNationalConferenceon ArticialIntelligence,1997.

[12] I.ManiandG.Wilson. Robusttemp oralpro cessing of news.InProceedingsofthe38thAnnualMeetingof theAssociationforComputationalLinguistics,2000. [13] K.McCoyandJ.Cheng.Fo cusofattention:

Constrainingwhatcanb esaidnext.InC.Paris, W.Swartout,andW.Mann,editors,Natural LanguageGenerationinArticialIntelligenceand ComputationalLinguistics.KluwerAcademic Publishers,1991.

[14] K.McKeown.TextGeneration:UsingDiscourse StrategiesandFocusConstraintstoGenerateNatural LanguageText.CambridgeUniversity Press,England, 1985.

[15] K.McKeown,J.Klavans,V.Hatzivassilogl ou, R.Barzilay, andE.Eskin.Towardsmultido cument summarization byreformulatin: Progressand prosp ects.InProceedingsoftheSeventeenthNational ConferenceonArticialIntelligence,1999.

[16] D.Mo oney,S.Carb erry,andK.McCoy.The generationofhigh-level structureforextended explanations. InProceedingsoftheInternational ConferenceonComputationalLinguistics (COLING{90),pages276{281,Helsinki, 1990. [17] J.Mo oreandC.Paris.Planningtextforadvisory

dialogues: Capturingintentional andrhetorical information. JournalofComputationalLinguistics, 19(4),1993.

[18] D.Radev,H.Jing, andM.Budzikowska.

Centroid-basedsummarization ofmultiple do cuments: sentenceextraction, utility-based evaluation, anduser studies.InProceedingsoftheANLP/NAACL2000 WorkshoponAutomaticSummarization,2000. [19] D.RadevandK.McKeown.Generating natural

language summariesfrommultiple on-linesources. ComputationalLinguistics,24(3):469{500,Septemb er 1998.