Clustering (ATESC) Based Summarization Anand Gupta 1 , Manpreet Kathuria 2 , Arjun Singh 2 , Ashish Sachdeva 2 ,
3.3 Performance Evaluation and Discussion
There are three major differences in the summarization methods using ATESC and LTT which are given below.
1. Calculation of Entailment Matrix Analog (ATESC) vs. Binary ( LTT) 2. Segmentation Spectral Clustering(ATESC) vs. Logical TextTiling( LTT) 3. Segment Scoring Mean Sentence Score (ATESC) vs. Assumed Equal (LTT) On the basis of above there can be a total of 8 (2*2*2) combinations 2 for each of those phases where ATESC is distinct from LTT. These eight combinations are denoted as numerals from 1 to 8, and are described in detail below.
Each of the above combinations is experimented with both the baselines in order
Table 1. All combinations of methods. Analog = [0,1]. LTT:Logical TextTiling[1].
NSC:Normalized Spectral Clustering [5]. AE:Assumed Equal. MS: Mean Sentence Score. The combinations used in the graphs are as per the method codes given in this table (Column 1).
Method Code Entailment Ma- trix Segmentation Salience 1 Binary(0/1) LTT AE 2 Binary(0/1) LTT MSS 3 Binary(0/1) NSC AE 4 Binary(0/1) NSC MSS 5 Analog LTT AE 6 Analog LTT MSS 7 Analog NSC AE 8 Analog NSC MSS
to observe the effect, which each of the three changes proposed in this paper has on the quality of the final summary produced. We have considered four distinct cases to conduct the experiments, and their results are discussed as follows: (The numbers in curly braces, as they appear in the following text are used to depict a bar graph in one of the figures, which is denoted by the same number. For e.g. ({4} in Fig. 13) refers to the bar graph of the method having Method Code 4 as per Table 1 in Figure 13).
3.3.1 Different Length of Input Text
The text documents in our dataset are classified into two categories according to their lengths Short and Long.
Short Texts. The results for this category are shown in Figs. 2 and 3. With MS Word AutoSummarize as the baseline, it is observed that the most accurate summaries ({6}in Fig. 2) are generated when the segmentation by LTT method is based on Analog Entailment values and is succeeded by the assignment of segment scores and not assuming them to be equal for all segments. Both of these changes are employed in ATESC based summarization.
LTT revolves around the notion that sentences are most closely related when at shorter distances to each other. ATESC is independent of such a presumption. But for shorter texts the distances among sentences is not very large. Thus the characteristic feature of ATESC does not get an opportunity to be tested, pertaining to shorter distances among sentences in a Short Text. As a result for Short Texts LTT is observed to produce more accurate summaries.
Fig. 2. P, R, F for Short Texts MS Word 2007
Auto Summarize
Fig. 3. P, R, F for Short Texts Hu-
man Developed Summaries
Long Texts.The results for this category are shown in Figs. 4 and 5. The no- tion of distant sentences exhibiting strong relationship with each other (with respect to connotation) is proved by the results of experimentation on Long Texts. From Figs. 4 and 5, it is observed that the replacement of LTT methodol- ogy by ATESC methodology (transmute from{1} to {8} in Figures 4 and 5), we observe a gradual and proportional rise in the quality of summaries produced. The quality increases almost linearly for both the baselines employed but with minor differences. Also in both the graphs the best summary is generated by employing ATESC in entirety ({8} in both Figures 4 and 5).
Fig. 4. P, R, F for long Texts MS Word 2007
Auto Summarizer
Fig. 5. P, R, F for long Texts Human
Developed Summaries
3.3.2 Jumbling the Sentences of a Given Text
Jumbled text refers to the text document in which the original order of sen- tences is changed randomly. The objective of jumbling up a given text and then summarizing it is to address those text documents where contextual sentences (sentences closely related to each other) are distributed erratically within the text. A prime example of such texts is e-mails. It is observed from Figs. 6 and 7
Fig. 6. P, R, F for Jumbled texts MS Word 2007
Auto Summarizer
Fig. 7. P, R, F measure for Jumbled
texts - Human Developed Sum- maries
that there is a prominent difference between the quality of summaries generated by ATESC and LTT in favor of ATESC.
3.3.3 Input Text Classification
The dataset is intentionally made to comprise of different types of texts in order to study the performance of all the methods in detail. We classify our input texts in two categories: Essays and News Articles.
Essays. The articles belonging to this category follow a logical structural orga- nization. For such kind of articles, LTT is bound to give better results compared to ATESC, which is proved experimentally and demonstrated by graphs in Figs. 8 and 9. It is also inferred that with MS Word 2007 AutoSummarize as reference (Fig. 8), the results are most accurate when LTT method is integrated with analog entailment values rather than discrete ({5} in Fig. 8) or when the seg- ment scoring is employed after segmentation using LTT ({2} in Fig. 8); both of these changes are proposed in ATESC. Furthermore, when we consider Human generated summary as a base, there is a very thin line of difference between the results of ATESC and LTT as shown in Fig. 9.
News Articles. ATESC performs fairly well over LTT in the case of news articles when the baseline selected is MS Word 2007 AutoSummarize as illustrated in Fig. 10.The editor of a news article generally connects various sentences, present in different paragraphs of the article. Unlike LTT, ATESC can easily determine such affiliations and create segments comprising of such sentences.
Fig. 8. P, R, F for Essays - MS Word 2007 Auto
Summarizer
Fig. 9. P, R, F for Essays - Human
Fig. 10. P, R, F for News Articles MS Word 2007
Auto Summarizer
Fig. 11. P, R, F for News Articles -
Human Developed Summaries
3.3.4 Different Summary Lengths of the Same Text
Summaries measuring 25%, 35% and 45% in length of the entire text are pro- duced using ATESC and LTT with the MS Word 2007 AutoSummarize as ref- erence. It can be observed from Figs. 12 and 13 that on increasing the summary length for a given text, both ATESC and LTT based summarizations exhibit improvements, though these improvements are more significant in ATESC. This is ascertained to the fact that the introduction of assignment of scores to the segments created empowers ATESC to select more appropriate sentences as the length of summary is increased. In case of LTT, since each segment is assigned equal importance, on increasing the length of the summary, the inclusion of the next most appropriate sentence is unlikely.
Fig. 12. P, R, F for different summary rates using
ATESC
Fig. 13. P, R, F for different sum-
mary rates using LTT
4
Conclusions and Future Work
In this paper, we have presented a new approach to Text Summarization based on Analog Text Entailment and Segmentation using Normalized Spectral Cluster- ing. It has shown best results for texts which are long and belong to the domain of texts/documents where relationship among sentences is independent of their respective positions, such as News Articles and e-mails. However, in texts like essays,which are more structured, the best results are observed when the LTT segmentation methodology is supplemented with one of the changes proposed by ATESC. In case of texts/documents, where the likelihood of sentences at
larger distances being related is constrained by the length of the text (Short Texts) ATESC is not found to be the most effective Summarization approach. The increase in the length of the produced summaries has a greater influence in improving the performance in case of ATESC than in LTT pertaining to the in- troduction of mean segment score values in the former. It can thus be concluded that ATESC is able to effectively overcome the short-comings of LTT. It is safe to say that it will pave the way for future research and experimentation towards the use of analog entailment values and clustering algorithms for segmentation of text.
It may be mentioned that the performance of ATESC based Summarization is limited by the effectiveness and computation speed of the employed Textual Entailment Engine and the Segmentation Algorithm. An improvement in either of these is sure to enhance the quality of the summary produced.
References
1. Tatar, D., Tamaianu-Morita, E., Mihis, A., Lupsa, D.: Summarization by Logic Segmentation and text Entailment. In: The Proceedings of Conference on Intelli- gent Text Processing and Computational Linguistics (CICLing 2008), Haifa, Israel, February 17-23, pp. 15–26 (2008)
2. Jones, K.S.: Automatic summarizing: The state of the art. Information Processing & Management 43(6), 1449–1481 (2007)
3. Iftene, A.: Thesis on AI, Textual Entailment, TR 09-02, University “Alexandru Ioan Cuza” of Iasi, Faculty of Computer Science (October 2009)
4. Delmonte, R.: Venses, http://project.cgm.unive.it/venses_en.html
5. Ng, A., Jordan, M.I., Weiss, Y.: On Spectral Clustering: Analysis and an algorithm. In: Dietterich, T., Becker, S., Ghahramani, Z. (eds.) The Advances in Neural In- formation Processing Systems, Vancouver, British Columbia, Canada, December 3-8, pp. 849–856 (2001)
6. Luxburg, U.V.: A Tutorial on Spectral Clustering. Journal Statistics and Comput- ing 17(4), 1–32 (2007)
7. Classic Essays, http://grammar.about.com/od/classicessays/ CLASSIC ESSAYS.html, News Articles, http://www.nytimes.com/
8. Radev, D.R., Hovy, E., McKeown, K.: Introduction to the Special issue on Sum- marization. Journal Computational Linguistics 28(4), 399–408 (2002)
9. Microsoft Auto Summarizer 2007 is used to identify the key points in the document and create a summary,
http://office.microsoft.com/en-us/word-help/ automatically-summarize-a-document-HA010255206.aspx