Relevant Text Summarisation approaches

2.5 Text Summarisation

2.5.2 Relevant Text Summarisation approaches

A number of popular text summarisation approaches will be presented in this sub- section. The aim is to provide an overview of existing approaches with which the approaches proposed later in this thesis can be compared. The different approaches presented here are categorised according to the concepts on which they are based: (i) lexical chains, (ii) sentence extraction and ranking, and (iii) other techniques.

2.5.2.1 Techniques based on lexical chains

The techniques based on the use of lexical chains are closely related to Natural Language Processing (NLP). According to Morris and Hirst [76] a lexical chain is “a succession of a number of nearby related words spanning a topical unit of the text”, in other words a lexical chain is a sequence of related words. Many approaches have been proposed that use the concept of lexical chains, which can “hold a set of semantically related words of a text” [26]. For example, consider the following text: “Recently, the Department of Transport has been addressing issues related to aircraft safety. Although the use of airplanes is more common, the use of helicopters in cities has increased in the last couple of years.”, in this case the lexical chain is “{transport, aircraft, airplanes, helicopters}”, where “airplanes” and “helicopters” are specialisations of “aircraft”, which in turn is a specialisation of “transport”.

Barzilay and Elhadad [7] proposed an algorithm that identifies lexical chains in a text without the need for full semantic interpretation. Barzilay and Elhadad merged various knowledge sources, the WordNet thesaurus, a part-of-speech tagger, a parser to identify nominal groups and a segmentation algorithm. Their method considers lexical

chains as a source but it ignores any other information from the text. Although their approach was not the first to focus on the use of lexical chains [76], it was very influential with respect to the work of many researchers in the field of text summarisation.

Silber and McCoy [92] proposed a linear time algorithm for calculating lexical chains focusing on its efficiency. Later, in [93], they proposed an improved version of this linear time algorithm and a method for evaluating lexical chains as an intermediate step in summarisation. The algorithm can handle larger documents than that proposed by Barzilay and Elhadad [7], thus, making Silber and McCoy’s approach much more tractable.

Co-reference chains are lexical chains that are related to each other by having the same referent (concept or idea). The approach described in [5] is based on co-reference chains, where a summary representation is constructed by selecting the “best” co- reference chain that best represents the main topic of a text. Thus, the generated summary is a concatenation of sentences from the text that contain one or more el- ements found on the “best” co-reference chain. The results of the evaluation of the implementation of the algorithm showed that there was a high level of precision with respect to the different criteria taken into consideration.

Another approach related to lexical chains is that presented by Fuentes and Rodr´ıguez [29], in which the cohesive properties of text (namely lexical chains, co-reference chains and named-entity chains) are used to develop a system that allows for the automatic extraction of summaries. Named-entity chains are those that are related to existing categories of names, such as the names of persons, places, etc. The system’s performance was described as positive; nevertheless, further improvement was needed and the fact that it has only been used within the Spanish language context gives it a limited application.

2.5.2.2 Techniques based on sentence extraction and ranking

Many approaches use sentence extraction as a way to fragment the text and then to generate summaries from the extracted sentences. However, some approaches optimise the way that sentences are extracted form the text due to the importance that this process has in the construction and synthesis of the output to produce a summary.

The approach of Jing [59] presented a sentence reduction system that focused on determining the sentences that are less important in the text and that can be removed. In the evaluation of the method, the system’s reduction of a particular text was close to the reduction made by humans. A limitation for using this approach is that it is a generic summarisation approach.

Chuang and Yang [15] focused on sentence extraction from a machine learning perspective and presented the design of an automatic text summariser trained using a supervised learning algorithm. It extracted sentence segments based on a feature vector

representation. The learning algorithms chosen for training the summariser were: the decision tree C4.5 algorithm, the naive Bayesian classifier and the DistAl neural network algorithm. The results of the experiments using the DistAl and the Bayesian learning algorithms outperformed the results that were obtained using the C4.5 algorithm.

Mihalcea [73] proposed an unsupervised method for automatic text summarisation by using a graph-based ranking algorithm for sentence extraction. The sentence extraction algorithm, called TextRank, took into account the local context of a word and the information recursively produced from the entire text. In order to identify important sentences, a process of “recommendation” was used, it consisted of the establishment of a link between two sentences that were related to similar concepts, consequently sentences that have a greater recommendation received a higher score than the ones that will appear in the summary. The TextRank algorithm was portable to other domains, genres or languages because it did not require significant linguistic knowledge.

2.5.2.3 Other techniques

Knight and Marcu [67] suggested a summarisation technique that went beyond sentence extraction; two approaches were proposed for optimizing the process of sentence compression: a noisy-channel based model and a decision-tree based model. In the evaluation of the approaches, the performance of the compression algorithms was closer to human performance. However the performance dropped when there were sentences related to another corpus or set of texts than the ones that were used to train the data: the noisy-channel based model performance was slightly decreased, while the decision-tree based model performance decreased dramatically.

An approach that edits the sentences that result from a sentence extraction process is the one presented by Jing and McKeown [60] in the form of a “cut and paste” based automatic text summariser. The “cut and paste” operations used by the summariser are based on the analysis of abstracts generated by humans: sentence reduction, sentence combination, syntactic transformation, lexical paraphrasing, generalisation or specification (depending on the needs and context), and reordering of the extracted sentences. The text summariser included a decomposition program for the analysis of abstracts written by humans and a sentence reduction module that based its decisions on several knowledge sources. The overall evaluation of the system was satisfactory.

A different approach to text summarisation that involved the use of Singular Value Decomposition (SVD) was taken by Steinberger and Jeˇzek [97]. SVD had been used extensively in the area of statistics. In this approach, two evaluation methods based on SVD were proposed, these methods measured the similarity between the contents of an original document and its summary. The summarising method proposed performed better than the other examined methods considered during testing.

2.5.3 Text Summarisation approaches that use Text Classification

In document On the use of text classification methods for text summarisation (Page 48-51)