Further work: Non-scopal chunks

The chunking approach based on scopal relationships reflects inherent properties of *MRS representation, but does not capture all constructions capable of acting as semantic chunks. Scopal arguments match the concept of situation discussed in Chapter 2 well enough that the presence of scopal links allowed us to expand viable chunk centres to parts of speech other than verbs. It reliably indicates non-attributive occurrences of adjectives and prepositional phrases. In contrast, when considering non-scopal chunks, there are no clear criteria on which to base the distinction. Verbs and tensed nodes seem to be the only safe candidates for chunk centres.

5.7 Further work: Non-scopal chunks 135

(a) DMRS of the cat which sat on the roof.

(b) DMRS of Sam ruined the fish by cooking it.

Fig. 5.28 DMRS of predicting of the outcome/predicting the outcome.

In particular, two major categories of chunks elude detection based on scopal relationships: relative clauses (Fig. 5.27a) and gerund clauses introduced by prepositions (Fig. 5.27b). Central nodes of these satellites have no scopal connection to the top of the sentence and because of that, they are not captured by the scopal hierarchy introduced in §5.1.

Relative clauses are parts of nominal phrases, with the antecedent of the clause acting as the top node of their subgraph. Prepositional phrases are themselves modifiers, with verbs bound within a nominalized phrase by anominalization predicate. Since the ERG 1214 does not maintain a clear distinction between gerundive and mixed nominals (§2.1.3), it is unclear whether subgraphs introduced bynominalization nodes should be considered valid chunks. The DMRS graphs from Figure 5.28 generates both of the following surface forms:

(103) predicting the outcome, (104) predicting of the outcome.

The ambiguity is amplified by the fact that the same type of nominalized phrase can be used in coordination with nominal phrases based on nouns and as sentence subjects.

Given the constraints on valid chunk centres, grammatical constructions with non-scopal chunks can be represented as templates, just as their scopal counterparts. The starting point would be a DMRS graph of a scopal chunk, which is the nucleus situation, e.g Sam ruined the fishin Figure 5.27b. The functional graph for the chunking decision would comprise nodes participating in simple paths between the two chunk centres, and could be converted to a template, like other functional graphs we encountered. Long prepositional phrases with gerund arguments act syntactically like subordinate clauses, falling naturally to the end of the sentence, so we expect the templates to produce usable surface representations, similar to those of scopal templates.

Relative clauses pose more problems, as they can modify any element of the chunk, independent of its syntactic position in the sentence. For example, the cat from Figure 5.27a could be the subject or the object of the nucleus clause:

(105) The cat which sat on the roof looked bored. (106) I miss the cat which sat on the roof.

In Example 105 the relative clause is interjected in the middle of the surface string representing the nucleus chunk. There are no heuristics similar to the left-to-right assumption which we used for scopal chunks. If we had access to the mapping between realized tokens and their source nodes, we could replace the string of the antecedent in the realization of the nucleus with the instantiated template representation, using surface representations of templates such as ANTECEDENT which SATELLITE. Without that information, assembling the full surface string out of the resulting chunks is not straightforward. The alternative solution is to realize the entire template and identify the common substring between the nucleus surface and the template realization as the antecedent string. The downside of this approach is that the antecedent would have to be realized twice, potentially incurring significant computational cost.

Addressing these difficulties would allow the chunking model described in this chapter to include an even wider variety of semantic chunks.

Chapter 6 Conclusions

This thesis introduced a new task called semantic chunking, dedicated to reducing processing costs of downstream NLP tasks reliant on semantic information. In this chapter we give an overview of its contributions, expanding upon the list from §1.2, and we point out potential future research directions for each aspect of the investigation.

6.1 The foundations of the task

The main contribution of our work is the introduction of semantic chunking. The starting point was a practical definition of the task designed to isolate semantically contained fragments of a sentence that could be processed independently without loss of information. We envisioned a pre-processing divide-and-conquer approach tailored to the needs of the input representation and the target task. The results of processing individual fragments, i.e. semantic chunks, can be assembled into a result that is identical to the output of the target task performed on the full sentence, but which was obtained at a smaller computational cost.

Semantic chunking defined in this way relates to existing NLP tasks, such as shallow chunking and sentence simplification (§2.2), especially the ones that act as pre-processing steps for applications requiring semantic annotation, e.g. machine translation or summarisa- tion. The distinguishing feature of semantic chunking is the flexibility of what constitutes a valid output, and its primary consideration is the practical improvement observed in the downstream task. The customisable nature of semantic chunking affects the interpretation of the condition that semantic chunks have to be semantically contained. This constraint has to be resolved in the context of the application and its inputs. If an aspect of semantic information is irrelevant to the task or inaccessible through the given representation, semantic chunking does not have to preserve it. At the same time, auxiliary structures can allow chunks

to interact by preserving the information that would otherwise be lost if the constituents in question are separated (cf. §6.2).

In a sense, many formulations of the related tasks can be considered specific cases of semantic chunking. For example, one of the common steps in sentence simplification (§2.2.2) is splitting a sentence into smaller fragments and the elements of the models responsible for this operation can be seen as performing a dedicated form of semantic chunking. The task defined in this thesis is in a way a generalization of the similar tasks.

In Chapter 2 we investigated theoretical concepts used to describe the type of information that semantic chunking is interested in preserving. After exploring propositions as potential theoretical equivalents of semantic chunks (§2.1.1), we homed in on situation and event semantics as two related frameworks which offer the most insights into the form of suitable constituents (§2.1.2). Situations combine propositions that share participants and spatiotem- poral locations and, together with event variables, establish the basis for distinguishing semantically contained constituents.

The task is deeply rooted in the principle of compositionality, reflected in the underpinning assumption that a sentence can be divided into constituents suitable to act as semantic chunks. In §2.1.3 we identified individual situations with syntactic clauses in a sentence, reviewing a variety of other relevant grammatical constructions, such as eventive noun phrases. In the process, the issue of granularity emerged as a key consideration. Semantic chunks have to be small enough to be processed more readily than their full source sentences, but at the same time they have to be large enough to be distinct from individual units of representation, e.g. tokens in a sentence string. This balancing act is one of the examples how semantic chunking is defined primarily through its practical goals and how its exact form depends on the intended use.

Identifying semantic chunks as related to situations and syntactic clauses provided us with a framework in which we can consider relationships between chunks (§2.1.4). We borrow terminology from Rhetorical Structure Theory (RST) to connect semantic chunks within an individual sentence through nucleus-satellite relationships, as we observe that each sentence has a nucleus semantic chunk describing the core situation expressed by a sentence. The nucleus chunk is further qualified by satellite chunks. Syntactically, the relationships are often expressed by operators, e.g. subordinating conjunctions, which do not belong to semantic chunks but have no independent meaning. Throughout the thesis we referred to them as functional fragments and investigated ways in which the information they convey can be preserved during chunking for the purpose of the final assembly step of partial results into the full output. We summarise the explored solutions in §6.3.

6.1 The foundations of the task 139

The primary representation on which we focused in the thesis is Dependency Minimal Recursion Semantics (DMRS), which takes the form of semantic dependency graphs. We gave an overview of DMRS and its related *MRS framework in §2.3. *MRS is a formalism with a deeply in-built syntax-semantics interaction and a sophisticated treatment of scopal relationships. Both of these properties make it an appealing domain for the exploration of semantic chunking. On one hand, strong compositional properties of the representation, supported by a wide coverage, linguistically motivated grammar, mean that constituents suitable to act as semantic chunks emerge naturally from DMRS structures (cf. Chapter 5.3). On the other hand, the close knit relation of the complex representation to the underlying surface form creates an opening into the investigation of semantic chunking of the surface string representation (§6.2, cf. Chapter 4).

6.1.1 Further research on the foundations

Our thesis practically explored semantic chunking on two representations: DMRS graphs and surface strings, but the task could be defined on other ones as well. The experiments of Narayan and Gardent (2014) on sentence simplification (§2.2.2) suggest that semantic chunking could be successfully applied to Discourse Representation Structures (DRS; Kamp (1981)). Another promising formalism is Abstract Meaning Representation (AMR; Banarescu et al. (2013)), which is a representation language, rather than a fixed representation itself. Like DMRS, AMR structures are graphs with directed edges representing relations, so that some of the techniques developed in this thesis could be adapted to serve the new representation. On the other hand, like DRS, the information contained in AMR graphs is dependent on the particular annotation conventions adopted and often encompasses elements of discourse. Both DRS and AMR could open a way to researching a chunking-like task which is not limited to the semantic information of individual sentences but can target larger text spans.

Another expansion of the task domain could see chunking applied to different languages. The thesis focuses exclusively on English, but the DELPH-IN initiative (§2.3) originated *MRS-based grammars of multiple languages, including Japanese, German and Spanish.

These grammars could be used to assess how well our observations transfer to other languages.

Finally, the related task of speech segmentation (§2.2.4) suggests another potential extension. Like semantic chunks, speech units aim to divide an utterance into semantic fragments that facilitate the processing of longer input. Although idiosyncrasies of speech mean that spoken utterances fall outside of the scope of this thesis, semantic chunking could be adapted to meet the needs of that domain.

In document Semantic chunking (Page 146-152)